Density Ratio-based Causal Discovery from Bivariate Continuous-Discrete Data

arXiv:2505.08371v4 Announce Type: replace-cross
Abstract: We address the problem of inferring the causal direction between a continuous variable $X$ and a discrete variable $Y$ from observational data. For the model $X to Y$, we adopt the threshold model used in prior work. For the model $Y to X$, we consider two cases: (1) the conditional distributions of $X$ given different values of $Y$ form a location-shift family, and (2) they are mixtures of generalized normal distributions with independently parameterized components. We establish identifiability of the causal direction through three theoretical results. First, we prove that under $X to Y$, the density ratio of $X$ conditioned on different values of $Y$ is monotonic. Second, we establish that under $Y to X$ with non-location-shift conditionals, monotonicity of the density ratio holds only on a set of Lebesgue measure zero in the parameter space. Third, we show that under $X to Y$, the conditional distributions forming a location-shift family requires a precise coordination between the causal mechanism and input distribution, which is non-generic under the principle of independent mechanisms. Together, these results imply that monotonicity of the density ratio characterizes the direction $X to Y$, whereas non-monotonicity or location-shift conditionals characterizes $Y to X$. Based on this, we propose Density Ratio-based Causal Discovery (DRCD), a method that determines causal direction by testing for location-shift conditionals and monotonicity of the estimated density ratio. Experiments on synthetic and real-world datasets demonstrate that DRCD outperforms existing methods.

Liked Liked