Weighting-Based Identification and Estimation in Graphical Models of Missing Data

arXiv:2602.10969v1 Announce Type: cross
Abstract: We propose a constructive algorithm for identifying complete data distributions in graphical models of missing data. The complete data distribution is unrestricted, while the missingness mechanism is assumed to factorize according to a conditional directed acyclic graph. Our approach follows an interventionist perspective in which missingness indicators are treated as variables that can be intervened on. A central challenge in this setting is that sequences of interventions on missingness indicators may induce and propagate selection bias, so that identification can fail even when a propensity score is invariant to available interventions. To address this challenge, we introduce a tree-based identification algorithm that explicitly tracks the creation and propagation of selection bias and determines whether it can be avoided through admissible intervention strategies. The resulting tree provides both a diagnostic and a constructive characterization of identifiability under a given missingness mechanism. Building on these results, we develop recursive inverse probability weighting procedures that mirror the intervention logic of the identification algorithm, yielding valid estimating equations for both the missingness mechanism and functionals of the complete data distribution. Simulation studies and a real-data application illustrate the practical performance of the proposed methods. An accompanying R package, flexMissing, implements all proposed procedures.

Liked Liked