Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.
Causal discovery is the task of learning causal models, encoding causal relationships, from a source of information, such as a dataset containing observational data. While many algorithms have been developed to discover causal models under varied sets of assumptions, the case in which the dataset is affected by missing data remains significantly underexplored. Naively applying standard causal discovery algorithms to listwise, test-wise, or regression-wise deleted datasets, or imputing the missing data, can introduce spurious associations between variables and bias function estimation in functional causal models. This issue arises when the data is missing at random or not at random. It ultimately invalidates the theoretical guarantees of these algorithms and prevents finding the true underlying causal model, even in the large-sample limit. An established family of causal models is the Linear Non-Gaussian Acyclic Model (LiNGAM), which assumes linear functional relationships and non-Gaussian independent noise terms. We propose a new causal discovery algorithm for LiNGAM, capable of recovering the underlying causal structure and providing unbiased estimates of the model’s parameters, even when the data is affected by MNAR missingness.