Abstract
Discovering causal relationships among observed variables is an important research focus in data mining. Existing causal discovery approaches are mainly based on constraint-based methods and functional causal models (FCMs). However, the constraint-based method cannot identify the Markov equivalence class and the functional causal models cannot identify the complex interrelationships when multiple variables affect one variable. To address the two aforementioned problems, we propose a new graph structure Causal Star Graph (CSG) and a corresponding framework Causal Discovery via Causal Star Graphs (CD-CSG) to divide a causal directed acyclic graph into multiple CSGs for causal discovery. In this framework, we also propose a generalized learning in CSGs based on a variational approach to learn the representative intermediate variable of CSG’s non-central variables. Through the generalized learning in CSGs, the asymmetry in the forward and backward model of CD-CSG can be found to identify the causal directions in the directed acyclic graphs. We further divide the CSGs into three categories and provide the causal identification principle under each category in our proposed framework. Experiments using synthetic data show that the causal relationships between variables can be effectively identified with CD-CSG and the accuracy of CD-CSG is higher than the best existing model. By applying CD-CSG to real-world data, our proposed method can greatly augment the applicability and effectiveness of causal discovery.
- [1] . 1997. A characterization of Markov equivalence classes for acyclic digraphs. The Annals of Statistics 25, 2 (1997), 505–541.Google ScholarCross Ref
- [2] . 1995. Two-stage least squares estimation of average causal effects in models with variable treatment intensity. Journal of the American Statistical Association 90, 430 (1995), 431–442.Google ScholarCross Ref
- [3] . 1996. Identification of causal effects using instrumental variables. Journal of the American Statistical Association 91, 434 (1996), 444–455.Google ScholarCross Ref
- [4] . 1980. Linear structural equations with latent variables. Psychometrika 45, 3 (1980), 289–308.Google ScholarCross Ref
- [5] . 2018. Cause-effect inference by comparing regression errors. In Proceedings of the International Conference on Artificial Intelligence and Statistics. PMLR, 900–909.Google Scholar
- [6] . 2009. Improving the reliability of causal discovery from small datasets using argumentation. Journal of Machine Learning Research 10, 2 (2009), 141–180.Google Scholar
- [7] . 2014. CAM: Causal additive models, high-dimensional order search and penalized regression. The Annals of Statistics 42, 6 (2014), 2526–2556.Google ScholarCross Ref
- [8] . 2017. A survey on non-temporal series observational data based causal discovery. Chinese Journal of Computers 40, 6 (2017), 1470–1490.Google Scholar
- [9] . 2019. Causal discovery with cascade nonlinear additive noise model. In Proceedings of the 28th International Joint Conference on Artificial Intelligence. 1609–1615.Google ScholarDigital Library
- [10] . 2013. Canonical correlation analysis based on Hilbert–Schmidt independence criterion and centered kernel target alignment. In Proceedings of the 30th International Conference on Machine Learning.316–324.Google Scholar
- [11] . 2002. Optimal structure identification with greedy search. Journal of Machine Learning Research 3, Nov. (2002), 507–554.Google Scholar
- [12] . 2014. Order-independent constraint-based causal structure learning. Journal of Machine Learning Research 15, 1 (2014), 3741–3782.Google ScholarDigital Library
- [13] . 2014. Order-independent constraint-based causal structure learning. Journal of Machine Learning Research 15, 1 (2014), 3741–3782.Google ScholarDigital Library
- [14] . 2011. Learning high-dimensional DAGs with latent and selection variables. In Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence. 850.Google Scholar
- [15] . 1999. Causal discovery from a mixture of experimental and observational data. In Proceedings of the 15th Conference on Uncertainty in Artificial Intelligence. 116–125.Google Scholar
- [16] . 2010. Inferring deterministic causal relations. In Proceedings of the 26th Conference on Uncertainty in Artificial Intelligence. 143–150.Google Scholar
- [17] . 2019. Conditional distribution variability measures for causality detection. In Proceedings of the Cause Effect Pairs in Machine Learning. Springer, 339–347.Google ScholarCross Ref
- [18] . 1997. A supervised machine learning algorithm for arrhythmia analysis. In Proceedings of the Computers in Cardiology 1997. 433–436.Google ScholarCross Ref
- [19] . 2019. Detection of topical influence in social networks via granger-causal inference: A Twitter case study. In Proceedings of the International Conference on Advances in Social Networks Analysis and Mining. 969–977.Google ScholarDigital Library
- [20] . 2012. Characterization and greedy learning of interventional Markov equivalence classes of directed acyclic graphs. Journal of Machine Learning Research 13, 1 (2012), 2409–2464.Google ScholarDigital Library
- [21] . 2008. Nonlinear causal discovery with additive noise models. In Proceedings of the 22nd Annual Conference on Neural Information Processing Systems. 689–696.Google Scholar
- [22] . 2018. Causal inference and mechanism clustering of a mixture of additive noise models. In Proceedings of the Annual Conference on Neural Information Processing Systems. 5212–5222.Google Scholar
- [23] . 2012. Information-geometric approach to inferring causal directions. Artificial Intelligence 182-183 (2012), 1–31.Google ScholarDigital Library
- [24] . 2020. Causal discovery toolbox: Uncovering causal relationships in python. Journal of Machine Learning Research 21, 1 (2020), 37:1–37:5.Google Scholar
- [25] . 2007. Estimating high-dimensional directed acyclic graphs with the PC-algorithm. Journal of Machine Learning Research 8, 3 (2007), 613–636.Google ScholarDigital Library
- [26] . 2014. Auto-encoding variational bayes. In Proceedings of the 2nd International Conference on Learning Representations.Google Scholar
- [27] . 2019. A fast PC algorithm for high dimensional causal discovery with multi-core PCs. IEEE/ACM Transactions on Computational Biology and Bioinformatics 16, 5 (2019), 1483–1495.Google ScholarDigital Library
- [28] . 2019. Constraint-based causal structure learning with consistent separating sets. In Proceedings of the Annual Conference on Neural Information Processing Systems 2019. 14257–14266.Google Scholar
- [29] . 2010. Distinguishing between cause and effect. In Proceedings of the Causality: Objectives and Assessment.147–156.Google Scholar
- [30] . 2016. Distinguishing cause from effect using observational data: Methods and benchmarks. Journal of Machine Learning Research 17, 1 (2016), 32:1–32:102.Google Scholar
- [31] . 1994. The population biology of abalone (Haliotis species) in Tasmania. i. blacklip abalone (h. rubra) from the north coast and islands of bass strait. Sea Fisheries Division, Technical Report 48 (1994), p411.Google Scholar
- [32] . 2000. Causality: Models, Reasoning, and Inference. Cambridge University Press.Google ScholarDigital Library
- [33] . 2014. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Elsevier.Google ScholarDigital Library
- [34] . 2005. Combining model-based and instance-based learning for first order regression. Machine Learning, Proceedings of the Twenty-Second International Conference (ICML’05, Bonn, Germany, August 7-11, 2005), ACM International Conference Proceeding Series, Vol. 119, ACM, 193–200.Google Scholar
- [35] . 2005. Causal protein-signaling networks derived from multiparameter single-cell data. Science 308, 5721 (2005), 523–529.Google ScholarCross Ref
- [36] . 2006. A linear non-gaussian acyclic model for causal discovery. Journal of Machine Learning Research 7, 10 (2006), 2003–2030.Google ScholarDigital Library
- [37] . 2000. Causation, Prediction, and Search. MIT Press.Google Scholar
- [38] . 1999. An algorithm for causal inference in the presence of latent variables and selection bias. Computation, Causation, and Discovery 21 (1999), 1–252.Google Scholar
- [39] . 2013. Umweltstatistik: Statistische Verarbeitung und Analyse Von Umweltdaten. Springer-Verlag.Google Scholar
- [40] . 2007. A kernel-based causal learning algorithm. In Proceedings of the 24th International Conference on Machine Learning.855–862.Google ScholarDigital Library
- [41] . 2014. Rényi divergence and kullback-leibler divergence. IEEE Transactions on Information Theory 60, 7 (2014), 3797–3820.Google ScholarCross Ref
- [42] . 2013. Social networks and causal inference. Handbook of Causal Analysis for Social Research (2013), 353–374.Google ScholarCross Ref
- [43] . 1990. Equivalence and synthesis of causal models. In Proceedings of the 6th Annual Conference on Uncertainty in Artificial Intelligence. 255–270.Google Scholar
- [44] . 2019. The blessings of multiple causes. Journal of the American Statistical Association 114, 528 (2019), 1574–1596.Google ScholarCross Ref
- [45] . 2006. Analysis of strength of concrete using design of experiments and neural networks. Journal of Materials in Civil Engineering 18, 4 (2006), 597–604.Google ScholarCross Ref
- [46] . 2009. On the identifiability of the post-nonlinear causal model. In Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence. 647–655.Google Scholar
- [47] . 2011. Kernel-based conditional independence test and application in causal discovery. In Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence. 804–813.Google Scholar
- [48] . 2016. On estimation of functional causal models: General results and application to the post-nonlinear causal model. ACM Transactions on Intelligent Systems and Technology 7, 2 (2016), 13:1–13:22.Google ScholarDigital Library
- [49] . 2014. A causal feature selection algorithm for stock prediction modeling. Neurocomputing 142 (2014), 48–59.Google ScholarDigital Library
- [50] . 2008. Kernel measures of independence for non-iid data. In Proceedings of the 22nd Annual Conference on Neural Information Processing Systems. 1937–1944.Google Scholar
Index Terms
- Causal Discovery via Causal Star Graphs
Recommendations
Coresets for fast causal discovery with the additive noise model
AbstractCausal discovery reveals the true causal relationships behind data and discovering causal relationships from observed data is a particularly challenging problem, especially in large-scale datasets. The functional causal model is an effective ...
Highlights- New coresets proposed for the additive noise model greatly reduces the data size for causal discovery.
- A time-efficient algorithm, FANM, is proposed for causal discovery based on the coresets.
- The coreset construction is applied to ...
Causal Discovery with Confounding Cascade Nonlinear Additive Noise Models
Identification of causal direction between a causal-effect pair from observed data has recently attracted much attention. Various methods based on functional causal models have been proposed to solve this problem, by assuming the causal process satisfies ...
On Estimation of Functional Causal Models: General Results and Application to the Post-Nonlinear Causal Model
Special Issue on Causal Discovery and InferenceCompared to constraint-based causal discovery, causal discovery based on functional causal models is able to identify the whole causal model under appropriate assumptions [Shimizu et al. 2006; Hoyer et al. 2009; Zhang and Hyvärinen 2009b]. Functional ...
Comments