Abstract
Causal discovery from data affected by latent confounders is an important and difficult challenge. Causal functional model-based approaches have not been used to present variables whose relationships are affected by latent confounders, while some constraint-based methods can present them. This paper proposes a causal functional model-based method called repetitive causal discovery (RCD) to discover the causal structure of observed variables affected by latent confounders. RCD repeats inferring the causal directions between a small number of observed variables and determines whether the relationships are affected by latent confounders. RCD finally produces a causal graph where a bidirected arrow indicates the pair of variables that have the same latent confounders and a directed arrow indicates the causal direction of a pair of variables that are not affected by the same latent confounder. The results of experimental validation using simulated data and real-world data confirmed that RCD is effective in identifying latent confounders and causal directions between observed variables.
Similar content being viewed by others
References
Chickering, D.M.: Optimal structure identification with greedy search. J. Mach. Learn. Res. 3, 507–554 (2002)
Colombo, D., Maathuis, M.H., Kalisch, M., Richardson, T.S.: Learning high-dimensional directed acyclic graphs with latent and selection variables. Ann Stat. 40(1), 294–321 (2012). https://doi.org/10.1214/11-AOS940
Darmois, G.: Analyse générale des liaisons stochastiques: etude particuliére de l’analyse factorielle linéaire. Rev. Int. Stat. Inst. 21, 2–8 (1953)
Duncan OD, Featherman DL, Duncan B (1972) Socioeconomic background and achievement. New York
Gretton, A., Fukumizu, K., Teo, C.H., Song, L., Schölkopf, B., Smola, A.J.: A kernel statistical test of independence. In: Platt, J.C., Koller, D., Singer, Y., Roweis, S.T. (eds.) Advances in Neural Information Processing Systems, pp. 585–592. Curran Associates, Inc, USA (2008)
Hoyer, P.O., Janzing, D., Mooij, J.M., Peters, J., Schölkopf, B.: Nonlinear causal discovery with additive noise models. In: Koller, D., Schuurmans, D., Bengio, Y., Bottou, L. (eds.) Advances in Neural Information Processing Systems, pp. 689–696. Curran Associates, Inc, USA (2009)
Hoyer, P.O., Shimizu, S., Kerminen, A.J., Palviainen, M.: Estimation of causal effects using linear non-Gaussian causal models with hidden variables. Int. J. Approx. Reason. 49(2), 362–378 (2008). https://doi.org/10.1016/j.ijar.2008.02.006. (Special Section on Probabilistic Rough Sets and Special Section on PGM’06)
Liu, D.C., Nocedal, J.: On the limited memory BFGS method for large scale optimization. Math. Program. 45(1), 503–528 (1989). https://doi.org/10.1007/BF01589116
Maeda, T.N., Shimizu, S.: RCD: Repetitive causal discovery of linear non-Gaussian acyclic models with latent confounders. In: Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics (AISTATS2020), pp. 735–745 (2020)
Mooij, J., Janzing, D., Peters, J., Schölkopf, B.: Regression by dependence minimization and its application to causal inference in additive noise models. In: Proceedings of the 26th Annual International Conference on Machine Learning, ICML ’09, pp. 745–752. ACM, New York, NY, USA (2009). doi: 10.1145/1553374.1553470
Ogarrio, J.M., Spirtes, P., Ramsey, J.: A hybrid causal search algorithm for latent variable models. In: Conference on Probabilistic Graphical Models, pp. 368–379 (2016)
Pearl, J.: Comment: graphical models, causality and intervention. Stat. Sci. 8(3), 266–269 (1993)
Pearl, J.: Causality: models, reasoning and inference. Cambridge University Press, Cambridge (2000)
Peters, J., Mooij, J.M., Janzing, D., Schölkopf, B.: Causal discovery with continuous additive noise models. J. Mach. Learn. Res. 15(1), 2009–2053 (2014)
Shapiro, S.S., Wilk, M.B.: An analysis of variance test for normality (complete samples). Biometrika 52(3/4), 591–611 (1965)
Shimizu, S., Hoyer, P.O., Hyvärinen, A., Kerminen, A.: A linear non-Gaussian acyclic model for causal discovery. J. Mach. Learn. Res. 7, 2003–2030 (2006)
Shimizu, S., Inazumi, T., Sogawa, Y., Hyvärinen, A., Kawahara, Y., Washio, T., Hoyer, P.O., Bollen, K.: DirectLiNGAM: a direct method for learning a linear non-Gaussian structural equation model. J. Mach. Learn. Res. 12, 1225–1248 (2011)
Skitovitch, V.P.: On a property of the normal distribution. Doklady Akademii Nauk SSSR 89, 217–219 (1953)
Spirtes, P., Glymour, C.: An algorithm for fast recovery of sparse causal graphs. Social Sci. Comput. Rev. 9(1), 62–72 (1991)
Spirtes, P., Meek, C., Richardson, T.: Causal discovery in the presence of latent variables and selection bias. In: Cooper, G.F., Glymour, C.N. (eds.) Computation, causality, and discovery, pp. 211–252. AAAI Press, USA (1999)
Yamada, M., Sugiyama, M.: Dependence minimizing regression with model selection for non-linear causal inference under non-Gaussian noise. In: Twenty-Fourth AAAI Conference on Artificial Intelligence (2010)
Zhang, H., Zhou, S., Zhang, K., Guan, J.: Causal discovery using regression-based conditional independence tests. In: Thirty-First AAAI Conference on Artificial Intelligence (2017)
Zou, H.: The adaptive lasso and its oracle properties. J. Am. Stat. Assoc. 101(476), 1418–1429 (2006). https://doi.org/10.1198/016214506000000735
Acknowledgements
We thank Dr. Samuel Y. Wang for his useful comments on a previous version of our algorithm proposed in [9]. Takashi Nicholas Maeda has been partially supported by Grant-in-Aid for Scientific Research (C) from Japan Society for the Promotion of Science (JSPS) #20K19872. Shohei Shimizu has been partially supported by ONRG NICOP N62909-17-1-2034 and Grant-in-Aid for Scientific Research (C) from Japan Society for the Promotion of Science (JSPS) #16K00045 and #20K11708.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Maeda, T.N., Shimizu, S. Repetitive causal discovery of linear non-Gaussian acyclic models in the presence of latent confounders. Int J Data Sci Anal 13, 77–89 (2022). https://doi.org/10.1007/s41060-021-00282-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s41060-021-00282-0