Causal discovery from multi-domain data using the independence of modularities

Qiao, Jie; Bai, Yiming; Cai, Ruichu; Hao, Zhifeng

doi:10.1007/s00521-021-06507-4

Causal discovery from multi-domain data using the independence of modularities

Original Article
Published: 17 September 2021

Volume 34, pages 1939–1949, (2022)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Jie Qiao¹,
Yiming Bai¹,
Ruichu Cai ORCID: orcid.org/0000-0001-8972-167X^1,2 &
…
Zhifeng Hao³

602 Accesses
1 Altmetric
Explore all metrics

Abstract

Finding a causal relationship that can be generalized in different scenarios is a fundamental problem in science. However, in real-world scenarios, it commonly encounters a distribution shift, of which the underlying generating process changes across the domains. Such a distribution shift brings the challenge to the causal discovery from observational data, as most of the current models assume a fixed causal mechanism in heterogeneous data. As a consequence, the causal direction fails to be identified. Fortunately, in a general causal system, the distributions in the causal direction (but not the anti-causal direction) change independently across the domains, which inspires a way for causal discovery in the multi-domain data by measuring the independent change. By investigating the modularity of the causal mechanism in the multi-domain discretization data, we establish theoretical results on the identification of the causal direction under a mild technical condition. One step further, by utilizing the discretization technique, we propose a general framework for causal direction identification in the multi-domain data without assuming the specific causal mechanism and data types. We verify the effectiveness of our proposed methods in synthetic data and successfully identified the causal direction in two real-world datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Unsuitability of NOTEARS for Causal Graph Discovery when Dealing with Dimensional Quantities

Article 18 January 2022

Repetitive causal discovery of linear non-Gaussian acyclic models in the presence of latent confounders

Article 07 September 2021

A KNN-Based Non-Parametric Conditional Independence Test for Mixed Data and Application in Causal Discovery

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

Artificial Intelligence

References

Andersson SA, Madigan D, Perlman MD et al (1997) A characterization of markov equivalence classes for acyclic digraphs. Ann Stat 25(2):505–541. https://doi.org/10.1214/aos/1031833662
Article MathSciNet MATH Google Scholar
Asuncion A, Newman D (2007) Uci machine learning repository
Cai R, Qiao J, Zhang K, Zhang Z, Hao Z (2018) Causal discovery from discrete data using hidden compact representation. In: Advances in neural information processing systems 31: Annual conference on neural information processing systems 2018, NeurIPS 2018. Montréal, Canada, pp 2671–2679
Cai R, Qiao J, Zhang Z, Hao Z (2018) Self: Structural equational embedded likelihood framework for causal discovery. In: AAAI
Cai R, Ye J, Qiao J, Fu H, Hao Z (2020) Fom: fourth-order moment based causal direction identification on the heteroscedastic data. Neural Netw 124:193–201. https://doi.org/10.1016/j.neunet.2020.01.006
Article Google Scholar
Chen T, Guestrin C (2016) Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, San Francisco, CA, USA, 2016, pp. 785–794. https://doi.org/10.1145/2939672.2939785
Chickering DM (2002) Optimal structure identification with greedy search. J Mach Learn Res 3(Nov):507–554
MathSciNet MATH Google Scholar
Ghassami A, Kiyavash N, Huang B, Zhang K (2018) Multi-domain causal structure learning in linear systems. In: Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018. Montréal, Canada, pp 6269–6279
Gretton A, Bousquet O, Smola A, Schölkopf B (2005) Measuring statistical dependence with hilbert-schmidt norms. Int Conf Algorithm Learn Theory. https://doi.org/10.1007/115640897
Article MATH Google Scholar
Hausser J, Strimmer K (2014) Entropy: estimation of entropy, mutual information and related quantities. R package version 1(2):1
Hoyer PO, Janzing D, Mooij JM, Peters J, Schölkopf B (2008) Nonlinear causal discovery with additive noise models. In: Advances in Neural Information Processing Systems 21, Proceedings of the twenty-second annual conference on neural information processing systems, Vancouver, British Columbia, Canada, pp. 689–696
Huang B, Zhang K, Gong M, Glymour C (2019) Causal discovery and forecasting in nonstationary environments with state-space models. In: International conference on machine learning, pp. 2901–2910. PMLR
Huang B, Zhang K, Schölkopf B (2015) Identification of time-dependent causal model: a gaussian process treatment. In: Twenty-Fourth international joint conference on artificial intelligence
Huang B, Zhang K, Zhang J, Ramsey J, Sanchez-Romero R, Glymour C, Schölkopf B (2020) Causal discovery from heterogeneous/nonstationary data. J Mach Learn Res 21(89):1–53
MathSciNet MATH Google Scholar
Janzing D, Mooij J, Zhang K, Lemeire J, Zscheischler J, Daniušis P, Steudel B, Schölkopf B (2012) Information-geometric approach to inferring causal directions. Artif Intell 182:1–31. https://doi.org/10.1016/j.artint.2012.01.002
Article MathSciNet MATH Google Scholar
Khemakhem I, Monti R, Leech R, Hyvarinen A (2021) Causal autoregressive flows. In: International conference on artificial intelligence and statistics, pp. 3520–3528. PMLR
Liu F, Chan L (2016) Causal inference on discrete data via estimating distance correlations. Neural Comput
Mooij JM, Peters J, Janzing D, Zscheischler J, Schölkopf B (2016) Distinguishing cause from effect using observational data: methods and benchmarks. J Mach Learn Res 17(1):1103–1204
MathSciNet MATH Google Scholar
Pearl J, Verma TS (1995) A theory of inferred causation. Stud Logic Found Math 134:789–811. https://doi.org/10.1016/S0049-237X(06)80074-1
Article MathSciNet MATH Google Scholar
Peters J, Janzing D, Schölkopf B (2010) Identifying cause and effect on discrete data using additive noise models. In: AISTATS, pp. 597–604
Shimizu S, Hoyer PO, Hyvärinen A, Kerminen A (2006) A linear non-gaussian acyclic model for causal discovery. J Mach Learn Res 7:2003–2030
MathSciNet MATH Google Scholar
Spirtes P, Glymour CN, Scheines R (2020) Causation, prediction, and search. https://doi.org/10.1007/978-1-4612-2748-9
Tsamardinos I, Brown LE, Aliferis CF (2006) The max-min hill-climbing bayesian network structure learning algorithm. Mach Learn 65(1):31–78. https://doi.org/10.1007/s10994-006-6889-7
Article MATH Google Scholar
Zhang K, Huang B, Zhang J, Glymour C, Schölkopf B (2017) Causal discovery from nonstationary/heterogeneous data: Skeleton estimation and orientation determination. In: Proceedings of the twenty-sixth international joint conference on artificial intelligence, IJCAI 2017, Melbourne, Australia, 2017, pp. 1347–1353. https://doi.org/10.24963/ijcai.2017/187
Zhang K, Hyvärinen A (2009) On the identifiability of the post-nonlinear causal model. In: 25th Conference on uncertainty in artificial intelligence (UAI 2009), pp. 647–655. AUAI Press

Download references

Acknowledgements

This research was supported in part by Natural Science Foundation of China (61876043, 61976052), Science and Technology Planning Project of Guangzhou (201902010058), Guangdong Provincial Science and Technology Innovation Strategy Fund (2019B121203012).

Author information

Authors and Affiliations

School of Computer Science, Guangdong University of Technology, Guangzhou, 510006, Guangdong, China
Jie Qiao, Yiming Bai & Ruichu Cai
Guangdong Provincial Key Laboratory of Public Finance and Taxation with Big Data Application, Guangzhou, 510320, Guangdong, China
Ruichu Cai
College of Science, Shantou University, Shantou, 515063, Guangdong, China
Zhifeng Hao

Authors

Jie Qiao
View author publications
You can also search for this author inPubMed Google Scholar
Yiming Bai
View author publications
You can also search for this author inPubMed Google Scholar
Ruichu Cai
View author publications
You can also search for this author inPubMed Google Scholar
Zhifeng Hao
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Ruichu Cai.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Qiao, J., Bai, Y., Cai, R. et al. Causal discovery from multi-domain data using the independence of modularities. Neural Comput & Applic 34, 1939–1949 (2022). https://doi.org/10.1007/s00521-021-06507-4

Download citation

Received: 13 February 2021
Accepted: 31 August 2021
Published: 17 September 2021
Issue Date: February 2022
DOI: https://doi.org/10.1007/s00521-021-06507-4

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Causal discovery from multi-domain data using the independence of modularities

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Unsuitability of NOTEARS for Causal Graph Discovery when Dealing with Dimensional Quantities

Repetitive causal discovery of linear non-Gaussian acyclic models in the presence of latent confounders

A KNN-Based Non-Parametric Conditional Independence Test for Mixed Data and Application in Causal Discovery

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now