Skip to main content
Log in

Causal discovery from multi-domain data using the independence of modularities

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Finding a causal relationship that can be generalized in different scenarios is a fundamental problem in science. However, in real-world scenarios, it commonly encounters a distribution shift, of which the underlying generating process changes across the domains. Such a distribution shift brings the challenge to the causal discovery from observational data, as most of the current models assume a fixed causal mechanism in heterogeneous data. As a consequence, the causal direction fails to be identified. Fortunately, in a general causal system, the distributions in the causal direction (but not the anti-causal direction) change independently across the domains, which inspires a way for causal discovery in the multi-domain data by measuring the independent change. By investigating the modularity of the causal mechanism in the multi-domain discretization data, we establish theoretical results on the identification of the causal direction under a mild technical condition. One step further, by utilizing the discretization technique, we propose a general framework for causal direction identification in the multi-domain data without assuming the specific causal mechanism and data types. We verify the effectiveness of our proposed methods in synthetic data and successfully identified the causal direction in two real-world datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Andersson SA, Madigan D, Perlman MD et al (1997) A characterization of markov equivalence classes for acyclic digraphs. Ann Stat 25(2):505–541. https://doi.org/10.1214/aos/1031833662

    Article  MathSciNet  MATH  Google Scholar 

  2. Asuncion A, Newman D (2007) Uci machine learning repository

  3. Cai R, Qiao J, Zhang K, Zhang Z, Hao Z (2018) Causal discovery from discrete data using hidden compact representation. In: Advances in neural information processing systems 31: Annual conference on neural information processing systems 2018, NeurIPS 2018. Montréal, Canada, pp 2671–2679

  4. Cai R, Qiao J, Zhang Z, Hao Z (2018) Self: Structural equational embedded likelihood framework for causal discovery. In: AAAI

  5. Cai R, Ye J, Qiao J, Fu H, Hao Z (2020) Fom: fourth-order moment based causal direction identification on the heteroscedastic data. Neural Netw 124:193–201. https://doi.org/10.1016/j.neunet.2020.01.006

    Article  Google Scholar 

  6. Chen T, Guestrin C (2016) Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, San Francisco, CA, USA, 2016, pp. 785–794. https://doi.org/10.1145/2939672.2939785

  7. Chickering DM (2002) Optimal structure identification with greedy search. J Mach Learn Res 3(Nov):507–554

    MathSciNet  MATH  Google Scholar 

  8. Ghassami A, Kiyavash N, Huang B, Zhang K (2018) Multi-domain causal structure learning in linear systems. In: Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018. Montréal, Canada, pp 6269–6279

  9. Gretton A, Bousquet O, Smola A, Schölkopf B (2005) Measuring statistical dependence with hilbert-schmidt norms. Int Conf Algorithm Learn Theory. https://doi.org/10.1007/115640897

    Article  MATH  Google Scholar 

  10. Hausser J, Strimmer K (2014) Entropy: estimation of entropy, mutual information and related quantities. R package version 1(2):1

  11. Hoyer PO, Janzing D, Mooij JM, Peters J, Schölkopf B (2008) Nonlinear causal discovery with additive noise models. In: Advances in Neural Information Processing Systems 21, Proceedings of the twenty-second annual conference on neural information processing systems, Vancouver, British Columbia, Canada, pp. 689–696

  12. Huang B, Zhang K, Gong M, Glymour C (2019) Causal discovery and forecasting in nonstationary environments with state-space models. In: International conference on machine learning, pp. 2901–2910. PMLR

  13. Huang B, Zhang K, Schölkopf B (2015) Identification of time-dependent causal model: a gaussian process treatment. In: Twenty-Fourth international joint conference on artificial intelligence

  14. Huang B, Zhang K, Zhang J, Ramsey J, Sanchez-Romero R, Glymour C, Schölkopf B (2020) Causal discovery from heterogeneous/nonstationary data. J Mach Learn Res 21(89):1–53

    MathSciNet  MATH  Google Scholar 

  15. Janzing D, Mooij J, Zhang K, Lemeire J, Zscheischler J, Daniušis P, Steudel B, Schölkopf B (2012) Information-geometric approach to inferring causal directions. Artif Intell 182:1–31. https://doi.org/10.1016/j.artint.2012.01.002

    Article  MathSciNet  MATH  Google Scholar 

  16. Khemakhem I, Monti R, Leech R, Hyvarinen A (2021) Causal autoregressive flows. In: International conference on artificial intelligence and statistics, pp. 3520–3528. PMLR

  17. Liu F, Chan L (2016) Causal inference on discrete data via estimating distance correlations. Neural Comput

  18. Mooij JM, Peters J, Janzing D, Zscheischler J, Schölkopf B (2016) Distinguishing cause from effect using observational data: methods and benchmarks. J Mach Learn Res 17(1):1103–1204

    MathSciNet  MATH  Google Scholar 

  19. Pearl J, Verma TS (1995) A theory of inferred causation. Stud Logic Found Math 134:789–811. https://doi.org/10.1016/S0049-237X(06)80074-1

    Article  MathSciNet  MATH  Google Scholar 

  20. Peters J, Janzing D, Schölkopf B (2010) Identifying cause and effect on discrete data using additive noise models. In: AISTATS, pp. 597–604

  21. Shimizu S, Hoyer PO, Hyvärinen A, Kerminen A (2006) A linear non-gaussian acyclic model for causal discovery. J Mach Learn Res 7:2003–2030

    MathSciNet  MATH  Google Scholar 

  22. Spirtes P, Glymour CN, Scheines R (2020) Causation, prediction, and search. https://doi.org/10.1007/978-1-4612-2748-9

  23. Tsamardinos I, Brown LE, Aliferis CF (2006) The max-min hill-climbing bayesian network structure learning algorithm. Mach Learn 65(1):31–78. https://doi.org/10.1007/s10994-006-6889-7

    Article  MATH  Google Scholar 

  24. Zhang K, Huang B, Zhang J, Glymour C, Schölkopf B (2017) Causal discovery from nonstationary/heterogeneous data: Skeleton estimation and orientation determination. In: Proceedings of the twenty-sixth international joint conference on artificial intelligence, IJCAI 2017, Melbourne, Australia, 2017, pp. 1347–1353. https://doi.org/10.24963/ijcai.2017/187

  25. Zhang K, Hyvärinen A (2009) On the identifiability of the post-nonlinear causal model. In: 25th Conference on uncertainty in artificial intelligence (UAI 2009), pp. 647–655. AUAI Press

Download references

Acknowledgements

This research was supported in part by Natural Science Foundation of China (61876043, 61976052), Science and Technology Planning Project of Guangzhou (201902010058), Guangdong Provincial Science and Technology Innovation Strategy Fund (2019B121203012).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ruichu Cai.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Qiao, J., Bai, Y., Cai, R. et al. Causal discovery from multi-domain data using the independence of modularities. Neural Comput & Applic 34, 1939–1949 (2022). https://doi.org/10.1007/s00521-021-06507-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-021-06507-4

Keywords

Navigation