A Genetic Algorithm for Causal Discovery Based on Structural Causal Model

Chen, Zhengyin; Liu, Kun; Jiao, Wenpin

doi:10.1007/978-3-031-20503-3_4

Zhengyin Chen^12,13,
Kun Liu^12,13 &
Wenpin Jiao^12,13

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13606))

Included in the following conference series:

CAAI International Conference on Artificial Intelligence

1540 Accesses

Abstract

With a large amount of data accumulated in many fields, causal discovery based on observational data is gradually emerging, which is considered to be the basis for realizing strong artificial intelligence. However, the existing main causal discovery methods, including constraint-based methods, structural causal model based methods, and scoring-based methods, cannot find real causal relations accurately and quickly. In this paper, we propose a causal discovery method based on genetic algorithm, which combines structural causal model, scoring method, and genetic search algorithm. The core of our method is to divide the causal relation discovery process into the evaluation phase based on the features of structural causal model and the search phase based on the genetic algorithm. In the evaluation phase, the causal graph is evaluated from three aspects: model deviation, noise independence, and causal graph cyclicity, which effectively ensures the accuracy of causal discovery. In the search phase, an efficient random search is designed based on genetic algorithm, which greatly improves the causal discovery efficiency. This paper implements the corresponding algorithm, namely SCM-GA (Structural Causal Model based Genetic Algorithm), and conducts experiments on several simulated datasets and one widely used real-scene dataset. The experiments compare five classic baseline algorithms, and the results show that SCM-GA has achieved great improvement in accuracy, applicability, and efficiency. Especially on the real scene dataset, SCM-GA achieves better results than the state-of-the-art algorithm, with similar SHD (Structure Hamming Distance) value, 40% higher recall rate, and 83.3% shorter running time.

Supported by National Science and Technology Major Project (Grant No. 2020AAA0109401).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 119.00; Price excludes VAT (USA)

Softcover Book: USD 159.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
It is commonly referred to as causal discovery, this abbreviation is also used in related literature, so in this article we do the same.

References

Cai, R.C., Chen, W., Zhang, K., Hao, Z.F.: A survey on non-temporal series observational data based causal discovery (in Chinese). Chin. J. Comput. 40(6), 1470–1490 (2017)
Google Scholar
Cai, R.C., Hao, Z.F.: Casual discovery in big data (in Chinese). Science Press (2018)
Google Scholar
Chickering, D.M.: Learning Bayesian networks is NP-complete. In: Fisher, D., Lenz, H.J. (eds.) Learning from Data. Lecture Notes in Statistics, vol. 112, pp. 121–130. Springer, New York, NY (1996). https://doi.org/10.1007/978-1-4612-2404-4_12
Chickering, D.M.: Optimal structure identification with greedy search. J. Mach. Learn. Res. 3(3), 507–554 (2002). https://doi.org/10.1162/153244303321897717
Chickering, M., Heckerman, D., Meek, C.: Large-sample learning of Bayesian networks is NP-hard. J. Mach. Learn. Res. 5, 1287–1330 (2004)
MathSciNet MATH Google Scholar
Glymour, C., Zhang, K., Spirtes, P.: Review of causal discovery methods based on graphical models. Front. Genet. 10, 524 (2019). https://doi.org/10.3389/fgene.2019.00524
Article Google Scholar
Goudet, O., Kalainathan, D., Caillou, P., Guyon, I., Lopez-Paz, D., Sebag, M.: Learning functional causal models with generative neural networks. In: Escalante, H.J., et al. (eds.) Explainable and Interpretable Models in Computer Vision and Machine Learning. TSSCML, pp. 39–80. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-98131-4_3
Chapter Google Scholar
Gretton, A., Herbrich, R., Smola, A., Bousquet, O., Schölkopf, B., et al.: Kernel methods for measuring independence. J. Mach. Learn. Res. 6, 2075–2129 (2005). https://doi.org/10.1007/s10846-005-9001-9
Article MathSciNet MATH Google Scholar
Haughton, D.M.: On the choice of a model to fit data from an exponential family. Ann. Stat. 16, pp. 342–355 (1988). https://doi.org/10.1214/aos/1176350709
He, Y.-B., Geng, Z., Liang, X.: Learning causal structures based on Markov equivalence class. In: Jain, S., Simon, H.U., Tomita, E. (eds.) ALT 2005. LNCS (LNAI), vol. 3734, pp. 92–106. Springer, Heidelberg (2005). https://doi.org/10.1007/11564089_9
Chapter MATH Google Scholar
He, Y., Cui, P., Shen, Z., Xu, R., Liu, F., Jiang, Y.: Daring: differentiable causal discovery with residual independence. In: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery Data Mining, pp. 596–605 (2021). https://doi.org/10.1145/3447548.3467439
Hoyer, P., Janzing, D., Mooij, J.M., Peters, J., Schölkopf, B.: Nonlinear causal discovery with additive noise models. In: Advances in Neural Information Processing Systems 21 (2008)
Google Scholar
Kitson, N.K., Constantinou, A.C., Guo, Z., Liu, Y., Chobtham, K.: A survey of Bayesian network structure learning. arXiv preprint arXiv:2109.11415 (2021)
Lachapelle, S., Brouillard, P., Deleu, T., Lacoste-Julien, S.: Gradient-based neural DAG learning. In: International Conference on Learning Representations (2020)
Google Scholar
Mattmann, C.A.: A vision for data science. Nature 493(7433), 473–475 (2013). https://doi.org/10.1038/493473a
Article Google Scholar
Peters, J., Janzing, D., Schölkopf, B.: Elements of causal inference: foundations and learning algorithms. The MIT Press (2017)
Google Scholar
Peters, J., Mooij, J.M., Janzing, D., Schölkopf, B.: Causal discovery with continuous additive noise models. J. Mach. Learn. Res. 15(58), 2009–2053 (2014). http://hdl.handle.net/2066/130001
Sachs, K., Perez, O., Pe’er, D., Lauffenburger, D.A., Nolan, G.P.: Causal protein-signaling networks derived from multiparameter single-cell data. Science 308(5721), 523–529 (2005). https://doi.org/10.1126/science.1105809
Article Google Scholar
Shimizu, S., Hoyer, P.O., Hyvärinen, A., Kerminen, A., Jordan, M.: A linear non-gaussian acyclic model for causal discovery. J. Mach. Learn. Res. 7(10), 2003–2030 (2006). https://doi.org/10.1007/s10883-006-0005-y
Spirtes, P., Glymour, C., Scheines, R.: Causality from probability (1989)
Google Scholar
Spirtes, P., Glymour, C.N., Scheines, R., Heckerman, D.: Causation, prediction, and search. MIT press (2000)
Google Scholar
Verma, T.S., Pearl, J.: Equivalence and synthesis of causal models. In: Probabilistic and Causal Inference: the works of Judea Pearl, pp. 221–236 (2022). https://doi.org/10.1145/3501714.3501732
Wang, X., Dunson, D., Leng, C.: No penalty no tears: least squares in high-dimensional linear models. In: International Conference on Machine Learning, pp. 1814–1822. PMLR (2016)
Google Scholar
Zhang, K., Hyvärinen, A.: Causality discovery with additive disturbances: an information-theoretical perspective. In: Buntine, W., Grobelnik, M., Mladenić, D., Shawe-Taylor, J. (eds.) ECML PKDD 2009. LNCS (LNAI), vol. 5782, pp. 570–585. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-04174-7_37
Chapter Google Scholar
Zhang, K., Hyvärinen, A.: Distinguishing causes from effects using nonlinear acyclic causal models. In: Causality: Objectives and Assessment, pp. 157–164. PMLR (2010)
Google Scholar
Zhang, K., Hyvarinen, A.: On the identifiability of the post-nonlinear causal model. arXiv preprint arXiv:1205.2599 (2012)
Zheng, X., Aragam, B., Ravikumar, P.K., Xing, E.P.: DAGs with no tears: continuous optimization for structure learning. In: Advances in Neural Information Processing Systems 31 (2018)
Google Scholar
Zhou, S.: Thresholding procedures for high dimensional variable selection and statistical estimation. In: Advances in Neural Information Processing Systems 22 (2009)
Google Scholar
Zhu, S., Ng, I., Chen, Z.: Causal discovery with reinforcement learning. In: International Conference on Learning Representations (2020)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Software, School of Computer Science, Peking University, Beijing, 100871, China
Zhengyin Chen, Kun Liu & Wenpin Jiao
Key Laboratory of High Confidence Software Technology, (Peking University), MOE, China
Zhengyin Chen, Kun Liu & Wenpin Jiao

Authors

Zhengyin Chen
View author publications
You can also search for this author in PubMed Google Scholar
Kun Liu
View author publications
You can also search for this author in PubMed Google Scholar
Wenpin Jiao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wenpin Jiao .

Editor information

Editors and Affiliations

Tsinghua University, Beijing, China
Lu Fang
Xiaomi Inc., Beijing, China
Daniel Povey
Shanghai Jiao Tong University, Shanghai, China
Guangtao Zhai
JD Explore Academy, Beijing, China
Tao Mei
Chinese Academy of Sciences, Beijing, China
Ruiping Wang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, Z., Liu, K., Jiao, W. (2022). A Genetic Algorithm for Causal Discovery Based on Structural Causal Model. In: Fang, L., Povey, D., Zhai, G., Mei, T., Wang, R. (eds) Artificial Intelligence. CICAI 2022. Lecture Notes in Computer Science(), vol 13606. Springer, Cham. https://doi.org/10.1007/978-3-031-20503-3_4

Download citation

DOI: https://doi.org/10.1007/978-3-031-20503-3_4
Published: 17 December 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-20502-6
Online ISBN: 978-3-031-20503-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Genetic Algorithm for Causal Discovery Based on Structural Causal Model