Stable learning establishes some common ground between causal inference and machine learning

Cui, Peng; Athey, Susan

doi:10.1038/s42256-022-00445-z

Perspective
Published: 23 February 2022

Stable learning establishes some common ground between causal inference and machine learning

Peng Cui^1,2 &
Susan Athey³

Nature Machine Intelligence volume 4, pages 110–115 (2022)Cite this article

9102 Accesses
74 Citations
28 Altmetric
Metrics details

Subjects

Abstract

Causal inference has recently attracted substantial attention in the machine learning and artificial intelligence community. It is usually positioned as a distinct strand of research that can broaden the scope of machine learning from predictive modelling to intervention and decision-making. In this Perspective, however, we argue that ideas from causality can also be used to improve the stronghold of machine learning, predictive modelling, if predictive stability, explainability and fairness are important. With the aim of bridging the gap between the tradition of precise modelling in causal inference and black-box approaches from machine learning, stable learning is proposed and developed as a source of common ground. This Perspective clarifies a source of risk for machine learning models and discusses the benefits of bringing causality into learning. We identify the fundamental problems addressed by stable learning, as well as the latest progress from both causal inference and learning perspectives, and we discuss relationships with explainability and fairness problems.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Three ways of generating correlations.**

**Fig. 2: The physical processes for generating datasets used in predictive modelling, occurring over time.**

**Fig. 3: Comparison of different learning paradigms.**

Quantifying causality in data science with quasi-experiments

Article 14 January 2021

Tony Liu, Lyle Ungar & Konrad Kording

Bayesian statistics and modelling

Article 14 January 2021

Rens van de Schoot, Sarah Depaoli, … Christopher Yau

Variable selection for inferential models with relatively high-dimensional data: Between method heterogeneity and covariate stability as adjuncts to robust selection

Article Open access 14 May 2020

Eliana Lima, Peers Davies, … Martin Green

References

Athey, S. C., Bryan, K. A. & Gans, J. S. The allocation of decision authority to human and artificial intelligence. AEA Papers and Proceedings 110, 80–84 (2020).
Rudin, C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat. Mach. Intell. 1, 206–215 (2019).
Article Google Scholar
Corbett-Davies, S. & Goel, S. The measure and mismeasure of fairness: a critical review of fair machine learning. Preprint at https://arxiv.org/abs/1808.00023 (2018).
Heinze-Deml, C. & Meinshausen, N. Conditional variance penalties and domain shift robustness. Mach. Learn. 110, 303–348 (2021).
Article MathSciNet Google Scholar
Pearl, J. Theoretical impediments to machine learning with seven sparks from the causal revolution. In Proc. of the Eleventh ACM International Conference on Web Search and Data Mining (2018).
Imbens, G. W. & Rubin, D. B. Causal Inference in Statistics, Social, and Biomedical Sciences (Cambridge Univ. Press, 2015).
Rosenbaum, P. R. & Rubin, D. B. The central role of the propensity score in observational studies for causal effects. Biometrika 70, 41–55 (1983).
Article MathSciNet Google Scholar
Athey, S. & Imbens, G. A measure of robustness to misspecification. Am. Econ. Rev. 105, 476–480 (2015).
Article Google Scholar
Holland, P. W. Statistics and causal inference. J. Am. Stat. Assoc. 81, 945–960 (1986).
Article MathSciNet Google Scholar
Xu, R., Cui, P., Shen, Z., Zhang, X. & Zhang, T. Why stable learning works? A theory of covariate shift generalization. Preprint at https://arxiv.org/abs/2111.02355 (2021).
Kuang, K., Cui, P., Athey, S., Xiong, R. & Li, B. Stable prediction across unknown environments. In Proc. of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining 1617–1626 (2018).
Yu, B. et al. Stability. Bernoulli 19, 1484–1500 (2013).
Article MathSciNet Google Scholar
Vapnik, V. Principles of risk minimization for learning theory. In Advances in Neural Information Processing Systems 831–838 (1992).
Pan, S. J. et al. A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22, 1345–1359 (2010).
Article Google Scholar
Shen, Z. et al. Towards out-of-distribution generalization: a survey. Preprint at https://arxiv.org/abs/2108.13624 (2021).
Athey, S., Imbens, G. W. & Wager, S. Approximate residual balancing: debiased inference of average treatment effects in high dimensions. J. R. Stat. Soc. Series B Stat. Methodol. 80.4, 597–623 (2018).
Article MathSciNet Google Scholar
Zubizarreta, J. R. Stable weights that balance covariates for estimation with incomplete outcome data. J. Am. Stat. Assoc. 110, 910–922 (2015).
Article MathSciNet Google Scholar
Hainmueller, J. Entropy balancing for causal effects: a multivariate reweighting method to produce balanced samples in observational studies. Political Anal. 20.1, 25–46 (2012).
Article Google Scholar
Guo, R., Cheng, L., Li, J., Hahn, P. R. & Liu, H. A Survey of Learning Causality With Data: Problems and Methods 53.4, 137 (ACM Computing Surveys (CSUR), 2021).
Hicks, R. & Tingley, D. Causal mediation analysis. Stata J. 11, 605–619 (2011).
Article Google Scholar
Pearl, J. Direct and indirect effects. In Proc. of the Seventeenth conference on Uncertainty in Artificial Intelligence 411–420 (2001).
Shen, Z., Cui, P., Kuang, K., Li, B. & Chen, P. Causally regularized learning with agnostic data selection bias. In Proc. of the 26th ACM International Conference on Multimedia 411–419 (2018).
Bisgaard, T. M. & Sasvári, Z. When does e (xk⋅ yl)= e (xk)⋅ e (yl) imply independence? Stat. Probabil. Lett. 76, 1111–1116 (2006).
Article Google Scholar
Kuang, K., Xiong, R., Cui, P., Athey, S. & Li, B. Stable prediction with model misspecification and agnostic distribution shift. In Proc. of the AAAI Conference on Artificial Intelligence 34, No. 04 (2020).
Shen, Z., Cui, P., Zhang, T. & Kunag, K. Stable learning via sample reweighting. In Proc. of the AAAI Conference on Artificial Intelligence 34, no. 04, 5692–5699 (2020).
Cornelißen, T. & Sonderhof, K. Partial effects in probit and logit models with a triple dummy-variable interaction term. Stata J. 9, 571–583 (2009).
Article Google Scholar
Gelman, A. & Hill, J. in Data Analysis Using Regression and Multilevel/Hierarchical Models 167–198 (Cambridge Univ. Press, 2007).
Holzinger, A., Langs, G., Denk, H., Zatloukal, K. & Müller, H. Causability and explainability of artificial intelligence in medicine. WIREs Data Min. Knowl. Discov. 9, e1312 (2019).
Google Scholar
Gunning, D. & Aha, D. W. DARPA’s explainable artificial intelligence program. AI Mag. 40, 44–58 (2019).
Google Scholar
Rai, A. Explainable AI: from black box to glass box. J. Acad. Market. Sci. 48, 137–141 (2020).
Article Google Scholar
Zhang, X., Cui, P., Xu, R., Zhou, L., He, Y., & Shen, Z. Deep stable learning for out-of-distribution generalization. In Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 5372–5382 (2021).
Dwork, C., Hardt, M., Pitassi, T., Reingold, O. & Zemel, R. Fairness through awareness. In Proc. of the 3rd Innovations in Theoretical Computer Science Conference 214–226 (2012).
Hardt, M. et al. Equality of opportunity in supervised learning. In Advances in Neural Information Processing Systems 3315–3323 (2016).
Kusner, M. J., Loftus, J., Russell, C. & Silva, R. Counterfactual fairness. In Advances in Neural Information Processing Systems 4066–4076 (2017).
Kilbertus, N. et al. Avoiding discrimination through causal reasoning. In Advances in Neural Information Processing Systems 656–666 (2017).
Adragna, R., Creager, E., Madras, D. & Zemel, R. Fairness and robustness in invariant learning: a case study in toxicity classification. Preprint at https://arxiv.org/abs/2011.06485 (2020).
Hashimoto, T. B., Srivastava, M., Namkoong, H. & Liang, P. Fairness without demographics in repeated loss minimization. In International Conference on Machine Learning 1929–1938 (PMLR, 2018).
Roh, Y., Lee, K., Whang, S. E. & Suh, C. FR-Train: a mutual information-based approach to fair and robust training. In International Conference on Machine Learning 8147–8157 (PMLR, 2020).

Download references

Acknowledgements

Peng Cui’s research is supported by National Key R&D Program of China (No. 2018AAA0102004), National Natural Science Foundation of China (No. U1936219), Beijing Academy of Artificial Intelligence (BAAI) and Guoqiang Institute of Tsinghua University.

Author information

Authors and Affiliations

Department of Computer Science, Tsinghua University, Beijing, China
Peng Cui
Beijing Academy of Artificial Intelligence, Beijing, China
Peng Cui
Graduate School of Business, Stanford University, Stanford, CA, USA
Susan Athey

Authors

Peng Cui
View author publications
You can also search for this author in PubMed Google Scholar
Susan Athey
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Peng Cui.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Machine Intelligence thanks Kush Varshney and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cui, P., Athey, S. Stable learning establishes some common ground between causal inference and machine learning. Nat Mach Intell 4, 110–115 (2022). https://doi.org/10.1038/s42256-022-00445-z

Download citation

Received: 16 December 2020
Accepted: 12 January 2022
Published: 23 February 2022
Issue Date: February 2022
DOI: https://doi.org/10.1038/s42256-022-00445-z

This article is cited by

Improving generalization of machine learning-identified biomarkers using causal modelling with examples from immune receptor diagnostics
- Milena Pavlović
- Ghadi S. Al Hajj
- Geir K. Sandve
Nature Machine Intelligence (2024)
Causal deep learning for explainable vision-based quality inspection under visual interference
- Tianbiao Liang
- Tianyuan Liu
- Pai Zheng
Journal of Intelligent Manufacturing (2024)
CFTNet: a robust credit card fraud detection model enhanced by counterfactual data augmentation
- Menglin Kong
- Ruichen Li
- Cong Cao
Neural Computing and Applications (2024)
Feature importance measure of a multilayer perceptron based on the presingle-connection layer
- Wenyi Zhang
- Xiaohua Shen
- Lejun Zou
Knowledge and Information Systems (2024)
Applying Machine Learning Algorithms to Predict the Size of the Informal Economy
- João Felix
- Michel Alexandre
- Gilberto Tadeu Lima
Computational Economics (2024)