Continuous treatment effect estimation via generative adversarial de-confounding

Kuang, Kun; Li, Yunzhe; Li, Bo; Cui, Peng; Yang, Hongxia; Tao, Jianrong; Wu, Fei

doi:10.1007/s10618-021-00797-x

Continuous treatment effect estimation via generative adversarial de-confounding

Published: 22 September 2021

Volume 35, pages 2467–2497, (2021)
Cite this article

Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Kun Kuang ORCID: orcid.org/0000-0001-5524-5185¹,
Yunzhe Li¹,
Bo Li²,
Peng Cui²,
Hongxia Yang³,
Jianrong Tao⁴ &
…
Fei Wu¹

1088 Accesses
4 Citations
2 Altmetric
Explore all metrics

Abstract

One fundamental problem in causal inference is the treatment effect estimation in observational studies, and its key challenge is to handle the confounding bias induced by the associations between covariates and treatment variable. In this paper, we study the problem of effect estimation on continuous treatment from observational data, going beyond previous work on binary treatments. Previous work on binary treatment focuses on de-confounding by balancing the distribution of covariates between the treated and control groups with either propensity score or confounder balancing techniques. In the continuous setting, those methods would fail as we can hardly evaluate the distribution of covariates under each treatment status. To tackle the case of continuous treatments, we propose a novel Generative Adversarial De-confounding (GAD) algorithm to eliminate the associations between covariates and treatment variable with two main steps: (1) generating an “calibration” distribution without associations between covariates and treatment by randomly perturbation on treatment variable; (2) learning sample weights that transfer the distribution of observed data to the “calibration” distribution for de-confounding with a Generative Adversarial Network. We show, both theoretically and with empirical experiments, that our GAD algorithm can remove the associations between covariates and treatment, hence, precisely estimating the causal effect of continuous treatment. Extensive experiments on both synthetic and real-world datasets demonstrate that our algorithm outperforms the state-of-the-art methods for effect estimation of continuous treatment with observational data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-Domain Adversarial Balancing for the Estimation of Individual Treatment Effect

Differentiated matching for individual and average treatment effect estimation

Article 09 November 2022

Adversarial balancing-based representation learning for causal effect inference with observational data

Article Open access 17 May 2021

Notes

Units represent the objects of treatment. For example, in medical experiments, the units refer to the patients who take a particular medication.
\({\mathbf {X}}'\) should have the identical marginal distribution with the observed covariates, that is \(P({\mathbf {X}}') = P({\mathbf {X}})\).
https://en.wikipedia.org/wiki/Moment_(mathematics).

References

Arjovsky M, Chintala S, Bottou L (2017) Wasserstein generative adversarial networks. In: Proceedings of the 34th international conference on machine learning, PMLR, proceedings of machine learning research, vol 70, pp 214–223
Athey S, Imbens G (2016) Recursive partitioning for heterogeneous causal effects. Proc Natl Acad Sci 113:7353–7360
Article MathSciNet MATH Google Scholar
Athey S, Imbens GW, Wager S (2018) Approximate residual balancing: debiased inference of average treatment effects in high dimensions. J R Stat Soc: Ser B (Stat Methodol) 80(4):597–623
Article MathSciNet MATH Google Scholar
Austin PC (2011) An introduction to propensity score methods for reducing the effects of confounding in observational studies. Multivar Behav Res 46(3):399–424
Article Google Scholar
Bang H, Robins JM (2005) Doubly robust estimation in missing data and causal inference models. Biometrics 61(4):962–973
Article MathSciNet MATH Google Scholar
Chan D, Ge R, Gershony O, Hesterberg T, Lambert D (2010) Evaluating online ad campaigns in a pipeline: causal models at scale. In: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 7–16
Chan KG, Yam SC, Zhang Z (2016) Globally efficient non-parametric inference of average treatment effects by empirical balancing calibration weighting. J R Stat Soc Ser B Stat Methodol 78(3):673–700
Article MathSciNet MATH Google Scholar
Chernozhukov V, Chetverikov D, Demirer M, Duflo E, Hansen C et al (2016) Double machine learning for treatment and causal parameters. arXiv preprint arXiv:1608.00060
Duchi J, Namkoong H (2018) Learning models with uniform performance via distributionally robust optimization. arXiv preprint arXiv:1810.08750
Egel D, Graham BS, de Xavier Pinto CC (2008) Inverse probability tilting for moment condition models with missing data. Single equation models eJournal, Econometrics
MATH Google Scholar
Fan J, Imai K, Liu H, Ning Y, Yang X (2016) Improving covariate balancing propensity score: a doubly robust and efficient approach. Technical report
Flores CA, Flores-Lagunes A (2009) Identification and estimation of causal mechanisms and net effects of a treatment under unconfoundedness. IZA Institute of Labor Economics Discussion Paper Series
Fong C, Hazlett C, Imai K et al (2018) Covariate balancing propensity score for a continuous treatment: application to the efficacy of political advertisements. Ann Appl Stat 12(1):156–177
Article MathSciNet MATH Google Scholar
Galagate D (2016) Causal inference with a continuous treatment and outcome: alternative estimators for parametric dose-response functions with applications. Ph.D. thesis
Galvao AF, Wang L (2015) Uniformly semiparametric efficient estimation of treatment effects with a continuous treatment. J Am Stat Assoc 110(512):1528–1542
Article MathSciNet MATH Google Scholar
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems, pp 2672–2680
Hainmueller J (2012) Entropy balancing for causal effects: a multivariate reweighting method to produce balanced samples in observational studies. Polit Anal 20(1):25–46
Article Google Scholar
Hill JL (2011) Bayesian nonparametric modeling for causal inference. J Comput Graph Stat 20(1):217–240
Article MathSciNet Google Scholar
Hirano K, Imbens GW (2004) The propensity score with continuous treatments. Applied Bayesian modeling and causal inference from incomplete-data perspectives 226164:73–84
MathSciNet MATH Google Scholar
Holland PW (1986) Statistics and causal inference. J Am Stat Assoc 81(396):945–960
Article MathSciNet MATH Google Scholar
Imai K, Ratkovic M (2014) Covariate balancing propensity score. J R Stat Soc: Ser B (Stat Methodol) 76(1):243–263
Article MathSciNet MATH Google Scholar
Imai K, Van Dyk DA (2004) Causal inference with general treatment regimes: generalizing the propensity score. J Am Stat Assoc 99(467):854–866
Article MathSciNet MATH Google Scholar
Imbens GW (2004) Nonparametric estimation of average treatment effects under exogeneity: a review. Rev Econ Stat 86(1):4–29
Article MathSciNet Google Scholar
Imbens GW, Rubin DB (2015) Causal inference in statistics, social, and biomedical sciences. Cambridge University Press, Cambridge
Book MATH Google Scholar
Kallus N (2019) Generalized optimal matching methods for causal inference. J Mach Learn Res (forthcoming)
Kallus N, Santacatterina M (2019) Kernel optimal orthogonality weighting: a balancing approach to estimating effects of continuous treatments. arXiv, Methodology
Kallus N, Zhou A (2018) Policy evaluation and optimization with continuous treatments. In: International conference on artificial intelligence and statistics, pp 1243–1251
Kennedy EH, Ma Z, McHugh MD, Small DS (2017) Non-parametric methods for doubly robust estimation of continuous treatment effects. J R Stat Soc: Ser B (Stat Methodol) 79(4):1229–1245
Article MathSciNet MATH Google Scholar
Kohavi R, Longbotham R (2011) Unexpected results in online controlled experiments. ACM SIGKDD Explor Newsl 12(2):31–35
Article Google Scholar
Kreif N, Grieve R, Díaz I, Harrison D (2015) Evaluation of the effect of a continuous treatment: a machine learning approach with an application to treatment for traumatic brain injury. Health Econ 24(9):1213–1228
Article Google Scholar
Kuang K, Cui P, Li B, Jiang M, Yang S (2017) Estimating treatment effect in the wild via differentiated confounder balancing. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, pp 265–274. ACM
Kuang K, Cui P, Athey S, Xiong R, Li B (2018) Stable prediction across unknown environments. In: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, pp 1617–1626
Kuang K, Cui P, Li B, Jiang M, Wang Y, Wu F, Yang S (2019) Treatment effect estimation via differentiated confounder balancing and regression. ACM Trans Knowl Discov Data (TKDD) 14(1):1–25
Google Scholar
Kuang K, Cui P, Zou H, Li B, Tao J, Wu F, Yang S (2020) Data-driven variable decomposition for treatment effect estimation. IEEE Trans Knowl Data Eng. https://doi.org/10.1109/TKDE.2020.3006898
Article Google Scholar
Kuang K, Li L, Geng Z, Xu L, Zhang K, Liao B, Huang H, Ding P, Miao W, Jiang Z (2020b) Causal inference. Engineering 6(3):253–263
Article Google Scholar
Künzel SR, Sekhon JS, Bickel PJ, Yu B (2019) Metalearners for estimating heterogeneous treatment effects using machine learning. Proc Natl Acad Sci 116(10):4156–4165
Article Google Scholar
Li F, Li L, Yin J, Zhang Y, Zhou Q, Kuang K (2020a) How to interpret machine knowledge. Engineering 6(3):218–220
Article Google Scholar
Li M, Kuang K, Zhu Q, Chen X, Guo Q, Wu F (2020b) IB-M: a flexible framework to align an interpretable model and a black-box model. In: 2020 IEEE international conference on bioinformatics and biomedicine (BIBM), pp 643–649. IEEE
Liu J, Ma Y, Wang L (2018) An alternative robust estimator of average treatment effect in causal inference. Biometrics 74(3):910–923
Article MathSciNet MATH Google Scholar
Liu Y, Dieng A, Roy S, Rudin C, Volfovsky A (2019) Interpretable almost matching exactly for causal inference. AISTATS
Louizos C, Shalit U, Mooij J, Sontag D, Zemel R, Welling M (2017) Causal effect inference with deep latent-variable models. In: Proceedings of the 31st annual conference on neural information processing systems
Lu C, Wang S (2020) The general-purpose intelligent agent. Engineering 6(3):221–226
Article Google Scholar
McCaffrey DF, Ridgeway G, Morral AR (2004) Propensity score estimation with boosted regression for evaluating causal effects in observational studies. Psychol Methods 9(4):403
Article Google Scholar
Neugebauer R, van der Laan M (2007) Nonparametric causal effects based on marginal structural models. J Stat Plan Inference 137(2):419–434
Article MathSciNet MATH Google Scholar
Olaya D, Coussement K, Verbeke W (2020) A survey and benchmarking study of multitreatment uplift modeling. Data Min Knowl Disc 34:273–308
Article MathSciNet Google Scholar
Pearl J (2009) Causality. Cambridge University Press, Cambridge
Book MATH Google Scholar
Ren K, Zheng T, Qin Z, Liu X (2020) Adversarial attacks and defenses in deep learning. Engineering 6(3):346–360
Article Google Scholar
Robins J, Rotnitzky A (2001) Comment on inference for semiparametric models: some questions and an answer, by P.J. Bickel and J. Kwon. Stat Sin 11:920–936
Google Scholar
Robins JM, Hernan MA, Brumback B (2000) Marginal structural models and causal inference in epidemiology
Rojas-Carulla M, Schölkopf B, Turner R, Peters J (2018) Invariant models for causal transfer learning. J Mach Learn Res 19(1):1309–1342
MathSciNet MATH Google Scholar
Rong G, Mendez A, Assi EB, Zhao B, Sawan M (2020) Artificial intelligence in healthcare: review and prediction case studies. Engineering 6(3):291–301
Article Google Scholar
Rosenbaum PR, Rubin DB (1983) The central role of the propensity score in observational studies for causal effects. Biometrika 70(1):41–55
Article MathSciNet MATH Google Scholar
Rubin DB (1974) Estimating causal effects of treatments in randomized and nonrandomized studies. J Educ Psychol 66(5):688
Article Google Scholar
Rudas K, Jaroszewicz S (2018) Linear regression for uplift modeling. Data Min Knowl Disc 32:1275–1305
Article MathSciNet MATH Google Scholar
Rudin C (2019) Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat Mach Intell 1(5):206–215
Article Google Scholar
Schölkopf B, Locatello F, Bauer S, Ke NR, Kalchbrenner N, Goyal A, Bengio Y (2021) Toward causal representation learning. Proc IEEE 109(5):612–634
Google Scholar
Soltys M, Jaroszewicz S, Rzepakowski P (2014) Ensemble methods for uplift modeling. Data Min Knowl Disc 29:1531–1559
Article MathSciNet Google Scholar
Tan Z (2010) Bounded, efficient and doubly robust estimation with inverse weighting. Biometrika 97:661–682
Article MathSciNet MATH Google Scholar
Tian Q, Kuang K, Jiang K, Wu F, Wang Y (2021) Analysis and applications of class-wise robustness in adversarial training. arXiv preprint arXiv:2105.14240
Wager S, Athey S (2015) Estimation and inference of heterogeneous treatment effects using random forests. J Am Stat Assoc 113:1228–1242
Article MathSciNet MATH Google Scholar
Westreich D, Lessler J, Funk MJ (2010) Propensity score estimation: neural networks, support vector machines, decision trees (cart), and meta-classifiers as alternatives to logistic regression. J Clin Epidemiol 63(8):826–833
Article Google Scholar
Zhao Q (2016) Covariate balancing propensity score by tailored loss functions. arXiv, Methodology
Zhu Y, Coffman D, Ghosh D (2015) A boosting algorithm for estimating generalized propensity scores with continuous treatments. J Causal Inference 3:25–40
Article MathSciNet Google Scholar
Zou WY, Shyam S, Mui M, Wang M, Pedersen J, Ghahramani Z (2020) Learning continuous treatment policy and bipartite embeddings for matching with heterogeneous causal effects. arXiv:2004.09703
Zubizarreta J (2015) Stable weights that balance covariates for estimation with incomplete outcome data. J Am Stat Assoc 110:910–922
Article MathSciNet MATH Google Scholar
Žliobaitė I (2017) Measuring discrimination in algorithmic decision making. Data Min Knowl Disc 31:1060–1089
Article MathSciNet Google Scholar

Download references

Acknowledgements

This work was supported in part by National Natural Science Foundation of China (Nos. 61625107, 62006207), National Key Research and Development Program of China (Nos. 2018AAA0101900, 2020YFC0832500), the Fundamental Research Funds for the Central Universities and Zhejiang Province Natural Science Foundation (No. LQ21F020020).

Author information

Authors and Affiliations

Department of Computer Science and Technology, Zhejiang University, Hangzhou, Zhejiang Province, China
Kun Kuang, Yunzhe Li & Fei Wu
Tsinghua University, Beijing, China
Bo Li & Peng Cui
Alibaba Group, Hangzhou, Zhejiang Province, China
Hongxia Yang
NetEase Fuxi AI Lab, Hangzhou, Zhejiang Province, China
Jianrong Tao

Authors

Kun Kuang
View author publications
You can also search for this author in PubMed Google Scholar
Yunzhe Li
View author publications
You can also search for this author in PubMed Google Scholar
Bo Li
View author publications
You can also search for this author in PubMed Google Scholar
Peng Cui
View author publications
You can also search for this author in PubMed Google Scholar
Hongxia Yang
View author publications
You can also search for this author in PubMed Google Scholar
Jianrong Tao
View author publications
You can also search for this author in PubMed Google Scholar
Fei Wu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kun Kuang.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Responsible editor: Sriraam Natarajan.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kuang, K., Li, Y., Li, B. et al. Continuous treatment effect estimation via generative adversarial de-confounding. Data Min Knowl Disc 35, 2467–2497 (2021). https://doi.org/10.1007/s10618-021-00797-x

Download citation

Received: 27 November 2020
Accepted: 04 September 2021
Published: 22 September 2021
Issue Date: November 2021
DOI: https://doi.org/10.1007/s10618-021-00797-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Continuous treatment effect estimation via generative adversarial de-confounding

Abstract

Access this article

Similar content being viewed by others

Multi-Domain Adversarial Balancing for the Estimation of Individual Treatment Effect

Differentiated matching for individual and average treatment effect estimation

Adversarial balancing-based representation learning for causal effect inference with observational data

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Continuous treatment effect estimation via generative adversarial de-confounding

Abstract

Access this article

Similar content being viewed by others

Multi-Domain Adversarial Balancing for the Estimation of Individual Treatment Effect

Differentiated matching for individual and average treatment effect estimation

Adversarial balancing-based representation learning for causal effect inference with observational data

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation