skip to main content
10.1145/3580305.3599531acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article
Free Access

Treatment Effect Estimation with Adjustment Feature Selection

Published:04 August 2023Publication History

ABSTRACT

In causal inference, it is common to select a subset of observed covariates, named the adjustment features, to be adjusted for estimating the treatment effect. For real-world applications, the abundant covariates are usually observed, which contain extra variables partially correlating to the treatment (treatment-only variables, e.g., instrumental variables) or the outcome (outcome-only variables, e.g., precision variables) besides the confounders (variables that affect both the treatment and outcome). In principle, unbiased treatment effect estimation is achieved once the adjustment features contain all the confounders. However, the performance of empirical estimations varies a lot with different extra variables. To solve this issue, variable separation/selection for treatment effect estimation has received growing attention when the extra variables contain instrumental variables and precision variables.

In this paper, assuming no mediator variables exist, we consider a more general setting by allowing for the existence of post-treatment and post-outcome variables rather than instrumental and precision variables in observed covariates. Our target is to separate the treatment-only variables from the adjustment features. To this end, we establish a metric named Optimal Adjustment Features(OAF), which empirically measures the asymptotic variance of the estimation. Theoretically, we show that our OAF metric is minimized if and only if adjustment features consist of the confounders and outcome-only variables, i.e., the treatment-only variables are perfectly separated. As optimizing the OAF metric is a combinatorial optimization problem, we introduce Reinforcement Learning (RL) and adopt the policy gradient to search for the optimal adjustment set. Empirical results on both synthetic and real-world datasets demonstrate that (a) our method successfully searches the optimal adjustment features and (b) the searched adjustment features achieve a more precise estimation of the treatment effect.

Skip Supplemental Material Section

Supplemental Material

rtfp0426-2min-promo.mp4

mp4

7.7 MB

rtfp0426-20min-video.mp4

mp4

34.2 MB

References

  1. Douglas Almond, Kenneth Y Chay, and David S Lee. 2005. The costs of low birth weight. The Quarterly Journal of Economics, Vol. 120, 3 (2005), 1031--1083.Google ScholarGoogle Scholar
  2. Susan Athey, Guido W Imbens, and Stefan Wager. 2018. Approximate residual balancing: debiased inference of average treatment effects in high dimensions. Journal of the Royal Statistical Society: Series B (Statistical Methodology), Vol. 80, 4 (2018), 597--623.Google ScholarGoogle ScholarCross RefCross Ref
  3. Peter C Austin and Elizabeth A Stuart. 2015. Moving towards best practice when using inverse probability of treatment weighting (IPTW) using the propensity score to estimate causal treatment effects in observational studies. Statistics in medicine, Vol. 34, 28 (2015), 3661--3679.Google ScholarGoogle Scholar
  4. Irwan Bello, Hieu Pham, Quoc V Le, Mohammad Norouzi, and Samy Bengio. 2016. Neural combinatorial optimization with reinforcement learning. arXiv preprint arXiv:1611.09940 (2016).Google ScholarGoogle Scholar
  5. CM Booth and IF Tannock. 2014. Randomised controlled trials and population-based observational research: partners in the evolution of medical evidence. British journal of cancer, Vol. 110, 3 (2014), 551--555.Google ScholarGoogle Scholar
  6. William G Cochran. 1968. The effectiveness of adjustment by subclassification in removing bias in observational studies. Biometrics (1968), 295--313.Google ScholarGoogle Scholar
  7. Carlos Fernández-Loría and Foster Provost. 2022. Causal decision making and causal effect estimation are not the same? and why it matters. INFORMS Journal on Data Science (2022).Google ScholarGoogle Scholar
  8. Jinyong Hahn. 1998. On the role of the propensity score in efficient semiparametric estimation of average treatment effects. Econometrica (1998), 315--331.Google ScholarGoogle Scholar
  9. Negar Hassanpour and Russell Greiner. 2019. Learning disentangled representations for counterfactual regression. In International Conference on Learning Representations.Google ScholarGoogle Scholar
  10. Tobias Hatt and Stefan Feuerriegel. 2021. Estimating average treatment effects via orthogonal regularization. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management. 680--689.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Jennifer L Hill. 2011. Bayesian nonparametric modeling for causal inference. Journal of Computational and Graphical Statistics, Vol. 20, 1 (2011), 217--240.Google ScholarGoogle ScholarCross RefCross Ref
  12. Oliver Hines, Oliver Dukes, Karla Diaz-Ordaz, and Stijn Vansteelandt. 2022. Demystifying statistical learning based on efficient influence functions. The American Statistician (2022), 1--13.Google ScholarGoogle Scholar
  13. Guido W Imbens and Donald B Rubin. 2015. Causal inference in statistics, social, and biomedical sciences. Cambridge University Press.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Amir-Hossein Karimi, Julius Von Kügelgen, Bernhard Schölkopf, and Isabel Valera. 2020. Algorithmic recourse under imperfect causal knowledge: a probabilistic approach. Advances in neural information processing systems, Vol. 33 (2020), 265--277.Google ScholarGoogle Scholar
  15. Kun Kuang, Peng Cui, Hao Zou, Bo Li, Jianrong Tao, Fei Wu, and Shiqiang Yang. 2020. Data-driven variable decomposition for treatment effect estimation. IEEE Transactions on Knowledge and Data Engineering (2020).Google ScholarGoogle Scholar
  16. Bryan Lim. 2018. Forecasting treatment responses over time using recurrent marginal structural networks. advances in neural information processing systems, Vol. 31 (2018).Google ScholarGoogle Scholar
  17. Safoora Masoumi and Saeid Shahraz. 2022. Meta-analysis using Python: a hands-on tutorial. BMC medical research methodology, Vol. 22, 1 (2022), 1--8.Google ScholarGoogle Scholar
  18. Judea Pearl et al. 2000. Models, reasoning and inference. Cambridge, UK: CambridgeUniversityPress, Vol. 19, 2 (2000).Google ScholarGoogle Scholar
  19. Zhaozhi Qian, Alicia Curth, and Mihaela van der Schaar. 2021. Estimating Multi-cause Treatment Effects via Single-cause Perturbation. Advances in Neural Information Processing Systems, Vol. 34 (2021), 23754--23767.Google ScholarGoogle Scholar
  20. Andrea Rotnitzky and Ezequiel Smucler. 2020. Efficient Adjustment Sets for Population Average Causal Treatment Effect Estimation in Graphical Models. J. Mach. Learn. Res., Vol. 21, 188 (2020), 1--86.Google ScholarGoogle Scholar
  21. Uri Shalit, Fredrik D Johansson, and David Sontag. 2017. Estimating individual treatment effect: generalization bounds and algorithms. In International Conference on Machine Learning. PMLR, 3076--3085.Google ScholarGoogle Scholar
  22. Claudia Shi, David Blei, and Victor Veitch. 2019. Adapting neural networks for the estimation of treatment effects. Advances in neural information processing systems, Vol. 32 (2019).Google ScholarGoogle Scholar
  23. Claudia Shi, Victor Veitch, and David M Blei. 2021. Invariant representation learning for treatment effect estimation. In Uncertainty in Artificial Intelligence. PMLR, 1546--1555.Google ScholarGoogle Scholar
  24. Leonard A Stefanski and Dennis D Boos. 2002. The calculus of M-estimation. The American Statistician, Vol. 56, 1 (2002), 29--38.Google ScholarGoogle ScholarCross RefCross Ref
  25. Elizabeth A Stuart. 2010. Matching methods for causal inference: A review and a look forward. Statistical science: a review journal of the Institute of Mathematical Statistics, Vol. 25, 1 (2010), 1.Google ScholarGoogle Scholar
  26. Stratis Tsirtsis and Manuel Gomez Rodriguez. 2020. Decisions, counterfactual explanations and strategic behavior. Advances in Neural Information Processing Systems, Vol. 33 (2020), 16749--16760.Google ScholarGoogle Scholar
  27. Mark J Van der Laan, Sherri Rose, et al. 2011. Targeted learning: causal inference for observational and experimental data. Vol. 10. Springer.Google ScholarGoogle Scholar
  28. Mark J Van Der Laan and Daniel Rubin. 2006. Targeted maximum likelihood learning. The international journal of biostatistics, Vol. 2, 1 (2006).Google ScholarGoogle Scholar
  29. Stefan Wager and Susan Athey. 2018. Estimation and inference of heterogeneous treatment effects using random forests. J. Amer. Statist. Assoc., Vol. 113, 523 (2018), 1228--1242.Google ScholarGoogle ScholarCross RefCross Ref
  30. Ronald J Williams. 1992. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine learning, Vol. 8, 3 (1992), 229--256.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Anpeng Wu, Junkun Yuan, Kun Kuang, Bo Li, Runze Wu, Qiang Zhu, Yue Ting Zhuang, and Fei Wu. 2022. Learning decomposed representations for treatment effect estimation. IEEE Transactions on Knowledge and Data Engineering (2022).Google ScholarGoogle Scholar
  32. Pengzhou Wu and Kenji Fukumizu. 2021. β-Intact-VAE: Identifying and Estimating Causal Effects under Limited Overlap. arXiv preprint arXiv:2110.05225 (2021).Google ScholarGoogle Scholar
  33. Liuyi Yao, Sheng Li, Yaliang Li, Mengdi Huai, Jing Gao, and Aidong Zhang. 2018. Representation learning for treatment effect estimation from observational data. Advances in Neural Information Processing Systems, Vol. 31 (2018).Google ScholarGoogle Scholar
  34. Jinsung Yoon, James Jordon, and Mihaela Van Der Schaar. 2018. GANITE: Estimation of individualized treatment effects using generative adversarial nets. In International Conference on Learning Representations.Google ScholarGoogle Scholar
  35. Shengyu Zhang, Dong Yao, Zhou Zhao, Tat-Seng Chua, and Fei Wu. 2021b. Causerec: Counterfactual user sequence synthesis for sequential recommendation. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 367--377.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Weijia Zhang, Lin Liu, and Jiuyong Li. 2021a. Treatment effect estimation with disentangled latent factors. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 10923--10930.Google ScholarGoogle ScholarCross RefCross Ref
  37. Shengyu Zhu, Ignavier Ng, and Zhitang Chen. 2019. Causal discovery with reinforcement learning. arXiv preprint arXiv:1906.04477 (2019).Google ScholarGoogle Scholar
  38. Yueting Zhuang, Ming Cai, Xuelong Li, Xiangang Luo, Qiang Yang, and Fei Wu. 2020. The next breakthroughs of artificial intelligence: The interdisciplinary nature of AI. Engineering, Vol. 6, 3 (2020), 245.Google ScholarGoogle ScholarCross RefCross Ref
  39. Hao Zou, Bo Li, Jiangang Han, Shuiping Chen, Xuetao Ding, and Peng Cui. 2022. Counterfactual Prediction for Outcome-Oriented Treatments. In International Conference on Machine Learning. PMLR, 27693--27706.Google ScholarGoogle Scholar

Index Terms

  1. Treatment Effect Estimation with Adjustment Feature Selection

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      KDD '23: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
      August 2023
      5996 pages
      ISBN:9798400701030
      DOI:10.1145/3580305

      Copyright © 2023 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 4 August 2023

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate1,133of8,635submissions,13%

      Upcoming Conference

      KDD '24
    • Article Metrics

      • Downloads (Last 12 months)503
      • Downloads (Last 6 weeks)50

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader