ABSTRACT
One fundamental problem in causal inference is to learn the individual treatment effects (ITE) -- assessing the causal effects of a certain treatment (e.g., prescription of medicine) on an important outcome (e.g., cure of a disease) for each data instance, but the effectiveness of most existing methods is often limited due to the existence of hidden confounders. Recent studies have shown that the auxiliary relational information among data can be utilized to mitigate the confounding bias. However, these works assume that the observational data and the relations among them are static, while in reality, both of them will continuously evolve over time and we refer such data as time-evolving networked observational data.
In this paper, we make an initial investigation of ITE estimation on such data. The problem remains difficult due to the following challenges: (1) modeling the evolution patterns of time-evolving networked observational data; (2) controlling the hidden confounders with current data and historical information; (3) alleviating the discrepancy between the control group and the treated group. To tackle these challenges, we propose a novel ITE estimation framework Dynamic Networked Observational Data Deconfounder (\mymodel) which aims to learn representations of hidden confounders over time by leveraging both current networked observational data and historical information. Additionally, a novel adversarial learning based representation balancing method is incorporated toward unbiased ITE estimation. Extensive experiments validate the superiority of our framework when measured against state-of-the-art baselines. The implementation can be accessed in \hrefhttps://github.com/jma712/DNDC https://github.com/jma712/DNDC.
- Andrew Anglemyer, Hacsi T Horvath, and Lisa Bero. 2014. Healthcare outcomes assessed with observational study designs compared with those assessed in randomized trials. Cochrane Database of Systematic Reviews 4 (2014).Google Scholar
- Ioana Bica, Ahmed Alaa, and Mihaela Van Der Schaar. 2020. Time series deconfounder: estimating treatment effects over time in the presence of hidden confounders. In International Conference on Machine Learning.Google Scholar
- Stephen Bonner and Flavian Vasile. 2018. Causal embeddings for recommendation. In the ACM Conference on Recommender Systems.Google ScholarDigital Library
- Leo Breiman. 2001. Random forests. Machine Learning, Vol. 45, 1 (2001).Google Scholar
- Kyunghyun Cho, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014).Google Scholar
- Yaroslav Ganin, Evgeniya Ustinova, Hana Ajakan, Pascal Germain, Hugo Larochelle, Francc ois Laviolette, Mario Marchand, and Victor Lempitsky. 2016. Domain-adversarial training of neural networks. The Journal of Machine Learning Research, Vol. 17, 1 (2016).Google Scholar
- Ruocheng Guo, Lu Cheng, Jundong Li, P Richard Hahn, and Huan Liu. 2020 a. A survey of learning causality with data: problems and methods. ACM Computing Surveys (CSUR), Vol. 53, 4 (2020).Google ScholarDigital Library
- Ruocheng Guo, Jundong Li, and Huan Liu. 2020 b. Counterfactual evaluation of treatment assignment functions with networked observational data. In SIAM International Conference on Data Mining.Google ScholarCross Ref
- Ruocheng Guo, Jundong Li, and Huan Liu. 2020 c. Learning individual causal effects from networked observational data. In ACM International Conference on Web Search and Data Mining.Google ScholarDigital Library
- Ruocheng Guo, Yichuan Li, Jundong Li, K Selcc uk Candan, Adrienne Raglin, and Huan Liu. 2020 d. IGNITE: A minimax game toward learning individual treatment effects from networked observational data. In International Joint Conferences on Artificial Intelligence.Google ScholarCross Ref
- Jan-Eric Gustafsson. 2013. Causal inference in educational effectiveness research: a comparison of three methods to investigate effects of homework on student achievement. School Effectiveness and School Improvement, Vol. 24, 3 (2013).Google ScholarCross Ref
- Ehsan Hajiramezanali, Arman Hasanzadeh, Krishna Narayanan, Nick Duffield, Mingyuan Zhou, and Xiaoning Qian. 2019. Variational graph recurrent neural networks. In Advances in Neural Information Processing Systems.Google Scholar
- Jennifer L Hill. 2011. Bayesian nonparametric modeling for causal inference. Journal of Computational and Graphical Statistics, Vol. 20, 1 (2011).Google ScholarCross Ref
- Fredrik Johansson, Uri Shalit, and David Sontag. 2016. Learning representations for counterfactual inference. In International Conference on Machine Learning.Google Scholar
- Thomas N Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016).Google Scholar
- Srijan Kumar, Xikun Zhang, and Jure Leskovec. 2019. Predicting dynamic embedding trajectory in temporal interaction networks. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.Google ScholarDigital Library
- Manabu Kuroki and Judea Pearl. 2014. Measurement bias and effect restoration in causal inference. Biometrika, Vol. 101, 2 (2014).Google ScholarCross Ref
- Jundong Li, Harsh Dani, Xia Hu, Jiliang Tang, Yi Chang, and Huan Liu. 2017. Attributed network embedding for learning in a dynamic environment. In ACM International Conference on Information and Knowledge Management.Google ScholarDigital Library
- Christos Louizos, Uri Shalit, Joris M Mooij, David Sontag, Richard Zemel, and Max Welling. 2017. Causal effect inference with deep latent-variable models. In Advances in Neural Information Processing Systems.Google Scholar
- Minh-Thang Luong, Hieu Pham, and Christopher D Manning. 2015. Effective approaches to attention-based neural machine translation. arXiv preprint arXiv:1508.04025 (2015).Google Scholar
- Terence C Mills and Terence C Mills. 1991. Time series techniques for economists.Google Scholar
- Jersey Neyman. 1923. Sur les applications de la théorie des probabilités aux experiences agricoles: Essai des principes. Roczniki Nauk Rolniczych, Vol. 10 (1923).Google Scholar
- Judea Pearl. 2012. On measurement bias in causal inference. arXiv preprint arXiv:1203.3504 (2012).Google Scholar
- Judea Pearl et al. 2009. Causal inference in statistics: An overview. Statistics Surveys, Vol. 3 (2009).Google Scholar
- Jonathan K Pritchard, Matthew Stephens, and Peter Donnelly. 2000. Inference of population structure using multilocus genotype data. Genetics, Vol. 155, 2 (2000).Google Scholar
- Vineeth Rakesh, Ruocheng Guo, Raha Moraffah, Nitin Agarwal, and Huan Liu. 2018. Linked causal variational autoencoder for inferring paired spillover effects. In ACM International Conference on Information and Knowledge Management.Google ScholarDigital Library
- Paul R Rosenbaum and Donald B Rubin. 1983. The central role of the propensity score in observational studies for causal effects. Biometrika, Vol. 70, 1 (1983).Google ScholarCross Ref
- Donald B Rubin. 2005. Bayesian inference for causal effects. Handbook of Statistics, Vol. 25 (2005).Google Scholar
- Ludger Rüschendorf. 1985. The Wasserstein distance and approximation theorems. Probability Theory and Related Fields, Vol. 70, 1 (1985).Google ScholarCross Ref
- Tobias Schnabel, Adith Swaminathan, Ashudeep Singh, Navin Chandak, and Thorsten Joachims. 2016. Recommendations as treatments: debiasing learning and evaluation. arXiv preprint arXiv:1602.05352 (2016).Google ScholarDigital Library
- Uri Shalit, Fredrik D Johansson, and David Sontag. 2017. Estimating individual treatment effect: generalization bounds and algorithms. In International Conference on Machine Learning.Google Scholar
- Wei Sun, Pengyuan Wang, Dawei Yin, Jian Yang, and Yi Chang. 2015. Causal inference via sparse additive models with application to online advertising. In AAAI Conference on Artificial Intelligence.Google Scholar
- Panos Toulis, Alexander Volfovsky, and Edoardo M Airoldi. 2018. Propensity score methodology in the presence of network entanglement between treatments. arXiv preprint arXiv:1801.07310 (2018).Google Scholar
- Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in Neural Information Processing Systems.Google Scholar
- Victor Veitch, Dhanya Sridhar, and David M Blei. 2019. Using text embeddings for causal inference. arXiv preprint arXiv:1905.12741 (2019).Google Scholar
- Stefan Wager and Susan Athey. 2018. Estimation and inference of heterogeneous treatment effects using random forests. J. Amer. Statist. Assoc., Vol. 113, 523 (2018).Google ScholarCross Ref
- Yixin Wang and David M Blei. 2019. The blessings of multiple causes. J. Amer. Statist. Assoc., Vol. 114, 528 (2019).Google Scholar
- Cort J Willmott and Kenji Matsuura. 2005. Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Climate Research, Vol. 30, 1 (2005).Google ScholarCross Ref
- Liuyi Yao, Sheng Li, Yaliang Li, Mengdi Huai, Jing Gao, and Aidong Zhang. 2018. Representation learning for treatment effect estimation from observational data. In Advances in Neural Information Processing Systems.Google Scholar
- Kun Zhang, Biwei Huang, Jiji Zhang, Clark Glymour, and Bernhard Schölkopf. 2017. Causal discovery from nonstationary/heterogeneous data: skeleton estimation and orientation determination. In International Joint Conference on Artificial Intelligence.Google ScholarCross Ref
Index Terms
- Deconfounding with Networked Observational Data in a Dynamic Environment
Recommendations
Graph Infomax Adversarial Learning for Treatment Effect Estimation with Networked Observational Data
KDD '21: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data MiningTreatment effect estimation from observational data is a critical research topic across many domains. The foremost challenge in treatment effect estimation is how to capture hidden confounders. Recently, the growing availability of networked ...
Learning Individual Causal Effects from Networked Observational Data
WSDM '20: Proceedings of the 13th International Conference on Web Search and Data MiningThe convenient access to observational data enables us to learn causal effects without randomized experiments. This research direction draws increasing attention in research areas such as economics, healthcare, and education. For example, we can study ...
Disentangling causality: assumptions in causal discovery and inference
AbstractCausality has been a burgeoning field of research leading to the point where the literature abounds with different components addressing distinct parts of causality. For researchers, it has been increasingly difficult to discern the assumptions ...
Comments