skip to main content
10.1145/3534678.3539427acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article
Public Access

Robust Event Forecasting with Spatiotemporal Confounder Learning

Published:14 August 2022Publication History

ABSTRACT

Data-driven societal event forecasting methods exploit relevant historical information to predict future events. These methods rely on historical labeled data and cannot accurately predict events when data are limited or of poor quality. Studying causal effects between events goes beyond correlation analysis and can contribute to a more robust prediction of events. However, incorporating causality analysis in data-driven event forecasting is challenging due to several factors: (i) Events occur in a complex and dynamic social environment. Many unobserved variables, i.e., hidden confounders, affect both potential causes and outcomes. (ii) Given spatiotemporal non-independent and identically distributed (non-IID) data, modeling hidden confounders for accurate causal effect estimation is not trivial. In this work, we introduce a deep learning framework that integrates causal effect estimation into event forecasting. We first study the problem of Individual Treatment Effect (ITE) estimation from observational event data with spatiotemporal attributes and present a novel causal inference model to estimate ITEs. We then incorporate the learned event-related causal information into event prediction as prior knowledge. Two robust learning modules, including a feature reweighting module and an approximate constraint loss, are introduced to enable prior knowledge injection. We evaluate the proposed causal inference model on real-world event datasets and validate the effectiveness of proposed robust learning modules in event prediction by feeding learned causal information into different deep learning methods. Experimental results demonstrate the strengths of the proposed causal inference model for ITE estimation in societal events and showcase the beneficial properties of robust learning modules in societal event forecasting.

Skip Supplemental Material Section

Supplemental Material

KDD22-rtfp2205.mp4

mp4

34.8 MB

References

  1. Harshavardhan Achrekar, Avinash Gandhe, Ross Lazarus, Ssu-Hsin Yu, and Benyuan Liu. 2011. Predicting flu trends using twitter data. In IEEE Conference on Computer Communications Workshops. IEEE, 702--707.Google ScholarGoogle ScholarCross RefCross Ref
  2. Andrew Anglemyer, Hacsi T Horvath, and Lisa Bero. 2014. Healthcare outcomes assessed with observational study designs compared with those assessed in randomized trials. Cochrane Database of Systematic Reviews 4 (2014).Google ScholarGoogle Scholar
  3. Peter W Battaglia, Razvan Pascanu, Matthew Lai, Danilo Rezende, and Koray Kavukcuoglu. 2016. Interaction networks for learning about objects, relations and physics. arXiv:1612.00222 (2016).Google ScholarGoogle Scholar
  4. Johan Bollen, Huina Mao, and Xiaojun Zeng. 2011. Twitter mood predicts the stock market. Journal of computational science, Vol. 2, 1 (2011), 1--8.Google ScholarGoogle ScholarCross RefCross Ref
  5. Stephen Bonner and Flavian Vasile. 2018. Causal embeddings for recommendation. In RecSys. 104--112.Google ScholarGoogle Scholar
  6. Elizabeth Boschee, Jennifer Lautenschlager, Sean O'Brien, Steve Shellman, James Starz, and Michael Ward. 2015. ICEWS Coded Event Data.Google ScholarGoogle Scholar
  7. Jin Chen, Xinxiao Wu, Yao Hu, and Jiebo Luo. 2021. Spatial-temporal Causal Inference for Partial Image-to-video Adaptation. In AAAI, Vol. 35. 1027--1035.Google ScholarGoogle ScholarCross RefCross Ref
  8. Hugh A Chipman, Edward I George, Robert E McCulloch, et al. 2010. BART: Bayesian additive regression trees. The Annals of Applied Statistics, Vol. 4, 1 (2010), 266--298.Google ScholarGoogle ScholarCross RefCross Ref
  9. Yann N Dauphin, Angela Fan, Michael Auli, and David Grangier. 2017. Language modeling with gated convolutional networks. In ICML. PMLR, 933--941.Google ScholarGoogle Scholar
  10. Songgaojun Deng, Huzefa Rangwala, and Yue Ning. 2019. Learning Dynamic Context Graphs for Predicting Social Events. In KDD. ACM, 1007--1016.Google ScholarGoogle Scholar
  11. Songgaojun Deng, Huzefa Rangwala, and Yue Ning. 2020 a. Dynamic Knowledge Graph Based Multi-Event Forecasting. Association for Computing Machinery, New York, NY, USA.Google ScholarGoogle Scholar
  12. Songgaojun Deng, Huzefa Rangwala, and Yue Ning. 2021. Understanding Event Predictions via Contextualized Multilevel Feature Learning. In CIKM. 342--351.Google ScholarGoogle Scholar
  13. Songgaojun Deng, Shusen Wang, Huzefa Rangwala, Lijing Wang, and Yue Ning. 2020 b. Cola-GNN: Cross-location Attention based Graph Neural Networks for Long-term ILI Prediction. In CIKM. 245--254.Google ScholarGoogle Scholar
  14. Michelangelo Diligenti, Soumali Roychowdhury, and Marco Gori. 2017. Integrating prior knowledge into deep learning. In ICMLA. IEEE, 920--923.Google ScholarGoogle Scholar
  15. Matthew S Gerber. 2014. Predicting crime using Twitter and kernel density estimation. Decision Support Systems, Vol. 61 (2014), 115--125.Google ScholarGoogle ScholarCross RefCross Ref
  16. Xavier Glorot and Yoshua Bengio. 2010. Understanding the difficulty of training deep feedforward neural networks. In AISTATS.Google ScholarGoogle Scholar
  17. Ruocheng Guo, Jundong Li, and Huan Liu. 2019. Learning individual treatment effects from networked observational data. arXiv:1906.03485 (2019).Google ScholarGoogle Scholar
  18. Fredrik Johansson, Uri Shalit, and David Sontag. 2016. Learning representations for counterfactual inference. In ICML. 3020--3029.Google ScholarGoogle Scholar
  19. Nathan Kallus. 2014. Predicting crowd behavior with big public data. In WWW. 625--630.Google ScholarGoogle Scholar
  20. Bingyi Kang, Zhuang Liu, Xin Wang, Fisher Yu, Jiashi Feng, and Trevor Darrell. 2019. Few-shot object detection via feature reweighting. In ICCV. 8420--8429.Google ScholarGoogle Scholar
  21. D Kinga and J Ba Adam. 2015. A method for stochastic optimization. In ICLR, Vol. 5.Google ScholarGoogle Scholar
  22. Thomas N Kipf and Max Welling. 2017. Semi-supervised classification with graph convolutional networks. In ICLR.Google ScholarGoogle Scholar
  23. Kalev Leetaru and Philip A Schrodt. 2013. Gdelt: Global data on events, location, and tone, 1979--2012. In ISA annual convention, Vol. 2. Citeseer, 1--49.Google ScholarGoogle Scholar
  24. Jia Li, Xiaowei Jia, Haoyu Yang, Vipin Kumar, Michael Steinbach, and Gyorgy Simon. 2020. Teaching deep learning causal effects improves predictive performance. arXiv:2011.05466 (2020).Google ScholarGoogle Scholar
  25. Christos Louizos, Uri Shalit, Joris M Mooij, David Sontag, Richard Zemel, and Max Welling. 2017. Causal effect inference with deep latent-variable models. In NIPS. 6446--6456.Google ScholarGoogle Scholar
  26. Jing Ma, Ruocheng Guo, Chen Chen, Aidong Zhang, and Jundong Li. 2021. Deconfounding with Networked Observational Data in a Dynamic Environment. In WSDM (Virtual Event, Israel) (WSDM '21). 166--174.Google ScholarGoogle Scholar
  27. Nikhil Muralidhar, Mohammad Raihanul Islam, Manish Marwah, Anuj Karpatne, and Naren Ramakrishnan. 2018. Incorporating prior domain knowledge into deep neural networks. In ICBD. IEEE, 36--45.Google ScholarGoogle Scholar
  28. Yue Ning, Sathappan Muthiah, Huzefa Rangwala, and Naren Ramakrishnan. 2016. Modeling precursors for event forecasting via nested multi-instance learning. In KDD. ACM, 1095--1104.Google ScholarGoogle Scholar
  29. Aaron van den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew Senior, and Koray Kavukcuoglu. 2016. Wavenet: A generative model for raw audio. arXiv:1609.03499 (2016).Google ScholarGoogle Scholar
  30. Judea Pearl et al. 2009. Causal inference in statistics: An overview. Statistics surveys, Vol. 3 (2009), 96--146.Google ScholarGoogle Scholar
  31. Kira Radinsky, Sagie Davidovich, and Shaul Markovitch. 2012. Learning causality for news events prediction. In WWW. 909--918.Google ScholarGoogle Scholar
  32. Kira Radinsky and Eric Horvitz. 2013. Mining the web to predict future events. In WSDM. 255--264.Google ScholarGoogle Scholar
  33. Paul R Rosenbaum and Donald B Rubin. 1983. The central role of the propensity score in observational studies for causal effects. Biometrika, Vol. 70, 1 (1983), 41--55.Google ScholarGoogle ScholarCross RefCross Ref
  34. Donald B Rubin. 2005. Causal inference using potential outcomes: Design, modeling, decisions. J. Amer. Statist. Assoc., Vol. 100, 469 (2005), 322--331.Google ScholarGoogle ScholarCross RefCross Ref
  35. Uri Shalit, Fredrik D Johansson, and David Sontag. 2017. Estimating individual treatment effect: generalization bounds and algorithms. In ICML. JMLR. org, 3076--3085.Google ScholarGoogle Scholar
  36. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In NIPS. 5998--6008.Google ScholarGoogle Scholar
  37. Laura von Rueden, Sebastian Mayer, Katharina Beckh, Bogdan Georgiev, Sven Giesselbach, Raoul Heese, Birgit Kirsch, Julius Pfrommer, Annika Pick, Rajkumar Ramamurthy, et al. 2019. Informed Machine Learning--A Taxonomy and Survey of Integrating Knowledge into Learning Systems. arXiv:1903.12394 (2019).Google ScholarGoogle Scholar
  38. Risto Vuorio, Shao-Hua Sun, Hexiang Hu, and Joseph J Lim. 2019. Multimodal model-agnostic meta-learning via task-aware modulation. arXiv:1910.13616 (2019).Google ScholarGoogle Scholar
  39. Stefan Wager and Susan Athey. 2018. Estimation and inference of heterogeneous treatment effects using random forests. J. Amer. Statist. Assoc., Vol. 113, 523 (2018), 1228--1242.Google ScholarGoogle ScholarCross RefCross Ref
  40. Xiaofeng Wang, Donald E Brown, and Matthew S Gerber. 2012a. Spatio-temporal modeling of criminal incidents using geographic, demographic, and Twitter-derived information. In ISI. IEEE, 36--41.Google ScholarGoogle Scholar
  41. Xiaofeng Wang, Matthew S Gerber, and Donald E Brown. 2012b. Automatic crime prediction using events extracted from twitter posts. In International conference on social computing, behavioral-cultural modeling, and prediction. Springer, 231--238.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Yixin Wang and David M Blei. 2019. The blessings of multiple causes. J. Amer. Statist. Assoc. just-accepted (2019), 1--71.Google ScholarGoogle Scholar
  43. Zonghan Wu, Shirui Pan, Guodong Long, Jing Jiang, and Chengqi Zhang. 2019. Graph wavenet for deep spatial-temporal graph modeling. arXiv:1906.00121 (2019).Google ScholarGoogle Scholar
  44. Liu Yang and Rong Jin. 2006. Distance metric learning: A comprehensive survey. Michigan State Universiy, Vol. 2, 2 (2006), 4.Google ScholarGoogle Scholar
  45. Liuyi Yao, Sheng Li, Yaliang Li, Mengdi Huai, Jing Gao, and Aidong Zhang. 2018. Representation learning for treatment effect estimation from observational data. NIPS, Vol. 31 (2018).Google ScholarGoogle Scholar
  46. Fisher Yu and Vladlen Koltun. 2015. Multi-scale context aggregation by dilated convolutions. arXiv:1511.07122 (2015).Google ScholarGoogle Scholar
  47. Liang Zhao, Feng Chen, Jing Dai, Ting Hua, Chang-Tien Lu, and Naren Ramakrishnan. 2014. Unsupervised spatial event detection in targeted domains with applications to civil unrest modeling. PloS one, Vol. 9, 10 (2014).Google ScholarGoogle Scholar
  48. Liang Zhao, Feng Chen, Chang-Tien Lu, and Naren Ramakrishnan. 2015a. Spatiotemporal event forecasting in social media. In SIAM. SIAM, 963--971.Google ScholarGoogle Scholar
  49. Liang Zhao, Qian Sun, Jieping Ye, Feng Chen, Chang-Tien Lu, and Naren Ramakrishnan. 2015b. Multi-task learning for spatio-temporal event forecasting. In KDD. ACM, 1503--1512.Google ScholarGoogle Scholar

Index Terms

  1. Robust Event Forecasting with Spatiotemporal Confounder Learning

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
          August 2022
          5033 pages
          ISBN:9781450393850
          DOI:10.1145/3534678

          Copyright © 2022 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 14 August 2022

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate1,133of8,635submissions,13%

          Upcoming Conference

          KDD '24

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader