skip to main content
10.1145/3459637.3482309acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
Public Access

Understanding Event Predictions via Contextualized Multilevel Feature Learning

Published: 30 October 2021 Publication History


Deep learning models have been studied to forecast human events using vast volumes of data, yet they still cannot be trusted in certain applications such as healthcare and disaster assistance due to the lack of interpretability. Providing explanations for event predictions not only helps practitioners understand the underlying mechanism of prediction behavior but also enhances the robustness of event analysis. Improving the transparency of event prediction models is challenging given the following factors: (i) multilevel features exist in event data which creates a challenge to cross-utilize different levels of data; (ii) features across different levels and time steps are heterogeneous and dependent; and (iii) static model-level interpretations cannot be easily adapted to event forecasting given the dynamic and temporal characteristics of the data. Recent interpretation methods have proven their capabilities in tasks that deal with graph-structured or relational data. In this paper, we present a Contextualized Multilevel Feature learning framework, CMF, for interpretable temporal event prediction. It consists of a predictor for forecasting events of interest and an explanation module for interpreting model predictions. We design a new context-based feature fusion method to integrate multiple levels of heterogeneous features. We also introduce a temporal explanation module to determine sequences of text and subgraphs that have crucial roles in a prediction. We conduct extensive experiments on several real-world datasets of political and epidemic events. We demonstrate that the proposed method is competitive compared with the state-of-the-art models while possessing favorable interpretation capabilities.

Supplementary Material

MP4 File (cikm-rgfp0456-pre-recoding.mp4)
Presentation video. It introduces a novel framework for forecasting future events and providing multilevel explanations for predictions.


Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014).
Johan Bollen, Huina Mao, and Xiaojun Zeng. 2011. Twitter mood predicts the stock market. Journal of computational science, Vol. 2, 1 (2011), 1--8.
Elizabeth Boschee, Jennifer Lautenschlager, Sean O'Brien, Steve Shellman, James Starz, and Michael Ward. 2015a. CAMEO.CDB.09b5.pdf. In ICEWS Coded Event Data. Harvard Dataverse.
Elizabeth Boschee, Jennifer Lautenschlager, Sean O'Brien, Steve Shellman, James Starz, and Michael Ward. 2015b. ICEWS Coded Event Data.
Jianbo Chen, Le Song, Martin Wainwright, and Michael Jordan. 2018. Learning to explain: An information-theoretic perspective on model interpretation. In ICML '18. PMLR, 883--892.
Kyunghyun Cho, Bart van Merrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning Phrase Representations using RNN Encoder--Decoder for Statistical Machine Translation. In EMNLP (Doha, Qatar). Association for Computational Linguistics, 1724--1734.
Edward Choi, Mohammad Taha Bahadori, Jimeng Sun, Joshua Kulas, Andy Schuetz, and Walter Stewart. 2016. Retain: An interpretable predictive model for healthcare using reverse time attention mechanism. NeurIPS '16, Vol. 29 (2016), 3504--3512.
Songgaojun Deng, Huzefa Rangwala, and Yue Ning. 2019. Learning Dynamic Context Graphs for Predicting Social Events. In KDD '19. ACM, 1007--1016.
Songgaojun Deng, Huzefa Rangwala, and Yue Ning. 2020a. Dynamic Knowledge Graph Based Multi-Event Forecasting. In KDD '20. Association for Computing Machinery, New York, NY, USA, 1585--1595.
Songgaojun Deng, Shusen Wang, Huzefa Rangwala, Lijing Wang, and Yue Ning. 2020b. Cola-GNN: Cross-location Attention based Graph Neural Networks for Long-term ILI Prediction. In CIKM '20.
Finale Doshi-Velez and Been Kim. 2017. Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608 (2017).
Mengnan Du, Ninghao Liu, and Xia Hu. 2019. Techniques for interpretable machine learning. Commun. ACM, Vol. 63, 1 (2019), 68--77.
Yuyang Gao, Liang Zhao, Lingfei Wu, Yanfang Ye, Hui Xiong, and Chaowei Yang. 2019. Incomplete Label Multi-Task Deep Learning for Spatio-Temporal Event Subtype Forecasting. In AAAI, Vol. 33. 3638--3646.
Matthew S Gerber. 2014. Predicting crime using Twitter and kernel density estimation. Decision Support Systems, Vol. 61 (2014), 115--125.
Xavier Glorot and Yoshua Bengio. 2010. Understanding the difficulty of training deep feedforward neural networks. In AISTATS. 249--256.
Zhen Han, Peng Chen, Yunpu Ma, and Volker Tresp. 2021. Explainable Subgraph Reasoning for Forecasting on Temporal Knowledge Graphs. In ICLR '21.
M Shahriar Hossain, Patrick Butler, Arnold P Boedihardjo, and Naren Ramakrishnan. 2012. Storytelling in entity networks to support intelligence analysts. In KDD '12. 1375--1383.
Woojeong Jin, Changlin Zhang, Pedro Szekely, and Xiang Ren. 2019. Recurrent event network for reasoning over temporal knowledge graphs. arXiv preprint arXiv:1904.05530 (2019).
D Kinga and J Ba Adam. 2015. A method for stochastic optimization. In ICLR '15, Vol. 5.
Kalev Leetaru and Philip A Schrodt. 2013. Gdelt: Global data on events, location, and tone, 1979--2012. In ISA annual convention, Vol. 2. Citeseer, 1--49.
Zachary C Lipton. 2018. The mythos of model interpretability. Queue, Vol. 16, 3 (2018), 31--57.
Dongsheng Luo, Wei Cheng, Dongkuan Xu, Wenchao Yu, Bo Zong, Haifeng Chen, and Xiang Zhang. 2020. Parameterized Explainer for Graph Neural Network. NIPS '20, Vol. 33 (2020).
Andre Martins and Ramon Astudillo. 2016. From softmax to sparsemax: A sparse model of attention and multi-label classification. In ICML '16. 1614--1623.
Christoph Molnar. 2019. Interpretable Machine Learning.
Yue Ning, Sathappan Muthiah, Huzefa Rangwala, and Naren Ramakrishnan. 2016. Modeling precursors for event forecasting via nested multi-instance learning. In KDD '16. ACM, 1095--1104.
Yue Ning, Rongrong Tao, Chandan K Reddy, Huzefa Rangwala, James C Starz, and Naren Ramakrishnan. 2018. STAPLE: Spatio-Temporal Precursor Learning for Event Forecasting. In SIAM. SIAM, 99--107.
Matteo Pagliardini, Prakhar Gupta, and Martin Jaggi. 2018. Unsupervised Learning of Sentence Embeddings using Compositional n-Gram Features. In NAACL '2018.
Naren Ramakrishnan, Patrick Butler, Sathappan Muthiah, Nathan Self, Rupinder Khandpur, Parang Saraf, Wei Wang, Jose Cadena, Anil Vullikanti, Gizem Korkmaz, et al. 2014. 'Beating the news' with EMBERS: forecasting civil unrest using open source indicators. In KDD '14. ACM, 1799--1808.
Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. "Why should I trust you?" Explaining the predictions of any classifier. In KDD '16. 1135--1144.
S Rasoul Safavian and David Landgrebe. 1991. A survey of decision tree classifier methodology. IEEE transactions on systems, man, and cybernetics, Vol. 21, 3 (1991), 660--674.
Michael Schlichtkrull, Thomas N Kipf, Peter Bloem, Rianne Van Den Berg, Ivan Titov, and Max Welling. 2018. Modeling relational data with graph convolutional networks. In ESWC. Springer, 593--607.
Avanti Shrikumar, Peyton Greenside, and Anshul Kundaje. 2017. Learning important features through propagating activation differences. arXiv preprint arXiv:1704.02685 (2017).
Alessio Signorini, Alberto Maria Segre, and Philip M Polgreen. 2011. The use of Twitter to track levels of disease activity and public concern in the US during the influenza A H1N1 pandemic. PloS one, Vol. 6, 5 (2011), e19467.
Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. 2013. Deep inside convolutional networks: Visualising image classification models and saliency maps. arXiv preprint arXiv:1312.6034 (2013).
Jost Tobias Springenberg, Alexey Dosovitskiy, Thomas Brox, and Martin Riedmiller. 2014. Striving for simplicity: The all convolutional net. arXiv preprint arXiv:1412.6806 (2014).
Andranik Tumasjan, Timm Oliver Sprenger, Philipp G Sandner, and Isabell M Welpe. 2010. Predicting elections with twitter: What 140 characters reveal about political sentiment. ICWSM, Vol. 10, 1 (2010), 178--185.
Shikhar Vashishth, Soumya Sanyal, Vikram Nitin, and Partha Talukdar. 2019. Composition-based multi-relational graph convolutional networks. arXiv preprint arXiv:1911.03082 (2019).
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In NIPS '17. 5998--6008.
Petar Velivc ković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. 2017. Graph attention networks. arXiv preprint arXiv:1710.10903 (2017).
Wei Wang, Yue Ning, Huzefa Rangwala, and Naren Ramakrishnan. 2016. A multiple instance learning framework for identifying key sentences and detecting events. In CIKM '16. 509--518.
Xiaofeng Wang, Matthew S Gerber, and Donald E Brown. 2012. Automatic crime prediction using events extracted from twitter posts. In SBP-BRiMS '12. Springer, 231--238.
Xiaoran Xu, Wei Feng, Yunsheng Jiang, Xiaohui Xie, Zhiqing Sun, and Zhi-Hong Deng. 2019. Dynamically pruned message passing networks for large-scale knowledge graph reasoning. arXiv preprint arXiv:1909.11334 (2019).
Zhitao Ying, Dylan Bourgeois, Jiaxuan You, Marinka Zitnik, and Jure Leskovec. 2019. Gnnexplainer: Generating explanations for graph neural networks. In NIPS '19. 9244--9255.
Hao Yuan, Jiliang Tang, Xia Hu, and Shuiwang Ji. 2020. XGNN: Towards Model-Level Explanations of Graph Neural Networks. arXiv preprint arXiv:2006.02587 (2020).
Zhengyan Zhang, Xu Han, Zhiyuan Liu, Xin Jiang, Maosong Sun, and Qun Liu. 2019. ERNIE: Enhanced language representation with informative entities. arXiv preprint arXiv:1905.07129 (2019).
Liang Zhao, Feng Chen, Chang-Tien Lu, and Naren Ramakrishnan. 2016. Multi-resolution spatial event forecasting in social media. In 2016 IEEE 16th International Conference on Data Mining (ICDM). IEEE, 689--698.
Liang Zhao, Qian Sun, Jieping Ye, Feng Chen, Chang-Tien Lu, and Naren Ramakrishnan. 2015. Multi-task learning for spatio-temporal event forecasting. In KDD '15. ACM, 1503--1512.

Cited By

View all
  • (2024)Advances in Human Event Modeling: From Graph Neural Networks to Language ModelsProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671466(6459-6469)Online publication date: 25-Aug-2024
  • (2024)On the Feasibility of Predicting Volumes of Fake News—The Spanish CaseIEEE Transactions on Computational Social Systems10.1109/TCSS.2023.329709311:4(5230-5240)Online publication date: Aug-2024
  • (2024)Predicting multi-subsequent events and actors in public health emergenciesComputers and Industrial Engineering10.1016/j.cie.2023.109852187:COnline publication date: 12-Apr-2024
  • Show More Cited By



Information & Contributors


Published In

cover image ACM Conferences
CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management
October 2021
4966 pages
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]



Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 October 2021


Request permissions for this article.

Check for updates

Author Tags

  1. event prediction
  2. multilevel feature learning
  3. temporal explanation


  • Research-article

Funding Sources


CIKM '21

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25


Other Metrics

Bibliometrics & Citations


Article Metrics

  • Downloads (Last 12 months)220
  • Downloads (Last 6 weeks)27
Reflects downloads up to 20 Jan 2025

Other Metrics


Cited By

View all
  • (2024)Advances in Human Event Modeling: From Graph Neural Networks to Language ModelsProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671466(6459-6469)Online publication date: 25-Aug-2024
  • (2024)On the Feasibility of Predicting Volumes of Fake News—The Spanish CaseIEEE Transactions on Computational Social Systems10.1109/TCSS.2023.329709311:4(5230-5240)Online publication date: Aug-2024
  • (2024)Predicting multi-subsequent events and actors in public health emergenciesComputers and Industrial Engineering10.1016/j.cie.2023.109852187:COnline publication date: 12-Apr-2024
  • (2023)News event prediction by trigger evolution graph and event segmentJournal of Systems Engineering and Electronics10.23919/JSEE.2023.00008334:3(615-626)Online publication date: Jun-2023
  • (2023)A Temporal Attention-based Model for Social Event Prediction2023 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN54540.2023.10191427(1-8)Online publication date: 18-Jun-2023
  • (2022)Causality Enhanced Societal Event Forecasting With Heterogeneous Graph Learning2022 IEEE International Conference on Data Mining (ICDM)10.1109/ICDM54844.2022.00019(91-100)Online publication date: Nov-2022
  • (2022)Learning Dynamic Multimodal Implicit and Explicit Networks for Multiple Financial Tasks2022 IEEE International Conference on Big Data (Big Data)10.1109/BigData55660.2022.10020722(825-834)Online publication date: 17-Dec-2022

View Options

View options


View or Download as a PDF file.



View online with eReader.


Login options







Share this Publication link

Share on social media