ABSTRACT
Microblogging platforms like Twitter have been heavily leveraged to report and exchange information about natural disasters. The real-time data on these sites is highly helpful in gaining situational awareness and planning aid efforts. However, disaster-related messages are immersed in a high volume of irrelevant information. The situational data of disaster events also vary greatly in terms of information types ranging from general situational awareness (caution, infrastructure damage, casualties) to individual needs or not related to the crisis. It thus requires efficient methods to handle data overload and prioritize various types of information. This paper proposes an interpretable classification-summarization framework that first classifies tweets into different disaster-related categories and then summarizes those tweets. Unlike existing work, our classification model can provide explanations or rationales for its decisions. In the summarization phase, we employ an Integer Linear Programming (ILP) based optimization technique along with the help of rationales to generate summaries of event categories. Extensive evaluation on large-scale disaster events shows (a). our model can classify tweets into disaster-related categories with an 85% Macro F1 score and high interpretability (b). the summarizer achieves (5-25%) improvement in terms of ROUGE-1 F-score over most state-of-the-art approaches.
- Firoj Alam, Shafiq Joty, and Muhammad Imran. 2018. Graph Based Semi-supervised Learning with Convolution Neural Networks to Classify Crisis Related Tweets. In Proceedings of the Eleventh International AAAI Conference on Web and Social Media (ICWSM).Google ScholarCross Ref
- Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural machine translation by jointly learning to align and translate. In Proceedings of the International Conference on Learning Representations (ICLR).Google Scholar
- Mark A. Cameron, Robert Power, Bella F Robinson, and Jie Yin. 2012. Emergency situation awareness from twitter for crisis management. In Proceedings of the 21st International Conference on World Wide Web (WWW’12 Companion).Google ScholarDigital Library
- Rich Caruana. 1997. Multitask Learning. Rich Caruana (1997).Google Scholar
- N. Chawla, K. Bowyer, L. Hall, and P. Kegelmeyer. 2002. SMOTE: synthetic minority over-sampling technique. Journal of Artificial Intelligence Research(2002), 321–357.Google Scholar
- Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL).Google Scholar
- J. DeYoung, S. Jain, N. F. Rajani, E. Lehman, C. Xiong, R. Socher, and B. C. Wallace. 2020. ERASER: A Benchmark to Evaluate Rationalized NLP Models. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL). 4443–4458.Google ScholarCross Ref
- gurobi. 2015. Gurobi – The overall fastest and best supported solver available. http://www.gurobi.com/Google Scholar
- Huggingface. 2021. Hugging Face – The AI community building the future.https://huggingface.co/Google Scholar
- Muhammad Imran, Carlos Castillo, Ji Lucas, Patrick Meier, and Sarah Vieweg. 2014. AIDR: artificial intelligence for disaster response. In Proceedings of the 23rd International Conference on World Wide Web (WWW’14 Companion).Google ScholarDigital Library
- Muhammad Imran, Prasenjit Mitra, and Carlos Castillo. 2016. Twitter as a Lifeline: Human-annotated Twitter Corpora for NLP of Crisis-related Messages. In Proceedings of the 10th Language Resources and Evaluation Conference (LREC).Google Scholar
- Sarthak Jain and Byron C. Wallace. 2019. Attention is not Explanation. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (ACL).Google Scholar
- S. Jain, S. Wiegreffe, Y. Pinter, and B. C. Wallace. 2020. Learning to Faithfully Rationalize by Construction. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL). 4459–4473.Google ScholarCross Ref
- Ruipeng Jia, Yanan Cao, Hengzhu Tang, Fang Fang, Cong Cao, and Shi Wang. 2020. Neural Extractive Summarization with Hierarchical Attentive Heterogeneous Graph Network. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP).Google ScholarCross Ref
- Chris Kedzie, Fernando Diaz, and Kathleen R. McKeown. 2016. Real-Time Web Scale Event Summarization Using Sequential Decision Making. In Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, IJCAI, Subbarao Kambhampati (Ed.). 3754–3760.Google Scholar
- Chris Kedzie, Kathleen McKeown, and Fernando Diaz. 2015. Predicting Salient Updates for Disaster Summarization. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (IJCNLP).Google ScholarCross Ref
- Prashant Khare, Grégoire Burel, Diana Maynard, and Harith Alani. 2018. Cross-Lingual Classification of Crisis Data. In Proceedings of the International International Semantic Web Conference (ISWC).Google ScholarDigital Library
- Hongmin Li, Doina Caragea, and Cornelia Caragea. 2021. Combining Self-training with Deep Learning for Disaster Tweet Classification. In Proceedings of the 18th International Conference on Information Systems for Crisis Response and Management (ISCRAM).Google Scholar
- Chin-Yew Lin. 2004. ROUGE: A Package for Automatic Evaluation of Summaries. In Proceddings of Workshop on Text Summarization Branches Out (with ACL).Google Scholar
- Chin-Yew Lin and Eduard Hovy. 2003. Automatic Evaluation of Summaries Using N-gram Co-Occurrence Statistics. In Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language (NAACL).Google ScholarDigital Library
- Junhua Liu, Trisha Singhal, Lucienne T.M. Blessing, Kristin L. Wood, and Kwan Hui Lim. 2021. CrisisBERT: A Robust Transformer for Crisis Classification and Contextual Crisis Embedding. In Proceedings of the 32nd ACM Conference on Hypertext and Social Media (HT).Google ScholarDigital Library
- Yang Liu and Mirella Lapata. 2019. Text Summarization with Pretrained Encoders. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP).Google ScholarCross Ref
- Ilya Loshchilov and Frank Hutter. 2019. Decoupled Weight Decay Regularization. In Proceddings of the International Conference on Learning Representations (ICLR).Google Scholar
- Reza Mazloom, Hongmin Li, Doina Caragea, Cornelia Caragea, and Muhammad Imran. 2019. A hybrid domain adaptation approach for identifying crisis-relevant tweets. International Journal of Information Systems for Crisis Response and Management (IJISCRAM)2(2019), 1–19.Google Scholar
- Ramesh Nallapati, Feifei Zhai, and Bowen Zhou. 2017. SummaRuNNer: a recurrent neural network based sequence model for extractive summarization of documents. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI).Google ScholarCross Ref
- Dat Quoc Nguyen, Thanh Vu, and Anh Tuan Nguyen. 2020. BERTweet: A pre-trained language model for English Tweets. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations.Google ScholarCross Ref
- Dat Tien Nguyen, Kamela Ali Al Mannai, Shafiq Joty, Hassan Sajjad, Muhammad Imran, and Prasenjit Mitra. 2017. Robust Classification of Crisis-Related Data on Social Networks Using Convolutional Neural Networks. In Proceedings of the 11th International AAAI Conference on Web and Social Media (ICWSM).Google ScholarCross Ref
- Minh-Tien Nguyen, Asanobu Kitamoto, and Tri-Thanh Nguyen. 2015. TSum4act: A Framework for Retrieving and Summarizing Actionable Tweets During a Disaster for Reaction. In Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD).Google ScholarCross Ref
- Thi Huyen Nguyen, Tuan-Anh Hoang, and Wolfgang Nejdl. 2019. Efficient Summarizing of Evolving Events from Twitter Streams. In Proceedings of the 2019 SIAM International Conference on Data Mining (SDM).Google ScholarCross Ref
- Andrei Olariu. 2014. Efficient Online Summarization of Microblogging Streams. In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics (EACL).Google ScholarCross Ref
- Miles Osborne, Sean Moran, Richard McCreadie, Alexander Von Lunen, Martin Sykora, Elizabeth Cano, Neil Ireson, Craig Macdonald, Iadh Ounis, Yulan He, Tom Jackson, Fabio Ciravegna, and Ann O’Brien. 2014. Real-Time Detection, Tracking, and Monitoring of Automatically Discovered Events in Social Media. In Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics (ACL).Google ScholarCross Ref
- Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. ”Why Should I Trust You?”: Explaining the Predictions of Any Classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD).Google ScholarDigital Library
- Andrew Slavin Ross, Michael C. Hughes, and Finale Doshi-Velez. 2017. Right for the Right Reasons: Training Differentiable Models by Constraining their Explanations. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI).Google ScholarCross Ref
- Cynthia Rudin. 2019. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. In Nature Machine Intelligence.Google Scholar
- Koustav Rudra, Niloy Ganguly, Pawan Goyal, and Saptarshi GhoshACM Transactions on the Web. 2018. Extracting and Summarizing Situational Information from the Twitter Social Media during Disasters. ACM Transactions on the Web(2018).Google Scholar
- Koustav Rudra, Subham Ghosh, and Niloy Ganguly. 2015. Extracting Situational Information from Microblogs during Disaster Events: a Classification-Summarization Approach. In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management (CIKM).Google ScholarDigital Library
- Koustav Rudra, Pawan Goyal, Niloy Ganguly, Muhammad Imran, and Prasenjit Mitra. 2019. Summarizing Situational Tweets in Crisis Scenarios: An Extractive-Abstractive Approach. In IEEE Transactions on Computational Social Systems.Google Scholar
- Koustav Rudra, Pawan Goyal, Niloy Ganguly, Prasenjit Mitra, and Muhammad Imran. 2018. Identifying Sub-events and Summarizing Disaster-Related Information from Microblogs. In Proceedings of the 41st International ACM SIGIR Conference on Research & Development in Information Retrieval (SIGIR).Google ScholarDigital Library
- Swarnadeep Saha, Prateek Yadav, Lisa Bauer, and Mohit Bansal. 2021. EXPLAGRAPHS: An Explanation Graph Generation Task for Structured Commonsense Reasoning. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (ACL). 7716–7740.Google ScholarCross Ref
- Naveen Saini, Sriparna Saha, and Pushpak Bhattacharyya. 2019. Multiobjective-Based Approach for Microblog Summarization. IEEE Transactions on Computational Social Systems (2019).Google ScholarCross Ref
- Sofia Serrano and Noah A. Smith. 2019. Is Attention Interpretable?. In Proceedings of the 57th Annual Meeting of the //Association for Computational Linguistics (ACL).Google ScholarCross Ref
- Lidan Shou, Zhenhua Wang, Ke Chen, and Gang Chen. 2013. Sumblr: continuous summarization of evolving tweet streams. In Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval).Google ScholarDigital Library
- István Varga, Motoki Sano, Kentaro Torisawa, Chikara Hashimoto, Kiyonori Ohtake, Takao Kawai, Jong-Hoon Oh, and Stijn De Saeger. 2013. Aid is Out There: Looking for Help from Tweets during a Large Scale Disaster. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (ACL).Google Scholar
- Sudha Verma, Sarah Vieweg, William J. Corvey, Leysia Palen, James H. Martin, Martha Palmer, Aaron Schram1, and Kenneth M. Anderson. 2011. Natural Language Processing to the Rescue? Extracting “Situational Awareness” Tweets During Mass Emergency. In Proceedings of the Fifth International AAAI Conference on Weblogs and Social Media (ICWSM).Google Scholar
- Eric Wallace, Shi Feng, Nikhil Kandpal, Matt Gardner, and Sameer Singh. 2019. Universal Adversarial Triggers for Attacking and Analyzing NLP. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP).Google ScholarCross Ref
- Wikipedia. 2021. Wilcoxon signed-rank test. https://en.wikipedia.org/wiki/Wilcoxon_signed-rank_testGoogle Scholar
- Ziyi Yang, Chenguang Zhu, Robert Gmyr, Michael Zeng, Xuedong Huang, and Eric Darve1. 2020. TED: A Pretrained Unsupervised Summarization Model with Theme Modeling and Denoising. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings.Google ScholarCross Ref
- Zijian Zhang, Koustav Rudra, and Avishek Anand. 2021. Explain and Predict, and then Predict again. In Proceedings of the 14th ACM International Conference on Web Search and Data Mining (WSDM).Google ScholarDigital Library
- Hao Zheng and Mirella Lapata. 2019. Sentence Centrality Revisited for Unsupervised Summarization. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL).Google ScholarCross Ref
- R. Zhong, S. Shao, and K. McKeown. 2019. Fine-grained sentiment analysis with faithful attention. In arXiv preprint arXiv:1908.06870.Google Scholar
- Arkaitz Zubiaga, Damiano Spina, Enrique Amigó, and Julio Gonzalo. 2012. Towards Real-Time Summarization of Scheduled Events from Twitter Streams. In Proceedings of the 12th ACM conference on Hypertext and Hypermedia (HT).Google ScholarDigital Library
Index Terms
- Towards an Interpretable Approach to Classify and Summarize Crisis Events from Microblogs
Recommendations
Rationale Aware Contrastive Learning Based Approach to Classify and Summarize Crisis-Related Microblogs
CIKM '22: Proceedings of the 31st ACM International Conference on Information & Knowledge ManagementRecent fashion of information propagation on Twitter makes the platform a crucial conduit for tactical data and emergency responses during disasters. However, the real-time information about crises is immersed in a large volume of emotional and ...
CrisICSum: Interpretable Classification and Summarization Platform for Crisis Events from Microblogs
CIKM '22: Proceedings of the 31st ACM International Conference on Information & Knowledge ManagementMicroblogging platforms such as Twitter, receive massive messages during crisis events. Real-time insights are crucial for emergency response. Hence, there is a need to develop faithful tools for efficiently digesting information. In this paper, we ...
Extracting and Summarizing Situational Information from the Twitter Social Media during Disasters
Microblogging sites like Twitter have become important sources of real-time information during disaster events. A large amount of valuable situational information is posted in these sites during disasters; however, the information is dispersed among ...
Comments