skip to main content
10.1145/3485447.3512259acmconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
research-article

Towards an Interpretable Approach to Classify and Summarize Crisis Events from Microblogs

Published:25 April 2022Publication History

ABSTRACT

Microblogging platforms like Twitter have been heavily leveraged to report and exchange information about natural disasters. The real-time data on these sites is highly helpful in gaining situational awareness and planning aid efforts. However, disaster-related messages are immersed in a high volume of irrelevant information. The situational data of disaster events also vary greatly in terms of information types ranging from general situational awareness (caution, infrastructure damage, casualties) to individual needs or not related to the crisis. It thus requires efficient methods to handle data overload and prioritize various types of information. This paper proposes an interpretable classification-summarization framework that first classifies tweets into different disaster-related categories and then summarizes those tweets. Unlike existing work, our classification model can provide explanations or rationales for its decisions. In the summarization phase, we employ an Integer Linear Programming (ILP) based optimization technique along with the help of rationales to generate summaries of event categories. Extensive evaluation on large-scale disaster events shows (a). our model can classify tweets into disaster-related categories with an 85% Macro F1 score and high interpretability (b). the summarizer achieves (5-25%) improvement in terms of ROUGE-1 F-score over most state-of-the-art approaches.

References

  1. Firoj Alam, Shafiq Joty, and Muhammad Imran. 2018. Graph Based Semi-supervised Learning with Convolution Neural Networks to Classify Crisis Related Tweets. In Proceedings of the Eleventh International AAAI Conference on Web and Social Media (ICWSM).Google ScholarGoogle ScholarCross RefCross Ref
  2. Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural machine translation by jointly learning to align and translate. In Proceedings of the International Conference on Learning Representations (ICLR).Google ScholarGoogle Scholar
  3. Mark A. Cameron, Robert Power, Bella F Robinson, and Jie Yin. 2012. Emergency situation awareness from twitter for crisis management. In Proceedings of the 21st International Conference on World Wide Web (WWW’12 Companion).Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Rich Caruana. 1997. Multitask Learning. Rich Caruana (1997).Google ScholarGoogle Scholar
  5. N. Chawla, K. Bowyer, L. Hall, and P. Kegelmeyer. 2002. SMOTE: synthetic minority over-sampling technique. Journal of Artificial Intelligence Research(2002), 321–357.Google ScholarGoogle Scholar
  6. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL).Google ScholarGoogle Scholar
  7. J. DeYoung, S. Jain, N. F. Rajani, E. Lehman, C. Xiong, R. Socher, and B. C. Wallace. 2020. ERASER: A Benchmark to Evaluate Rationalized NLP Models. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL). 4443–4458.Google ScholarGoogle ScholarCross RefCross Ref
  8. gurobi. 2015. Gurobi – The overall fastest and best supported solver available. http://www.gurobi.com/Google ScholarGoogle Scholar
  9. Huggingface. 2021. Hugging Face – The AI community building the future.https://huggingface.co/Google ScholarGoogle Scholar
  10. Muhammad Imran, Carlos Castillo, Ji Lucas, Patrick Meier, and Sarah Vieweg. 2014. AIDR: artificial intelligence for disaster response. In Proceedings of the 23rd International Conference on World Wide Web (WWW’14 Companion).Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Muhammad Imran, Prasenjit Mitra, and Carlos Castillo. 2016. Twitter as a Lifeline: Human-annotated Twitter Corpora for NLP of Crisis-related Messages. In Proceedings of the 10th Language Resources and Evaluation Conference (LREC).Google ScholarGoogle Scholar
  12. Sarthak Jain and Byron C. Wallace. 2019. Attention is not Explanation. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (ACL).Google ScholarGoogle Scholar
  13. S. Jain, S. Wiegreffe, Y. Pinter, and B. C. Wallace. 2020. Learning to Faithfully Rationalize by Construction. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL). 4459–4473.Google ScholarGoogle ScholarCross RefCross Ref
  14. Ruipeng Jia, Yanan Cao, Hengzhu Tang, Fang Fang, Cong Cao, and Shi Wang. 2020. Neural Extractive Summarization with Hierarchical Attentive Heterogeneous Graph Network. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP).Google ScholarGoogle ScholarCross RefCross Ref
  15. Chris Kedzie, Fernando Diaz, and Kathleen R. McKeown. 2016. Real-Time Web Scale Event Summarization Using Sequential Decision Making. In Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, IJCAI, Subbarao Kambhampati (Ed.). 3754–3760.Google ScholarGoogle Scholar
  16. Chris Kedzie, Kathleen McKeown, and Fernando Diaz. 2015. Predicting Salient Updates for Disaster Summarization. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (IJCNLP).Google ScholarGoogle ScholarCross RefCross Ref
  17. Prashant Khare, Grégoire Burel, Diana Maynard, and Harith Alani. 2018. Cross-Lingual Classification of Crisis Data. In Proceedings of the International International Semantic Web Conference (ISWC).Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Hongmin Li, Doina Caragea, and Cornelia Caragea. 2021. Combining Self-training with Deep Learning for Disaster Tweet Classification. In Proceedings of the 18th International Conference on Information Systems for Crisis Response and Management (ISCRAM).Google ScholarGoogle Scholar
  19. Chin-Yew Lin. 2004. ROUGE: A Package for Automatic Evaluation of Summaries. In Proceddings of Workshop on Text Summarization Branches Out (with ACL).Google ScholarGoogle Scholar
  20. Chin-Yew Lin and Eduard Hovy. 2003. Automatic Evaluation of Summaries Using N-gram Co-Occurrence Statistics. In Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language (NAACL).Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Junhua Liu, Trisha Singhal, Lucienne T.M. Blessing, Kristin L. Wood, and Kwan Hui Lim. 2021. CrisisBERT: A Robust Transformer for Crisis Classification and Contextual Crisis Embedding. In Proceedings of the 32nd ACM Conference on Hypertext and Social Media (HT).Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Yang Liu and Mirella Lapata. 2019. Text Summarization with Pretrained Encoders. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP).Google ScholarGoogle ScholarCross RefCross Ref
  23. Ilya Loshchilov and Frank Hutter. 2019. Decoupled Weight Decay Regularization. In Proceddings of the International Conference on Learning Representations (ICLR).Google ScholarGoogle Scholar
  24. Reza Mazloom, Hongmin Li, Doina Caragea, Cornelia Caragea, and Muhammad Imran. 2019. A hybrid domain adaptation approach for identifying crisis-relevant tweets. International Journal of Information Systems for Crisis Response and Management (IJISCRAM)2(2019), 1–19.Google ScholarGoogle Scholar
  25. Ramesh Nallapati, Feifei Zhai, and Bowen Zhou. 2017. SummaRuNNer: a recurrent neural network based sequence model for extractive summarization of documents. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI).Google ScholarGoogle ScholarCross RefCross Ref
  26. Dat Quoc Nguyen, Thanh Vu, and Anh Tuan Nguyen. 2020. BERTweet: A pre-trained language model for English Tweets. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations.Google ScholarGoogle ScholarCross RefCross Ref
  27. Dat Tien Nguyen, Kamela Ali Al Mannai, Shafiq Joty, Hassan Sajjad, Muhammad Imran, and Prasenjit Mitra. 2017. Robust Classification of Crisis-Related Data on Social Networks Using Convolutional Neural Networks. In Proceedings of the 11th International AAAI Conference on Web and Social Media (ICWSM).Google ScholarGoogle ScholarCross RefCross Ref
  28. Minh-Tien Nguyen, Asanobu Kitamoto, and Tri-Thanh Nguyen. 2015. TSum4act: A Framework for Retrieving and Summarizing Actionable Tweets During a Disaster for Reaction. In Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD).Google ScholarGoogle ScholarCross RefCross Ref
  29. Thi Huyen Nguyen, Tuan-Anh Hoang, and Wolfgang Nejdl. 2019. Efficient Summarizing of Evolving Events from Twitter Streams. In Proceedings of the 2019 SIAM International Conference on Data Mining (SDM).Google ScholarGoogle ScholarCross RefCross Ref
  30. Andrei Olariu. 2014. Efficient Online Summarization of Microblogging Streams. In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics (EACL).Google ScholarGoogle ScholarCross RefCross Ref
  31. Miles Osborne, Sean Moran, Richard McCreadie, Alexander Von Lunen, Martin Sykora, Elizabeth Cano, Neil Ireson, Craig Macdonald, Iadh Ounis, Yulan He, Tom Jackson, Fabio Ciravegna, and Ann O’Brien. 2014. Real-Time Detection, Tracking, and Monitoring of Automatically Discovered Events in Social Media. In Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics (ACL).Google ScholarGoogle ScholarCross RefCross Ref
  32. Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. ”Why Should I Trust You?”: Explaining the Predictions of Any Classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD).Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Andrew Slavin Ross, Michael C. Hughes, and Finale Doshi-Velez. 2017. Right for the Right Reasons: Training Differentiable Models by Constraining their Explanations. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI).Google ScholarGoogle ScholarCross RefCross Ref
  34. Cynthia Rudin. 2019. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. In Nature Machine Intelligence.Google ScholarGoogle Scholar
  35. Koustav Rudra, Niloy Ganguly, Pawan Goyal, and Saptarshi GhoshACM Transactions on the Web. 2018. Extracting and Summarizing Situational Information from the Twitter Social Media during Disasters. ACM Transactions on the Web(2018).Google ScholarGoogle Scholar
  36. Koustav Rudra, Subham Ghosh, and Niloy Ganguly. 2015. Extracting Situational Information from Microblogs during Disaster Events: a Classification-Summarization Approach. In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management (CIKM).Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Koustav Rudra, Pawan Goyal, Niloy Ganguly, Muhammad Imran, and Prasenjit Mitra. 2019. Summarizing Situational Tweets in Crisis Scenarios: An Extractive-Abstractive Approach. In IEEE Transactions on Computational Social Systems.Google ScholarGoogle Scholar
  38. Koustav Rudra, Pawan Goyal, Niloy Ganguly, Prasenjit Mitra, and Muhammad Imran. 2018. Identifying Sub-events and Summarizing Disaster-Related Information from Microblogs. In Proceedings of the 41st International ACM SIGIR Conference on Research & Development in Information Retrieval (SIGIR).Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Swarnadeep Saha, Prateek Yadav, Lisa Bauer, and Mohit Bansal. 2021. EXPLAGRAPHS: An Explanation Graph Generation Task for Structured Commonsense Reasoning. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (ACL). 7716–7740.Google ScholarGoogle ScholarCross RefCross Ref
  40. Naveen Saini, Sriparna Saha, and Pushpak Bhattacharyya. 2019. Multiobjective-Based Approach for Microblog Summarization. IEEE Transactions on Computational Social Systems (2019).Google ScholarGoogle ScholarCross RefCross Ref
  41. Sofia Serrano and Noah A. Smith. 2019. Is Attention Interpretable?. In Proceedings of the 57th Annual Meeting of the //Association for Computational Linguistics (ACL).Google ScholarGoogle ScholarCross RefCross Ref
  42. Lidan Shou, Zhenhua Wang, Ke Chen, and Gang Chen. 2013. Sumblr: continuous summarization of evolving tweet streams. In Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval).Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. István Varga, Motoki Sano, Kentaro Torisawa, Chikara Hashimoto, Kiyonori Ohtake, Takao Kawai, Jong-Hoon Oh, and Stijn De Saeger. 2013. Aid is Out There: Looking for Help from Tweets during a Large Scale Disaster. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (ACL).Google ScholarGoogle Scholar
  44. Sudha Verma, Sarah Vieweg, William J. Corvey, Leysia Palen, James H. Martin, Martha Palmer, Aaron Schram1, and Kenneth M. Anderson. 2011. Natural Language Processing to the Rescue? Extracting “Situational Awareness” Tweets During Mass Emergency. In Proceedings of the Fifth International AAAI Conference on Weblogs and Social Media (ICWSM).Google ScholarGoogle Scholar
  45. Eric Wallace, Shi Feng, Nikhil Kandpal, Matt Gardner, and Sameer Singh. 2019. Universal Adversarial Triggers for Attacking and Analyzing NLP. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP).Google ScholarGoogle ScholarCross RefCross Ref
  46. Wikipedia. 2021. Wilcoxon signed-rank test. https://en.wikipedia.org/wiki/Wilcoxon_signed-rank_testGoogle ScholarGoogle Scholar
  47. Ziyi Yang, Chenguang Zhu, Robert Gmyr, Michael Zeng, Xuedong Huang, and Eric Darve1. 2020. TED: A Pretrained Unsupervised Summarization Model with Theme Modeling and Denoising. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings.Google ScholarGoogle ScholarCross RefCross Ref
  48. Zijian Zhang, Koustav Rudra, and Avishek Anand. 2021. Explain and Predict, and then Predict again. In Proceedings of the 14th ACM International Conference on Web Search and Data Mining (WSDM).Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Hao Zheng and Mirella Lapata. 2019. Sentence Centrality Revisited for Unsupervised Summarization. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL).Google ScholarGoogle ScholarCross RefCross Ref
  50. R. Zhong, S. Shao, and K. McKeown. 2019. Fine-grained sentiment analysis with faithful attention. In arXiv preprint arXiv:1908.06870.Google ScholarGoogle Scholar
  51. Arkaitz Zubiaga, Damiano Spina, Enrique Amigó, and Julio Gonzalo. 2012. Towards Real-Time Summarization of Scheduled Events from Twitter Streams. In Proceedings of the 12th ACM conference on Hypertext and Hypermedia (HT).Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Towards an Interpretable Approach to Classify and Summarize Crisis Events from Microblogs
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            WWW '22: Proceedings of the ACM Web Conference 2022
            April 2022
            3764 pages
            ISBN:9781450390965
            DOI:10.1145/3485447

            Copyright © 2022 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 25 April 2022

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article
            • Research
            • Refereed limited

            Acceptance Rates

            Overall Acceptance Rate1,899of8,196submissions,23%

            Upcoming Conference

            WWW '24
            The ACM Web Conference 2024
            May 13 - 17, 2024
            Singapore , Singapore

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          HTML Format

          View this article in HTML Format .

          View HTML Format