ABSTRACT
Recent fashion of information propagation on Twitter makes the platform a crucial conduit for tactical data and emergency responses during disasters. However, the real-time information about crises is immersed in a large volume of emotional and irrelevant posts. It brings the necessity to develop an automatic tool to identify disaster-related messages and summarize the information for data consumption and situation planning. Besides, explainability of the methods is crucial in determining their applicability in real-life scenarios. Recent studies also highlight the importance of learning a good latent representation of tweets for several downstream tasks. In this paper, we take advantage of state-of-the-art methods, such as transformers and contrastive learning to build an interpretable classifier. Our proposed model classifies Twitter messages into different humanitarian categories and also extracts rationale snippets as supporting evidence for output decisions. The contrastive learning framework helps to learn better representations of tweets by bringing the related tweets closer in the embedding space. Furthermore, we employ classification labels and rationales to efficiently generate summaries of crisis events. Extensive experiments over different crisis datasets show that (i). our classifier obtains the best performance-interpretability trade-off, (ii). the proposed summarizer shows superior performance (1.4%-22% improvement) with significantly less computation cost than baseline models.
- Firoj Alam, Shafiq R. Joty, and Muhammad Imran. 2018. Graph Based Semi-Supervised Learning with Convolution Neural Networks to Classify Crisis Related Tweets. In Proceedings of the Twelfth International Conference on Web and Social Media, ICWSM 2018, Stanford, California, USA, June 25-28, 2018. 556--559.Google ScholarCross Ref
- Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural Machine Translation by Jointly Learning to Align and Translate. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings.Google Scholar
- Mark A. Cameron, Robert Power, Bella Robinson, and Jie Yin. 2012. Emergency situation awareness from twitter for crisis management. In Proceedings of the 21st World Wide Web Conference, WWW 2012, Lyon, France, April 16-20, 2012 (Companion Volume). 695--698.Google ScholarDigital Library
- Cornelia Caragea, Adrian Silvescu, and Andrea H. Tapia. 2016. Identifying Informative Messages in Disasters using Convolutional Neural Networks. In 13th Proceedings of the International Conference on Information Systems for Crisis Response and Management, Rio de Janeiro, Brasil, May 22-25, 2016.Google Scholar
- N. Chawla, K. Bowyer, L. Hall, and P. Kegelmeyer. 2002. SMOTE: synthetic minority over-sampling technique. Journal of artificial intelligence research, Vol. 16 (2002), 321--357.Google ScholarCross Ref
- Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers). 4171--4186.Google Scholar
- Jay DeYoung, Sarthak Jain, Nazneen Fatema Rajani, Eric Lehman, Caiming Xiong, Richard Socher, and Byron C. Wallace. 2020. ERASER: A Benchmark to Evaluate Rationalized NLP Models. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5-10, 2020. 4443--4458.Google ScholarCross Ref
- Milton Friedman. 1940. A comparison of alternative tests of significance for the problem of m rankings. The Annals of Mathematical Statistics (1940).Google Scholar
- Tianyu Gao, Xingcheng Yao, and Danqi Chen. 2021. SimCSE: Simple Contrastive Learning of Sentence Embeddings. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing.Google ScholarCross Ref
- gurobi. 2015. Gurobi -- The overall fastest and best supported solver available. http://www.gurobi.com/Google Scholar
- Muhammad Imran, Carlos Castillo, Fernando Diaz, and Sarah Vieweg. 2015. Processing Social Media Messages in Mass Emergency: A Survey. ACM Comput. Surv., Vol. 47, 4, 67:1--67:38.Google ScholarDigital Library
- Muhammad Imran, Carlos Castillo, Ji Lucas, Patrick Meier, and Sarah Vieweg. [n.d.]. AIDR: artificial intelligence for disaster response. In 23rd International World Wide Web Conference, WWW '14, Seoul, Republic of Korea, April 7-11, 2014, Companion Volume. 159--162.Google ScholarDigital Library
- Muhammad Imran, Prasenjit Mitra, and Carlos Castillo. 2016. Twitter as a Lifeline: Human-annotated Twitter Corpora for NLP of Crisis-related Messages. In Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, Portorovz, Slovenia, May 23-28, 2016.Google Scholar
- Sarthak Jain and Byron C. Wallace. 2019. Attention is not Explanation. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers). 3543--3556.Google Scholar
- Chris Kedzie, Fernando Diaz, and Kathleen R. McKeown. 2016. In Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, IJCAI 2016, New York, NY, USA, 9-15 July 2016. IJCAI/AAAI Press, 3754--3760.Google Scholar
- Chris Kedzie, Kathleen R. McKeown, and Fernando Diaz. 2015. Predicting Salient Updates for Disaster Summarization. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, ACL 2015, July 26-31, 2015, Beijing, China, Volume 1: Long Papers. 1608--1617.Google Scholar
- Jens Kersten, Anna M. Kruspe, Matti Wiegmann, and Friederike Klan. [n.d.]. Robust filtering of crisis-related tweets. In Proceedings of the 16th International Conference on Information Systems for Crisis Response and Management, València, Spain, May 19--22, 2019.Google Scholar
- Prannay Khosla, Piotr Teterwak, Chen Wang, Aaron Sarna, Yonglong Tian, Phillip Isola, Aaron Maschinot, Ce Liu, and Dilip Krishnan. 2020. Supervised Contrastive Learning. In Advances in Neural Information Processing Systems. 18661--18673.Google Scholar
- Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Veselin Stoyanov, and Luke Zettlemoyer. 2020. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.Google ScholarCross Ref
- Hongmin Li, Doina Caragea, and Cornelia Caragea. 2021a. Combining Self-training with Deep Learning for Disaster Tweet Classification. In Proceedings of the 18th International Conference on Information Systems for Crisis Response and Management.Google Scholar
- Quanzhi Li and Qiong Zhang. 2020. Abstractive Event Summarization on Twitter. In Companion of The 2020 Web Conference 2020, Taipei, Taiwan, April 20-24, 2020. 22--23.Google Scholar
- Yunfan Li, Peng Hu, Zitao Liu, Dezhong Peng, Joey Tianyi Zhou, and Xi Peng. 2021b. Contrastive Clustering. In Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence.Google Scholar
- Chin-Yew Lin. 2004. ROUGE: A Package for Automatic Evaluation of Summaries. In Text Summarization Branches Out. Association for Computational Linguistics.Google Scholar
- Yixin Liu, Zi-Yi Dou, and Pengfei Liu. 2021. RefSum: Refactoring Neural Summarization. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2021, Online, June 6-11, 2021. Association for Computational Linguistics, 1437--1448.Google ScholarCross Ref
- Yang Liu and Mirella Lapata. 2019. Text Summarization with Pretrained Encoders. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP 2019, Hong Kong, China, November 3-7, 2019. 3728--3738.Google ScholarCross Ref
- Ilya Loshchilov and Frank Hutter. 2019. Decoupled Weight Decay Regularization. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019.Google Scholar
- Richard McCreadie, Cody Buntain, and Ian Soboroff. 2019. TREC Incident Streams: Finding Actionable Information on Social Media. In Proceedings of the 16th International Conference on Information Systems for Crisis Response and Management, València, Spain, May 19-22, 2019.Google Scholar
- Ramesh Nallapati, Feifei Zhai, and Bowen Zhou. 2017. SummaRuNNer: A Recurrent Neural Network Based Sequence Model for Extractive Summarization of Documents. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, February 4-9, 2017, San Francisco, California, USA.Google ScholarCross Ref
- Dat Quoc Nguyen, Thanh Vu, and Anh Tuan Nguyen. 2020. BERTweet: A pre-trained language model for English Tweets. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations.Google ScholarCross Ref
- Dat Tien Nguyen, Kamla Al-Mannai, Shafiq R. Joty, Hassan Sajjad, Muhammad Imran, and Prasenjit Mitra. 2017. Robust Classification of Crisis-Related Data on Social Networks Using Convolutional Neural Networks. In Proceedings of the Eleventh International Conference on Web and Social Media, ICWSM 2017, Montréal, Québec, Canada, May 15-18, 2017. 632--635.Google ScholarCross Ref
- Minh-Tien Nguyen, Asanobu Kitamoto, and Tri-Thanh Nguyen. 2015. TSum4act: A Framework for Retrieving and Summarizing Actionable Tweets During a Disaster for Reaction. In Advances in Knowledge Discovery and Data Mining - 19th Pacific-Asia Conference, PAKDD 2015, Ho Chi Minh City, Vietnam, May 19-22, 2015, Proceedings, Part II, Vol. 9078. 64--75.Google Scholar
- Thi-Huyen Nguyen, Tuan-Anh Hoang, and Wolfgang Nejdl. 2019. Efficient Summarizing of Evolving Events from Twitter Streams. In Proceedings of the 2019 SIAM International Conference on Data Mining, SDM 2019, Calgary, Alberta, Canada, May 2-4, 2019. 226--234.Google ScholarCross Ref
- Thi Huyen Nguyen and Koustav Rudra. 2022. Towards an Interpretable Approach to Classify and Summarize Crisis Events from Microblogs. In WWW '22: The ACM Web Conference 2022, Virtual Event, Lyon, France, April 25 - 29, 2022.Google ScholarDigital Library
- Andrei Olariu. 2014. Efficient Online Summarization of Microblogging Streams. In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2014, April 26-30, 2014, Gothenburg, Sweden. 236--240.Google ScholarCross Ref
- Pouya Pezeshkpour, Sarthak Jain, Sameer Singh, and Byron C. Wallace. 2022. Combining Feature and Instance Attribution to Detect Artifacts. In Findings of the Association for Computational Linguistics: ACL 2022, Dublin, Ireland, May 22--27, 2022. Association for Computational Linguistics, 1934--1946. https://doi.org/10.18653/v1/2022.findings-acl.153Google ScholarCross Ref
- Shalini Priya, Apoorva Upadhyaya, Manish Bhanu, Sourav Kumar Dandapat, and Joydeep Chandra. 2020. EnDeA: Ensemble based Decoupled Adversarial Learning for Identifying Infrastructure Damage during Disasters. In CIKM '20: The 29th ACM International Conference on Information and Knowledge Management, Virtual Event, Ireland, October 19-23, 2020. ACM, 1245--1254.Google ScholarDigital Library
- Yasaman Razeghi, Robert L Logan IV, Matt Gardner, and Sameer Singh. 2022. Impact of pretraining term frequencies on few-shot reasoning. arXiv preprint arXiv:2202.07206 (2022).Google Scholar
- Nils Reimers and Iryna Gurevych. 2019. Sentence-bert: Sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084 (2019).Google Scholar
- Marco Tú lio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. "Why Should I Trust You?": Explaining the Predictions of Any Classifier. In Proceedings of the Demonstrations Session, NAACL HLT 2016, The 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego California, USA, June 12-17, 2016.97--101.Google Scholar
- Andrew Slavin Ross, Michael C. Hughes, and Finale Doshi-Velez. 2017. Right for the Right Reasons: Training Differentiable Models by Constraining their Explanations. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI-17. 2662--2670.Google ScholarCross Ref
- Cynthia Rudin. 2019. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. In Nature Machine Intelligence.Google Scholar
- Koustav Rudra, Subham Ghosh, Niloy Ganguly, Pawan Goyal, and Saptarshi Ghosh. 2015. Extracting Situational Information from Microblogs during Disaster Events: a Classification-Summarization Approach. In Proceedings of the 24th ACM International Conference on Information and Knowledge Management, CIKM 2015, Melbourne, VIC, Australia, October 19 - 23, 2015. 583--592.Google ScholarDigital Library
- Koustav Rudra, Pawan Goyal, Niloy Ganguly, Muhammad Imran, and Prasenjit Mitra. 2019. Summarizing Situational Tweets in Crisis Scenarios: An Extractive-Abstractive Approach. In IEEE Transactions on Computational Social Systems.Google Scholar
- Koustav Rudra, Pawan Goyal, Niloy Ganguly, Prasenjit Mitra, and Muhammad Imran. 2018. Identifying Sub-events and Summarizing Disaster-Related Information from Microblogs. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, SIGIR 2018, Ann Arbor, MI, USA, July 08-12, 2018. 265--274.Google ScholarDigital Library
- Naveen Saini, Sriparna Saha, and Pushpak Bhattacharyya. 2019. Multiobjective-Based Approach for Microblog Summarization. IEEE Transactions on Computational Social Systems (2019).Google ScholarCross Ref
- Sofia Serrano and Noah A. Smith. 2019. Is Attention Interpretable?. In Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28- August 2, 2019, Volume 1: Long Papers. 2931--2951.Google Scholar
- Dylan Slack, Anna Hilgard, Himabindu Lakkaraju, and Sameer Singh. 2021a. Counterfactual Explanations Can Be Manipulated. (2021), 62--75.Google Scholar
- Dylan Slack, Anna Hilgard, Sameer Singh, and Himabindu Lakkaraju. 2021b. Reliable Post hoc Explanations: Modeling Uncertainty in Explainability. (2021), 9391--9404.Google Scholar
- Kihyuk Sohn. 2016. Improved Deep Metric Learning with Multi-class N-pair Loss Objective. In Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, December 5-10, 2016, Barcelona, Spain. 1849--1857.Google ScholarDigital Library
- Sanjay Subramanian, William Merrill, Trevor Darrell, Matt Gardner, Sameer Singh, and Anna Rohrbach. 2022. ReCLIP: A Strong Zero-Shot Baseline for Referring Expression Comprehension. (2022), 5198--5215.Google Scholar
- Varsha Suresh and Desmond C. Ong. 2021. Not All Negatives are Equal: Label-Aware Contrastive Loss for Fine-grained Text Classification. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021, Virtual Event / Punta Cana, Dominican Republic, 7-11 November, 2021.Google Scholar
- TREC. 2022. Crisis FACTS. https://crisisfacts.github.io/Google Scholar
- Istvá n Varga, Motoki Sano, Kentaro Torisawa, Chikara Hashimoto, Kiyonori Ohtake, Takao Kawai, Jong-Hoon Oh, and Stijn De Saeger. 2013. Aid is Out There: Looking for Help from Tweets during a Large Scale Disaster. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, ACL 2013, 4-9 August 2013, Sofia, Bulgaria, Volume 1: Long Papers. 1619--1629.Google Scholar
- Sudha Verma, Sarah Vieweg, William J. Corvey, Leysia Palen, James H. Martin, Martha Palmer, Aaron Schram, and Kenneth Mark Anderson. 2011. Natural Language Processing to the Rescue? Extracting "Situational Awareness" Tweets During Mass Emergency. In Proceedings of the Fifth International Conference on Weblogs and Social Media, Barcelona, Catalonia, Spain, July 17-21, 2011.Google Scholar
- Eric Wallace, Tony Z. Zhao, Shi Feng, and Sameer Singh. 2021. Concealed Data Poisoning Attacks on NLP Models. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2021, Online, June 6-11, 2021. Association for Computational Linguistics, 139--150.Google ScholarCross Ref
- Zijian Zhang, Koustav Rudra, and Avishek Anand. 2021. Explain and Predict, and then Predict Again. In WSDM '21, The Fourteenth ACM International Conference on Web Search and Data Mining, Virtual Event, Israel, March 8-12, 2021. 418--426.Google ScholarDigital Library
- Hao Zheng and Mirella Lapata. 2019. Sentence Centrality Revisited for Unsupervised Summarization. In Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28-August 2, 2019, Volume 1: Long Papers. 6236--6247.Google ScholarCross Ref
Index Terms
- Rationale Aware Contrastive Learning Based Approach to Classify and Summarize Crisis-Related Microblogs
Recommendations
Towards an Interpretable Approach to Classify and Summarize Crisis Events from Microblogs
WWW '22: Proceedings of the ACM Web Conference 2022Microblogging platforms like Twitter have been heavily leveraged to report and exchange information about natural disasters. The real-time data on these sites is highly helpful in gaining situational awareness and planning aid efforts. However, disaster-...
Learning Faithful Attention for Interpretable Classification of Crisis-Related Microblogs under Constrained Human Budget
WWW '23: Proceedings of the ACM Web Conference 2023The recent widespread use of social media platforms has created convenient ways to obtain and spread up-to-date information during crisis events such as disasters. Time-critical analysis of crisis data can help human organizations gain actionable ...
CrisICSum: Interpretable Classification and Summarization Platform for Crisis Events from Microblogs
CIKM '22: Proceedings of the 31st ACM International Conference on Information & Knowledge ManagementMicroblogging platforms such as Twitter, receive massive messages during crisis events. Real-time insights are crucial for emergency response. Hence, there is a need to develop faithful tools for efficiently digesting information. In this paper, we ...
Comments