research-article

Rationale Aware Contrastive Learning Based Approach to Classify and Summarize Crisis-Related Microblogs

Authors:
Thi Huyen Nguyen

L3S Research Center, Hanover, Germany

L3S Research Center, Hanover, Germany
View Profile

,
Koustav Rudra

Indian Institute of Technology (Indian School of Mines) Dhanbad, Dhanbad, India

Indian Institute of Technology (Indian School of Mines) Dhanbad, Dhanbad, India
View Profile

CIKM '22: Proceedings of the 31st ACM International Conference on Information & Knowledge ManagementOctober 2022Pages 1552–1562https://doi.org/10.1145/3511808.3557426

Published:17 October 2022Publication History

CIKM '22: Proceedings of the 31st ACM International Conference on Information & Knowledge Management

Pages 1552–1562

ABSTRACT

Recent fashion of information propagation on Twitter makes the platform a crucial conduit for tactical data and emergency responses during disasters. However, the real-time information about crises is immersed in a large volume of emotional and irrelevant posts. It brings the necessity to develop an automatic tool to identify disaster-related messages and summarize the information for data consumption and situation planning. Besides, explainability of the methods is crucial in determining their applicability in real-life scenarios. Recent studies also highlight the importance of learning a good latent representation of tweets for several downstream tasks. In this paper, we take advantage of state-of-the-art methods, such as transformers and contrastive learning to build an interpretable classifier. Our proposed model classifies Twitter messages into different humanitarian categories and also extracts rationale snippets as supporting evidence for output decisions. The contrastive learning framework helps to learn better representations of tweets by bringing the related tweets closer in the embedding space. Furthermore, we employ classification labels and rationales to efficiently generate summaries of crisis events. Extensive experiments over different crisis datasets show that (i). our classifier obtains the best performance-interpretability trade-off, (ii). the proposed summarizer shows superior performance (1.4%-22% improvement) with significantly less computation cost than baseline models.

References

Firoj Alam, Shafiq R. Joty, and Muhammad Imran. 2018. Graph Based Semi-Supervised Learning with Convolution Neural Networks to Classify Crisis Related Tweets. In Proceedings of the Twelfth International Conference on Web and Social Media, ICWSM 2018, Stanford, California, USA, June 25-28, 2018. 556--559.Google ScholarCross Ref
Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural Machine Translation by Jointly Learning to Align and Translate. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings.Google Scholar
Mark A. Cameron, Robert Power, Bella Robinson, and Jie Yin. 2012. Emergency situation awareness from twitter for crisis management. In Proceedings of the 21st World Wide Web Conference, WWW 2012, Lyon, France, April 16-20, 2012 (Companion Volume). 695--698.Google ScholarDigital Library
Cornelia Caragea, Adrian Silvescu, and Andrea H. Tapia. 2016. Identifying Informative Messages in Disasters using Convolutional Neural Networks. In 13th Proceedings of the International Conference on Information Systems for Crisis Response and Management, Rio de Janeiro, Brasil, May 22-25, 2016.Google Scholar
N. Chawla, K. Bowyer, L. Hall, and P. Kegelmeyer. 2002. SMOTE: synthetic minority over-sampling technique. Journal of artificial intelligence research, Vol. 16 (2002), 321--357.Google ScholarCross Ref
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers). 4171--4186.Google Scholar
Jay DeYoung, Sarthak Jain, Nazneen Fatema Rajani, Eric Lehman, Caiming Xiong, Richard Socher, and Byron C. Wallace. 2020. ERASER: A Benchmark to Evaluate Rationalized NLP Models. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5-10, 2020. 4443--4458.Google ScholarCross Ref
Milton Friedman. 1940. A comparison of alternative tests of significance for the problem of m rankings. The Annals of Mathematical Statistics (1940).Google Scholar
Tianyu Gao, Xingcheng Yao, and Danqi Chen. 2021. SimCSE: Simple Contrastive Learning of Sentence Embeddings. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing.Google ScholarCross Ref
gurobi. 2015. Gurobi -- The overall fastest and best supported solver available. http://www.gurobi.com/Google Scholar
Muhammad Imran, Carlos Castillo, Fernando Diaz, and Sarah Vieweg. 2015. Processing Social Media Messages in Mass Emergency: A Survey. ACM Comput. Surv., Vol. 47, 4, 67:1--67:38.Google ScholarDigital Library
Muhammad Imran, Carlos Castillo, Ji Lucas, Patrick Meier, and Sarah Vieweg. [n.d.]. AIDR: artificial intelligence for disaster response. In 23rd International World Wide Web Conference, WWW '14, Seoul, Republic of Korea, April 7-11, 2014, Companion Volume. 159--162.Google ScholarDigital Library
Muhammad Imran, Prasenjit Mitra, and Carlos Castillo. 2016. Twitter as a Lifeline: Human-annotated Twitter Corpora for NLP of Crisis-related Messages. In Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, Portorovz, Slovenia, May 23-28, 2016.Google Scholar
Sarthak Jain and Byron C. Wallace. 2019. Attention is not Explanation. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers). 3543--3556.Google Scholar
Chris Kedzie, Fernando Diaz, and Kathleen R. McKeown. 2016. In Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, IJCAI 2016, New York, NY, USA, 9-15 July 2016. IJCAI/AAAI Press, 3754--3760.Google Scholar
Chris Kedzie, Kathleen R. McKeown, and Fernando Diaz. 2015. Predicting Salient Updates for Disaster Summarization. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, ACL 2015, July 26-31, 2015, Beijing, China, Volume 1: Long Papers. 1608--1617.Google Scholar
Jens Kersten, Anna M. Kruspe, Matti Wiegmann, and Friederike Klan. [n.d.]. Robust filtering of crisis-related tweets. In Proceedings of the 16th International Conference on Information Systems for Crisis Response and Management, València, Spain, May 19--22, 2019.Google Scholar
Prannay Khosla, Piotr Teterwak, Chen Wang, Aaron Sarna, Yonglong Tian, Phillip Isola, Aaron Maschinot, Ce Liu, and Dilip Krishnan. 2020. Supervised Contrastive Learning. In Advances in Neural Information Processing Systems. 18661--18673.Google Scholar
Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Veselin Stoyanov, and Luke Zettlemoyer. 2020. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.Google ScholarCross Ref
Hongmin Li, Doina Caragea, and Cornelia Caragea. 2021a. Combining Self-training with Deep Learning for Disaster Tweet Classification. In Proceedings of the 18th International Conference on Information Systems for Crisis Response and Management.Google Scholar
Quanzhi Li and Qiong Zhang. 2020. Abstractive Event Summarization on Twitter. In Companion of The 2020 Web Conference 2020, Taipei, Taiwan, April 20-24, 2020. 22--23.Google Scholar
Yunfan Li, Peng Hu, Zitao Liu, Dezhong Peng, Joey Tianyi Zhou, and Xi Peng. 2021b. Contrastive Clustering. In Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence.Google Scholar
Chin-Yew Lin. 2004. ROUGE: A Package for Automatic Evaluation of Summaries. In Text Summarization Branches Out. Association for Computational Linguistics.Google Scholar
Yixin Liu, Zi-Yi Dou, and Pengfei Liu. 2021. RefSum: Refactoring Neural Summarization. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2021, Online, June 6-11, 2021. Association for Computational Linguistics, 1437--1448.Google ScholarCross Ref
Yang Liu and Mirella Lapata. 2019. Text Summarization with Pretrained Encoders. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP 2019, Hong Kong, China, November 3-7, 2019. 3728--3738.Google ScholarCross Ref
Ilya Loshchilov and Frank Hutter. 2019. Decoupled Weight Decay Regularization. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019.Google Scholar
Richard McCreadie, Cody Buntain, and Ian Soboroff. 2019. TREC Incident Streams: Finding Actionable Information on Social Media. In Proceedings of the 16th International Conference on Information Systems for Crisis Response and Management, València, Spain, May 19-22, 2019.Google Scholar
Ramesh Nallapati, Feifei Zhai, and Bowen Zhou. 2017. SummaRuNNer: A Recurrent Neural Network Based Sequence Model for Extractive Summarization of Documents. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, February 4-9, 2017, San Francisco, California, USA.Google ScholarCross Ref
Dat Quoc Nguyen, Thanh Vu, and Anh Tuan Nguyen. 2020. BERTweet: A pre-trained language model for English Tweets. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations.Google ScholarCross Ref
Dat Tien Nguyen, Kamla Al-Mannai, Shafiq R. Joty, Hassan Sajjad, Muhammad Imran, and Prasenjit Mitra. 2017. Robust Classification of Crisis-Related Data on Social Networks Using Convolutional Neural Networks. In Proceedings of the Eleventh International Conference on Web and Social Media, ICWSM 2017, Montréal, Québec, Canada, May 15-18, 2017. 632--635.Google ScholarCross Ref
Minh-Tien Nguyen, Asanobu Kitamoto, and Tri-Thanh Nguyen. 2015. TSum4act: A Framework for Retrieving and Summarizing Actionable Tweets During a Disaster for Reaction. In Advances in Knowledge Discovery and Data Mining - 19th Pacific-Asia Conference, PAKDD 2015, Ho Chi Minh City, Vietnam, May 19-22, 2015, Proceedings, Part II, Vol. 9078. 64--75.Google Scholar
Thi-Huyen Nguyen, Tuan-Anh Hoang, and Wolfgang Nejdl. 2019. Efficient Summarizing of Evolving Events from Twitter Streams. In Proceedings of the 2019 SIAM International Conference on Data Mining, SDM 2019, Calgary, Alberta, Canada, May 2-4, 2019. 226--234.Google ScholarCross Ref
Thi Huyen Nguyen and Koustav Rudra. 2022. Towards an Interpretable Approach to Classify and Summarize Crisis Events from Microblogs. In WWW '22: The ACM Web Conference 2022, Virtual Event, Lyon, France, April 25 - 29, 2022.Google ScholarDigital Library
Andrei Olariu. 2014. Efficient Online Summarization of Microblogging Streams. In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2014, April 26-30, 2014, Gothenburg, Sweden. 236--240.Google ScholarCross Ref
Pouya Pezeshkpour, Sarthak Jain, Sameer Singh, and Byron C. Wallace. 2022. Combining Feature and Instance Attribution to Detect Artifacts. In Findings of the Association for Computational Linguistics: ACL 2022, Dublin, Ireland, May 22--27, 2022. Association for Computational Linguistics, 1934--1946. https://doi.org/10.18653/v1/2022.findings-acl.153Google ScholarCross Ref
Shalini Priya, Apoorva Upadhyaya, Manish Bhanu, Sourav Kumar Dandapat, and Joydeep Chandra. 2020. EnDeA: Ensemble based Decoupled Adversarial Learning for Identifying Infrastructure Damage during Disasters. In CIKM '20: The 29th ACM International Conference on Information and Knowledge Management, Virtual Event, Ireland, October 19-23, 2020. ACM, 1245--1254.Google ScholarDigital Library
Yasaman Razeghi, Robert L Logan IV, Matt Gardner, and Sameer Singh. 2022. Impact of pretraining term frequencies on few-shot reasoning. arXiv preprint arXiv:2202.07206 (2022).Google Scholar
Nils Reimers and Iryna Gurevych. 2019. Sentence-bert: Sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084 (2019).Google Scholar
Marco Tú lio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. "Why Should I Trust You?": Explaining the Predictions of Any Classifier. In Proceedings of the Demonstrations Session, NAACL HLT 2016, The 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego California, USA, June 12-17, 2016.97--101.Google Scholar
Andrew Slavin Ross, Michael C. Hughes, and Finale Doshi-Velez. 2017. Right for the Right Reasons: Training Differentiable Models by Constraining their Explanations. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI-17. 2662--2670.Google ScholarCross Ref
Cynthia Rudin. 2019. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. In Nature Machine Intelligence.Google Scholar
Koustav Rudra, Subham Ghosh, Niloy Ganguly, Pawan Goyal, and Saptarshi Ghosh. 2015. Extracting Situational Information from Microblogs during Disaster Events: a Classification-Summarization Approach. In Proceedings of the 24th ACM International Conference on Information and Knowledge Management, CIKM 2015, Melbourne, VIC, Australia, October 19 - 23, 2015. 583--592.Google ScholarDigital Library
Koustav Rudra, Pawan Goyal, Niloy Ganguly, Muhammad Imran, and Prasenjit Mitra. 2019. Summarizing Situational Tweets in Crisis Scenarios: An Extractive-Abstractive Approach. In IEEE Transactions on Computational Social Systems.Google Scholar
Koustav Rudra, Pawan Goyal, Niloy Ganguly, Prasenjit Mitra, and Muhammad Imran. 2018. Identifying Sub-events and Summarizing Disaster-Related Information from Microblogs. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, SIGIR 2018, Ann Arbor, MI, USA, July 08-12, 2018. 265--274.Google ScholarDigital Library
Naveen Saini, Sriparna Saha, and Pushpak Bhattacharyya. 2019. Multiobjective-Based Approach for Microblog Summarization. IEEE Transactions on Computational Social Systems (2019).Google ScholarCross Ref
Sofia Serrano and Noah A. Smith. 2019. Is Attention Interpretable?. In Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28- August 2, 2019, Volume 1: Long Papers. 2931--2951.Google Scholar
Dylan Slack, Anna Hilgard, Himabindu Lakkaraju, and Sameer Singh. 2021a. Counterfactual Explanations Can Be Manipulated. (2021), 62--75.Google Scholar
Dylan Slack, Anna Hilgard, Sameer Singh, and Himabindu Lakkaraju. 2021b. Reliable Post hoc Explanations: Modeling Uncertainty in Explainability. (2021), 9391--9404.Google Scholar
Kihyuk Sohn. 2016. Improved Deep Metric Learning with Multi-class N-pair Loss Objective. In Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, December 5-10, 2016, Barcelona, Spain. 1849--1857.Google ScholarDigital Library
Sanjay Subramanian, William Merrill, Trevor Darrell, Matt Gardner, Sameer Singh, and Anna Rohrbach. 2022. ReCLIP: A Strong Zero-Shot Baseline for Referring Expression Comprehension. (2022), 5198--5215.Google Scholar
Varsha Suresh and Desmond C. Ong. 2021. Not All Negatives are Equal: Label-Aware Contrastive Loss for Fine-grained Text Classification. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021, Virtual Event / Punta Cana, Dominican Republic, 7-11 November, 2021.Google Scholar
TREC. 2022. Crisis FACTS. https://crisisfacts.github.io/Google Scholar
Istvá n Varga, Motoki Sano, Kentaro Torisawa, Chikara Hashimoto, Kiyonori Ohtake, Takao Kawai, Jong-Hoon Oh, and Stijn De Saeger. 2013. Aid is Out There: Looking for Help from Tweets during a Large Scale Disaster. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, ACL 2013, 4-9 August 2013, Sofia, Bulgaria, Volume 1: Long Papers. 1619--1629.Google Scholar
Sudha Verma, Sarah Vieweg, William J. Corvey, Leysia Palen, James H. Martin, Martha Palmer, Aaron Schram, and Kenneth Mark Anderson. 2011. Natural Language Processing to the Rescue? Extracting "Situational Awareness" Tweets During Mass Emergency. In Proceedings of the Fifth International Conference on Weblogs and Social Media, Barcelona, Catalonia, Spain, July 17-21, 2011.Google Scholar
Eric Wallace, Tony Z. Zhao, Shi Feng, and Sameer Singh. 2021. Concealed Data Poisoning Attacks on NLP Models. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2021, Online, June 6-11, 2021. Association for Computational Linguistics, 139--150.Google ScholarCross Ref
Zijian Zhang, Koustav Rudra, and Avishek Anand. 2021. Explain and Predict, and then Predict Again. In WSDM '21, The Fourteenth ACM International Conference on Web Search and Data Mining, Virtual Event, Israel, March 8-12, 2021. 418--426.Google ScholarDigital Library
Hao Zheng and Mirella Lapata. 2019. Sentence Centrality Revisited for Unsupervised Summarization. In Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28-August 2, 2019, Volume 1: Long Papers. 6236--6247.Google ScholarCross Ref

Index Terms

Rationale Aware Contrastive Learning Based Approach to Classify and Summarize Crisis-Related Microblogs
1. Information systems
  1. Information retrieval
    1. Retrieval tasks and goals
      1. Clustering and classification
      2. Summarization

Recommendations

Towards an Interpretable Approach to Classify and Summarize Crisis Events from Microblogs
WWW '22: Proceedings of the ACM Web Conference 2022

Microblogging platforms like Twitter have been heavily leveraged to report and exchange information about natural disasters. The real-time data on these sites is highly helpful in gaining situational awareness and planning aid efforts. However, disaster-...
Read More
Learning Faithful Attention for Interpretable Classification of Crisis-Related Microblogs under Constrained Human Budget
WWW '23: Proceedings of the ACM Web Conference 2023

The recent widespread use of social media platforms has created convenient ways to obtain and spread up-to-date information during crisis events such as disasters. Time-critical analysis of crisis data can help human organizations gain actionable ...
Read More
CrisICSum: Interpretable Classification and Summarization Platform for Crisis Events from Microblogs
CIKM '22: Proceedings of the 31st ACM International Conference on Information & Knowledge Management

Microblogging platforms such as Twitter, receive massive messages during crisis events. Real-time insights are crucial for emergency response. Hence, there is a need to develop faithful tools for efficiently digesting information. In this paper, we ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CIKM '22: Proceedings of the 31st ACM International Conference on Information & Knowledge Management
October 2022
5274 pages
ISBN:9781450392365
DOI:10.1145/3511808
General Chairs:
Mohammad Al Hasan
Indiana University Purdue University, Indianapolis, USA
,
Li Xiong
Emory University, Atlanta, USA
Copyright © 2022 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 17 October 2022
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
classification
contrastive learning
crisis events
interpretability
summarization
Qualifiers
- research-article
Conference

Acceptance Rates
CIKM '22 Paper Acceptance Rate621of2,257submissions,28%Overall Acceptance Rate1,861of8,427submissions,22%
More
Upcoming Conference
CIKM '24

Sponsor:

sigir

sigir

The 33rd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2024

Boise , ID , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 208
  Total Downloads
- Downloads (Last 12 months)95
- Downloads (Last 6 weeks)8
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Rationale Aware Contrastive Learning Based Approach to Classify and Summarize Crisis-Related Microblogs

CIKM '22: Proceedings of the 31st ACM International Conference on Information & Knowledge Management

ABSTRACT

References

Cited By

Index Terms

Recommendations

Towards an Interpretable Approach to Classify and Summarize Crisis Events from Microblogs

Learning Faithful Attention for Interpretable Classification of Crisis-Related Microblogs under Constrained Human Budget

CrisICSum: Interpretable Classification and Summarization Platform for Crisis Events from Microblogs

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Rationale Aware Contrastive Learning Based Approach to Classify and Summarize Crisis-Related Microblogs

CIKM '22: Proceedings of the 31st ACM International Conference on Information & Knowledge Management

ABSTRACT

References

Cited By

Index Terms

Recommendations

Towards an Interpretable Approach to Classify and Summarize Crisis Events from Microblogs

Learning Faithful Attention for Interpretable Classification of Crisis-Related Microblogs under Constrained Human Budget

CrisICSum: Interpretable Classification and Summarization Platform for Crisis Events from Microblogs

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media