research-article

GNoM: Graph Neural Network Enhanced Language Models for Disaster Related Multilingual Text Classification

Authors:
Samujjwal Ghosh

Department of Computer Science and Engineering, IIT Hyderabad, India

Department of Computer Science and Engineering, IIT Hyderabad, India
View Profile

,
Subhadeep Maji

Amazon, India

Amazon, India
View Profile

,
Maunendra Sankar Desarkar

Department of Computer Science and Engineering, IIT Hyderabad, India

Department of Computer Science and Engineering, IIT Hyderabad, India
View Profile

WebSci '22: Proceedings of the 14th ACM Web Science Conference 2022June 2022Pages 55–65https://doi.org/10.1145/3501247.3531561

Published:26 June 2022Publication History

WebSci '22: Proceedings of the 14th ACM Web Science Conference 2022

Pages 55–65

ABSTRACT

Online social media works as a source of various valuable and actionable information during disasters. These information might be available in multiple languages due to the nature of user generated content. An effective system to automatically identify and categorize these actionable information should be capable to handle multiple languages and under limited supervision. However, existing works mostly focus on English language only with the assumption that sufficient labeled data is available. To overcome these challenges, we propose a multilingual disaster related text classification system which is capable to work undervmonolingual, cross-lingual and multilingual lingual scenarios and under limited supervision. Our end-to-end trainable framework combines the versatility of graph neural networks, by applying over the corpus, with the power of transformer based large language models, over examples, with the help of cross-attention between the two. We evaluate our framework over total nine English, Non-English and monolingual datasets invmonolingual, cross-lingual and multilingual lingual classification scenarios. Our framework outperforms state-of-the-art models in disaster domain and multilingual BERT baseline in terms of Weighted F1 score. We also show the generalizability of the proposed model under limited supervision.

Supplemental Material

WS22_S1_65.mp4

mp4

958.4 MB

Download

References

Firoj Alam, Shafiq Joty, and Muhammad Imran. 2018. Domain Adaptation with Adversarial Training and Graph Embeddings. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 1077–1087.Google ScholarCross Ref
Aindriya Barua, S. Thara, B. Premjith, and K. P. Soman. 2021. Analysis of Contextual and Non-contextual Word Embedding Models for Hindi NER with Web Application for Data Collection. In Advanced Computing, Deepak Garg, Kit Wong, Jagannathan Sarangapani, and Suneet Kumar Gupta (Eds.). Springer Singapore, Singapore, 183–202.Google Scholar
Cornelia Caragea, Adrian Silvescu, and Andrea H. Tapia. 2016. Identifying informative messages in disaster events using Convolutional Neural Networks. In ISCRAM 2016 Conference Proceedings - 13th International Conference on Information Systems for Crisis Response and Management(Proceedings of the International ISCRAM Conference), Pedro Antunes, Victor Amadeo Banuls Silvera, Joao Porto de Albuquerque, Kathleen Ann Moore, and Andrea H. Tapia (Eds.). Information Systems for Crisis Response and Management, ISCRAM.Google Scholar
Alfredo Cobo, Denis Parra, and Jaime Navón. 2015. Identifying Relevant Messages in a Twitter-Based Citizen Channel for Natural Disaster Situations. In Proceedings of the 24th International Conference on World Wide Web (Florence, Italy) (WWW ’15 Companion). Association for Computing Machinery, New York, NY, USA, 1189–1194. https://doi.org/10.1145/2740908.2741719Google ScholarDigital Library
Alexis Conneau, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, Edouard Grave, Myle Ott, Luke Zettlemoyer, and Veselin Stoyanov. 2019. Unsupervised Cross-lingual Representation Learning at Scale. arXiv preprint arXiv:1911.02116(2019).Google Scholar
Alexis Conneau, Guillaume Lample, Marc’Aurelio Ranzato, Ludovic Denoyer, and Hervé Jégou. 2017. Word Translation Without Parallel Data. arXiv preprint arXiv:1710.04087(2017).Google Scholar
Stefano Cresci, Maurizio Tesconi, Andrea Cimino, and Felice Dell’Orletta. 2015. A Linguistically-Driven Approach to Cross-Event Damage Assessment of Natural Disasters from Social Media Messages. In Proceedings of the 24th International Conference on World Wide Web (Florence, Italy) (WWW ’15 Companion). Association for Computing Machinery, New York, NY, USA, 1195–1200. https://doi.org/10.1145/2740908.2741722Google ScholarDigital Library
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 4171–4186. https://doi.org/10.18653/v1/N19-1423Google ScholarCross Ref
Samujjwal Ghosh and Maunendra Sankar Desarkar. 2020. Semi-Supervised Granular Classification Framework for Resource Constrained Short-texts: Towards Retrieving Situational Information During Disaster Events. In 12th ACM Conference on Web Science. 29–38.Google ScholarDigital Library
Saptarshi Ghosh and Kripabandhu Ghosh. 2016. Overview of the FIRE 2016 Microblog track: Information Extraction from Microblogs Posted during Disasters. In Working notes of FIRE 2016 - Forum for Information Retrieval Evaluation, Kolkata, India, December 7-10, 2016(CEUR Workshop Proceedings, Vol. 1737), Prasenjit Majumder, Mandar Mitra, Parth Mehta, Jainisha Sankhavara, and Kripabandhu Ghosh (Eds.). CEUR-WS.org, 56–61. http://ceur-ws.org/Vol-1737/T2-1.pdfGoogle Scholar
Saptarshi Ghosh, Kripabandhu Ghosh, Tanmoy Chakraborty, Debasis Ganguly, Gareth Jones, and Marie-Francine Moens. 2017. First International Workshop on Exploitation of Social Media for Emergency Relief and Preparedness (SMERP). ADVANCES IN INFORMATION RETRIEVAL, ECIR 2017 10193 (2017), 779–783.Google Scholar
Samujjwal Ghosh, Subhadeep Maji, and Maunendra Sankar Desarkar. 2021. Unsupervised Domain Adaptation with Global and Local Graph Neural Networks in Limited Labeled Data Scenario: Application to Disaster Management. arxiv:2104.01436 [cs.CL]Google Scholar
Prashant Khare, Grégoire Burel, Diana Maynard, and Harith Alani. 2018. Cross-Lingual Classification of Crisis Data. In The Semantic Web – ISWC 2018, Denny Vrandečić, Kalina Bontcheva, Mari Carmen Suárez-Figueroa, Valentina Presutti, Irene Celino, Marta Sabou, Lucie-Aimée Kaffee, and Elena Simperl (Eds.). Springer International Publishing, Cham, 617–633.Google ScholarDigital Library
Jitin Krishnan, Hemant Purohit, and Huzefa Rangwala. 2020. Attention Realignment and Pseudo-Labelling for Interpretable Cross-Lingual Classification of Crisis Tweets. In KiML@KDD.Google Scholar
Xukun Li and Doina Caragea. 2020. Domain Adaptation with Reconstruction for Disaster Tweet Classification. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 1561–1564.Google ScholarDigital Library
V. Lorini, C. Castillo, F. Dottori, M. Kalas, D. Nappo, and P. Salamon. 2019. Integrating Social Media into a Pan-European Flood Awareness System: A Multilingual Approach. arxiv:1904.10876 [cs.IR]Google Scholar
Reza Mazloom, Hongming Li, Doina Caragea, Muhammad Imran, and Cornelia Caragea. 2018. Classification of Twitter Disaster Data Using a Hybrid Feature-Instance Adaptation Approach. In ISCRAM.Google Scholar
L. McInnes, J. Healy, and J. Melville. 2018. UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction. ArXiv e-prints (Feb. 2018). arxiv:1802.03426 [stat.ML]Google Scholar
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed Representations of Words and Phrases and their Compositionality. In Advances in Neural Information Processing Systems, C. J. C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K. Q. Weinberger (Eds.). Vol. 26. Curran Associates, Inc., 3111–3119. https://proceedings.neurips.cc/paper/2013/file/9aa42b31882ec039965f3c4923ce901b-Paper.pdfGoogle Scholar
Aibek Musaev and Calton Pu. 2017. Towards Multilingual Automated Classification Systems. 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS) (2017), 2333–2337.Google Scholar
Venkata Kishore Neppalli, Cornelia Caragea, and Doina Caragea. 2018. Deep Neural Networks versus Naive Bayes Classifiers for Identifying Informative Tweets during Disasters. In ISCRAM.Google Scholar
Tien Dat Nguyen, Kamla Al-Mannai, Shafiq R. Joty, Hassan Sajjad, Muhammad Imran, and Prasenjit Mitra. 2017. Robust Classification of Crisis-Related Data on Social Networks Using Convolutional Neural Networks. In ICWSM.Google Scholar
Kenta Oono and Taiji Suzuki. 2020. Graph Neural Networks Exponentially Lose Expressive Power for Node Classification. In International Conference on Learning Representations. https://openreview.net/forum?id=S1ldO2EFPrGoogle Scholar
Sara Piscitelli, Edoardo Arnaudo, and Claudio Rossi. 2021. Multilingual Text Classification from Twitter during Emergencies. In 2021 IEEE International Conference on Consumer Electronics (ICCE). 1–6. https://doi.org/10.1109/ICCE50685.2021.9427581Google ScholarCross Ref
Jishnu Ray Chowdhury, Cornelia Caragea, and Doina Caragea. 2020. Cross-Lingual Disaster-related Multi-label Tweet Classification with Manifold Mixup. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop. Association for Computational Linguistics, Online, 292–298. https://doi.org/10.18653/v1/2020.acl-srw.39Google ScholarCross Ref
Johnny Torres, Carmen Vaca. 2019. Cross-Lingual Perspectives about Crisis-Related Conversations on Twitter. In Companion Proceedings of The 2019 World Wide Web Conference (San Francisco, USA) (WWW ’19). Association for Computing Machinery, New York, NY, USA, 255–261. https://doi.org/10.1145/3308560.3316799Google ScholarDigital Library
Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. 2018. Graph Attention Networks. In Proceedings of the 6th International Conference on Learning Representations(ICLR ’18).Google Scholar
Vikas Verma, Alex Lamb, Christopher Beckham, Amir Najafi, Ioannis Mitliagkas, David Lopez-Paz, and Yoshua Bengio. 2019. Manifold Mixup: Better Representations by Interpolating Hidden States. In Proceedings of the 36th International Conference on Machine Learning(Proceedings of Machine Learning Research, Vol. 97), Kamalika Chaudhuri and Ruslan Salakhutdinov (Eds.). PMLR, Long Beach, California, USA, 6438–6447. http://proceedings.mlr.press/v97/verma19a.htmlGoogle Scholar
Congcong Wang, Paul Nulty, and David Lillis. 2021. Transformer-based Multi-task Learning for Disaster Tweet Categorisation. arXiv preprint arXiv:2110.08010(2021).Google Scholar
Yongji Wu, Defu Lian, Yiheng Xu, Le Wu, and Enhong Chen. 2020. Graph Convolutional Networks with Markov Random Field Reasoning for Social Spammer Detection. Proceedings of the AAAI Conference on Artificial Intelligence 34, 01 (Apr. 2020), 1054–1061. https://doi.org/10.1609/aaai.v34i01.5455Google ScholarCross Ref
Liang Yao, Chengsheng Mao, and Yuan Luo. 2019. Graph convolutional networks for text classification. In Proceedings of the AAAI conference on artificial intelligence, Vol. 33. 7370–7377.Google ScholarDigital Library
Hamada M. Zahera, Rricha Jalota, Mohamed Ahmed Sherif, and Axel-Cyrille Ngonga Ngomo. 2021. I-AID: Identifying Actionable Information From Disaster-Related Tweets. IEEE Access 9(2021), 118861–118870. https://doi.org/10.1109/ACCESS.2021.3107812Google ScholarCross Ref

Index Terms

GNoM: Graph Neural Network Enhanced Language Models for Disaster Related Multilingual Text Classification
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
  2. Machine learning
    1. Learning paradigms
      1. Supervised learning
    2. Machine learning approaches
      1. Neural networks
2. Information systems
  1. Information systems applications

Index terms have been assigned to the content through auto-classification.

Recommendations

Multi-prototype Morpheme Embedding for Text Classification
SMA 2020: The 9th International Conference on Smart Media and Applications

Representing a word into a continuous space, also known as a word vector, has been successful in various NLP tasks. The word-based embedding has two problems; one is the out-of-vocabulary problem and the other is does not take into account the context ...
Read More
Mono vs Multilingual BERT for Hate Speech Detection and Text Classification: A Case Study in Marathi
Artificial Neural Networks in Pattern Recognition
Abstract
Transformers are the most eminent architectures used for a vast range of Natural Language Processing tasks. These models are pre-trained over a large text corpus and are meant to serve state-of-the-art results over tasks like text classification. ...
Read More
Improving NER Tagging Performance in Low-Resource Languages via Multilingual Learning

Existing supervised solutions for Named Entity Recognition (NER) typically rely on a large annotated corpus. Collecting large amounts of NER annotated corpus is time-consuming and requires considerable human effort. However, collecting small amounts of ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
WebSci '22: Proceedings of the 14th ACM Web Science Conference 2022
June 2022
479 pages
ISBN:9781450391917
DOI:10.1145/3501247
General Chairs:
Ricardo Baeza-Yates
Northeastern University, MA, USA & Universitat Pompeu Fabra, Spain
,
Katrin Weller
GESIS & Center for Advanced Internet Studies, Germany
,
Organizing Chair:
Manuel Portela
Universitat Pompeu Fabra, Spain
,
Program Chairs:
Oshani Seneviratne
Rensselaer Polytechnic Institute, NY, USA
,
Ingmar Weber
Qatar Computing Research Institute, Qatar
,
Taha Yasseri
University College Dublin, Ireland
,
Publications Chairs:
Anna Bon
Vrije Universiteit Amsterdam, Netherlands
,
Srinath Srinivas
International Institute of Information Technology, Bangalore, India
,
Luis-Daniel Ibáñez
University of Southampton, UK
Copyright © 2022 ACM
Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 26 June 2022
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Disaster Management
Graph Neural Networks
Multilingual Learning
Natural Language Processing
Text Classification
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate218of875submissions,25%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 5
  Total Citations
  View Citations
- 169
  Total Downloads
- Downloads (Last 12 months)74
- Downloads (Last 6 weeks)6
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

GNoM: Graph Neural Network Enhanced Language Models for Disaster Related Multilingual Text Classification

WebSci '22: Proceedings of the 14th ACM Web Science Conference 2022

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Multi-prototype Morpheme Embedding for Text Classification

Mono vs Multilingual BERT for Hate Speech Detection and Text Classification: A Case Study in Marathi

Improving NER Tagging Performance in Low-Resource Languages via Multilingual Learning