skip to main content
10.1145/3501247.3531561acmconferencesArticle/Chapter ViewAbstractPublication PageswebsciConference Proceedingsconference-collections
research-article

GNoM: Graph Neural Network Enhanced Language Models for Disaster Related Multilingual Text Classification

Published:26 June 2022Publication History

ABSTRACT

Online social media works as a source of various valuable and actionable information during disasters. These information might be available in multiple languages due to the nature of user generated content. An effective system to automatically identify and categorize these actionable information should be capable to handle multiple languages and under limited supervision. However, existing works mostly focus on English language only with the assumption that sufficient labeled data is available. To overcome these challenges, we propose a multilingual disaster related text classification system which is capable to work undervmonolingual, cross-lingual and multilingual lingual scenarios and under limited supervision. Our end-to-end trainable framework combines the versatility of graph neural networks, by applying over the corpus, with the power of transformer based large language models, over examples, with the help of cross-attention between the two. We evaluate our framework over total nine English, Non-English and monolingual datasets invmonolingual, cross-lingual and multilingual lingual classification scenarios. Our framework outperforms state-of-the-art models in disaster domain and multilingual BERT baseline in terms of Weighted F1 score. We also show the generalizability of the proposed model under limited supervision.

Skip Supplemental Material Section

Supplemental Material

WS22_S1_65.mp4

mp4

958.4 MB

References

  1. Firoj Alam, Shafiq Joty, and Muhammad Imran. 2018. Domain Adaptation with Adversarial Training and Graph Embeddings. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 1077–1087.Google ScholarGoogle ScholarCross RefCross Ref
  2. Aindriya Barua, S. Thara, B. Premjith, and K. P. Soman. 2021. Analysis of Contextual and Non-contextual Word Embedding Models for Hindi NER with Web Application for Data Collection. In Advanced Computing, Deepak Garg, Kit Wong, Jagannathan Sarangapani, and Suneet Kumar Gupta (Eds.). Springer Singapore, Singapore, 183–202.Google ScholarGoogle Scholar
  3. Cornelia Caragea, Adrian Silvescu, and Andrea H. Tapia. 2016. Identifying informative messages in disaster events using Convolutional Neural Networks. In ISCRAM 2016 Conference Proceedings - 13th International Conference on Information Systems for Crisis Response and Management(Proceedings of the International ISCRAM Conference), Pedro Antunes, Victor Amadeo Banuls Silvera, Joao Porto de Albuquerque, Kathleen Ann Moore, and Andrea H. Tapia (Eds.). Information Systems for Crisis Response and Management, ISCRAM.Google ScholarGoogle Scholar
  4. Alfredo Cobo, Denis Parra, and Jaime Navón. 2015. Identifying Relevant Messages in a Twitter-Based Citizen Channel for Natural Disaster Situations. In Proceedings of the 24th International Conference on World Wide Web (Florence, Italy) (WWW ’15 Companion). Association for Computing Machinery, New York, NY, USA, 1189–1194. https://doi.org/10.1145/2740908.2741719Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Alexis Conneau, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, Edouard Grave, Myle Ott, Luke Zettlemoyer, and Veselin Stoyanov. 2019. Unsupervised Cross-lingual Representation Learning at Scale. arXiv preprint arXiv:1911.02116(2019).Google ScholarGoogle Scholar
  6. Alexis Conneau, Guillaume Lample, Marc’Aurelio Ranzato, Ludovic Denoyer, and Hervé Jégou. 2017. Word Translation Without Parallel Data. arXiv preprint arXiv:1710.04087(2017).Google ScholarGoogle Scholar
  7. Stefano Cresci, Maurizio Tesconi, Andrea Cimino, and Felice Dell’Orletta. 2015. A Linguistically-Driven Approach to Cross-Event Damage Assessment of Natural Disasters from Social Media Messages. In Proceedings of the 24th International Conference on World Wide Web (Florence, Italy) (WWW ’15 Companion). Association for Computing Machinery, New York, NY, USA, 1195–1200. https://doi.org/10.1145/2740908.2741722Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 4171–4186. https://doi.org/10.18653/v1/N19-1423Google ScholarGoogle ScholarCross RefCross Ref
  9. Samujjwal Ghosh and Maunendra Sankar Desarkar. 2020. Semi-Supervised Granular Classification Framework for Resource Constrained Short-texts: Towards Retrieving Situational Information During Disaster Events. In 12th ACM Conference on Web Science. 29–38.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Saptarshi Ghosh and Kripabandhu Ghosh. 2016. Overview of the FIRE 2016 Microblog track: Information Extraction from Microblogs Posted during Disasters. In Working notes of FIRE 2016 - Forum for Information Retrieval Evaluation, Kolkata, India, December 7-10, 2016(CEUR Workshop Proceedings, Vol. 1737), Prasenjit Majumder, Mandar Mitra, Parth Mehta, Jainisha Sankhavara, and Kripabandhu Ghosh (Eds.). CEUR-WS.org, 56–61. http://ceur-ws.org/Vol-1737/T2-1.pdfGoogle ScholarGoogle Scholar
  11. Saptarshi Ghosh, Kripabandhu Ghosh, Tanmoy Chakraborty, Debasis Ganguly, Gareth Jones, and Marie-Francine Moens. 2017. First International Workshop on Exploitation of Social Media for Emergency Relief and Preparedness (SMERP). ADVANCES IN INFORMATION RETRIEVAL, ECIR 2017 10193 (2017), 779–783.Google ScholarGoogle Scholar
  12. Samujjwal Ghosh, Subhadeep Maji, and Maunendra Sankar Desarkar. 2021. Unsupervised Domain Adaptation with Global and Local Graph Neural Networks in Limited Labeled Data Scenario: Application to Disaster Management. arxiv:2104.01436 [cs.CL]Google ScholarGoogle Scholar
  13. Prashant Khare, Grégoire Burel, Diana Maynard, and Harith Alani. 2018. Cross-Lingual Classification of Crisis Data. In The Semantic Web – ISWC 2018, Denny Vrandečić, Kalina Bontcheva, Mari Carmen Suárez-Figueroa, Valentina Presutti, Irene Celino, Marta Sabou, Lucie-Aimée Kaffee, and Elena Simperl (Eds.). Springer International Publishing, Cham, 617–633.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Jitin Krishnan, Hemant Purohit, and Huzefa Rangwala. 2020. Attention Realignment and Pseudo-Labelling for Interpretable Cross-Lingual Classification of Crisis Tweets. In KiML@KDD.Google ScholarGoogle Scholar
  15. Xukun Li and Doina Caragea. 2020. Domain Adaptation with Reconstruction for Disaster Tweet Classification. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 1561–1564.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. V. Lorini, C. Castillo, F. Dottori, M. Kalas, D. Nappo, and P. Salamon. 2019. Integrating Social Media into a Pan-European Flood Awareness System: A Multilingual Approach. arxiv:1904.10876 [cs.IR]Google ScholarGoogle Scholar
  17. Reza Mazloom, Hongming Li, Doina Caragea, Muhammad Imran, and Cornelia Caragea. 2018. Classification of Twitter Disaster Data Using a Hybrid Feature-Instance Adaptation Approach. In ISCRAM.Google ScholarGoogle Scholar
  18. L. McInnes, J. Healy, and J. Melville. 2018. UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction. ArXiv e-prints (Feb. 2018). arxiv:1802.03426 [stat.ML]Google ScholarGoogle Scholar
  19. Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed Representations of Words and Phrases and their Compositionality. In Advances in Neural Information Processing Systems, C. J. C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K. Q. Weinberger (Eds.). Vol. 26. Curran Associates, Inc., 3111–3119. https://proceedings.neurips.cc/paper/2013/file/9aa42b31882ec039965f3c4923ce901b-Paper.pdfGoogle ScholarGoogle Scholar
  20. Aibek Musaev and Calton Pu. 2017. Towards Multilingual Automated Classification Systems. 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS) (2017), 2333–2337.Google ScholarGoogle Scholar
  21. Venkata Kishore Neppalli, Cornelia Caragea, and Doina Caragea. 2018. Deep Neural Networks versus Naive Bayes Classifiers for Identifying Informative Tweets during Disasters. In ISCRAM.Google ScholarGoogle Scholar
  22. Tien Dat Nguyen, Kamla Al-Mannai, Shafiq R. Joty, Hassan Sajjad, Muhammad Imran, and Prasenjit Mitra. 2017. Robust Classification of Crisis-Related Data on Social Networks Using Convolutional Neural Networks. In ICWSM.Google ScholarGoogle Scholar
  23. Kenta Oono and Taiji Suzuki. 2020. Graph Neural Networks Exponentially Lose Expressive Power for Node Classification. In International Conference on Learning Representations. https://openreview.net/forum?id=S1ldO2EFPrGoogle ScholarGoogle Scholar
  24. Sara Piscitelli, Edoardo Arnaudo, and Claudio Rossi. 2021. Multilingual Text Classification from Twitter during Emergencies. In 2021 IEEE International Conference on Consumer Electronics (ICCE). 1–6. https://doi.org/10.1109/ICCE50685.2021.9427581Google ScholarGoogle ScholarCross RefCross Ref
  25. Jishnu Ray Chowdhury, Cornelia Caragea, and Doina Caragea. 2020. Cross-Lingual Disaster-related Multi-label Tweet Classification with Manifold Mixup. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop. Association for Computational Linguistics, Online, 292–298. https://doi.org/10.18653/v1/2020.acl-srw.39Google ScholarGoogle ScholarCross RefCross Ref
  26. Johnny Torres, Carmen Vaca. 2019. Cross-Lingual Perspectives about Crisis-Related Conversations on Twitter. In Companion Proceedings of The 2019 World Wide Web Conference (San Francisco, USA) (WWW ’19). Association for Computing Machinery, New York, NY, USA, 255–261. https://doi.org/10.1145/3308560.3316799Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. 2018. Graph Attention Networks. In Proceedings of the 6th International Conference on Learning Representations(ICLR ’18).Google ScholarGoogle Scholar
  28. Vikas Verma, Alex Lamb, Christopher Beckham, Amir Najafi, Ioannis Mitliagkas, David Lopez-Paz, and Yoshua Bengio. 2019. Manifold Mixup: Better Representations by Interpolating Hidden States. In Proceedings of the 36th International Conference on Machine Learning(Proceedings of Machine Learning Research, Vol. 97), Kamalika Chaudhuri and Ruslan Salakhutdinov (Eds.). PMLR, Long Beach, California, USA, 6438–6447. http://proceedings.mlr.press/v97/verma19a.htmlGoogle ScholarGoogle Scholar
  29. Congcong Wang, Paul Nulty, and David Lillis. 2021. Transformer-based Multi-task Learning for Disaster Tweet Categorisation. arXiv preprint arXiv:2110.08010(2021).Google ScholarGoogle Scholar
  30. Yongji Wu, Defu Lian, Yiheng Xu, Le Wu, and Enhong Chen. 2020. Graph Convolutional Networks with Markov Random Field Reasoning for Social Spammer Detection. Proceedings of the AAAI Conference on Artificial Intelligence 34, 01 (Apr. 2020), 1054–1061. https://doi.org/10.1609/aaai.v34i01.5455Google ScholarGoogle ScholarCross RefCross Ref
  31. Liang Yao, Chengsheng Mao, and Yuan Luo. 2019. Graph convolutional networks for text classification. In Proceedings of the AAAI conference on artificial intelligence, Vol. 33. 7370–7377.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Hamada M. Zahera, Rricha Jalota, Mohamed Ahmed Sherif, and Axel-Cyrille Ngonga Ngomo. 2021. I-AID: Identifying Actionable Information From Disaster-Related Tweets. IEEE Access 9(2021), 118861–118870. https://doi.org/10.1109/ACCESS.2021.3107812Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. GNoM: Graph Neural Network Enhanced Language Models for Disaster Related Multilingual Text Classification
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            WebSci '22: Proceedings of the 14th ACM Web Science Conference 2022
            June 2022
            479 pages
            ISBN:9781450391917
            DOI:10.1145/3501247

            Copyright © 2022 ACM

            Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 26 June 2022

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article
            • Research
            • Refereed limited

            Acceptance Rates

            Overall Acceptance Rate218of875submissions,25%

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          HTML Format

          View this article in HTML Format .

          View HTML Format