skip to main content
10.1145/3441501.3441517acmotherconferencesArticle/Chapter ViewAbstractPublication PagesfireConference Proceedingsconference-collections
research-article

Overview of the HASOC Track at FIRE 2020: Hate Speech and Offensive Language Identification in Tamil, Malayalam, Hindi, English and German

Published:17 January 2021Publication History

ABSTRACT

This paper presents the HASOC track and its two parts. HASOC is dedicated to evaluate technology for finding Offensive Language and Hate Speech. HASOC is creating test collections for languages with few resources and English for comparison. The first track within HASOC has continued work from 2019 and provided a testbed of Twitter posts for Hindi, German and English. The second track within HASOC has created test resources for Tamil and Malayalam in native and Latin script. Posts were extracted mainly from Youtube and Twitter. Both tracks have attracted much interest and over 40 research groups have participated as well as described their approaches in papers. In this overview, we present the tasks, the data and the main results.

References

  1. Ajees A P. 2020. Ajees@HASOC-Dravidian-CodeMix-FIRE2020. In FIRE (Working Notes). CEUR.Google ScholarGoogle Scholar
  2. Gaurav Arora. 2020. Gauravarora@HASOC-Dravidian-CodeMix- FIRE2020: Pre-training ULMFiT on Synthetically Generated Code-Mixed Data for Hate Speech Detection. In FIRE (Working Notes). CEUR.Google ScholarGoogle Scholar
  3. Nitin Nikamath Balaji and B Bharathi. 2020. SSNCSE-NLP@HASOC-Dravidian-CodeMix- FIRE2020: Offensive Language Identification on Multilingual Code Mixing Text. In FIRE (Working Notes). CEUR.Google ScholarGoogle Scholar
  4. Arup Baruah, Kaushik Amar Das, Ferdous Ahmed Barbhuiya, and Kuntal Dey. 2020. IIITG-ADBU@HASOC-Dravidian-CodeMix-FIRE2020: Offensive Content Detection in Code-Mixed Dravidian Text. In FIRE (Working Notes). CEUR.Google ScholarGoogle Scholar
  5. Bharathi Raja Chakravarthi, Navya Jose, Shardul Suryawanshi, Elizabeth Sherly, and John Philip McCrae. 2020. A Sentiment Analysis Dataset for Code-Mixed Malayalam-English. In Proceedings of the 1st Joint Workshop on Spoken Language Technologies for Under-resourced languages (SLTU) and Collaboration and Computing for Under-Resourced Languages (CCURL). European Language Resources association, Marseille, France, 177–184. https://www.aclweb.org/anthology/2020.sltu-1.25Google ScholarGoogle Scholar
  6. Bharathi Raja Chakravarthi, Anand Kumar M, John P. McCrae, B. Premjith, K.P. Soman, and Thomas Mandl. 2020. Overview of the track on HASOC-Offensive Language Identification-DravidianCodeMix. In FIRE (Working Notes). CEUR.Google ScholarGoogle Scholar
  7. Bharathi Raja Chakravarthi, Vigneshwaran Muralidaran, Ruba Priyadharshini, and John Philip McCrae. 2020. Corpus Creation for Sentiment Analysis in Code-Mixed Tamil-English Text. In Proceedings of the 1st Joint Workshop on Spoken Language Technologies for Under-resourced languages (SLTU) and Collaboration and Computing for Under-Resourced Languages (CCURL). European Language Resources association, Marseille, France, 202–210. https://www.aclweb.org/anthology/2020.sltu-1.28Google ScholarGoogle Scholar
  8. Kunjie Dong. 2020. YUN@HASOC-Dravidian-CodeMix-FIRE2020: A Multi-component Sentiment Analysis Model for Offensive Language Identification. In FIRE (Working Notes). CEUR.Google ScholarGoogle Scholar
  9. Tochukwu Ezike and Manikandan Sivanesan. 2020. Chrestotes at HASOC 2020: Bert Fine-tuning for the Identification of Hate Speech and Offensive Language in Tweets. In FIRE (Working Notes). CEUR.Google ScholarGoogle Scholar
  10. Ritesh Kumar, Bornini Lahiri, Atul Kr. Ojha, and Akanksha Bansal. 2020. ComMA@FIRE 2020: Exploring Multilingual Joint Training across different Classification Tasks. In FIRE (Working Notes). CEUR.Google ScholarGoogle Scholar
  11. Ritesh Kumar, Atul Kr. Ojha, Shervin Malmasi, and Marcos Zampieri. 2018. Benchmarking Aggression Identification in Social Media. In Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying (TRAC-2018). Association for Computational Linguistics, Santa Fe, New Mexico, USA, 1–11. https://www.aclweb.org/anthology/W18-4401Google ScholarGoogle Scholar
  12. Sunil Kumar, Abhinav Saumya, and Jyoti Prakash Singh. 2020. NITP-AINLP@HASOC-Dravidian-CodeMix-FIRE2020: A Machine Learning Approach to Identify Offensive Languages from Dravidian Code-Mixed Text. In FIRE (Working Notes). CEUR.Google ScholarGoogle Scholar
  13. Thomas Mandl, Sandip Modha, Prasenjit Majumder, Daksh Patel, Mohana Dave, Chintak Mandlia, and Aditya Patel. 2019. Overview of the HASOC track at FIRE 2019: Hate speech and offensive content identification in Indo-European languages. In Proceedings of the 11th Forum for Information Retrieval Evaluation. CEUR, 14–17. http://ceur-ws.org/Vol-2517/T3-1.pdfGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  14. Thomas Mandl, Sandip Modha, Gautam Kishore Shahi, Amit Kumar Jaiswal, Durgesh Nandini, Daksh Patel, Prasenjit Majumder, and Johannes Schäfer. 2020. Overview of the HASOC track at FIRE 2020: Hate Speech and Offensive Content Identification in Indo-European Languages. In FIRE (Working Notes). CEUR.Google ScholarGoogle Scholar
  15. Ankit Kumar Mishra, Sunil Saumya, and Abhinav Kumar. 2020. IIIT_DWD@HASOC 2020: Identifying offensive content in multitask Indo-European languages. In FIRE (Working Notes). CEUR.Google ScholarGoogle Scholar
  16. Veena P V, Praveena Ramanan, and Remmiya Devi G. 2020. CENMates@HASOC-Dravidian-CodeMix-FIRE2020: Offensive Language Identification on Code-mixed Social Media Comments. In FIRE (Working Notes). CEUR.Google ScholarGoogle Scholar
  17. Varsha Pathak, Manish Joshi, Prasad Joshi, Monica Mundada, and Tanmay Joshi. 2020. KBCNMUJAL@HASOC-Dravidian-CodeMix-FIRE2020: Using Machine Learning for Detection of Hate Speech and Offensive Codemix Social Media text. In FIRE (Working Notes). CEUR.Google ScholarGoogle Scholar
  18. Fabio Poletto, Valerio Basile, Manuela Sanguinetti, Cristina Bosco, and Viviana Patti. 2020. Resources and benchmark corpora for hate speech detection: a systematic review. Language Resources and Evaluation(2020), 1–47. https://doi.org/10.1007/s10579-020-09502-8Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Roushan Raj, Shivangi Srivastava, and Sunil Saumya. 2020. NSIT & IIITDWD @ HASOC 2020: Deep learning model for hate-speech Identification in Indo-European languages. In FIRE (Working Notes). CEUR.Google ScholarGoogle Scholar
  20. Tharindu Ranasinghe and Marcos Zampieri. 2020. WLV-RIT @ HASOC 2020: Offensive Language Identification in Code-switched Texts. In FIRE (Working Notes). CEUR.Google ScholarGoogle Scholar
  21. Sara Renjit. 2020. CUSAT-NLP@HASOC-Dravidian-CodeMix-FIRE2020: Identifying Offensive Language from Manglish Tweets. In FIRE (Working Notes). CEUR.Google ScholarGoogle Scholar
  22. Siva Sai and Yashvardhan Sharma. 2020. Siva@HASOC-Dravidian-CodeMix-FIRE-2020: Multilingual Offensive Speech Detection in Code-mixed and Romanized Text. In FIRE (Working Notes). CEUR.Google ScholarGoogle Scholar
  23. Pankaj Singh and Pushpak Bhattacharyya. 2020. CFILT IIT Bombay@HASOC-Dravidian-CodeMix FIRE 2020: Assisting ensemble of transformers with random transliteration. In FIRE (Working Notes). CEUR. http://ceur-ws.org/Google ScholarGoogle Scholar
  24. Michael Wiegand, Melanie Siegel, and Josef Ruppenhofer. 2018. Overview of the germeval 2018 shared task on the identification of offensive language. (2018). https://ids-pub.bsz-bw.de/files/8493/Wiegand_Siegel_Ruppenhofer_Overview_of_the_GermEval_2018.pdfGoogle ScholarGoogle Scholar
  25. Marcos Zampieri, Shervin Malmasi, Preslav Nakov, Sara Rosenthal, Noura Farra, and Ritesh Kumar. 2019. Predicting the Type and Target of Offensive Posts in Social Media. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 1415–1420. https://doi.org/10.18653/v1/N19-1144Google ScholarGoogle ScholarCross RefCross Ref
  26. Yueying Zhu and Xiaobing Zhou. 2020. Zyy1510@HASOC-Dravidian-CodeMix-FIRE2020: An Ensemble Model for Offensive Language Identification. In FIRE (Working Notes). CEUR.Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Other conferences
    FIRE '20: Proceedings of the 12th Annual Meeting of the Forum for Information Retrieval Evaluation
    December 2020
    70 pages
    ISBN:9781450389785
    DOI:10.1145/3441501

    Copyright © 2020 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 17 January 2021

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article
    • Research
    • Refereed limited

    Acceptance Rates

    Overall Acceptance Rate19of64submissions,30%

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format