skip to main content
research-article

Abusive and Hate speech Classification in Arabic Text Using Pre-trained Language Models and Data Augmentation

Published: 21 November 2024 Publication History

Abstract

Hateful content on social media is a worldwide problem that adversely affects not just the targeted individuals but also anyone whose content is accessible. The majority of studies that looked at the automatic identification of inappropriate content addressed the English language, given the availability of resources. Therefore, there are still a number of low-resource languages that need more attention from the community. This article focuses on the Arabic dialect, which has several specificities that make the use of non-Arabic models inappropriate. Our hypothesis is that leveraging pre-trained language models (PLMs) specifically designed for Arabic, along with data augmentation techniques, can significantly enhance the detection of hate speech in Arabic mono- and multi-dialect texts.
To test this hypothesis, we conducted a series of experiments addressing three key research questions: (RQ1) Does text augmentation enhance the final results compared to using an unaugmented dataset? (RQ2) Do Arabic PLMs outperform other models utilizing techniques such as fastText and AraVec word embeddings? (RQ3) Does training and fine-tuning models on a multilingual dataset yield better results than training them on a monolingual dataset?
Our methodology involved the comparison of PLMs based on transfer learning, specifically examining the performance of DziriBERT, AraBERT v2, and BERT-base-arabic models. We implemented text augmentation techniques and evaluated their impact on model performance. The tools used included fastText and AraVec for word embeddings, as well as various PLMs for transfer learning.
The results demonstrate a notable improvement in classification accuracy, with augmented datasets showing an increase in performance metrics (accuracy, precision, recall, and F1-score) by up to 15–21% compared to non-augmented datasets. This underscores the potential of data augmentation in enhancing the models’ ability to generalize across the nuanced spectrum of Arabic dialects.

References

[1]
Naganna Chetty and Sreejith Alathur. 2018. Hate speech review in the context of online social networks. Aggress. Viol. Behav. 40 (2018), 108–118.
[2]
Amine Abdaoui, Mohamed Berrimi, Mourad Oussalah, and Abdelouahab Moussaoui. 2021. DziriBERT: A pre-trained language model for the Algerian Dialect. Retrieved from https://arXiv:2109.12346
[3]
Kareem E. Abdelfatah, Gabriel Terejanu, Ayman A. Alhelbawy, et al. 2017. Unsupervised detection of violent content in Arabic social media. Comput. Sci. Info. Technol. 7 (2017).
[4]
Muhammad Abdul-Mageed, AbdelRahim Elmadany, and El Moatez Billah Nagoudi. 2020. ARBERT & MARBERT: Deep bidirectional transformers for Arabic. Retrieved from https://arXiv:2101.01785
[5]
Nawaf A. Abdulla, Nizar A. Ahmed, Mohammed A. Shehab, and Mahmoud Al-Ayyoub. 2013. Arabic sentiment analysis: Lexicon-based and corpus-based. In Proceedings of the IEEE Jordan Conference on Applied Electrical Engineering and Computing Technologies (AEECT’13). IEEE, 1–6.
[6]
Ehab A. Abozinadah, Alex V. Mbaziira, and J. Jones. 2015. Detection of abusive accounts with Arabic tweets. Int. J. Knowl. Eng.-IACSIT 1, 2 (2015), 113–119.
[7]
Ibrahim Abu Farha and Walid Magdy. 2020. Multitask learning for Arabic offensive language and hate-speech detection. In Proceedings of the 4th Workshop on Open-Source Arabic Corpora and Processing Tools, with a Shared Task on Offensive Language Detection. European Language Resource Association, 86–90. Retrieved from https://aclanthology.org/2020.osact-1.14
[8]
Zinah Abdulridha Abutiheen, Ahmed H. Aliwy, and Kadhim B. S. Aljanabi. 2018. Arabic text classification using master-slaves technique. In Journal of Physics: Conference Series, Vol. 1032. IOP Publishing, 012052.
[9]
Nizar A. Ahmed, Mohammed A. Shehab, Mahmoud Al-Ayyoub, and Ismail Hmeidi. 2015. Scalable multi-label Arabic text classification. In Proceedings of the 6th International Conference on Information and Communication Systems (ICICS’15). IEEE, 212–217.
[10]
Areej Al-Hassan and Hmood Al-Dossari. 2019. Detection of Hate Speech in Social Networks: A Survey on Multilingual Corpus. Proceedings of the 6th International Conference on Computer Science and Information Technology (CS& IT’19).
[11]
Marwan Al Omari, Moustafa Al-Hajj, Nacereddine Hammami, and Amani Sabra. 2019. Sentiment classifier: Logistic regression for Arabic services’ reviews in lebanon. In Proceedings of the International Conference on Computer and Information Sciences (ICCIS’19). IEEE, 1–5.
[12]
Mayy M. Al-Tahrawi and Sumaya N. Al-Khatib. 2015. Arabic text classification using Polynomial Networks. J. King Saud Univ. Comput. Info. Sci. 27, 4 (2015), 437–449.
[13]
Azalden Alakrot, Liam Murray, and Nikola S. Nikolov. 2018. Dataset construction for the detection of anti-social behaviour in online communication in Arabic. Procedia Comput. Sci. 142 (2018), 174–181.
[14]
Nuha Albadi, Maram Kurdi, and Shivakant Mishra. 2018. Are they our brothers? Analysis and detection of religious hate speech in the Arabic twittersphere. In Proceedings of the IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM’18). IEEE, 69–76.
[15]
Ibrahim Aljarah, Maria Habib, Neveen Hijazi, Hossam Faris, Raneem Qaddoura, Bassam Hammo, Mohammad Abushariah, and Mohammad Alfawareh. 2021. Intelligent detection of hate speech in Arabic social network: A machine learning approach. J. Info. Sci. 47, 4 (2021), 483–501.
[16]
Safa Alsafari, Samira Sadaoui, and Malek Mouhoub. 2020. Hate and offensive speech detection on Arabic social media. Online Soc. Netw. Media 19 (2020), 100096.
[17]
Raghad Alshaalan and Hend Al-Khalifa. 2020. Hate speech detection in saudi twittersphere: A deep learning approach. In Proceedings of the 5th Arabic Natural Language Processing Workshop. 12–23.
[18]
A. Aziz Altowayan and Lixin Tao. 2016. Word embeddings for Arabic sentiment analysis. In Proceedings of the IEEE International Conference on Big Data (BigData’16). IEEE, 3820–3825.
[19]
Mohamed Aly and Amir Atiya. 2013. Labr: A large scale Arabic book reviews dataset. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics. 494–498.
[20]
Wissam Antoun, Fady Baly, and Hazem Hajj. [2020]. AraBERT: Transformer-based model for Arabic language understanding. In Proceedings of the Workshop Language Resources and Evaluation Conference (LREC’20). 9.
[21]
Pinkesh Badjatiya, Shashank Gupta, Manish Gupta, and Vasudeva Varma. 2017. Deep learning for hate speech detection in tweets. In Proceedings of the 26th International Conference on World Wide Web Companion. 759–760.
[22]
Nabil Badri, Ferihane Kboubi, and Anja Habacha Chaibi. 2022. Combining FastText and Glove word embedding for offensive and hate speech text detection. Procedia Comput. Sci. 207 (2022), 769–778.
[23]
Nabil Badri, Ferihane Kboubi, and Anja Habacha Chaibi. 2022. Towards automatic detection of inappropriate content in multi-dialectic Arabic text. In Proceedings of the Conference on Computational Collective Intelligence Technologies and Applications. Springer, 84–100.
[24]
Zakaria Boulouard, Mariya Ouaissa, and Mariyam Ouaissa. 2022. Machine learning for hate speech detection in Arabic social media. In Computational Intelligence in Recent Communication Networks. Springer, 147–162.
[25]
Alexis Conneau and Guillaume Lample. 2019. Cross-lingual language model pretraining. Adv. Neural Info. Process. Syst. 32 (2019).
[26]
Kareem Darwish, Walid Magdy, and Ahmed Mourad. 2012. Language processing for Arabic microblog retrieval. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management. 2427–2430.
[27]
Thomas Davidson, Dana Warmsley, Michael Macy, and Ingmar Weber. 2017. Automated hate speech detection and the problem of offensive language. In Proceedings of the International AAAI Conference on Web and Social Media, Vol. 11. 512–515.
[28]
Tuo Deng, Astrid Manders, Jianbing Jin, and Hai Xiang Lin. 2022. Clustering-based spatial transfer learning for short-term ozone forecasting. J. Hazard. Mater. Adv. (2022), 100168.
[29]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. Retrieved from https://arXiv:1810.04805
[30]
Nemanja Djuric, Jing Zhou, Robin Morris, Mihajlo Grbovic, Vladan Radosavljevic, and Narayan Bhamidipati. 2015. Hate speech detection with comment embeddings. In Proceedings of the 24th International Conference on World Wide Web. 29–30.
[31]
John Qi Dong and Chia-Han Yang. 2020. Business value of big data analytics: A systems-theoretic approach and empirical test. Info. Manage. 57, 1 (2020), 103124.
[32]
Rehab Duwairi, Amena Hayajneh, and Muhannad Quwaider. 2021. A deep learning framework for automatic detection of hate speech embedded in Arabic tweets. Arab. J. Sci. Eng. 46 (2021), 1–14.
[33]
A. Elmadany, Hamdy Mubarak, and Walid Magdy. 2018. ArSAS: An Arabic speech-act and sentiment corpus of tweets. In Proceedings of the 3rd Workshop on Open-Source Arabic Corpora and Processing Tools (OSACT’18). 20.
[34]
Ibrahim Abu Farha and Walid Magdy. 2020. From Arabic sentiment analysis to sarcasm detection: The ArSarcasm dataset. In Proceedings of the 4th Workshop on Open-Source Arabic Corpora and Processing Tools, with a Shared Task on Offensive Language Detection. 32–39.
[35]
Ibrahim Abu Farha and Walid Magdy. 2020. Multitask learning for Arabic offensive language and hate-speech detection. In Proceedings of the 4th Workshop on Open-source Arabic Corpora and Processing Tools, with a Shared Task on Offensive Language Detection. 86–90.
[36]
Norjihan Abdul Ghani, Suraya Hamid, Ibrahim Abaker Targio Hashem, and Ejaz Ahmed. 2019. Social media big data analytics: A survey. Comput. Hum. Behav. 101 (2019), 417–428.
[37]
Njagi Dennis Gitari, Zhang Zuping, Hanyurwimfura Damien, and Jun Long. 2015. A lexicon-based approach for hate speech detection. Int. J. Multimedia Ubiq. Eng. 10, 4 (2015), 215–230.
[38]
Hatem Haddad, Hala Mulki, and Asma Oueslati. 2019. T-HSAB: A Tunisian hate speech and abusive dataset. In Proceedings of the International Conference on Arabic Language Processing. Springer, 251–263.
[39]
Hatem Haddad, Ahmed Cheikh Rouhou, Abir Messaoudi, Abir Korched, Chayma Fourati, Amel Sellami, Moez Ben HajHmida, and Faten Ghriss. 2023. TunBERT: Pretraining BERT for Tunisian dialect understanding. SN Comput. Sci. 4, 2 (2023), 194.
[40]
Batoul Haidar, Maroun Chamoun, and Ahmed Serhrouchni. 2017. A multilingual system for cyberbullying detection: Arabic content detection using machine learning. Adv. Sci. Technol. Eng. Syst. J. 2, 6 (2017), 275–284.
[41]
Malek Hedhli and Ferihane Kboubi. 2023. CNN-BiLSTM model for Arabic dialect identification. In Proceedings of the International Conference on Computational Collective Intelligence. Springer, 213–225.
[42]
Go Inoue, Bashar Alhafni, Nurpeiis Baimukan, Houda Bouamor, and Nizar Habash. 2021. The interplay of variant, size, and task type in Arabic pre-trained language models. Retrieved from https://arXiv:2103.06678
[43]
Akshita Jha and Radhika Mamidi. 2017. When does a compliment become sexist? Analysis and classification of ambivalent sexism using Twitter data. In Proceedings of the 2nd Workshop on NLP and Computational Social Science. Association for Computational Linguistics, 7–16.
[44]
Armand Joulin, Edouard Grave, Piotr Bojanowski, Matthijs Douze, Hérve Jégou, and Tomas Mikolov. 2016. Fasttext. zip: Compressing text classification models. Retrieved from https://arXiv:1612.03651
[45]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT’19), Vol. 1. 2.
[46]
Marwa Khairy, Tarek M. Mahmoud, and Tarek Abd-El-Hafeez. 2021. Automatic detection of cyberbullying and abusive language in Arabic content on social networks: A survey. Procedia Comput. Sci. 189 (2021), 156–166.
[47]
Marwa Khairy, Tarek M. Mahmoud, and Tarek Abd-El-Hafeez. 2024. The effect of rebalancing techniques on the classification performance in cyberbullying datasets. Neural Comput. Appl. 36, 3 (2024), 1049–1065.
[48]
Marwa Khairy, Tarek M. Mahmoud, Tarek Abd-El-Hafeez, and Ahmed Mahfouz. 2021. User awareness of privacy, reporting system and cyberbullying on Facebook. In Proceedings of the Conference on Advanced Machine Learning Technologies and Applications (AMLTA’21). Springer, 613–625.
[49]
Marwa Khairy, Tarek M. Mahmoud, Ahmed Omar, and Tarek Abd El-Hafeez. 2023. Comparative performance of ensemble machine learning for Arabic cyberbullying and offensive language detection. Lang. Res. Eval. 58 (2023), 695–712.
[50]
Amr Mohamed El Koshiry, Entesar Hamed I. Eliwa, Tarek Abd El-Hafeez, and Ahmed Omar. 2023. Arabic toxic tweet classification: Leveraging the AraBERT model. Big Data Cogn. Comput. 7, 4 (2023), 170.
[51]
Irene Kwok and Yuzhou Wang. 2013. Locate the hate: Detecting tweets against blacks. In Proceedings of the 27th AAAI Conference on Artificial Intelligence.
[52]
Edward Ma. 2019. NLP Augmentation. Retrieved from https://github.com/makcedward/nlpaug
[53]
Heba Mamdouh Farghaly and Tarek Abd El-Hafeez. 2022. A new feature selection method based on frequent and associated itemsets for text classification. Concurr. Comput.: Pract. Exp. 34, 25 (2022), e7258.
[54]
Yassir Matrane, Faouzia Benabbou, and Nawal Sael. 2023. A systematic literature review of Arabic dialect sentiment analysis. J. King Saud Univ. Comput. Info. Sci. (2023), 101570.
[55]
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S. Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. Adv. Neural Info. Process. Syst. 26 (2013).
[56]
Djamila Mohdeb, Meriem Laifa, Fayssal Zerargui, and Omar Benzaoui. 2022. Evaluating transfer learning approach for detecting Arabic anti-refugee/migrant speech on social media. Aslib J. Info. Manage. 74, 6 (2022), 1075–1088.
[57]
Leila Moudjari, Karima Akli-Astouati, and Farah Benamara. 2020. An algerian corpus and an annotation platform for opinion and emotion analysis. In Proceedings of the 12th Language Resources and Evaluation Conference. 1202–1210.
[58]
Hamdy Mubarak, Kareem Darwish, and Walid Magdy. 2017. Abusive language detection on Arabic social media. In Proceedings of the 1st Workshop on Abusive Language Online. 52–56.
[59]
Hala Mulki, Hatem Haddad, Chedi Bechikh Ali, and Halima Alshabani. 2019. L-HSAB: A levantine Twitter dataset for hate speech and abusive language. In Proceedings of the 3rd Workshop on Abusive Language Online. 111–118.
[60]
Hala Mulki, Hatem Haddad, Mourad Gridach, and Ismail Babaoğlu. 2019. Syntax-ignorant N-gram embeddings for sentiment analysis of Arabic dialects. In Proceedings of the 4th Arabic Natural Language Processing Workshop. 30–39.
[61]
Karsten Müller and Carlo Schwarz. 2021. Fanning the flames of hate: Social media and hate crime. J. Eur. Econ. Assoc. 19, 4 (2021), 2131–2167.
[62]
Mahmoud Nabil, Mohamed Aly, and Amir Atiya. 2015. ASTD: Arabic sentiment tweets dataset. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 2515–2519.
[63]
Ahmed Omar and Tarek Abd El-Hafeez. 2023. Quantum computing and machine learning for Arabic language sentiment classification in social media. Sci. Rep. 13, 1 (2023), 17305.
[64]
Ahmed Omar, Tarek M. Mahmoud, and Tarek Abd-El-Hafeez. 2020. Comparative performance of machine learning and deep learning algorithms for Arabic hate speech detection in OSNs. In Proceedings of the International Conference on Artificial Intelligence and Computer Vision. Springer, 247–257.
[65]
Ahmed Omar, Tarek M. Mahmoud, Tarek Abd-El-Hafeez, and Ahmed Mahfouz. 2021. Multi-label Arabic text classification in online social networks. Info. Syst. 100 (2021), 101785.
[66]
Nedjma Ousidhoum, Zizheng Lin, Hongming Zhang, Yangqiu Song, and Dit-Yan Yeung. 2019. Multilingual and multi-aspect hate speech analysis. Retrieved from https://arXiv:1908.11049
[67]
Amalie Pauli, Rafael Sarabia, Leon Derczynski, and Ira Assent. 2023. TeamAmpa at SemEval-2023 task 3: Exploring multilabel and multilingual RoBERTa models for persuasion and framing detection. In Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval’23). 847–855.
[68]
Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’14). 1532–1543.
[69]
Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, Ilya Sutskever et al. 2019. Language models are unsupervised multitask learners. OpenAI Blog 1, 8 (2019), 9.
[70]
Ali Safaya, Moutasem Abdullatif, and Deniz Yuret. 2020. Kuisail at semeval-2020 task 12: Bert-CNN for offensive speech identification in social media. In Proceedings of the 14th Workshop on Semantic Evaluation. 2054–2059.
[71]
Mohammed A. Shehab, Omar Badarneh, Mahmoud Al-Ayyoub, and Yaser Jararweh. 2016. A supervised approach for multi-label classification of Arabic news articles. In Proceedings of the 7th International Conference on Computer Science and Information Technology (CSIT’16). IEEE, 1–6.
[72]
Leandro Silva, Mainack Mondal, Denzil Correa, Fabrício Benevenuto, and Ingmar Weber. 2016. Analyzing the targets of hate in online social media. In Proceedings of the 10th International AAAI Conference on Web and Social Media.
[73]
Saja Tawalbeh and Mohammad Al-Smadi. 2020. Is this sentence valid? An Arabic dataset for commonsense validation. Retrieved from https://arXiv:2008.10873
[74]
Zeerak Waseem and Dirk Hovy. 2016. Hateful symbols or hateful people? Predictive features for hate speech detection on twitter. In Proceedings of the NAACL Student Research Workshop. 88–93.
[75]
Zhilin Yang, Zihang Dai, Yiming Yang, Jaime Carbonell, Russ R. Salakhutdinov, and Quoc V. Le. 2019. Xlnet: Generalized autoregressive pretraining for language understanding. Adv. Neural Info. Process. Syst. 32 (2019).

Cited By

View all
  • (2025)Machine Learning Approaches for Sentiment Analysis on Social MediaAI-Driven: Social Media Analytics and Cybersecurity10.1007/978-3-031-80334-5_2(21-43)Online publication date: 4-Mar-2025

Index Terms

  1. Abusive and Hate speech Classification in Arabic Text Using Pre-trained Language Models and Data Augmentation

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Asian and Low-Resource Language Information Processing
      ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 23, Issue 11
      November 2024
      248 pages
      EISSN:2375-4702
      DOI:10.1145/3613714
      Issue’s Table of Contents

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 21 November 2024
      Online AM: 03 August 2024
      Accepted: 07 July 2024
      Revised: 06 April 2024
      Received: 21 September 2023
      Published in TALLIP Volume 23, Issue 11

      Check for updates

      Author Tags

      1. Natural Language Processing (NLP)
      2. hate speech detection
      3. Arabic mono/multi-dialect NLP
      4. transfer learning
      5. Pre-trained Language Models (PLMs)
      6. data augmentation
      7. DziriBERT
      8. AraBERT v2
      9. BERT-base-arabic
      10. text classification

      Qualifiers

      • Research-article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)345
      • Downloads (Last 6 weeks)20
      Reflects downloads up to 02 Mar 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2025)Machine Learning Approaches for Sentiment Analysis on Social MediaAI-Driven: Social Media Analytics and Cybersecurity10.1007/978-3-031-80334-5_2(21-43)Online publication date: 4-Mar-2025

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Full Text

      View this article in Full Text.

      Full Text

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media