Comparative Performance of Machine Learning and Deep Learning Algorithms for Arabic Hate Speech Detection in OSNs

Omar, Ahmed; Mahmoud, Tarek M.; Abd-El-Hafeez, Tarek

doi:10.1007/978-3-030-44289-7_24

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1153))

Included in the following conference series:

The International Conference on Artificial Intelligence and Computer Vision

3677 Accesses
33 Citations

Abstract

Nowadays, Online Social Networks (OSNs) are the most popular and interactive media that used to express feelings, communicate and share information between people. However, along with useful and interesting content, sometimes unsuitable or abusive content can be published on these networks, such as hate speech and insults. Hate speech includes any type of online abuse concepts like cyberbullying, discrimination, abusive language, profanity, flaming, toxicity, and harassment. Most of the Hate speech detection attempts have concentrated on the English text, while work on the Arabic text is sparse. In this paper, we constructed a standard Arabic dataset that can be used for hate speech and abuse detection. In contrast to most previous work the datasets were collected from one platform, the proposed dataset is collected from more social network platforms (Facebook, Twitter, Instagram, and YouTube). To validate the effectiveness of the proposed datasets twelve machine learning algorithms and two deep learning architecture were used. Recurrent Neural Network (RNN) outperformed other classifiers with an accuracy of 98.7%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Al-Tahrawi, M.M., Al-Khatib, S.N.: Arabic text classification using polynomial networks. J. King Saud Univ. Comput. Inf. Sci. 27(4), 437–449 (2015)
Google Scholar
Alakrot, A., Murray, L., Nikolov, N.S.: Dataset construction for the detection of anti-social behaviour in online communication in Arabic. Procedia Comput. Sci. 142, 174–181 (2018)
Article Google Scholar
Alakrot, A., Murray, L., Nikolov, N.S.: Towards accurate detection of offensive language in online communication in Arabic. Procedia Comput. Sci. 142, 315–320 (2018)
Article Google Scholar
Albadi, N., Kurdi, M., Mishra, S.: Are they our brothers? Analysis and detection of religious hate speech in the Arabic Twittersphere. In: Proceedings of the 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. ASONAM 2018, pp. 69–76 (2018)
Google Scholar
Bodkhe, R., Ghorpade, T., Jethani, V.: A novel methodology to filter out unwanted messages from OSN user’s wall using trust value calculation. In: Satapathy, S.C., Raju, K.S., Mandal, J.K., Bhateja, V. (eds.) Proceedings of the Second International Conference on Computer and Communication Technologies, pp. 755–764. Springer, New Delhi (2016)
Google Scholar
Clement, J.: Most popular social networks worldwide as of October 2019, ranked by number of active users (2019). https://www.statista.com/statistics/272014/global-social-networks-ranked-by-number-of-users/. Accessed 01 Jan 2020
Elayeb, B.: Arabic word sense disambiguation: a review. Artif. Intell. Rev., 1–58 (2018)
Google Scholar
Fortuna, P., Nunes, S.: A survey on automatic detection of hate speech in text. ACM Comput. Surv. 51(4), 1–30 (2018)
Article Google Scholar
Founta, A.M., Chatzakou, D., Kourtellis, N., Blackburn, J., Vakali, A., Leontiadis, I.: A unified deep learning architecture for abuse detection. In: WebSci 2019 – Proceedings of the 11th ACM Conference on Web Science, pp. 105–114 (2019)
Google Scholar
Ghosh Chowdhury, A., Didolkar, A., Sawhney, R., Shah, R.R.: ARHNet - leveraging community interaction for detection of religious hate speech in Arabic, pp. 273–280 (2019)
Google Scholar
Haddad, H., Mulki, H., Oueslati, A.: T-HSAB: a tunisian hate speech and abusive dataset. Springer International Publishing (2019)
Google Scholar
Internet World Stats: INTERNET WORLD USERS BY LANGUAGE (2019). https://www.internetworldstats.com/stats7.htm. Accessed 30 July 2019
Lecun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)
Article Google Scholar
Mar, B.: How Much Data Do We Create Every Day? The Mind-Blowing Stats Everyone Should Read, Forbes (2018). https://www.forbes.com/sites/bernardmarr/2018/05/21/how-much-data-do-we-create-every-day-the-mind-blowing-stats-everyone-should-read/. Accessed 12 Dec 2019
Mohaouchane, H., Mourhir, A., Nikolov, N.S.: Detecting offensive language on Arabic social media using deep learning. In: 2019 Sixth International Conference on Social Networks Analysis, Management and Security, pp. 466–471 (2019)
Google Scholar
Mubarak, H., Darwish, K., Magdy, W.: Abusive language detection on Arabic social media. In: Proceedings of the First Workshop on Abusive Language Online, pp. 52–56. Association for Computational Linguistics, Stroudsburg (2017)
Google Scholar
Mulani, J., Heda, S., Tumdi, K., Patel, J., Chhinkaniwala, H., Patel, J.: Deep Learning Techniques for Biomedical and Health Informatics. Springer, Cham (2020)
Google Scholar
Mulki, H., Haddad, H., Bechikh Ali, C., Alshabani, H.: L-HSAB: a levantine Twitter dataset for hate speech and abusive language, pp. 111–118 (2019)
Google Scholar
Omar, A., Mahmoud, T.M., Abd-El-Hafeez, T.: Building online social network dataset for Arabic text classification. In: The International Conference on Advanced Machine Learning Technologies and Applications (AMLTA2018). Advances in Intelligent Systems and Computing, pp. 486–495 (2018)
Google Scholar
Pedregosa, F., Grisel, O., Weiss, R., Passos, A., Brucher, M.: Scikit-learn: machine learning in Python 12, 2825–2830 (2011)
Google Scholar
Stieglitz, S., Mirbabaie, M., Ross, B., Neuberger, C.: Social media analytics – challenges in topic discovery, data collection, and data preparation. Int. J. Inf. Manag. 39(October 2017), 156–168 (2018)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science Department, Faculty of Science, Minia University, EL-Minia, Egypt
Ahmed Omar, Tarek M. Mahmoud & Tarek Abd-El-Hafeez
Deraya University, EL-Minia, Egypt
Tarek Abd-El-Hafeez

Authors

Ahmed Omar
View author publications
You can also search for this author in PubMed Google Scholar
Tarek M. Mahmoud
View author publications
You can also search for this author in PubMed Google Scholar
Tarek Abd-El-Hafeez
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ahmed Omar .

Editor information

Editors and Affiliations

Information Technology Department, Cairo University, Faculty of Computers and Information, Giza, Egypt
Aboul-Ella Hassanien
Faculty of Computers and Information, Benha University, Banha, Egypt
Ahmad Taher Azar
Faculty of Computers and Information, Suez Canal University, Ismailia, Egypt
Tarek Gaber
Departamento de Ciencias Computacionales, Universidad de Guadalajara, CUCEI, Guadajalara, Jalisco, Mexico
Diego Oliva
Faculty of Computer and Information Sciences, Ain Shams University, Cairo, Egypt
Fahmy M. Tolba

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Omar, A., Mahmoud, T.M., Abd-El-Hafeez, T. (2020). Comparative Performance of Machine Learning and Deep Learning Algorithms for Arabic Hate Speech Detection in OSNs. In: Hassanien, AE., Azar, A., Gaber, T., Oliva, D., Tolba, F. (eds) Proceedings of the International Conference on Artificial Intelligence and Computer Vision (AICV2020). AICV 2020. Advances in Intelligent Systems and Computing, vol 1153. Springer, Cham. https://doi.org/10.1007/978-3-030-44289-7_24

Download citation

DOI: https://doi.org/10.1007/978-3-030-44289-7_24
Published: 24 March 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-44288-0
Online ISBN: 978-3-030-44289-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics