Automatic hate speech detection using killer natural language processing optimizing ensemble deep learning approach

Al-Makhadmeh, Zafer; Tolba, Amr

doi:10.1007/s00607-019-00745-0

Automatic hate speech detection using killer natural language processing optimizing ensemble deep learning approach

Published: 01 August 2019

Volume 102, pages 501–522, (2020)
Cite this article

Computing Aims and scope Submit manuscript

Zafer Al-Makhadmeh¹ &
Amr Tolba^1,2

2951 Accesses
53 Citations
1 Altmetric
Explore all metrics

Abstract

Over the last decade, the increased use of social media has led to an increase in hateful activities in social networks. Hate speech is one of the most dangerous of these activities, so users have to protect themselves from these activities from YouTube, Facebook, Twitter etc. This paper introduces a method for using a hybrid of natural language processing and with machine learning technique to predict hate speech from social media websites. After hate speech is collected, steaming, token splitting, character removal and inflection elimination is performed before performing hate speech recognition process. After that collected data is examined using a killer natural language processing optimization ensemble deep learning approach (KNLPEDNN). This method detects hate speech on social media websites using an effective learning process that classifies the text into neutral, offensive and hate language. The performance of the system is then evaluated using overall accuracy, f-score, precision and recall metrics. The system attained minimum deviations mean square error − 0.019, Cross Entropy Loss − 0.015 and Logarithmic loss L-0.0238 and 98.71% accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fake news, disinformation and misinformation in social media: a review

Article 09 February 2023

Sentiment Analysis in Social Media Data for Depression Detection Using Artificial Intelligence: A Review

Article 19 November 2021

A comprehensive survey of AI-enabled phishing attacks detection techniques

Article 23 October 2020

References

Xiang G, Fan B, Wang L, Hong J, Rose C (2012) Detecting offensive tweets via topical feature discovery over a large scale twitter corpus. In: Proceedings of the 21st ACM international conference on information and knowledge management. ACM, pp 1980–1984
Del Vigna F, Cimino A, Dell’Orletta F, Petrocchi M, Tesconi M (2017) Hate me, hate me not: hate speech detection on Facebook. In: Proceedings of the first Italian conference on cybersecurity (ITASEC17), Venice, Italy
Waseem Z, Hovy D (2016) Hateful symbols or hateful people? Predictive features for hate speech detection on twitter. In: Proceedings of the NAACL student research workshop, pp 88–93
Watanabe H, Bouazizi M, Ohtsuki T (2018) Hate speech on twitter: a pragmatic approach to collect hateful and offensive expressions and perform hate speech detection. IEEE Access 6:13825–13835
Article Google Scholar
Bouazizi M, Ohtsuki TO (2016) A pattern-based approach for sarcasm detection on twitter. IEEE Access 4:5477–5488
Article Google Scholar
Facebook, Google and Twitter agree German Hate Speech Deal. Website. http://www.bbc.com/news/world-europe-35105003. Accessed 26 Mar 2019
AlFarraj O, AlZubi A, Tolba A (2018) Optimized feature selection algorithm based on fireflies with gravitational ant colony algorithm for big data predictive analytics. Neural Comput Appl. https://doi.org/10.1007/s00521-018-3612-0
Article Google Scholar
Xu K, Ba J, Kiros R, Cho K, Courville A, Salakhudinov R, Zemel R, Bengio Y (2015) Show, attend and tell: neural image caption generation with visual attention. In: International conference on machine learning, pp 2048–2057
Chen Y, Zhou Y, Zhu S, Xu H (2012) Detecting offensive language in social media to protect adolescent online safety. In: 2012 international conference on privacy, security, risk and trust and 2012 international conference on social computing. IEEE, pp 71–80
Xia F, Liaqat HB, Ahmed AM, Liu L, Ma J, Huang R, Tolba A (2016) User popularity-based packet scheduling for congestion control in ad-hoc social networks. J Comput Syst Sci 82(1):93–112
Article MathSciNet Google Scholar
Li J, Ning Z, Jedari B, Xia F, Lee I, Tolba A (2016) Geo-social distance-based data dissemination for socially aware networking. IEEE Access 4:1444–1453
Article Google Scholar
Rahim A, Qiu T, Ning Z, Wang J, Ullah N, Tolba A, Xia F (2019) Social acquaintance based routing in vehicular social networks. Future Gen Comput Syst 93:751–760
Article Google Scholar
Fortuna P, Nunes S (2018) A survey on automatic detection of hate speech in text. ACM Comput Surv (CSUR) 51(4):85
Article Google Scholar
Pitsilis GK, Ramampiaro H, Langseth H (2018) Effective hate-speech detection in Twitter data using recurrent neural networks. Appl Intell 48(12):4730–4742
Article Google Scholar
Gaydhani A, Doma V, Kendre S, Bhagwat L (2018) Detecting hate speech and offensive language on twitter using machine learning: an N-gram and TFIDF based approach. arXiv preprint arXiv:1809.08651
Fauzi MA, Yuniarti A (2018) Ensemble method for indonesian twitter hate speech detection. Indones. J Electr Eng Comput Sci 11(1):294–299
Article Google Scholar
Zhang Z, Luo L (2018) Hate speech detection: A solved problem? The challenging case of long tail on Twitter. Semantic Web, (Preprint), pp 1–21
Chang CY, Lee SJ, Lai CC (2017) Sighted word2vec based on the distance of words. In: 2017 international conference on machine learning and cybernetics (ICMLC). IEEE, vol 2, pp 563–568
Alarifi A, Tolba A, Al-Makhadmeh Z, Said W (2018) A big data approach to sentiment analysis using greedy feature selection with cat swarm optimization-based long short-term memory neural networks. J Supercomput. https://doi.org/10.1007/s11227-018-2398-2
Article Google Scholar
Kim Y, Jernite Y, Sontag D, Rush AM (2016) Character-aware neural language models. In: Thirtieth AAAI conference on artificial intelligence
Caren N, Jowers K, Gaby S (2012) A social movement online community: stormfront and the white nationalist movement. In: Earl J, Rohlinger DA (eds) Media, movements, and political change (research in social movements, conflicts and change, volume 33). Emerald Group Publishing Limited, Bingley, pp 163–193
Chapter Google Scholar
https://data.world/crowdflower/hate-speech-identification. Accessed 10 June 2019
Bergin TJ (2006) The origins of word processing software for personal computers: 1976–1985. IEEE Ann Hist Comput 28(4):32–47
Article MathSciNet Google Scholar
Wong KF, Li W, Xu R, Zhang ZS (2009) Introduction to Chinese natural language processing. Synth Lect Hum Lang Technol 2(1):1–148
Article Google Scholar
Gupta V (2014) Automatic stemming of words for Punjabi language. In: Thampi SM, Gelbukh A, Mukhopadhyay J (eds) Advances in signal processing and intelligent recognition systems. Springer, Cham, pp 73–84
Chapter Google Scholar
Fares M, Oepen S, Zhang Y (2013) Machine learning for high-quality tokenization replicating variable tokenization schemes. In: International conference on intelligent text processing and computational linguistics. Springer, Berlin, Heidelberg, pp 231–244
Chapter Google Scholar
Domínguez MA, Infante-Lopez G (2008) Searching for part of speech tags that improve parsing models. In: International conference on natural language processing. Springer, Berlin, Heidelberg, pp 126–137
Chapter Google Scholar
Rahim A, Ma K, Zhao W, Tolba A, Al-Makhadmeh Z, Xia F (2018) Cooperative data forwarding based on crowdsourcing in vehicular social networks. Pervasive Mob Comput 51:43–55
Article Google Scholar
Nicholls C, Song F (2010) Comparison of feature selection methods for sentiment analysis. In: Canadian conference on artificial intelligence. Springer, Berlin, Heidelberg, pp 286–289
Chapter Google Scholar
Razavi AH, Inkpen D, Uritsky S, Matwin S (2010) Offensive language detection using multi-level classification. In: Canadian conference on artificial intelligence. Springer, Berlin, Heidelberg, pp 16–27
Chapter Google Scholar
Chen Y, Zhou Y, Zhu S, Xu H (2012) Detecting offensive language in social media to protect adolescent online safety. In: 2012 international conference on privacy, security, risk and trust and 2012 international confernece on social computing. IEEE, pp 71–80
Jedari B, Xia F, Chen H, Das SK, Tolba A, Zafer AM (2019) A social-based watchdog system to detect selfish nodes in opportunistic mobile networks. Future Gen Comput Syst 92:777–788
Article Google Scholar
Gomathi P, Baskar S, Shakeel PM, Dhulipala VS (2019) Identifying brain abnormalities from electroencephalogram using evolutionary gravitational neocognitron neural network. Multimedia Tools Appl. https://doi.org/10.1007/s11042-019-7301-5
Article Google Scholar
Shakeel PM, Tolba A, Al-Makhadmeh Z, Al-Makhadmeh M, Musa J (2019) Automatic detection of lung cancer from biomedical data set using discrete AdaBoost optimized ensemble learning generalized neural networks. Neural Comput Appl. https://doi.org/10.1007/s00521-018-03972-2
Article Google Scholar
Davidson T, Warmsley D, Macy M, Weber I (2017) Automated hate speech detection and the problem of offensive language. In: Eleventh international AAAI conference on web and social media
Badjatiya P, Gupta S, Gupta M, Varma V (2017) Deep learning for hate speech detection in tweets. In: Proceedings of the 26th international conference on World Wide Web companion, pp 759–760
Yao Z, Sun Y, Ding W, Rao N, Xiong H (2018) Dynamic word embeddings for evolving semantic discovery. In: Proceedings of the eleventh ACM international conference on web search and data mining, pp 673–681
Hong G (2005) Relation extraction using support vector machine. In: International conference on natural language processing. Springer, Berlin, Heidelberg, pp 366–377
Google Scholar
Zhang Z, Robinson D, Tepper J (2018) Detecting hate speech on Twitter using a convolution-GRU based deep neural network. In: European semantic web conference. Springer, Cham, pp 745–760
Chapter Google Scholar
Kim Y, Jernite Y, Sontag D, Rush AM (2016) Character-aware neural language models. In: Thirtieth AAAI conference on artificial intelligence
Wackerly D, Mendenhall W, Scheaffer RL (2008) Mathematical statistics with applications, 7th edn. Thomson Higher Education, Belmont. ISBN 978-0-495-38508-0
MATH Google Scholar
Goodfellow I, Bengio Y, Courville A (2016) Deep learning. MIT Press, Cambridge
MATH Google Scholar
Mikolov T, Deoras A, Kombrink S, Burget L, Černocký J (2011) Empirical evaluation and combination of advanced language modeling techniques. In: Twelfth annual conference of the international speech communication association
Powers DM (2011) Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation. J Mach Learn Technol 2(1):37–63
MathSciNet Google Scholar
Muhammed Shafi P, Selvakumar S, Mohamed Shakeel P (2018) An efficient optimal fuzzy C means (OFCM) algorithm with particle swarm optimization (PSO) to analyze and predict crime data. J Adv Res Dyn Control Syst 10(06):699–707
Google Scholar
Shakeel PM, Manogaran G (2018) Prostate cancer classification from prostate biomedical data using ant rough set algorithm with radial trained extreme learning neural network. Health Technol. https://doi.org/10.1007/s12553-018-0279-6
Article Google Scholar
Powers DM (2012) ROC-ConCert: ROC-based measurement of consistency and certainty. In: 2012 Spring congress on engineering and technology. IEEE, pp 1–4

Download references

Acknowledgements

The authors extend their appreciation to the Deanship of Scientific Research at King Saud University for funding this work through Research Group No. RG-1439-088.

Author information

Authors and Affiliations

Computer Science Department, Community College, King Saud University, Riyadh, 11437, Saudi Arabia
Zafer Al-Makhadmeh & Amr Tolba
Mathematics and Computer Science Department, Faculty of Science, Menoufia University, Shebin-El-Kom, 32511, Egypt
Amr Tolba

Authors

Zafer Al-Makhadmeh
View author publications
You can also search for this author in PubMed Google Scholar
Amr Tolba
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zafer Al-Makhadmeh.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Al-Makhadmeh, Z., Tolba, A. Automatic hate speech detection using killer natural language processing optimizing ensemble deep learning approach. Computing 102, 501–522 (2020). https://doi.org/10.1007/s00607-019-00745-0

Download citation

Received: 02 April 2019
Accepted: 24 July 2019
Published: 01 August 2019
Issue Date: February 2020
DOI: https://doi.org/10.1007/s00607-019-00745-0

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Automatic hate speech detection using killer natural language processing optimizing ensemble deep learning approach

Abstract

Access this article

Similar content being viewed by others

Fake news, disinformation and misinformation in social media: a review

Sentiment Analysis in Social Media Data for Depression Detection Using Artificial Intelligence: A Review

A comprehensive survey of AI-enabled phishing attacks detection techniques

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Automatic hate speech detection using killer natural language processing optimizing ensemble deep learning approach

Abstract

Access this article

Similar content being viewed by others

Fake news, disinformation and misinformation in social media: a review

Sentiment Analysis in Social Media Data for Depression Detection Using Artificial Intelligence: A Review

A comprehensive survey of AI-enabled phishing attacks detection techniques

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation