Abstract
Many activities are conducted on social media platforms, such as promoting products, sharing news and sharing achievements. As a result of users’ freedom and anonymity on social media platforms, hate speech and harassment are common. Posts that spread hate and offense should be detected and deleted as soon as possible as they spread very quickly and have many negative consequences for the human race. The detection task becomes significantly more difficult when online users use code-mixed text in places where English is not the primary language. The problem of hate speech detection has recently been reduced to a binary classification task, without taking into account its topical focus and its target-oriented nature. Because there is no combined annotated dataset and scientific study that can provide insight into the relationship between offense traits, existing techniques usually only examine one or two offense traits at a time. Furthermore, these techniques are not efficient for multilingual, where most conversations are code-mixed. In this paper, we propose an optimal feature extraction and hybrid diagonal gated recurrent neural network (FE-DGRNN) for hate speech detection and sentiment analysis in multilingual code-mixed texts. The proposed FE-DGRNN technique consists threefold processes. After preprocessing, we first introduce an improved seagull optimization (ISO) algorithm for multiple features extraction from given code-mixed texts. Then, we utilize a quantum search optimization algorithm to optimize the extracted features which reduces the data dimensionality issues in further detection phase. Next, a hybrid diagonal gated recurrent neural network (Hyb-DGRNN) introduces to detect hate speech and analyzes sentiment on their language. In order to validate the effectiveness of our proposed FE-DGRNN technique, we conducted experiments on the HASOC 2019 dataset. This dataset includes posts written in English, Hindi and German, allowing us to evaluate the performance of our approach across multiple languages. From the simulation results, we observed that the accuracy of FE-DGRNN is 87.74%, 88.98% and 84.74% for Task-1, Task-2 and Task-3, respectively, for multilingual code-mixed texts dataset. Overall, the proposed FE-DGRNN technique shows a significant improvement in accuracy, precision, recall and F-measure compared to other classifiers, indicating its potential to be a robust and effective tool for hate speech detection and sentimental analysis in multilingual code-mixed texts.
Similar content being viewed by others
Data availability
HASOC 2019 dataset available in HASOC website.
Change history
20 March 2024
This article has been retracted. Please see the Retraction Notice for more detail: https://doi.org/10.1007/s11227-024-06074-0
References
DeNardis L, Hackl AM (2015) Internet governance by social media platforms. Telecommun Policy 39(9):761–770
Napoli PM (2015) Social media and the public interest: governance of news platforms in the realm of individual and algorithmic gatekeepers. Telecommunications Policy 39(9):751–760
Awan I, Zempi I (2016) The affinity between online and offline anti-Muslim hate crime: dynamics and impacts. Aggress Violent Beh 27:1–8
Ciszek EL (2016) Digital activism: How social media and dissensus inform theory and practice. Public relations review 42(2):314–321
Ramadan R (2017) Questioning the role of Facebook in maintaining Syrian social capital during the Syrian crisis. Heliyon 3(12):e00483
Chatterjee S, Ghosh K, Banerjee A and Banerjee S (2022) Forecasting COVID-19 outbreak through fusion of internet search, social media, and air quality data: a retrospective study in indian context. In: IEEE Transactions on Computational Social Systems
Rekha V, Raksha R, Patil P, Swaras N and Rajat GL (2019) Sentiment analysis on Indian government schemes using Twitter data. In: 2019 International Conference on Data Science and Communication (IconDSC), pp 1–5 IEEE
Skanda VS, Kumar MA and Soman KP (2017) Detecting stance in kannada social media code-mixed text using sentence embedding. In: 2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI), pp 964–969. IEEE
Bansal D, Saini N, Saha S (2021) DCBRTS: a classification-summarization approach for evolving tweet streams in multiobjective optimization framework. IEEE Access 9:148325–148338
Roy PK, Bhawal S, Subalalitha CN (2022) Hate speech and offensive language detection in Dravidian languages using deep ensemble framework. Comput Speech Lang 75:101386
Sharma A, Kabra A, Jain M (2022) Ceasing hate with MoH: hate speech detection in hindi-english code-switched language. Inf Process Manage 59(1):102760
Cruz, R.M., de Sousa, W.V. and Cavalcanti, G.D., 2022. Selecting and combining complementary feature representations and classifiers for hate speech detection. arXiv preprint arXiv:2201.06721.
Nascimento FR, Cavalcanti GD, Da Costa-Abreu M (2022) Unintended bias evaluation: an analysis of hate speech detection and gender bias mitigation on social media using ensemble learning. Expert Syst Appl 201:117032
Plaza-del-Arco FM, Molina-González MD, Urena-López LA, Martín-Valdivia MT (2021) Comparing pre-trained language models for Spanish hate speech detection. Expert Syst Appl 166:114120
Pronoza E, Panicheva P, Koltsova O, Rosso P (2021) Detecting ethnicity-targeted hate speech in Russian social media texts. Inf Process Manage 58(6):102674
Pamungkas EW, Basile V, Patti V (2021) A joint learning approach with knowledge injection for zero-shot cross-lingual hate speech detection. Inf Process Manage 58(4):102544
Ayo FE, Folorunso O, Ibharalu FT, Osinuga IA, Abayomi-Alli A (2021) A probabilistic clustering model for hate speech classification in twitter. Expert Syst Appl 173:114762
Beddiar DR, Jahan MS, Oussalah M (2021) Data expansion using back translation and paraphrasing for hate speech detection. Online Soc Netw Media 24:100153
Kocoń J, Figas A, Gruza M, Puchalska D, Kajdanowicz T, Kazienko P (2021) Offensive, aggressive, and hate speech analysis: From data-centric to human-centered approach. Inf Process Manage 58(5):102643
Das AK, Al Asif A, Paul A, Hossain MN (2021) Bangla hate speech detection on social media using attention-based recurrent neural network. J Intell Syst 30(1):578–591
Oriola O, Kotzé E (2020) Evaluating machine learning techniques for detecting offensive and hate speech in South African tweets. IEEE Access 8:21496–21509
Zhou Y, Yang Y, Liu H, Liu X, Savage N (2020) Deep learning based fusion approach for hate speech detection. IEEE Access 8:128923–128929
Roy PK, Tripathy AK, Das TK, Gao XZ (2020) A framework for hate speech detection using deep convolutional neural network. IEEE Access 8:204951–204962
Ali MZ, Rauf S, Javed K, Hussain S (2021) Improving hate speech detection of Urdu tweets using sentiment analysis. IEEE Access 9:84296–84305
Alatawi HS, Alhothali AM, Moria KM (2021) Detecting white supremacist hate speech using domain specific word embedding with deep learning and BERT. IEEE Access 9:106363–106374
Qureshi KA, Sabih M (2021) Un-compromised credibility: Social media based multi-class hate speech classification for text. IEEE Access 9:109465–109477
Baydogan C, Alatas B (2021) Metaheuristic ant lion and moth flame optimization-based novel approach for automatic detection of hate speech in online social networks. IEEE Access 9:110047–110062
Plaza-Del-Arco FM, Molina-González MD, Ureña-López LA, Martín-Valdivia MT (2021) A multi-task learning approach to hate speech detection leveraging sentiment analysis. IEEE Access 9:112478–112489
Kapil P, Ekbal A (2020) A deep neural network based multi-task learning approach to hate speech detection. Knowl-Based Syst 210:106458
Mossie Z, Wang JH (2020) Vulnerable community identification using hate speech detection on social media. Inf Process Manage 57(3):102087
Watanabe H, Bouazizi M, Ohtsuki T (2018) Hate speech on twitter: a pragmatic approach to collect hateful and offensive expressions and perform hate speech detection. IEEE access 6:13825–13835
Mishra S, Prasad S, Mishra S (2021) Exploring multi-task multi-lingual learning of transformer models for hate speech and offensive speech identification in social media. SN Computer Science 2(2):1–19
Baydogan C (2022) Deep-Cov19-hate: a textual-based novel approach for automatic detection of hate speech in online social networks throughout COVID-19 with shallow and deep learning models. Tehnički vjesnik 29(1):149–156
Baydoğan VC, Alatas B (2021) Çevrimiçi Sosyal Ağlarda Nefret Söylemi Tespiti için Yapay Zeka Temelli Algoritmaların Performans Değerlendirmesi. Fırat Üniversitesi Mühendislik Bilimleri Dergisi 33(2):745–754
Baydogan C (2021) Sentiment analysis in social networks using social spider optimization algorithm. Tehnički vjesnik 28(6):1943–1951
Author information
Authors and Affiliations
Contributions
Both the authors contributed in the manuscript. PK prepared the manuscript algorithms tables and figures. SD prepared the manuscript literature survey and introduction. All authors reviewed and revised the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The author does not have any conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article has been retracted. Please see the retraction notice for more detail: https://doi.org/10.1007/s11227-024-06074-0
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Kar, P., Debbarma, S. RETRACTED ARTICLE: Multilingual hate speech detection sentimental analysis on social media platforms using optimal feature extraction and hybrid diagonal gated recurrent neural network. J Supercomput 79, 19515–19546 (2023). https://doi.org/10.1007/s11227-023-05361-6
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-023-05361-6