MMHS: Multimodal Model for Hate Speech Intensity Prediction

Goel, Aman; Poswal, Abhishek

doi:10.1007/978-3-031-78014-1_8

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 15300))

Included in the following conference series:

International Conference on Speech and Computer

270 Accesses

Abstract

This paper presents a novel multimodal model that integrates both image and large language model capabilities to enhance hate intensity prediction, traditionally a purely text-based task. Accurately assessing hate speech intensity is crucial for moderating and regulating vast online communities by normalizing hate speech [1], while ensuring a balance between free expression and responsible communication. Our approach leverages insights from both the visual and language domains, resulting in the novel Multimodal Model for Hate Speech (MMHS). We demonstrate that MMHS achieves state-of-the-art performance on the NACL dataset, surpassing previous benchmarks by scoring 0.350 lower in Root Mean Squared Error (RMSE) and 0.132, and 0.012 higher in Pearson, and Cosine metrics, respectively. Additionally, user preference surveys indicate a significant favoring of our predictions over those of Masud et al. [1] by 16.67%. This work advances the technical landscape of hate speech detection and enriches our understanding of online discourse, enabling more effective moderation strategies. (Disclaimer: This paper includes examples of hate speech which contain some profane words. These examples are only included for contextual understanding. We tried our best to censor vulgar, offensive, or hateful words. We assert that we do not support these views in any way.)

A. Goel and A. Poswal—Contributed equally to this work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 74.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

An Identity-Based Framework for Generalizable Hate Speech Detection

Transfer learning for hate speech detection in social media

Article Open access 17 October 2023

Distinguishing Online Hate Speech from Aggressive Speech: A Five-Factor Annotation Model

References

Masud, S., et al.: Proactively reducing the hate intensity of online posts via hate speech normalization. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) 3524–3534 (2022)
Google Scholar
Awan, I., Zempi, I.: The affinity between online and offline anti-Muslim hate crime: dynamics and impacts. Aggression Violent Behav. 27, 1–8 (2016). https://doi.org/10.1016/j.avb.2016.02.001
Lupu, Y., et al.: Offline events and online hate. PLos One 18(1), e0278511 (2023). https://doi.org/10.1371/journal.pone.0278511
Wiedlitzka, S., et al.: Hate in word and deed: the temporal association between online and offline islamophobia. J. quant. Criminol. (2023)
Google Scholar
Hee, M.S., et al.: Recent advances in hate speech moderation: multimodality and the role of large models. arXiv preprint (2024)
Google Scholar
Radford, A., et al.: Learning transferable visual models from natural language supervision. In: Proceedings of the 38th International Conference on Machine Learning, ICML 2021, pp. 8748–8763 (2021)
Google Scholar
Touvron, H., et al.: LLaMA: open and efficient foundation language models. arXiv preprint (2023)
Google Scholar
Jiang, A.Q., et al.: Mistral 7B. arXiv preprint (2023)
Google Scholar
Gemini Team, et al.: Gemini: a family of highly capable multimodal models. arXiv preprint (2024)
Google Scholar
Liu, Y., et al.: RoBERTa: a robustly optimized BERT pretraining approach. arXiv preprint (2019)
Google Scholar
Waseem, Z., Hovy, D.: Hateful symbols or hateful people? Predictive features for hate speech detection on Twitter. In: NAACL SRW, pp. 88–93 (2016). https://doi.org/10.18653/v1/N16-2013
Founta, A.-M., et al.: Large scale crowdsourcing and characterization of Twitter abusive behavior. In ICWSM (2018)
Google Scholar
ElSherief, M., et al.: Latent hatred: a benchmark for understanding implicit hate speech. In: EMNLP, pp. 345–363 (2021). https://doi.org/10.18653/v1/2021.emnlp-main.29
Kennedy, B., et al.: The gab hate corpus: a collection of 27k posts annotated for hate speech. In: PsyArXiv (2018)
Google Scholar
Kiela, D., et al.: The hateful memes challenge: detecting hate speech in multimodal memes. In: NeurIPS (2020)
Google Scholar
Wu, C., Bhandary, U.: Detection of hate speech in videos using machine learning. In: International Conference on Computational Science and Computational Intelligence. IEEE (2020)
Google Scholar
Barakat, M.S., et al.: Detecting offensive user video blogs: an adaptive keyword spotting approach. In: ICALIP, pp. 419–425 (2012). https://doi.org/10.1109/ICALIP.2012.6376654
Wazir, A., et al.: Spectrogram-based classification of spoken foul language using deep CNN. In: MMSP (2020)
Google Scholar
Boishakhi, F., et al.: Multi-modal hate speech detection using machine learning. In: Big Data. IEEE (2021)
Google Scholar
Cao, R., Lee, R.: HateGAN: adversarial generative-based data augmentation for hate speech detection. In: COLING (2020)
Google Scholar
Brown, T.B., et al.: Language models are few-shot learners. arXiv preprint (2020)
Google Scholar
Huang, Z., et al.: Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint (2015)
Google Scholar
Basile, V., et al.: SemEval-2019 task 5: multilingual detection of hate speech against immigrants and women in Twitter. In: Proceedings of the 13th International Workshop on Semantic Evaluation (SemEval) (2019)
Google Scholar
Chung, Y., et al.: CONAN - COunter NArratives through Nichesourcing: a multilingual dataset of responses to fight online hate speech. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, Italy, pp. 2819–2829 (2019). https://doi.org/10.18653/v1/P19-1271
Davidson, T., et al.: Racial bias in hate speech and abusive language detection datasets. In: Proceedings of the Third Workshop on Abusive Language Online, Florence, Italy, pp. 25–35 (2019)
Google Scholar
Gibert, O., et al.: Hate speech dataset from a white supremacy forum. In: ALW, pp. 11–20 (2018)
Google Scholar
Jha, A., Mamidi, R. : When does a compliment become sexist? Analysis and classification of ambivalent sexism using twitter data. In: Proceedings of the Second Workshop on NLP and Computational Social Science, Vancouver, Canada, pp. 7–16 (2017)
Google Scholar
Mathew, B., et al.: HateXplain: a benchmark dataset for explainable hate speech detection. Proc. AAAI Conf. Artif. Intell. 35(17), 14867–14875 (2021)
Google Scholar
van der Maaten, L., Hinton, G.: Viualizing data using t-SNE. JMLR, 2579–2605 (2008)
Google Scholar

Download references

Acknowledgments

This work is based on sensitive material so we would like to thank our human participants in our survey. (The authors have no conflicting interests related to the scope of this article.)

Author information

Authors and Affiliations

Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, PA, 15213, USA
Aman Goel
The Hong Kong University of Science and Technology, Kowloon, Hong Kong
Abhishek Poswal

Authors

Aman Goel
View author publications
You can also search for this author in PubMed Google Scholar
Abhishek Poswal
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Aman Goel .

Editor information

Editors and Affiliations

St. Petersburg Federal Research Center of the Russian Academy of Sciences, St. Petersburg, Russia
Alexey Karpov
University of Novi Sad, Novi Sad, Serbia
Vlado Delić

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Goel, A., Poswal, A. (2025). MMHS: Multimodal Model for Hate Speech Intensity Prediction. In: Karpov, A., Delić, V. (eds) Speech and Computer. SPECOM 2024. Lecture Notes in Computer Science(), vol 15300. Springer, Cham. https://doi.org/10.1007/978-3-031-78014-1_8

Download citation

DOI: https://doi.org/10.1007/978-3-031-78014-1_8
Published: 22 November 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-78013-4
Online ISBN: 978-3-031-78014-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

MMHS: Multimodal Model for Hate Speech Intensity Prediction