An Inter-modal Attention Framework for Multimodal Offense Detection

Hossain, Eftekhar; Hoque, Mohammed Moshiul; Hossain, Md. Azad

doi:10.1007/978-3-031-19958-5_81

Eftekhar Hossain¹⁴,
Mohammed Moshiul Hoque ORCID: orcid.org/0000-0001-8806-708X¹⁵ &
Md. Azad Hossain¹⁴

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 569))

Included in the following conference series:

International Conference on Intelligent Computing & Optimization

773 Accesses

Abstract

In recent years, posting offensive content on social media platforms has increased, and memes play a vital role in continuing this trend. Memes usually come in multimodal forms and can often promulgate offensive views by implicitly expressing a sense of humour. A meme’s visual and textual information is often disparate and can easily evade surveillance systems. Therefore, an automatic offensive meme detection system is required to combat malign content and maintain social media harmony. Past studies explored various multimodal fusion techniques to combine the visual and textual information of the memes. However, they fail to learn the multimodal content’s significant relationships and perform poorly in offensive meme classification. This paper proposes an inter-modal attention framework that synergistically fuses the visual and textual information by focusing on the modality-specific features. The state-of-the-art pre-trained VGG19 and BERT models are employed to extract the visual and textual features, respectively. The proposed method is compared against several baseline models on a publicly available multimodal offensive dataset. The results demonstrated that the proposed approach outperformed all the models by achieving the highest weighted f1-score of 0.635.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Multimodal visual-textual object graph attention network for propaganda detection in memes

Article 30 May 2023

Multimodal Rumour Detection: Catching News that Never Transpired!

MRDCA: a multimodal approach for fine-grained fake news detection through integration of RoBERTa and DenseNet based upon fusion mechanism of co-attention

Article 26 December 2022

References

Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Dhingra, B., Liu, H., Yang, Z., Cohen, W.W., Salakhutdinov, R.: Gated-attention readers for text comprehension. arXiv preprint arXiv:1606.01549 (2016)
Gadzicki, K., Khamsehashari, R., Zetzsche, C.: Early vs late fusion in multimodal convolutional neural networks. In: 2020 IEEE 23rd International Conference on Information Fusion (FUSION), pp. 1–6. IEEE (2020)
Google Scholar
Hossain, E., Sharif, O., Hoque, M.M.: Nlp-cuet@ dravidianlangtech-eacl2021: investigating visual and textual features to identify trolls from multimodal social media memes. In: Proceedings of the First Workshop on Speech and Language Technologies for Dravidian Languages, pp. 300–306 (2021)
Google Scholar
Hs, C., et al.: Trollmeta@ dravidianlangtech-eacl2021: meme classification using deep learning. In: Proceedings of the First Workshop on Speech and Language Technologies for Dravidian Languages, pp. 277–280 (2021)
Google Scholar
Huang, B., Bai, Y.: Hub@ dravidianlangtech-eacl2021: Meme classification for tamil text-image fusion. In: Proceedings of the First Workshop on Speech and Language Technologies for Dravidian Languages, pp. 210–215 (2021)
Google Scholar
Khedkar, S., Karsi, P., Ahuja, D., Bahrani, A.: Hateful memes, offensive or non-offensive! In: Khanna, A., Gupta, D., Bhattacharyya, S., Hassanien, A.E., Anand, S., Jaiswal, A. (eds.) International Conference on Innovative Computing and Communications. AISC, vol. 1388, pp. 609–621. Springer, Singapore (2022). https://doi.org/10.1007/978-981-16-2597-8_52
Chapter Google Scholar
Kumari, K., Singh, J.P.: Identification of cyberbullying on multi-modal social media posts using genetic algorithm. Trans. Emerg. Telecommun. Technol. 32(2), e3907 (2021)
Google Scholar
Mishra, A.K., Saumya, S.: Iiit_dwd@ eacl2021: Identifying troll meme in tamil using a hybrid deep learning approach. In: Proceedings of the First Workshop on Speech and Language Technologies for Dravidian Languages, pp. 243–248 (2021)
Google Scholar
Pavlopoulos, J., Thain, N., Dixon, L., Androutsopoulos, I.: Convai at semeval-2019 task 6: Offensive language identification and categorization with perspective and bert. In: Proceedings of the 13th international Workshop on Semantic Evaluation, pp. 571–576 (2019)
Google Scholar
Perifanos, K., Goutsos, D.: Multimodal hate speech detection in greek social media. Multimodal Technol. Interact. 5(7), 34 (2021)
Article Google Scholar
Que, Q.: Simon@ dravidianlangtech-eacl2021: meme classification for tamil with bert. In: Proceedings of the First Workshop on Speech and Language Technologies for Dravidian Languages, pp. 287–290 (2021)
Google Scholar
Sharif, O., Hoque, M.M.: Identification and classification of textual aggression in social media: resource creation and evaluation. In: International Workshop on Combating On line Hostile Posts in Regional Languages during Emergency Situation, pp. 9–20. Springer (2021). https://doi.org/10.1007/978-3-030-73696-5_2
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Singh, B., Upadhyay, N., Verma, S., Bhandari, S.: Classification of hateful memes using multimodal models. In: Data Intelligence and Cognitive Informatics, pp. 181–192. Springer (2022). https://doi.org/10.1007/978-981-16-6460-1_13
Suryawanshi, S., Chakravarthi, B.R., Arcan, M., Buitelaar, P.: Multimodal meme dataset (MultiOFF) for identifying offensive content in image and text. In: Proceedings of the Second Workshop on Trolling, Aggression and Cyberbullying, pp. 32–41. European Language Resources Association (ELRA), Marseille, France (2020). https://aclanthology.org/2020.trac-1.6
Suryawanshi, S., Chakravarthi, B.R., Verma, P., Arcan, M., McCrae, J.P., Buitelaar, P.: A dataset for troll classification of tamilmemes. In: Proceedings of the WILDRE5-5th Workshop on Indian Language Data: Resources and Evaluation, pp. 7–13 (2020)
Google Scholar
Tan, C., Sun, F., Kong, T., Zhang, W., Yang, C., Liu, C.: A survey on deep transfer learning. In: Kůrková, V., Manolopoulos, Y., Hammer, B., Iliadis, L., Maglogiannis, I. (eds.) ICANN 2018. LNCS, vol. 11141, pp. 270–279. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01424-7_27
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electronics and Telecommunication Engineering, Chittagong University of Engineering and Technology, Chittagong, 4349, Bangladesh
Eftekhar Hossain & Md. Azad Hossain
Department of Computer Science & Engineering, Chittagong University of Engineering and Technology, Chittagong, 4349, Bangladesh
Mohammed Moshiul Hoque

Authors

Eftekhar Hossain
View author publications
You can also search for this author in PubMed Google Scholar
Mohammed Moshiul Hoque
View author publications
You can also search for this author in PubMed Google Scholar
Md. Azad Hossain
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mohammed Moshiul Hoque .

Editor information

Editors and Affiliations

Modeling Evolutionary Algorithms Simulation and Artificial Intelligence, Faculty of Electrical and Electronics Engineering, Ton Duc Thang University, Ho Chi Minh City, Vietnam
Pandian Vasant
Faculty of Engineering Management, Poznan University of Technology, Poznań, Wielkopolskie, Poland
Gerhard-Wilhelm Weber
Facultad de Ingeniería, Universidad Panamericana, Mexico City, Mexico
José Antonio Marmolejo-Saucedo
Department of Business Statistics and Operations Research, School of Economic Sciences, North West University, Mahikeng, South Africa
Elias Munapo
Department of Computer Science, UOW Malaysia KDU Penang University Colle, George Town, Malaysia
J. Joshua Thomas

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hossain, E., Hoque, M.M., Hossain, M.A. (2023). An Inter-modal Attention Framework for Multimodal Offense Detection. In: Vasant, P., Weber, GW., Marmolejo-Saucedo, J.A., Munapo, E., Thomas, J.J. (eds) Intelligent Computing & Optimization. ICO 2022. Lecture Notes in Networks and Systems, vol 569. Springer, Cham. https://doi.org/10.1007/978-3-031-19958-5_81

Download citation

DOI: https://doi.org/10.1007/978-3-031-19958-5_81
Published: 21 October 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-19957-8
Online ISBN: 978-3-031-19958-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

An Inter-modal Attention Framework for Multimodal Offense Detection

Abstract

Access this chapter

Similar content being viewed by others

Multimodal visual-textual object graph attention network for propaganda detection in memes

Multimodal Rumour Detection: Catching News that Never Transpired!

MRDCA: a multimodal approach for fine-grained fake news detection through integration of RoBERTa and DenseNet based upon fusion mechanism of co-attention

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

An Inter-modal Attention Framework for Multimodal Offense Detection

Abstract

Access this chapter

Similar content being viewed by others

Multimodal visual-textual object graph attention network for propaganda detection in memes

Multimodal Rumour Detection: Catching News that Never Transpired!

MRDCA: a multimodal approach for fine-grained fake news detection through integration of RoBERTa and DenseNet based upon fusion mechanism of co-attention

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation