research-article

MMHFND: Fusing Modalities for Multimodal Multiclass Hindi Fake News Detection via Contrastive Learning

Authors:

Arti AryaAuthors Info & Claims

ACM Transactions on Asian and Low-Resource Language Information Processing, Volume 23, Issue 11

Article No.: 156, Pages 1 - 25

https://doi.org/10.1145/3686797

Published: 21 November 2024 Publication History

Abstract

Multimodal content contains more deception than unimodal information, causing significant social and economic impacts. Current techniques often focus on a single modality, neglecting knowledge fusion. While most studies have concentrated on English fake news detection, this study explores multimodality for low-resource languages like Hindi. This work introduces the MMHFND model, based on M-CLIP, which uses late fusion for coarse (Fake vs Real) and fine-grained (World vs India vs Politics vs News vs Fact-Check) configurations. We extract deep representations from image and text using image transformer ResNet-50, a BERT-based L3cube-HindRoberta text transformer handling headlines, content, OCR text, and image captions, paired M-CLIP transformers, and an ELA (Error-Level Analysis) image forensic method incorporating EfficientNet B0 to analyze multimodal news in Hindi language based on Devanagari script. M-CLIP integrates cross-modal similarity mapping of images and texts with retrieved multimodal features. The extracted features undergo redundancy reduction before being channeled into the final classifier. The MAM (Modality Attention Mechanism) is introduced, which generates weights for each modality individually. The MMHFND model uses a computed modality divergence score to identify dissonance between modalities and a modified contrastive loss on the score. We thoroughly analyze the HinFakeNews dataset in a multimodal context, achieving accuracy in coarse- and fine-grained configurations. We also undertake an ablation study to evaluate outcomes and explore alternative fusion processes on three different setups. The results show that the MMHFND model effectively detects fake news in Hindi with an accuracy of 0.986, outperforming other existing multimodal approaches.

References

[1]

D. Khatar, J. S. Goud, M. Gupta, and V. Varma. 2019. MVAE: Multimodal variational autoencoder for fake news detection. In Proceedings of the World Wide Web Conference. 2915–2921.

Digital Library

[2]

Q. Peng, J. Cao, T. Yang, G. Junbo, and L. Jintao. 2019. Exploiting multidomain visual information for fake news detection. In Proceedings of the IEEE International Conference on Data Mining (ICDM’19). IEEE. 518–527. https://doi.ieeecomputersociety.org/10.1109/ICDM.2019.00062

[3]

A. Zubiaga, A. Ahmet, B. Kalina, L. Maria, and P. Rob. 2019. Detection and resolution of rumours in social media: A survey. ACM Computing Surveys 51, 2 (March2019), 1–36.

Digital Library

[4]

S. M. Alzanin and A. M. Azmi. 2018. Detecting rumors in social media: A survey. Procedia Computer Science 142 (Nov.2018), 294–300.

Digital Library

[5]

X. Zhou and R. Zafarani. 2020. Fake news: A survey of research, detection methods, and opportunities. ACM Computing Surveys 53, 5 (Sept.2020).

Digital Library

[6]

K. Wu, S. Yang, and K. Q. Zhu. 2015. False rumors detection on Sina Weibo by propagation structures. In Proceedings of the 2015 IEEE 31st International Conference on Data Engineering. 651–662.

[7]

Z. Jin, J. Cao, Y. Zhou, and Q. Tian. 2017. Novel visual and statistical image features for microblogs news verification. IEEE Transactions on Multimedia 19, 3 (March2017), 598–608.

Digital Library

[8]

N. M. Duc Tuan and P. Quang. 2021. Multimodal fusion with BERT and attention mechanism for fake news detection. In Proceedings of the International Conference on Computing and Communication Technologies (RIVF’21). 1–6.

[9]

Z. W. Jin, J. Cao, G. Han, Y. D. Zhang, and J. B. Luo. 2017. Multimodal fusion with recurrent neural networks for rumor detection on microblogs. In Proceedings of the 25th ACM International Conference on Multimedia. 759–816.

Digital Library

[10]

S. Singhal, A. Kabra, M. Sharma, R. R. Shah, and P. Kumaraguru. 2019. SpotFake: A multi-modal framework for fake news detection. In Proceedings of the IEEE 5th International Conference on Multimedia Big Data (BigMM’19). 39–47.

[11]

Y. Q. Wang, F. L. Ma, Z. W. Jin, Y. Yuan, G. X. Xun, and K. Jha. 2018. EANN: Event adversarial neural networks for multi-modal fake news detection. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 849–857.

Digital Library

[12]

H. W. Zhang, Q. Fang, S. S. Qian, and C. S. Xv. 2019. Multi-modal knowledge-aware event memory network for social media rumor detection. In Proceedings of the 27th ACM International Conference on Multimedia. 1942–1951.

Digital Library

[13]

Y. Chen, D. Li, P. Zhang, J. Sui, Q. Lv, L. Tun, and L. Shang. 2022. Cross-modal ambiguity learning for multimodal fake news detection. In Proceedings of the ACM Web Conference. 2897–2905.

Digital Library

[14]

X. Zhang, S. Dadkhah, A. G. Weismann, M. A. Kanaani, and A. A. Ghorbani. 2024. Multimodal fake news analysis based on image-text similarity. IEEE Transactions on Computational Social Systems 11, 1 (Feb. 2024), 959–972.

[15]

P. Meel and D. K. Vishwakarma. 2021. HAN, image captioning, and forensics ensemble multimodal fake news detection. Information Sciences 567 (March2021), 23–41.

[16]

E. Masiciari, V. Moscato, A. Picariello, and G. Sperli. 2020. Detecting fake news by image analysis. In Proceedings of the 24th Symposium on International Database Engineering and Applications (IDEAS’20). 1–5.

Digital Library

[17]

S. Devi, V. Karthik, B. V. Bavatharani, and K. Indhumadhi. 2021. Fake news and tampered image detection in social networks using machine learning. In Proceedings of the 3rd International Conference on Inventive Research in Computing Applications (ICIRCA’21). 266–272.

[18]

S. K. Uppada, P. Patel, and B. Sivaselvan. 2023. An image and text-based multimodal model for detecting fake news in OSN’s. Journal of Intelligent Information Systems61 (Nov. 2023), 367–393.

Digital Library

[19]

L. Peng, W. Qian, D. Xu, B. Ren, and J. Cao. 2023. Multi-modal fake news detection via bridging the gap between modals. Entropy 25, 4 (April2023), 614.

[20]

A. Radford, J. Wook Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, and J. Clark. 2021. Learning transferable visual models from natural language supervision. In Proceedings of the 38th International Conference on Machine Learning. 8748–8763.

[21]

N. Batra. 2021. What Percentage of People can Speak Hindi in India? Check Here. Retrieved August 15, 2024 from https://www.jagranjosh.com/general-knowledge/know-what-percentage-of-people-in-india-can-speak-hindi-1694658097-1

[22]

F. Carlsson, P. Eisen, F. Rekathati, and M. Sahlgren. 2022. Cross-lingual and multilingual CLIP. In Proceedings of the 13th Conference on Language Resources and Evaluation (LREC’22). 6848–6854. https://aclanthology.org/2022.lrec-1.739

[23]

M. Tan and Q. Le. 2019. EfficientNet: Rethinking model scaling for convolutional neural networks. In Proceedings of the 36th International Conference on Machine Learning. 6105–6114.

[24]

K. He, X. Zhang, S. Ren, and J. Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770–778.

[25]

R. Joshi. 2023. L3Cube-HindBERT and DevBERT: Pre-Trained BERT transformer models for Devanagri-based Hindi and Marathi languages. arXiv:2211.11418 (2023).

[26]

J. Hu, L. Shen, and G. Sun. 2018. Squeeze-and-excitation networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7132–7141.

[27]

K. Chugh, P. Gupta, A. Dhall, and R. Subramanian. 2020. Not made for each other—Audio-visual dissonance-based deepfake detection and localization. In Proceedings of the 28th ACM International Conference on Multimedia (MM’20). ACM, New York, NY, USA, 439–447.

Digital Library

[28]

R. Sharma and A. Arya. 2023. LFWE: Linguistic feature based word embedding for Hindi fake news detection. ACM Transactions on Asian and Low-Resource Language Information Processing 22, 6 (June2023), Article 165, 24 pages.

Digital Library

[29]

Z. Jin, J. Cao, H. Guo, Y. Zhang, and J. Luo. 2017. Multimodal fusion with recurrent neural networks for rumor detection on microblogs. In Proceedings of the 25th ACM International Conference on Multimedia (MM’17). 795–816.

Digital Library

[30]

D. Khattar, J. Singh, M. Gupta, and V. Varma. 2019. MVAE: Multimodal variational autoencoder for fake news detection. In Proceedings of the World Wide Web Conference. 2915–2921.

Digital Library

[31]

Y. Wang, F. Ma, Z. Jin, Y. Yuan, G. Xun, K. Jha, L. Su, and J. Gao. 2018. EANN: Event adversarial neural networks for multi-modal fake news detection. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 849–857.

Digital Library

[32]

S. Singhal, A. Kabra, M. Sharma, R. Shah, and P. Kumaraguru. 2020. SpotFake+: A multimodal framework for fake news detection via transfer learning. In Proceedings of the AAAI Conference on Artificial Intelligence. 13915–13916.

[33]

L. Jinshuo, F. Kuo, J. Pan, D. Juan, and W. Lina. 2020. MSRD: Multimodal web rumor detection method. Journal of Computer Research and Development 11, 57 (2020), 2328–2336.

[34]

M. Liu, K. Liang, B. Xiao, S. Zhou. H, W. Tu. X, and Y. Liu. 2024. Self-supervised temporal graph learning with temporal and structural intensity alignment. IEEE Transactions on Neural Networks and Learning Systems. Early Access, April 22, 2024.

[35]

C. Raj and P. Meel. 2022. ARCNN framework for multimodal infodemic detection. Journal of Neural Networks 146 (2022), 36–68.

Digital Library

[36]

H. Zhang, Q. Fang, S. Qian, and C. Xu. 2019. Multi-modal knowledge-aware event memory network for social media rumor detection. In Proceedings of the 27th ACM International Conference on Multimedia (MM’19). ACM, New York, NY, USA, 1942–1951.

Digital Library

[37]

X. Chen, S. Xie, and K. He. 2021. An empirical study of training self-supervised vision transformers. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 9620–9629.

[38]

Y. Yan, R. Li, S. Wang, F. Zhang, W. Wu, and W. Xu. 2021. ConSERT: A contrastive framework for self-supervised sentence representation transfer. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Long Papers, Vol. 1. 5065–5075.

[39]

A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, and J. Clark. 2021. Learning transferable visual models from natural language supervision. In Proceedings of the International Conference on Machine Learning. 8748–8763.

[40]

D. Dong, F. Lin, G. Li, and B. Liu. 2022. Similarity-aware attention network for multimodal fake news detection. In Proceedings of the 3rd International Conference on Big Data and Artificial Intelligence and Software Engineering (ICBASE’22), Vol. 3304. CEUR-ws.org/Vol-3304/paper10.pdf

[41]

Y. Zhou, Y. Yang, Q. Ying, Z. Qian, and X. Zhang. 2023. Multi-modal fake news detection on social media via multi-grained information fusion. In Proceedings of the 2023 ACM International Conference on Multimedia Retrieval. 343–352.

Digital Library

[42]

J. Kaur and J. R. Saini. 2022. A novel soft voting based hybrid approach to detect fake news in Hindi. In Proceedings of the International Conference on Futuristic Technologies (INCOFT’22). 1–8.

[43]

S. Garg and D. K. Sharma. 2024. Fake news detection in the Hindi language using multi-modality via transfer and ensemble learning. Internet Technology Letters. Published Online, April 1, 2024.

[44]

D. K. Sharma and S. Garg. 2023. IFND: A benchmark dataset for fake news detection. Complex & Intelligent Systems 9 (2023), 2843–2863.

[45]

M. K. Singh, J. Ahmed, K. K. Raghuvanshi, and M. A. Alam. 2024. IndDeepFake: Mitigating the spread of misinformation in India through a multimodal adversarial network. International Journal of Intelligent Systems and Applications in Engineering 12, 1S (2024), 81–97. https://ijisae.org/index.php/IJISAE/article/view/3397

[46]

A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, and I. Sutskever. 2019. Language models are unsupervised multitask learners. OpenAI Blog 1, 8 (2019), 9. https://api.semanticscholar.org/CorpusID:160025533

[47]

S. Singhal, T. Pandey, S. Mrig, R. R. Shah, and P. Kumaraguru. 2022. Leveraging intra and inter modality relationship for multimodal fake news detection. In Companion Proceedings of the Web Conference. 726–734.

Digital Library

[48]

C. Wardle. 2017. Fake news. It’s complicated. First Draft. Retrieved August 15, 2024 from https://firstdraftnews.org/articles/fake-news-complicated/

[49]

P. Khosla, P. Teterwalk, C. Wang, A. Sarna, Y. Tian, P. Isola, A. Maschinot, C. Liu, and D. Krishnan. 2022. Supervised contrastive learning. In Proceedings of the 34th Conference on Neural Information Processing Systems (NeurIPS’20).18661–18673.

Index Terms

MMHFND: Fusing Modalities for Multimodal Multiclass Hindi Fake News Detection via Contrastive Learning
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Language resources

Recommendations

Intra and Inter-modality Incongruity Modeling and Adversarial Contrastive Learning for Multimodal Fake News Detection
ICMR '24: Proceedings of the 2024 International Conference on Multimedia Retrieval

Multimodal fake news detection (FND) is significant in safeguarding network security and societal safety. Most existing studies only focus on common semantic features between different modalities and utilize simple cross-entropy loss for model training. ...
Cross-modal Contrastive Learning for Multimodal Fake News Detection
MM '23: Proceedings of the 31st ACM International Conference on Multimedia

Automatic detection of multimodal fake news has gained a widespread attention recently. Many existing approaches seek to fuse unimodal features to produce multimodal news representations. However, the potential of powerful cross-modal contrastive ...
Cross-modal Ambiguity Learning for Multimodal Fake News Detection
WWW '22: Proceedings of the ACM Web Conference 2022

Cross-modal learning is essential to enable accurate fake news detection due to the fast-growing multimodal contents in online social communities. A fundamental challenge of multimodal fake news detection lies in the inherent ambiguity across different ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Asian and Low-Resource Language Information Processing

ACM Transactions on Asian and Low-Resource Language Information Processing Volume 23, Issue 11

November 2024

248 pages

EISSN:2375-4702

DOI:10.1145/3613714

Editor:
Imed Zitouni
Google, USA

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 November 2024

Online AM: 12 August 2024

Accepted: 25 July 2024

Revised: 20 July 2024

Received: 28 June 2024

Published in TALLIP Volume 23, Issue 11

Check for updates

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
338
Total Downloads

Downloads (Last 12 months)338
Downloads (Last 6 weeks)20

Reflects downloads up to 02 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

Figures

Tables

Media

View full text|Download PDF

View Issue’s Table of Contents