Skip to main content

Multimodal Rumor Detection by Using Additive Angular Margin with Class-Aware Attention for Hard Samples

  • Conference paper
  • First Online:
Pattern Recognition and Computer Vision (PRCV 2023)

Abstract

Currently, several factors limit the practicality of multimodal rumor detection (MRD). These include incomplete feature fusion in multimodal data, the weak discriminative power in the softmax-based loss, and the detrimental impact of hard negative samples on the learning process. To address these issues, we propose a MRD framework that combines a supervised contrastive loss with an additive angular margin and incorporates class-aware attention. We propose a multi-layer fusion (MLF) module to enhance the multimodal feature fusion to align and fuse token-level features from text and image modalities. And also, by adding an angular margin to the loss function, we bolster the discriminative power of the contrastive loss. Additionally, the class-aware attention module effectively mitigates the impact of hard negative samples on the supervised contrastive loss. Extensive experiments on three real-world multimodal datasets demonstrate that our proposed learning objective leads to an embedding space that effectively distinguishes between rumors and truths. Furthermore, our work has significantly improved the efficacy of rumor detection, enabling us to promptly identify and curtail rumors’ propagation.

Thanks to the open project of key laboratory, Xinjiang Uygur Autonomous Region (No. 2023D04079).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A simple framework for contrastive learning of visual representations. In: International Conference on Machine Learning, pp. 1597–1607. PMLR (2020)

    Google Scholar 

  2. Chen, Y., et al.: Cross-modal ambiguity learning for multimodal fake news detection. In: Proceedings of the ACM Web Conference 2022, pp. 2897–2905 (2022)

    Google Scholar 

  3. Cubuk, E.D., Zoph, B., Shlens, J., Le, Q.V.: RandAugment: practical automated data augmentation with a reduced search space. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 702–703 (2020)

    Google Scholar 

  4. Cui, Y., Zhou, F., Lin, Y., Belongie, S.: Fine-grained categorization and dataset bootstrapping using deep metric learning with humans in the loop. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1153–1162 (2016)

    Google Scholar 

  5. Boididou, C., Papadopoulos, S., Zampoglou, M., Apostolidis, L., Papadopoulou, O., Kompatsiaris, Y.: Detection, visualization of misleading content on Twitter. Int. J. Multimedia Inf. Retrieval 7(1), 71–86 (2018)

    Google Scholar 

  6. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. ArXiv abs/1810.04805 (2019)

    Google Scholar 

  7. Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. ArXiv abs/2010.11929 (2020)

    Google Scholar 

  8. Gao, Y., Wang, X., He, X., Feng, H., Zhang, Y.: Rumor detection with self-supervised learning on texts and social graph. Front. Comp. Sci. 17(4), 174611 (2023)

    Article  Google Scholar 

  9. Han, W., Chen, H., Poria, S.: Improving multimodal fusion with hierarchical mutual information maximization for multimodal sentiment analysis. arXiv preprint arXiv:2109.00412 (2021)

  10. He, K., Fan, H., Wu, Y., Xie, S., Girshick, R.: Momentum contrast for unsupervised visual representation learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9729–9738 (2020)

    Google Scholar 

  11. Hua, J., Cui, X., Li, X., Tang, K., Zhu, P.: Multimodal fake news detection through data augmentation-based contrastive learning. Appl. Soft Comput. 136, 110125 (2023)

    Article  Google Scholar 

  12. Jin, Z., Cao, J., Guo, H., Zhang, Y., Luo, J.: Multimodal fusion with recurrent neural networks for rumor detection on microblogs. In: Proceedings of the 25th ACM International Conference on Multimedia, pp. 795–816 (2017)

    Google Scholar 

  13. Ke, Z., Sheng, J., Li, Z., Silamu, W., Guo, Q.: Knowledge-guided sentiment analysis via learning from natural language explanations. IEEE Access 9, 3570–3578 (2021)

    Article  Google Scholar 

  14. Khattar, D., Goud, J.S., Gupta, M., Varma, V.: MVAE: multimodal variational autoencoder for fake news detection. In: The World Wide Web Conference, pp. 2915–2921 (2019)

    Google Scholar 

  15. Khosla, P., et al.: Supervised contrastive learning. Adv. Neural. Inf. Process. Syst. 33, 18661–18673 (2020)

    Google Scholar 

  16. Li, X., Li, Z., Sheng, J., Slamu, W.: Low-resource text classification via cross-lingual language model fine-tuning. In: Sun, M., Li, S., Zhang, Y., Liu, Y., He, S., Rao, G. (eds.) CCL 2020. LNCS (LNAI), vol. 12522, pp. 231–246. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-63031-7_17

    Chapter  Google Scholar 

  17. Li, Z., Li, X., Sheng, J., Slamu, W.: AgglutiFiT: efficient low-resource agglutinative language model fine-tuning. IEEE Access 8, 148489–148499 (2020)

    Article  Google Scholar 

  18. Li, Z., Mak, M.W.: Speaker representation learning via contrastive loss with maximal speaker separability. In: 2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), pp. 962–967. IEEE (2022)

    Google Scholar 

  19. Li, Z., Mak, M.W., Meng, H.M.L.: Discriminative speaker representation via contrastive learning with class-aware attention in angular space. In: ICASSP 2023–2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5. IEEE (2023)

    Google Scholar 

  20. Lu, M., Huang, Z., Li, B., Zhao, Y., Qin, Z., Li, D.: SIFTER: a framework for robust rumor detection. IEEE/ACM Trans. Audio Speech Lang. Process. 30, 429–442 (2022)

    Article  Google Scholar 

  21. Ma, J., Gao, W., Wong, K.F.: Detect rumors in microblog posts using propagation structure via kernel learning. Association for Computational Linguistics (2017)

    Google Scholar 

  22. Ma, J., Gao, W., Wong, K.F.: Rumor detection on twitter with tree-structured recursive neural networks. Association for Computational Linguistics (2018)

    Google Scholar 

  23. Ngiam, J., Khosla, A., Kim, M., Nam, J., Lee, H., Ng, A.Y.: Multimodal deep learning. In: Proceedings of the 28th International Conference on Machine Learning (ICML-2011), pp. 689–696 (2011)

    Google Scholar 

  24. Peng, L., Jian, S., Li, D., Shen, S.: MRML: multimodal rumor detection by deep metric learning. In: ICASSP 2023–2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5. IEEE (2023)

    Google Scholar 

  25. Sang, M., Li, H., Liu, F., Arnold, A.O., Wan, L.: Self-supervised speaker verification with simple Siamese network and self-supervised regularization. In: ICASSP 2022–2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6127–6131. IEEE (2022)

    Google Scholar 

  26. Schroff, F., Kalenichenko, D., Philbin, J.: FaceNet: a unified embedding for face recognition and clustering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 815–823 (2015)

    Google Scholar 

  27. Sheng, J., et al.: Multi-view contrastive learning with additive margin for adaptive nasopharyngeal carcinoma radiotherapy prediction. In: Proceedings of the 2023 ACM International Conference on Multimedia Retrieval, pp. 555–559 (2023)

    Google Scholar 

  28. Wang, Y., et al.: EANN: event adversarial neural networks for multi-modal fake news detection. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 849–857 (2018)

    Google Scholar 

  29. Wei, Z., Pan, H., Qiao, L., Niu, X., Dong, P., Li, D.: Cross-modal knowledge distillation in multi-modal fake news detection. In: ICASSP 2022–2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4733–4737. IEEE (2022)

    Google Scholar 

  30. Wu, Y., Zhan, P., Zhang, Y., Wang, L., Xu, Z.: Multimodal fusion with co-attention networks for fake news detection. In: Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pp. 2560–2569 (2021)

    Google Scholar 

  31. Wu, Z., Xiong, Y., Yu, S.X., Lin, D.: Unsupervised feature learning via non-parametric instance discrimination. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3733–3742 (2018)

    Google Scholar 

  32. Xie, Q., Dai, Z., Hovy, E., Luong, T., Le, Q.: Unsupervised data augmentation for consistency training. Adv. Neural. Inf. Process. Syst. 33, 6256–6268 (2020)

    Google Scholar 

  33. Xue, J., Wang, Y., Tian, Y., Li, Y., Shi, L., Wei, L.: Detecting fake news by exploring the consistency of multimodal data. Inf. Process. Manag. 58(5), 102610 (2021)

    Article  Google Scholar 

  34. Ying, Q., Hu, X., Zhou, Y., Qian, Z., Zeng, D., Ge, S.: Bootstrapping multi-view representations for fake news detection. In: Proceedings of the AAAI Conference on Artificial Intelligence (2023)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiuhong Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhou, C. et al. (2024). Multimodal Rumor Detection by Using Additive Angular Margin with Class-Aware Attention for Hard Samples. In: Liu, Q., et al. Pattern Recognition and Computer Vision. PRCV 2023. Lecture Notes in Computer Science, vol 14425. Springer, Singapore. https://doi.org/10.1007/978-981-99-8429-9_27

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-8429-9_27

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-8428-2

  • Online ISBN: 978-981-99-8429-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics