skip to main content
10.1145/3613330.3613340acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicdltConference Proceedingsconference-collections
research-article

Unsupervised Cross-Domain Rumor Detection from Multiple Sources Based on RoBERTa and Multi-CNN

Authors Info & Claims
Published:28 September 2023Publication History

ABSTRACT

Internet rumors are prevalent and harmful to society. Hence, automatic rumor detection is essential. However, supervised learning methods are impractical due to the high cost of data labeling in the early stage of rumor propagation. Moreover, rumors can originate from multiple different source domains, and single-source domain adaptation methods cannot handle this scenario. To address these challenges, this paper proposes a rumor detection model based on RoBERTa pre-training model and multiple convolutional neural networks that combines unsupervised learning and multi-source domain adaptation. The model only uses the microblog text content to transfer the knowledge from multiple source domains to the target domain. We find that dynamic text representations can be effectively extracted by RoBERTa, and the discrepancy can be better reduced by aligning the distributions of each pair of source and target domains in multiple feature spaces. Additionally, according to F1 and time cost, dynamic distribution adaptation performs best on quantitative evaluation. Finally, extensive experiments demonstrate that the proposed model outperforms the baseline models based on transfer learning in rumor detection.

References

  1. Jing Ma, Wei Gao, Zhongyu Wei, Yueming Lu, and Kam-Fai Wong. 2015. Detect Rumors Using Time Series of Social Context Information on Microblogging Websites. In Proceedings of the 24th ACM International Conference on Information and Knowledge Management. Association for Computing Machinery, New York, NY, USA, 1751-1754. https://doi.org/10.1145/2806416.2806607Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Jing Ma, Wei Gao, and Kam-Fai Wong. 2017. Detect Rumors in Microblog Posts Using Propagation Structure via Kernel Learning. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Stroudsburg, PA, USA, 708-717. https://doi.org/10.18653/v1/P17-1066Google ScholarGoogle ScholarCross RefCross Ref
  3. Han Guo, Juan Cao, Yazi Zhang, Junbo Guo, and Jintao Li. 2018. Rumor Detection with Hierarchical Social Attention Network. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management (CIKM ’18). Association for Computing Machinery, New York, NY, USA, 943-951. https://doi.org/10.1145/3269206.3271709Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Changsong Bing, Yirong Wu, Fangmin Dong, Shouzhi Xu, Xiaodi Liu, and Shuifa Sun. 2022. Dual Co-Attention-Based Multi-Feature Fusion Method for Rumor Detection. Information 13, 1 (January 2022), 25. https://doi.org/10.3390/info13010025Google ScholarGoogle ScholarCross RefCross Ref
  5. Kan Liu and Haochen Du. 2019. Detecting Twitter Rumors with Deep Transfer Network. Data Analysis and Knowledge Discovery 3, 10 (November 2019), 47-55.  https://doi.org/10.11925/infotech.2096-3467.2018.1250Google ScholarGoogle ScholarCross RefCross Ref
  6. Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv preprint arXiv:1907.11692 (July 2019). https://arxiv.org/abs/1907.11692Google ScholarGoogle Scholar
  7. Yoon Kim. 2014. Convolutional Neural Networks for Sentence Classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Stroudsburg, PA, USA, 1746-1751. https://aclanthology.org/D14-1181Google ScholarGoogle ScholarCross RefCross Ref
  8. Arthur Gretton, Karsten M. Borgwardt, Malte J. Rasch, Bernhard Schölkopf, and Alexander Smola. 2012. A Kernel Two-Sample Test. The Journal of Machine Learning Research 13, 1 (March 2012), 723-773. https://dl.acm.org/doi/10.5555/2188385.2188410Google ScholarGoogle Scholar
  9. Yongchun Zhu, Fuzhen Zhuang, Jindong Wang, Guolin Ke, Jingwu Chen, Jiang Bian, Hui Xiong, and Qing He. 2020. Deep Subdomain Adaptation Network for Image Classification. IEEE transactions on neural networks and learning systems 32, 4 (May 2020), 1713-1722. https://doi.org/10.1109/TNNLS.2020.2988928Google ScholarGoogle ScholarCross RefCross Ref
  10. Jindong Wang, Yiqiang Chen, Wenjie Feng, Han Yu, Meiyu Huang, and Qiang Yang. 2020. Transfer Learning with Dynamic Distribution Adaptation. ACM Transactions on Intelligent Systems and Technology 11, 1 (February 2020), 1-25. https://doi.org/10.1145/3360309Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Yongchun Zhu, Fuzhen Zhuang, and Deqing Wang. 2019. Aligning Domain-Specific Distribution and Classifier for Cross-Domain Classification from Multiple Sources. In Proceedings of the AAAI Conference on Artificial Intelligence. AAAI Press, Palo Alto, CA, USA, 5989-5996. https://doi.org/10.1609/aaai.v33i01.33015989Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Qiong Nan, Juan Cao, Yongchun Zhu, Yanyan Wang, and Jintao Li. 2021. MDFEND: Multi-domain Fake News Detection. In Proceedings of the 30th ACM International Conference on Information and Knowledge Management. Association for Computing Machinery, New York, NY, USA, 3343-3347. https://doi.org/10.1145/3459637.3482139Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Jing Ma, Wei Gao, Prasenjit Mitra, Sejeong Kwon, Bernard J. Jansen, Kam-Fai Wong, and Meeyoung Cha. 2016. Detecting Rumors from Microblogs with Recurrent Neural Networks. In Proceedings of the 25th International Joint Conference on Artificial Intelligence. AAAI Press, Palo Alto, CA, USA, 3818-3824. https://ink.library.smu.edu.sg/sis_research/4630Google ScholarGoogle Scholar
  14. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Stroudsburg, PA, USA, 4171-4186. https://doi.org/10.18653/v1/N19-1423Google ScholarGoogle ScholarCross RefCross Ref
  15. Yaroslav Ganin, Evgeniya Ustinova, Hana Ajakan, Pascal Germain, Hugo Larochelle, François Laviolette, Mario Marchand, and Victor Lempitsky. 2016. Domain-Adversarial Training of Neural Networks. The Journal of Machine Learning Research 17, 1 (January 2016), 2096-2030. https://dl.acm.org/doi/abs/10.5555/2946645.2946704Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Han Zhao, Shanghang Zhang, Guanhang Wu, João P. Costeira, José M. F. Moura, and Geoffrey J. Gordon. 2018. Adversarial Multiple Source Domain Adaptation. In Proceedings of the 32nd International Conference on Neural Information Processing Systems. Curran Associates Inc., New York, NY, USA, 8568-8579. https://dl.acm.org/doi/abs/10.5555/3327757.3327947Google ScholarGoogle Scholar

Index Terms

  1. Unsupervised Cross-Domain Rumor Detection from Multiple Sources Based on RoBERTa and Multi-CNN
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Other conferences
            ICDLT '23: Proceedings of the 2023 7th International Conference on Deep Learning Technologies
            July 2023
            115 pages
            ISBN:9798400707520
            DOI:10.1145/3613330

            Copyright © 2023 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 28 September 2023

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article
            • Research
            • Refereed limited

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          HTML Format

          View this article in HTML Format .

          View HTML Format