ABSTRACT
Internet rumors are prevalent and harmful to society. Hence, automatic rumor detection is essential. However, supervised learning methods are impractical due to the high cost of data labeling in the early stage of rumor propagation. Moreover, rumors can originate from multiple different source domains, and single-source domain adaptation methods cannot handle this scenario. To address these challenges, this paper proposes a rumor detection model based on RoBERTa pre-training model and multiple convolutional neural networks that combines unsupervised learning and multi-source domain adaptation. The model only uses the microblog text content to transfer the knowledge from multiple source domains to the target domain. We find that dynamic text representations can be effectively extracted by RoBERTa, and the discrepancy can be better reduced by aligning the distributions of each pair of source and target domains in multiple feature spaces. Additionally, according to F1 and time cost, dynamic distribution adaptation performs best on quantitative evaluation. Finally, extensive experiments demonstrate that the proposed model outperforms the baseline models based on transfer learning in rumor detection.
- Jing Ma, Wei Gao, Zhongyu Wei, Yueming Lu, and Kam-Fai Wong. 2015. Detect Rumors Using Time Series of Social Context Information on Microblogging Websites. In Proceedings of the 24th ACM International Conference on Information and Knowledge Management. Association for Computing Machinery, New York, NY, USA, 1751-1754. https://doi.org/10.1145/2806416.2806607Google ScholarDigital Library
- Jing Ma, Wei Gao, and Kam-Fai Wong. 2017. Detect Rumors in Microblog Posts Using Propagation Structure via Kernel Learning. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Stroudsburg, PA, USA, 708-717. https://doi.org/10.18653/v1/P17-1066Google ScholarCross Ref
- Han Guo, Juan Cao, Yazi Zhang, Junbo Guo, and Jintao Li. 2018. Rumor Detection with Hierarchical Social Attention Network. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management (CIKM ’18). Association for Computing Machinery, New York, NY, USA, 943-951. https://doi.org/10.1145/3269206.3271709Google ScholarDigital Library
- Changsong Bing, Yirong Wu, Fangmin Dong, Shouzhi Xu, Xiaodi Liu, and Shuifa Sun. 2022. Dual Co-Attention-Based Multi-Feature Fusion Method for Rumor Detection. Information 13, 1 (January 2022), 25. https://doi.org/10.3390/info13010025Google ScholarCross Ref
- Kan Liu and Haochen Du. 2019. Detecting Twitter Rumors with Deep Transfer Network. Data Analysis and Knowledge Discovery 3, 10 (November 2019), 47-55. https://doi.org/10.11925/infotech.2096-3467.2018.1250Google ScholarCross Ref
- Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv preprint arXiv:1907.11692 (July 2019). https://arxiv.org/abs/1907.11692Google Scholar
- Yoon Kim. 2014. Convolutional Neural Networks for Sentence Classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Stroudsburg, PA, USA, 1746-1751. https://aclanthology.org/D14-1181Google ScholarCross Ref
- Arthur Gretton, Karsten M. Borgwardt, Malte J. Rasch, Bernhard Schölkopf, and Alexander Smola. 2012. A Kernel Two-Sample Test. The Journal of Machine Learning Research 13, 1 (March 2012), 723-773. https://dl.acm.org/doi/10.5555/2188385.2188410Google Scholar
- Yongchun Zhu, Fuzhen Zhuang, Jindong Wang, Guolin Ke, Jingwu Chen, Jiang Bian, Hui Xiong, and Qing He. 2020. Deep Subdomain Adaptation Network for Image Classification. IEEE transactions on neural networks and learning systems 32, 4 (May 2020), 1713-1722. https://doi.org/10.1109/TNNLS.2020.2988928Google ScholarCross Ref
- Jindong Wang, Yiqiang Chen, Wenjie Feng, Han Yu, Meiyu Huang, and Qiang Yang. 2020. Transfer Learning with Dynamic Distribution Adaptation. ACM Transactions on Intelligent Systems and Technology 11, 1 (February 2020), 1-25. https://doi.org/10.1145/3360309Google ScholarDigital Library
- Yongchun Zhu, Fuzhen Zhuang, and Deqing Wang. 2019. Aligning Domain-Specific Distribution and Classifier for Cross-Domain Classification from Multiple Sources. In Proceedings of the AAAI Conference on Artificial Intelligence. AAAI Press, Palo Alto, CA, USA, 5989-5996. https://doi.org/10.1609/aaai.v33i01.33015989Google ScholarDigital Library
- Qiong Nan, Juan Cao, Yongchun Zhu, Yanyan Wang, and Jintao Li. 2021. MDFEND: Multi-domain Fake News Detection. In Proceedings of the 30th ACM International Conference on Information and Knowledge Management. Association for Computing Machinery, New York, NY, USA, 3343-3347. https://doi.org/10.1145/3459637.3482139Google ScholarDigital Library
- Jing Ma, Wei Gao, Prasenjit Mitra, Sejeong Kwon, Bernard J. Jansen, Kam-Fai Wong, and Meeyoung Cha. 2016. Detecting Rumors from Microblogs with Recurrent Neural Networks. In Proceedings of the 25th International Joint Conference on Artificial Intelligence. AAAI Press, Palo Alto, CA, USA, 3818-3824. https://ink.library.smu.edu.sg/sis_research/4630Google Scholar
- Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Stroudsburg, PA, USA, 4171-4186. https://doi.org/10.18653/v1/N19-1423Google ScholarCross Ref
- Yaroslav Ganin, Evgeniya Ustinova, Hana Ajakan, Pascal Germain, Hugo Larochelle, François Laviolette, Mario Marchand, and Victor Lempitsky. 2016. Domain-Adversarial Training of Neural Networks. The Journal of Machine Learning Research 17, 1 (January 2016), 2096-2030. https://dl.acm.org/doi/abs/10.5555/2946645.2946704Google ScholarDigital Library
- Han Zhao, Shanghang Zhang, Guanhang Wu, João P. Costeira, José M. F. Moura, and Geoffrey J. Gordon. 2018. Adversarial Multiple Source Domain Adaptation. In Proceedings of the 32nd International Conference on Neural Information Processing Systems. Curran Associates Inc., New York, NY, USA, 8568-8579. https://dl.acm.org/doi/abs/10.5555/3327757.3327947Google Scholar
Index Terms
- Unsupervised Cross-Domain Rumor Detection from Multiple Sources Based on RoBERTa and Multi-CNN
Recommendations
Automatic detection of rumor on Sina Weibo
MDS '12: Proceedings of the ACM SIGKDD Workshop on Mining Data SemanticsThe problem of gauging information credibility on social networks has received considerable attention in recent years. Most previous work has chosen Twitter, the world's largest micro-blogging platform, as the premise of research. In this work, we shift ...
Unsupervised rumor detection based on users behaviors using neural networks
We propose comment based features to exploit crowd wisdom to help detect rumors.We use Recurrent Neural Networks to study the evolution of these features with time.We view rumor detection as an anomaly detection task.We adopt autoencoder to detect ...
Source-free unsupervised multi-source domain adaptation via proxy task for person re-identification
AbstractMost of existing unsupervised domain adaptation methods focus on aligning the feature discrepancy between labeled source and unlabeled target data. However, in practice, the source data may not be accessible due to transfer issue, privacy problem, ...
Comments