skip to main content
10.1145/3477314.3507134acmconferencesArticle/Chapter ViewAbstractPublication PagessacConference Proceedingsconference-collections
research-article

D-FEND: a diffusion-based fake news detection framework for news articles related to COVID-19

Published:06 May 2022Publication History

ABSTRACT

The social confusion caused by the recent pandemic of COVID-19 has been further facilitated by fake news diffused via social media on the Internet. For this reason, many studies have been proposed to detect fake news as early as possible. The content-based detection methods consider the difference between the contents of true and fake news articles. However, they suffer from the two serious limitations: (1) the publisher can manipulate the content of a news article easily, and (2) the content depends upon the language, with which the article is written. To overcome these limitations, the diffusion-based fake news detection methods have been proposed. The diffusion-based methods consider the difference among the diffusion patterns of true and fake news articles on social media. Despite its success, however, the lack of the diffusion information regarding to the COVID-19 related fake news prevents from studying the diffusion-based fake news detection methods. Therefore, for overcoming the limitation, we propose a diffusion-based fake news detection framework (D-FEND), which consists of four components: (C1) diffusion data collection, (C2) analysis of the data and feature extraction, (C3) model training, and (C4) inference. Our work contributes to the effort to mitigate the risk of infodemics during a pandemic by (1) building a new diffusion dataset, named CoAID+, (2) identifying and addressing the class imbalance problem of CoAID+, and (3) demonstrating that D-FEND successfully detects fake news articles with 88.89% model accuracy on average.

References

  1. [n. d.]. BuzzfeedNews Dataset. https://github.com/BuzzFeedNews/everything.Google ScholarGoogle Scholar
  2. Hadeer Ahmed, Issa Traore, and Sherif Saad. 2017. Detection of online fake news using n-gram analysis and machine learning techniques. In Proceedings of the International conference on intelligent, secure, and dependable systems in distributed and cloud environments. Springer, 127--138.Google ScholarGoogle ScholarCross RefCross Ref
  3. Sameer Badaskar, Sachin Agarwal, and Shilpa Arora. 2008. Identifying real or fake articles: Towards better language modeling. In Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-II.Google ScholarGoogle Scholar
  4. Carlos Castillo, Marcelo Mendoza, and Barbara Poblete. 2011. Information credibility on twitter. In Proceedings of the 20th international conference on World wide web. 675--684.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Gavin C Cawley. 2006. Leave-one-out cross-validation based model selection criteria for weighted LS-SVMs. In Proceedings of the 2006 IEEE international joint conference on neural network proceedings. IEEE, 1661--1668.Google ScholarGoogle Scholar
  6. Gavin C Cawley and Nicola LC Talbot. 2003. Efficient leave-one-out cross-validation of kernel fisher discriminant classifiers. Pattern Recognition 36, 11 (2003), 2585--2592.Google ScholarGoogle ScholarCross RefCross Ref
  7. Nitesh V Chawla, Kevin W Bowyer, Lawrence O Hall, and W Philip Kegelmeyer. 2002. SMOTE: synthetic minority over-sampling technique. Journal of artificial intelligence research 16 (2002), 321--357.Google ScholarGoogle ScholarCross RefCross Ref
  8. Nitesh V Chawla, Nathalie Japkowicz, and Aleksander Kotcz. 2004. Special issue on learning from imbalanced data sets. ACM SIGKDD explorations newsletter 6, 1 (2004), 1--6.Google ScholarGoogle Scholar
  9. Limeng Cui and Dongwon Lee. 2020. Coaid: Covid-19 healthcare misinformation dataset. arXiv preprint arXiv:2006.00885 (2020).Google ScholarGoogle Scholar
  10. Mohamed K Elhadad, Kin Fun Li, and Fayez Gebali. 2020. An ensemble deep learning technique to detect COVID-19 misleading information. In Proceedings of the International Conference on Network-Based Information Systems. Springer, 163--175.Google ScholarGoogle Scholar
  11. Adel Ghazikhani, Hadi Sadoghi Yazdi, and Reza Monsefi. 2012. Class imbalance handling using wrapper-based random oversampling. In Proceedings of the 20th Iranian Conference on Electrical Engineering (ICEE2012). IEEE, 611--616.Google ScholarGoogle ScholarCross RefCross Ref
  12. Sunil Gundapu and Radhika Mamidi. 2021. Transformer based Automatic COVID-19 Fake News Detection System. arXiv preprint arXiv:2101.00180 (2021).Google ScholarGoogle Scholar
  13. Shunjie Han, Cao Qubo, and Han Meng. 2012. Parameter selection in SVM with RBF kernel function. In Proceedings of the World Automation Congress 2012. IEEE, 1--4.Google ScholarGoogle Scholar
  14. Haibo He and Edwardo A Garcia. 2009. Learning from imbalanced data. IEEE Transactions on knowledge and data engineering 21, 9 (2009), 1263--1284.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Zhiwei Jin, Juan Cao, Yongdong Zhang, and Jiebo Luo. 2016. News verification by exploiting conflicting social viewpoints in microblogs. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 30.Google ScholarGoogle ScholarCross RefCross Ref
  16. Michael Kearns and Dana Ron. 1999. Algorithmic stability and sanity-check bounds for leave-one-out cross-validation. Neural computation 11, 6 (1999), 1427--1453.Google ScholarGoogle Scholar
  17. Adam Kucharski. 2016. Study epidemiology of fake news. Nature 540, 7634 (2016), 525--525.Google ScholarGoogle Scholar
  18. Bor-Chen Kuo, Hsin-Hua Ho, Cheng-Hsuan Li, Chih-Cheng Hung, and Jin-Shiuh Taur. 2013. A kernel-based feature selection method for SVM with RBF kernel for hyperspectral image classification. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 7, 1 (2013), 317--326.Google ScholarGoogle ScholarCross RefCross Ref
  19. Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning. nature 521, 7553 (2015), 436--444.Google ScholarGoogle Scholar
  20. Yin Liu and Keshab K Parhi. 2016. Computing RBF kernel for SVM classification using stochastic logic. In Proceedings of the 2016 IEEE International Workshop on Signal Processing Systems (SiPS). IEEE, 327--332.Google ScholarGoogle ScholarCross RefCross Ref
  21. Yang Liu and Yi-Fang Brook Wu. 2018. Early detection of fake news on social media through propagation path classification with recurrent and convolutional networks. In Proceedings of the AAAI conference on Artificial Intelligence.Google ScholarGoogle ScholarCross RefCross Ref
  22. Tanushree Mitra and Eric Gilbert. 2015. Credbank: A large-scale social media corpus with associated credibility annotations. In Proceedings of the international AAAI conference on web and social media.Google ScholarGoogle Scholar
  23. Federico Monti, Fabrizio Frasca, Davide Eynard, Damon Mannion, and Michael M Bronstein. 2019. Fake news detection on social media using geometric deep learning. arXiv preprint arXiv:1902.06673 (2019).Google ScholarGoogle Scholar
  24. Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, et al. 2011. Scikit-learn: Machine learning in Python. the Journal of machine Learning research 12 (2011), 2825--2830.Google ScholarGoogle Scholar
  25. Martin Potthast, Johannes Kiesel, Kevin Reinartz, Janek Bevendorff, and Benno Stein. 2017. A stylometric inquiry into hyperpartisan and fake news. arXiv preprint arXiv:1702.05638 (2017).Google ScholarGoogle Scholar
  26. Benjamin Riedel, Isabelle Augenstein, Georgios P Spithourakis, and Sebastian Riedel. 2017. A simple but tough-to-beat baseline for the Fake News Challenge stance detection task. arXiv preprint arXiv:1707.03264 (2017).Google ScholarGoogle Scholar
  27. Natali Ruchansky, Sungyong Seo, and Yan Liu. 2017. Csi: A hybrid deep model for fake news detection. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. 797--806.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Chris Seiffert, Taghi M Khoshgoftaar, Jason Van Hulse, and Amri Napolitano. 2008. RUSBoost: Improving classification performance when training data is skewed. In Proceedings of the 2008 19th International Conference on Pattern Recognition. IEEE, 1--4.Google ScholarGoogle ScholarCross RefCross Ref
  29. Gautam Kishore Shahi and Durgesh Nandini. 2020. FakeCovid-A multilingual cross-domain fact check news dataset for COVID-19. arXiv preprint arXiv:2006.11343 (2020).Google ScholarGoogle Scholar
  30. Kai Shu, Deepak Mahudeswaran, Suhang Wang, Dongwon Lee, and Huan Liu. 2020. Fakenewsnet: A data repository with news content, social context, and spatiotemporal information for studying fake news on social media. Big data 8, 3 (2020), 171--188.Google ScholarGoogle Scholar
  31. Kai Shu, Deepak Mahudeswaran, Suhang Wang, and Huan Liu. 2020. Hierarchical propagation networks for fake news detection: Investigation and exploitation. In Proceedings of the International AAAI Conference on Web and Social Media, Vol. 14. 626--637.Google ScholarGoogle ScholarCross RefCross Ref
  32. Kai Shu, Amy Sliva, Suhang Wang, Jiliang Tang, and Huan Liu. 2017. Fake news detection on social media: A data mining perspective. ACM SIGKDD explorations newsletter 19, 1 (2017), 22--36.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Kai Shu, Suhang Wang, and Huan Liu. 2018. Understanding user profiles on social media for fake news detection. In Proceedings of the 2018 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR). IEEE, 430--435.Google ScholarGoogle ScholarCross RefCross Ref
  34. Kai Shu, Suhang Wang, and Huan Liu. 2019. Beyond news contents: The role of social context for fake news detection. In Proceedings of the twelfth ACM international conference on web search and data mining. 312--320.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Mirela Silva, Fabrício Ceschin, Prakash Shrestha, Christopher Brant, Juliana Fernandes, Catia S Silva, André Grégio, Daniela Oliveira, and Luiz Giovanini. 2020. Predicting misinformation and engagement in covid-19 twitter discourse in the first months of the outbreak. arXiv preprint arXiv:2012.02164 (2020).Google ScholarGoogle Scholar
  36. Shivangi Singhal, Rajiv Ratn Shah, Tanmoy Chakraborty, Ponnurangam Kumaraguru, and Shin'ichi Satoh. 2019. Spotfake: A multi-modal framework for fake news detection. In Proceedings of the 2019 IEEE fifth international conference on multimedia big data (BigMM). IEEE, 39--47.Google ScholarGoogle ScholarCross RefCross Ref
  37. Eugenio Tacchini, Gabriele Ballarin, Marco L Della Vedova, Stefano Moret, and Luca de Alfaro. 2017. Some like it hoax: Automated fake news detection in social networks. arXiv preprint arXiv:1704.07506 (2017).Google ScholarGoogle Scholar
  38. Soroush Vosoughi, Deb Roy, and Sinan Aral. 2018. The spread of true and false news online. Science 359, 6380 (2018), 1146--1151.Google ScholarGoogle Scholar
  39. Juanjuan Wang, Mantao Xu, Hui Wang, and Jiwu Zhang. 2006. Classification of imbalanced data by using the SMOTE algorithm and locally linear embedding. In Proceedings of the 2006 8th international Conference on Signal Processing, Vol. 3. IEEE.Google ScholarGoogle ScholarCross RefCross Ref
  40. William Yang Wang. 2017. " liar, liar pants on fire": A new benchmark dataset for fake news detection. arXiv preprint arXiv:1705.00648 (2017).Google ScholarGoogle Scholar
  41. Xinyi Zhou, Apurva Mulay, Emilio Ferrara, and Reza Zafarani. 2020. Recovery: A multimodal repository for covid-19 news credibility research. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management. 3205--3212.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. D-FEND: a diffusion-based fake news detection framework for news articles related to COVID-19
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SAC '22: Proceedings of the 37th ACM/SIGAPP Symposium on Applied Computing
      April 2022
      2099 pages
      ISBN:9781450387132
      DOI:10.1145/3477314

      Copyright © 2022 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 6 May 2022

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate1,650of6,669submissions,25%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader