skip to main content
10.1145/3633083.3633212acmotherconferencesArticle/Chapter ViewAbstractPublication PageshcaiepConference Proceedingsconference-collections
research-article

Explainability in NLP model: Detection of Covid-19 Twitter Fake News

Published:14 December 2023Publication History

ABSTRACT

Fake news has found fertile ground on social media. A global health crisis such as COVID-19 further helps propagate fake news on social media. Much research has been done to develop AI systems that classify news as real or fake. However, there is a growing concern about trust in these AI systems. To this end, we attempt to improve the trustworthiness of AI text classification systems. We use tools to explore data, explain feature extraction techniques, interpret the ML models implemented, and explain the decision-making progress of AI systems. In this study, we compared five ML classifiers for our experiments: Naive Bayes, Support Vector Machines (SVMs), Logistic Regression, Decision Tree, and Random Forest. The models were trained on 10700 tweets containing 5,600 real and 5,100 fake tweets related to COVID-19. In comparison, the SVMs model performance was the best, with a detection accuracy of 0.93 and F1 scores of 0.94 and 0.93 for real and fake news, respectively. Global and local explanations are included to understand the overall model behavior, ensuring transparency and fostering confidence in AI users. We have chosen the SVMs model for the explanation section as it was the best model in this study.

References

  1. Sajid Ali, Tamer Abuhmed, Shaker El-Sappagh, Khan Muhammad, Jose M. Alonso-Moral, Roberto Confalonieri, Riccardo Guidotti, Javier Del Ser, Natalia Díaz-Rodríguez, and Francisco Herrera. 2023. Explainable Artificial Intelligence (XAI): What we know and what is left to attain Trustworthy Artificial Intelligence. Information Fusion 99 (2023), 101805. https://doi.org/10.1016/j.inffus.2023.101805Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Malak Aljabri, Amal A Alahmadi, Rami Mustafa A Mohammad, Menna Aboulnour, Dorieh M Alomari, and Sultan H Almotiri. 2022. Classification of firewall log data using multiclass machine learning models. Electronics 11, 12 (2022), 1851.Google ScholarGoogle ScholarCross RefCross Ref
  3. Malak Aljabri, Sumayh S Aljameel, Rami Mustafa A Mohammad, Sultan H Almotiri, Samiha Mirza, Fatima M Anis, Menna Aboulnour, Dorieh M Alomari, Dina H Alhamed, and Hanan S Altamimi. 2021. Intelligent techniques for detecting network attacks: review and research directions. Sensors 21, 21 (2021), 7070.Google ScholarGoogle ScholarCross RefCross Ref
  4. Leo Breiman. 2001. Random forests. Machine learning 45 (2001), 5–32.Google ScholarGoogle Scholar
  5. J Scott Brennen, Felix M Simon, Philip N Howard, and Rasmus Kleis Nielsen. 2020. Types, sources, and claims of COVID-19 misinformation. Ph. D. Dissertation. University of Oxford.Google ScholarGoogle Scholar
  6. DeepAI. 2019. Machine Learning. https://deepai.org/machine-learning-glossary-and-terms/machine-learningGoogle ScholarGoogle Scholar
  7. Deepanshi. 2023. Text preprocessing in NLP with python codes. https://www.analyticsvidhya.com/blog/2021/06/text-preprocessing-in-nlp-with-python-codes/Google ScholarGoogle Scholar
  8. S Dixon. 2023. Number of worldwide social network users 2027. https://www.statista.com/statistics/278414/number-of-worldwide-social-network-users/. Accessed: 2023-11-7.Google ScholarGoogle Scholar
  9. Finale Doshi-Velez and Been Kim. 2017. Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608 (2017).Google ScholarGoogle Scholar
  10. Abhishek Jha. 2023. Vectorization techniques in NLP [guide]. https://neptune.ai/blog/vectorization-techniques-in-nlp-guideGoogle ScholarGoogle Scholar
  11. Hamid Karimi, Proteek Roy, Sari Saba-Sadiya, and Jiliang Tang. 2018. Multi-source multi-class fake news detection. In Proceedings of the 27th international conference on computational linguistics. 1546–1557.Google ScholarGoogle Scholar
  12. Nasser Karimi and Jon Gambrell. 2022. Hundreds die of poisoning in Iran as fake news suggests methanol cure for virus. Times of Israel.[online] Available: https://www. timesofisrael. com/hundreds-die-of-poisoning-in-iran-as-fake-news-suggests-methanol-cure-for-virus (2022).Google ScholarGoogle Scholar
  13. Sotiris B Kotsiantis. 2013. Decision trees: a recent overview. Artificial Intelligence Review 39 (2013), 261–283.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Michael P LaValley. 2008. Logistic regression. Circulation 117, 18 (2008), 2395–2399.Google ScholarGoogle ScholarCross RefCross Ref
  15. Quoc Le and Tomas Mikolov. 2014. Distributed representations of sentences and documents. In International conference on machine learning. PMLR, 1188–1196.Google ScholarGoogle Scholar
  16. Shikun Lyu and Dan Chia-Tien Lo. 2020. Fake News Detection by Decision Tree. In 2020 SoutheastCon. 1–2. https://doi.org/10.1109/SoutheastCon44009.2020.9249688Google ScholarGoogle ScholarCross RefCross Ref
  17. Sahil Mankad. 2020. A Tour of Evaluation Metrics for Machine Learning. https://www.analyticsvidhya.com/blog/2020/11/a-tour-of-evaluation-metrics-for-machine-learning/Google ScholarGoogle Scholar
  18. Parth Patwa, Shivam Sharma, Srinivas Pykl, Vineeth Guptha, Gitanjali Kumari, Md Shad Akhtar, Asif Ekbal, Amitava Das, and Tanmoy Chakraborty. 2021. Fighting an infodemic: Covid-19 fake news dataset. (2021), 21–29.Google ScholarGoogle Scholar
  19. Gordon Pennycook, Jonathon McPhetres, Yunhao Zhang, Jackson G Lu, and David G Rand. 2020. Fighting COVID-19 misinformation on social media: Experimental evidence for a scalable accuracy-nudge intervention. Psychological science 31, 7 (2020), 770–780.Google ScholarGoogle Scholar
  20. Tomas Pranckevičius and Virginijus Marcinkevičius. 2017. Comparison of naive bayes, random forest, decision tree, support vector machines, and logistic regression classifiers for text reviews classification. Baltic Journal of Modern Computing 5, 2 (2017), 221.Google ScholarGoogle ScholarCross RefCross Ref
  21. Julio CS Reis, André Correia, Fabrício Murai, Adriano Veloso, and Fabrício Benevenuto. 2019. Supervised learning for fake news detection. IEEE Intelligent Systems 34, 2 (2019), 76–81.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Natali Ruchansky, Sungyong Seo, and Yan Liu. 2017. Csi: A hybrid deep model for fake news detection. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. 797–806.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Gesina Schwalbe and Bettina Finzel. 2023. A comprehensive taxonomy for explainable artificial intelligence: a systematic survey of surveys on methods and concepts. Data Mining and Knowledge Discovery (2023), 1–59.Google ScholarGoogle Scholar
  24. Kai Shu, Amy Sliva, Suhang Wang, Jiliang Tang, and Huan Liu. 2017. Fake news detection on social media: A data mining perspective. ACM SIGKDD explorations newsletter 19, 1 (2017), 22–36.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Kai Shu, Suhang Wang, and Huan Liu. 2017. Exploiting tri-relationship for fake news detection. arXiv preprint arXiv:1712.07709 8 (2017).Google ScholarGoogle Scholar
  26. Kai Shu, Suhang Wang, and Huan Liu. 2019. Beyond news contents: The role of social context for fake news detection. In Proceedings of the twelfth ACM international conference on web search and data mining. 312–320.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Saikat Dutta Sourya Dipta Das, Ayan Basak. 2021. COVID19 Fake News Dataset NLP. https://doi.org/10.34740/KAGGLE/DSV/2016658Google ScholarGoogle ScholarCross RefCross Ref
  28. Ilse van der Linden, Hinda Haned, and Evangelos Kanoulas. 2019. Global aggregations of local explanations for black box models. arXiv preprint arXiv:1907.03039 (2019).Google ScholarGoogle Scholar
  29. Lipo Wang. 2005. Support vector machines: theory and applications. Vol. 177. Springer Science & Business Media.Google ScholarGoogle Scholar
  30. Geoffrey I Webb, Eamonn Keogh, and Risto Miikkulainen. 2010. Naïve Bayes.Encyclopedia of machine learning 15, 1 (2010), 713–714.Google ScholarGoogle Scholar
  31. Swathi Y and Manoj Challa. 2023. A Comparative Analysis of Explainable AI Techniques for Enhanced Model Interpretability. In 2023 3rd International Conference on Pervasive Computing and Social Networking (ICPCSN). 229–234. https://doi.org/10.1109/ICPCSN58827.2023.00043Google ScholarGoogle ScholarCross RefCross Ref
  32. Rowan Zellers, Ari Holtzman, Hannah Rashkin, Yonatan Bisk, Ali Farhadi, Franziska Roesner, and Yejin Choi. 2019. Defending against neural fake news. (2019).Google ScholarGoogle Scholar

Index Terms

  1. Explainability in NLP model: Detection of Covid-19 Twitter Fake News

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Other conferences
          HCAIep '23: Proceedings of the 2023 Conference on Human Centered Artificial Intelligence: Education and Practice
          December 2023
          63 pages
          ISBN:9798400716461
          DOI:10.1145/3633083

          Copyright © 2023 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 14 December 2023

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed limited
        • Article Metrics

          • Downloads (Last 12 months)39
          • Downloads (Last 6 weeks)12

          Other Metrics

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format .

        View HTML Format