ABSTRACT
Fake news has found fertile ground on social media. A global health crisis such as COVID-19 further helps propagate fake news on social media. Much research has been done to develop AI systems that classify news as real or fake. However, there is a growing concern about trust in these AI systems. To this end, we attempt to improve the trustworthiness of AI text classification systems. We use tools to explore data, explain feature extraction techniques, interpret the ML models implemented, and explain the decision-making progress of AI systems. In this study, we compared five ML classifiers for our experiments: Naive Bayes, Support Vector Machines (SVMs), Logistic Regression, Decision Tree, and Random Forest. The models were trained on 10700 tweets containing 5,600 real and 5,100 fake tweets related to COVID-19. In comparison, the SVMs model performance was the best, with a detection accuracy of 0.93 and F1 scores of 0.94 and 0.93 for real and fake news, respectively. Global and local explanations are included to understand the overall model behavior, ensuring transparency and fostering confidence in AI users. We have chosen the SVMs model for the explanation section as it was the best model in this study.
- Sajid Ali, Tamer Abuhmed, Shaker El-Sappagh, Khan Muhammad, Jose M. Alonso-Moral, Roberto Confalonieri, Riccardo Guidotti, Javier Del Ser, Natalia Díaz-Rodríguez, and Francisco Herrera. 2023. Explainable Artificial Intelligence (XAI): What we know and what is left to attain Trustworthy Artificial Intelligence. Information Fusion 99 (2023), 101805. https://doi.org/10.1016/j.inffus.2023.101805Google ScholarDigital Library
- Malak Aljabri, Amal A Alahmadi, Rami Mustafa A Mohammad, Menna Aboulnour, Dorieh M Alomari, and Sultan H Almotiri. 2022. Classification of firewall log data using multiclass machine learning models. Electronics 11, 12 (2022), 1851.Google ScholarCross Ref
- Malak Aljabri, Sumayh S Aljameel, Rami Mustafa A Mohammad, Sultan H Almotiri, Samiha Mirza, Fatima M Anis, Menna Aboulnour, Dorieh M Alomari, Dina H Alhamed, and Hanan S Altamimi. 2021. Intelligent techniques for detecting network attacks: review and research directions. Sensors 21, 21 (2021), 7070.Google ScholarCross Ref
- Leo Breiman. 2001. Random forests. Machine learning 45 (2001), 5–32.Google Scholar
- J Scott Brennen, Felix M Simon, Philip N Howard, and Rasmus Kleis Nielsen. 2020. Types, sources, and claims of COVID-19 misinformation. Ph. D. Dissertation. University of Oxford.Google Scholar
- DeepAI. 2019. Machine Learning. https://deepai.org/machine-learning-glossary-and-terms/machine-learningGoogle Scholar
- Deepanshi. 2023. Text preprocessing in NLP with python codes. https://www.analyticsvidhya.com/blog/2021/06/text-preprocessing-in-nlp-with-python-codes/Google Scholar
- S Dixon. 2023. Number of worldwide social network users 2027. https://www.statista.com/statistics/278414/number-of-worldwide-social-network-users/. Accessed: 2023-11-7.Google Scholar
- Finale Doshi-Velez and Been Kim. 2017. Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608 (2017).Google Scholar
- Abhishek Jha. 2023. Vectorization techniques in NLP [guide]. https://neptune.ai/blog/vectorization-techniques-in-nlp-guideGoogle Scholar
- Hamid Karimi, Proteek Roy, Sari Saba-Sadiya, and Jiliang Tang. 2018. Multi-source multi-class fake news detection. In Proceedings of the 27th international conference on computational linguistics. 1546–1557.Google Scholar
- Nasser Karimi and Jon Gambrell. 2022. Hundreds die of poisoning in Iran as fake news suggests methanol cure for virus. Times of Israel.[online] Available: https://www. timesofisrael. com/hundreds-die-of-poisoning-in-iran-as-fake-news-suggests-methanol-cure-for-virus (2022).Google Scholar
- Sotiris B Kotsiantis. 2013. Decision trees: a recent overview. Artificial Intelligence Review 39 (2013), 261–283.Google ScholarDigital Library
- Michael P LaValley. 2008. Logistic regression. Circulation 117, 18 (2008), 2395–2399.Google ScholarCross Ref
- Quoc Le and Tomas Mikolov. 2014. Distributed representations of sentences and documents. In International conference on machine learning. PMLR, 1188–1196.Google Scholar
- Shikun Lyu and Dan Chia-Tien Lo. 2020. Fake News Detection by Decision Tree. In 2020 SoutheastCon. 1–2. https://doi.org/10.1109/SoutheastCon44009.2020.9249688Google ScholarCross Ref
- Sahil Mankad. 2020. A Tour of Evaluation Metrics for Machine Learning. https://www.analyticsvidhya.com/blog/2020/11/a-tour-of-evaluation-metrics-for-machine-learning/Google Scholar
- Parth Patwa, Shivam Sharma, Srinivas Pykl, Vineeth Guptha, Gitanjali Kumari, Md Shad Akhtar, Asif Ekbal, Amitava Das, and Tanmoy Chakraborty. 2021. Fighting an infodemic: Covid-19 fake news dataset. (2021), 21–29.Google Scholar
- Gordon Pennycook, Jonathon McPhetres, Yunhao Zhang, Jackson G Lu, and David G Rand. 2020. Fighting COVID-19 misinformation on social media: Experimental evidence for a scalable accuracy-nudge intervention. Psychological science 31, 7 (2020), 770–780.Google Scholar
- Tomas Pranckevičius and Virginijus Marcinkevičius. 2017. Comparison of naive bayes, random forest, decision tree, support vector machines, and logistic regression classifiers for text reviews classification. Baltic Journal of Modern Computing 5, 2 (2017), 221.Google ScholarCross Ref
- Julio CS Reis, André Correia, Fabrício Murai, Adriano Veloso, and Fabrício Benevenuto. 2019. Supervised learning for fake news detection. IEEE Intelligent Systems 34, 2 (2019), 76–81.Google ScholarDigital Library
- Natali Ruchansky, Sungyong Seo, and Yan Liu. 2017. Csi: A hybrid deep model for fake news detection. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. 797–806.Google ScholarDigital Library
- Gesina Schwalbe and Bettina Finzel. 2023. A comprehensive taxonomy for explainable artificial intelligence: a systematic survey of surveys on methods and concepts. Data Mining and Knowledge Discovery (2023), 1–59.Google Scholar
- Kai Shu, Amy Sliva, Suhang Wang, Jiliang Tang, and Huan Liu. 2017. Fake news detection on social media: A data mining perspective. ACM SIGKDD explorations newsletter 19, 1 (2017), 22–36.Google ScholarDigital Library
- Kai Shu, Suhang Wang, and Huan Liu. 2017. Exploiting tri-relationship for fake news detection. arXiv preprint arXiv:1712.07709 8 (2017).Google Scholar
- Kai Shu, Suhang Wang, and Huan Liu. 2019. Beyond news contents: The role of social context for fake news detection. In Proceedings of the twelfth ACM international conference on web search and data mining. 312–320.Google ScholarDigital Library
- Saikat Dutta Sourya Dipta Das, Ayan Basak. 2021. COVID19 Fake News Dataset NLP. https://doi.org/10.34740/KAGGLE/DSV/2016658Google ScholarCross Ref
- Ilse van der Linden, Hinda Haned, and Evangelos Kanoulas. 2019. Global aggregations of local explanations for black box models. arXiv preprint arXiv:1907.03039 (2019).Google Scholar
- Lipo Wang. 2005. Support vector machines: theory and applications. Vol. 177. Springer Science & Business Media.Google Scholar
- Geoffrey I Webb, Eamonn Keogh, and Risto Miikkulainen. 2010. Naïve Bayes.Encyclopedia of machine learning 15, 1 (2010), 713–714.Google Scholar
- Swathi Y and Manoj Challa. 2023. A Comparative Analysis of Explainable AI Techniques for Enhanced Model Interpretability. In 2023 3rd International Conference on Pervasive Computing and Social Networking (ICPCSN). 229–234. https://doi.org/10.1109/ICPCSN58827.2023.00043Google ScholarCross Ref
- Rowan Zellers, Ari Holtzman, Hannah Rashkin, Yonatan Bisk, Ali Farhadi, Franziska Roesner, and Yejin Choi. 2019. Defending against neural fake news. (2019).Google Scholar
Index Terms
- Explainability in NLP model: Detection of Covid-19 Twitter Fake News
Recommendations
FakeFinder: Twitter Fake News Detection on Mobile
WWW '20: Companion Proceedings of the Web Conference 2020Misinformation, or fake news, spreads quickly on the social media platform Twitter. Mobile devices are widely used to read Twitter posts. A mobile app that can detect fake news from the live Twitter stream and alert users in real time is an effective ...
Fake news and COVID-19 vaccination: a comparative study
ASONAM '21: Proceedings of the 2021 IEEE/ACM International Conference on Advances in Social Networks Analysis and MiningCOVID-19 pandemic has changed almost every aspect of people's lives around the world. Along with non-pharmaceutical interventions such as physical distancing, vaccination is one of the proposed solutions to control the spread of this pandemic. However, ...
Multidimensional Analysis of Fake News Spreaders on Twitter
Computational Data and Social NetworksAbstractSocial media has become a tool to spread false information with the help of its large complex network. The consequences of such misinformation could be very severe. The paper uses the Twitter conversations about the scrapping of Article 370 in ...
Comments