Abstract
Social media is becoming a source of news for many people due to its ease and freedom of use. As a result, fake news has been spreading quickly and easily regardless of its credibility, especially in the last decade. Fake news publishers take advantage of critical situations such as the Covid-19 pandemic and the American presidential elections to affect societies negatively. Fake news can seriously impact society in many fields including politics, finance, sports, etc. Many studies have been conducted to help detect fake news in English, but research conducted on fake news detection in the Arabic language is scarce. Our contribution is twofold: first, we have constructed a large and diverse Arabic fake news dataset. Second, we have developed and evaluated transformer-based classifiers to identify fake news while utilizing eight state-of-the-art Arabic contextualized embedding models. The majority of these models had not been previously used for Arabic fake news detection. We conduct a thorough analysis of the state-of-the-art Arabic contextualized embedding models as well as comparison with similar fake news detection systems. Experimental results confirm that these state-of-the-art models are robust, with accuracy exceeding 98%.








Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
khaleejtimes.com.
References
Jardaneh G, Abdelhaq H, Buzz M, Johnson D (2019) Classifying Arabic tweets based on credibility using content and user features. In: 2019 IEEE jordan international joint conference on electrical engineering and information technology, JEEIT 2019—proceedings. institute of electrical and electronics engineers Inc., pp 596–601
Injadat M, Salo F, Nassif AB (2016) Data mining techniques in social media: a survey. Neurocomputing 214. https://doi.org/10.1016/j.neucom.2016.06.045
Mehta D, Dwivedi A, Patra A, Anand Kumar M (2021) A transformer-based architecture for fake news classification. Soc Netw Anal Min 11:39. https://doi.org/10.1007/s13278-021-00738-y
de Souza JV, Gomes J Jr, Souza Filhode, FM et al (2020) A systematic mapping on automatic classification of fake news in social media. Soc Netw Anal Min 10:48. https://doi.org/10.1007/s13278-020-00659-2
Injadat M, Moubayed A, Nassif AB, Shami A (2021) Machine learning towards intelligent systems: applications, challenges, and opportunities. Artif Intell Rev 54:3299–3348
Nassif AB, Shahin I, Attili I et al (2019) Speech recognition using deep neural networks: a systematic review. IEEE Access 7:19143–19165. https://doi.org/10.1109/ACCESS.2019.2896880
Nassif AB, Shahin I, Bader M et al (2022) COVID-19 detection systems using deep-learning algorithms based on speech and image data. Mathematics 10:564
Hijazi H, Abu Talib M, Hasasneh A et al (2021) Wearable Devices, smartphones, and interpretable artificial intelligence in combating COVID-19. Sensors 21. https://doi.org/10.3390/s21248424
Douai A (2019) Global, Arab media in the post-truth era: globalization, authoritarianism and fake news. IEMed Mediterr Yearb 2019:124–132
Nassif AB, Darya AM, Elnagar A (2022) Empirical evaluation of shallow and deep learning classifiers for Arabic sentiment analysis. Trans Asian Low-Resour Lang Inf Process 21:1–25
Oueslati O, Cambria E, Ben HM, Ounelli H (2020) A review of sentiment analysis research in Arabic language. Futur Gener Comput Syst 112:408–430. https://doi.org/10.1016/j.future.2020.05.034
Nassif AB, Elnagar A, Shahin I, Henno S (2020) Deep learning for Arabic subjective sentiment analysis: challenges and research opportunities. Appl Soft Comput 98:106836
Boudad N, Faizi R, Thami ROH, Chiheb R (2018) Sentiment analysis in arabic: a review of the literature. Ain Shams Eng J 9:2479–2490. https://doi.org/10.1016/j.asej.2017.04.007
Vilares D, Peng H, Satapathy R, Cambria E (2019) BabelSenticNet: a Commonsense reasoning framework for multilingual sentiment analysis. In: Proc 2018 IEEE Symp Ser Comput Intell SSCI 2018 1292–1298. https://doi.org/10.1109/SSCI.2018.8628718
Alwaneen TH, Azmi AM, Aboalsamh HA et al (2021) Arabic question answering system: a survey. Artif Intell Rev. https://doi.org/10.1007/S10462-021-10031-1
Elmadany A, Abdul-Mageed M, Alhindi T (2020) Machine generation and detection of Arabic manipulated and fake news. In: Proceedings of the Fifth Arabic Natural Language Processing Workshop, pp. 69–84
Saadany H, Mohamed E, Or˘ C (2020) Fake or real? A study of arabic satirical fake news. Online
Helwe C, Elbassuoni S, Al Zaatari A, El-Hajj W (2019) Assessing arabic weblog credibility via deep co-learning. association for computational linguistics (ACL), pp 130–136
Rangel F, Rosso P, Charfi A, Zaghouani W (2019) Detecting deceptive tweets in arabic for cyber-security. In: 2019 IEEE international conference on intelligence and security informatics, ISI 2019. Institute of Electrical and Electronics Engineers Inc., pp 86–91
El Ballouli R, El-Hajj W, Ghandour A, et al (2017) CAT: credibility analysis of arabic content on Twitter. Association for computational linguistics (ACL), pp 62–71
Haouari F, Ali ZS, Elsayed T (2019) bigIR at CLEF 2019: automatic verification of arabic claims over the Web. undefined
Sabbeh SF, Baatwah SY (2018) Arabic news credibility on twitter: an enhanced model using hybrid features. J Theor Appl Inf Technol 96(8):2327–2338
Sutanto DH, Ghani MKA (2015) A benchmark of classification framework for non-communicable disease prediction: a review. ARPN J Eng Appl Sci 10:9941–9955
Alkhair M, Meftouh K, Smaïli K, Othman N (2019) An arabic corpus of fake news: collection, analysis and classification. Commun Comput Inf Sci 1108:292–302. https://doi.org/10.1007/978-3-030-32959-4_21
Hadj Ameur MS, Aliane H (2021) AraCOVID19-MFH: arabic COVID-19 multi-label fake news and hate speech detection dataset. Procedia CIRP 189:232–241. https://doi.org/10.1016/j.procs.2021.05.086
Al-Yahya M, Al-Khalifa H, Al-Baity H et al (2021) Arabic fake news detection: comparative study of neural networks and transformer-based approaches. Complexity 2021. https://doi.org/10.1155/2021/5516945
Ozbay FA, Alatas B (2020) Fake news detection within online social media using supervised artificial intelligence algorithms. Phys A Stat Mech Appl 540:123174. https://doi.org/10.1016/j.physa.2019.123174
Traylor T, Straub J, Gurmeet, Snell N (2019) Classifying fake news articles using natural language processing to identify in-article attribution as a supervised learning estimator. In: Proceedings—13th IEEE international conference on semantic computing, ICSC 2019. Institute of Electrical and Electronics Engineers Inc., pp 445–449
Yang KC, Niven T, Kao HY (2019) Fake news detection as natural language inference. In: 12th ACM International conference on web search and data mining (WSDM-2019) (in Fake News Classification Challenge, WSDM Cup 2019)
Kaliyar RK (2018) Fake news detection using a deep neural network. In: 2018 4th international conference on computing communication and automation, ICCCA 2018. Institute of Electrical and Electronics Engineers Inc.
Antoun W, Baly F, Achour R, et al (2020) State of the art models for fake news detection tasks. In: 2020 IEEE international conference on informatics, IoT, and enabling technologies, ICIoT 2020. Institute of Electrical and Electronics Engineers Inc., pp 519–524
Conneau A, Khandelwal K, Goyal N, Chaudhary V, Wenzek G, Guzmán F, ..., Stoyanov V (2020) Unsupervised cross-lingual representation learning at scale. In: Proceedings of the 58th annual meeting of the association for computational linguistics, pp. 8440–8451
Lan W, Chen Y, Xu W, Ritter A (2020) An empirical study of pre-trained transformers for arabic information extraction. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP)
Antoun W, Baly F, Hajj H (2020) AraBERT: Transformer-based model for arabic language understanding. In: LREC 2020 Workshop language resources and evaluation conference, p. 9
Chowdhury SA, Abdelali A, Darwish K, Soon-Gyo J, Salminen J, Jansen BJ (2020) Improving arabic text categorization using transformer training diversification. In: Proceedings of the fifth arabic natural language processing workshop, pp. 226–236
Safaya A, Abdullatif M, Yuret D (2020) KUISAIL at SemEval-2020 Task 12: BERT-CNN for offensive speech identification in social media. Online
Wang H, Zheng H (2013) True positive rate. In: Encyclopedia of systems biology. Springer, New York, pp 2302–2303
Nagoudi EMB, Elmadany A, Abdul-Mageed M, Alhindi T, Cavusoglu H (2020) Machine generation and detection of arabic manipulated and fake news. arXiv preprint arXiv:2011.03092.
Acknowledgements
The authors would like to convey their thanks and appreciation to the “University of Sharjah” for supporting the work through the research group—Machine Learning and Arabic Language Processing.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Informed consent
This study does not involve any experiments on animals.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Nassif, A.B., Elnagar, A., Elgendy, O. et al. Arabic fake news detection based on deep contextualized embedding models. Neural Comput & Applic 34, 16019–16032 (2022). https://doi.org/10.1007/s00521-022-07206-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-022-07206-4