Skip to main content

Advertisement

Log in

Arabic fake news detection based on deep contextualized embedding models

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Social media is becoming a source of news for many people due to its ease and freedom of use. As a result, fake news has been spreading quickly and easily regardless of its credibility, especially in the last decade. Fake news publishers take advantage of critical situations such as the Covid-19 pandemic and the American presidential elections to affect societies negatively. Fake news can seriously impact society in many fields including politics, finance, sports, etc. Many studies have been conducted to help detect fake news in English, but research conducted on fake news detection in the Arabic language is scarce. Our contribution is twofold: first, we have constructed a large and diverse Arabic fake news dataset. Second, we have developed and evaluated transformer-based classifiers to identify fake news while utilizing eight state-of-the-art Arabic contextualized embedding models. The majority of these models had not been previously used for Arabic fake news detection. We conduct a thorough analysis of the state-of-the-art Arabic contextualized embedding models as well as comparison with similar fake news detection systems. Experimental results confirm that these state-of-the-art models are robust, with accuracy exceeding 98%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Notes

  1. https://www.journalism.org/2016/12/15/many-americans-believe-fake-news-is-sowing-confusion/pj_2016-12-15_fake-news_0-01/.

  2. https://www.omnicoreagency.com/.

  3. https://www.ucf.edu/news/how-fake-news-affects-u-s-elections/.

  4. https://towardsdatascience.com/arabic-nlp-unique-challenges-and-their-solutions-d99e8a87893d.

  5. https://www.analyticsvidhya.com/.

  6. https://huggingface.co/.

  7. https://huggingface.co/.

  8. https://en.wikipedia.org/wiki/Zero-shot_learning.

  9. https://github.com/KUIS-AI-Lab/Arabic-ALBERT.

  10. https://arxiv.org/pdf/2012.15516.pdf.

  11. https://github.com/UBC-NLP/marbert#4-how-to-use-arbert-and-marbert.

  12. https://github.com/alisafaya/Arabic-BERT.

  13. https://www.kaggle.com/c/fake-news/data?select=test.csv.

  14. https://data.mendeley.com/datasets/9sht4t6cpf/2.

  15. http://norumors.net/?post_type=rumors?post_type=rumors.

  16. https://english.alarabiya.net/#slide=4.

  17. khaleejtimes.com.

References

  1. Jardaneh G, Abdelhaq H, Buzz M, Johnson D (2019) Classifying Arabic tweets based on credibility using content and user features. In: 2019 IEEE jordan international joint conference on electrical engineering and information technology, JEEIT 2019—proceedings. institute of electrical and electronics engineers Inc., pp 596–601

  2. Injadat M, Salo F, Nassif AB (2016) Data mining techniques in social media: a survey. Neurocomputing 214. https://doi.org/10.1016/j.neucom.2016.06.045

    Article  Google Scholar 

  3. Mehta D, Dwivedi A, Patra A, Anand Kumar M (2021) A transformer-based architecture for fake news classification. Soc Netw Anal Min 11:39. https://doi.org/10.1007/s13278-021-00738-y

    Article  Google Scholar 

  4. de Souza JV, Gomes J Jr, Souza Filhode, FM et al (2020) A systematic mapping on automatic classification of fake news in social media. Soc Netw Anal Min 10:48. https://doi.org/10.1007/s13278-020-00659-2

    Article  Google Scholar 

  5. Injadat M, Moubayed A, Nassif AB, Shami A (2021) Machine learning towards intelligent systems: applications, challenges, and opportunities. Artif Intell Rev 54:3299–3348

    Article  Google Scholar 

  6. Nassif AB, Shahin I, Attili I et al (2019) Speech recognition using deep neural networks: a systematic review. IEEE Access 7:19143–19165. https://doi.org/10.1109/ACCESS.2019.2896880

    Article  Google Scholar 

  7. Nassif AB, Shahin I, Bader M et al (2022) COVID-19 detection systems using deep-learning algorithms based on speech and image data. Mathematics 10:564

    Article  Google Scholar 

  8. Hijazi H, Abu Talib M, Hasasneh A et al (2021) Wearable Devices, smartphones, and interpretable artificial intelligence in combating COVID-19. Sensors 21. https://doi.org/10.3390/s21248424

    Article  Google Scholar 

  9. Douai A (2019) Global, Arab media in the post-truth era: globalization, authoritarianism and fake news. IEMed Mediterr Yearb 2019:124–132

    Google Scholar 

  10. Nassif AB, Darya AM, Elnagar A (2022) Empirical evaluation of shallow and deep learning classifiers for Arabic sentiment analysis. Trans Asian Low-Resour Lang Inf Process 21:1–25

    Article  Google Scholar 

  11. Oueslati O, Cambria E, Ben HM, Ounelli H (2020) A review of sentiment analysis research in Arabic language. Futur Gener Comput Syst 112:408–430. https://doi.org/10.1016/j.future.2020.05.034

    Article  Google Scholar 

  12. Nassif AB, Elnagar A, Shahin I, Henno S (2020) Deep learning for Arabic subjective sentiment analysis: challenges and research opportunities. Appl Soft Comput 98:106836

    Article  Google Scholar 

  13. Boudad N, Faizi R, Thami ROH, Chiheb R (2018) Sentiment analysis in arabic: a review of the literature. Ain Shams Eng J 9:2479–2490. https://doi.org/10.1016/j.asej.2017.04.007

    Article  Google Scholar 

  14. Vilares D, Peng H, Satapathy R, Cambria E (2019) BabelSenticNet: a Commonsense reasoning framework for multilingual sentiment analysis. In: Proc 2018 IEEE Symp Ser Comput Intell SSCI 2018 1292–1298. https://doi.org/10.1109/SSCI.2018.8628718

  15. Alwaneen TH, Azmi AM, Aboalsamh HA et al (2021) Arabic question answering system: a survey. Artif Intell Rev. https://doi.org/10.1007/S10462-021-10031-1

    Article  Google Scholar 

  16. Elmadany A, Abdul-Mageed M, Alhindi T (2020) Machine generation and detection of Arabic manipulated and fake news. In: Proceedings of the Fifth Arabic Natural Language Processing Workshop, pp. 69–84

  17. Saadany H, Mohamed E, Or˘ C (2020) Fake or real? A study of arabic satirical fake news. Online

  18. Helwe C, Elbassuoni S, Al Zaatari A, El-Hajj W (2019) Assessing arabic weblog credibility via deep co-learning. association for computational linguistics (ACL), pp 130–136

  19. Rangel F, Rosso P, Charfi A, Zaghouani W (2019) Detecting deceptive tweets in arabic for cyber-security. In: 2019 IEEE international conference on intelligence and security informatics, ISI 2019. Institute of Electrical and Electronics Engineers Inc., pp 86–91

  20. El Ballouli R, El-Hajj W, Ghandour A, et al (2017) CAT: credibility analysis of arabic content on Twitter. Association for computational linguistics (ACL), pp 62–71

  21. Haouari F, Ali ZS, Elsayed T (2019) bigIR at CLEF 2019: automatic verification of arabic claims over the Web. undefined

  22. Sabbeh SF, Baatwah SY (2018) Arabic news credibility on twitter: an enhanced model using hybrid features. J Theor Appl Inf Technol 96(8):2327–2338

    Google Scholar 

  23. Sutanto DH, Ghani MKA (2015) A benchmark of classification framework for non-communicable disease prediction: a review. ARPN J Eng Appl Sci 10:9941–9955

    Google Scholar 

  24. Alkhair M, Meftouh K, Smaïli K, Othman N (2019) An arabic corpus of fake news: collection, analysis and classification. Commun Comput Inf Sci 1108:292–302. https://doi.org/10.1007/978-3-030-32959-4_21

    Article  Google Scholar 

  25. Hadj Ameur MS, Aliane H (2021) AraCOVID19-MFH: arabic COVID-19 multi-label fake news and hate speech detection dataset. Procedia CIRP 189:232–241. https://doi.org/10.1016/j.procs.2021.05.086

    Article  Google Scholar 

  26. Al-Yahya M, Al-Khalifa H, Al-Baity H et al (2021) Arabic fake news detection: comparative study of neural networks and transformer-based approaches. Complexity 2021. https://doi.org/10.1155/2021/5516945

    Article  Google Scholar 

  27. Ozbay FA, Alatas B (2020) Fake news detection within online social media using supervised artificial intelligence algorithms. Phys A Stat Mech Appl 540:123174. https://doi.org/10.1016/j.physa.2019.123174

    Article  Google Scholar 

  28. Traylor T, Straub J, Gurmeet, Snell N (2019) Classifying fake news articles using natural language processing to identify in-article attribution as a supervised learning estimator. In: Proceedings—13th IEEE international conference on semantic computing, ICSC 2019. Institute of Electrical and Electronics Engineers Inc., pp 445–449

  29. Yang KC, Niven T, Kao HY (2019) Fake news detection as natural language inference. In: 12th ACM International conference on web search and data mining (WSDM-2019) (in Fake News Classification Challenge, WSDM Cup 2019)

  30. Kaliyar RK (2018) Fake news detection using a deep neural network. In: 2018 4th international conference on computing communication and automation, ICCCA 2018. Institute of Electrical and Electronics Engineers Inc.

  31. Antoun W, Baly F, Achour R, et al (2020) State of the art models for fake news detection tasks. In: 2020 IEEE international conference on informatics, IoT, and enabling technologies, ICIoT 2020. Institute of Electrical and Electronics Engineers Inc., pp 519–524

  32. Conneau A, Khandelwal K, Goyal N, Chaudhary V, Wenzek G, Guzmán F, ..., Stoyanov V (2020) Unsupervised cross-lingual representation learning at scale. In: Proceedings of the 58th annual meeting of the association for computational linguistics, pp. 8440–8451

  33. Lan W, Chen Y, Xu W, Ritter A (2020) An empirical study of pre-trained transformers for arabic information extraction. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP)

  34. Antoun W, Baly F, Hajj H (2020) AraBERT: Transformer-based model for arabic language understanding. In: LREC 2020 Workshop language resources and evaluation conference, p. 9

  35. Chowdhury SA, Abdelali A, Darwish K, Soon-Gyo J, Salminen J, Jansen BJ (2020) Improving arabic text categorization using transformer training diversification. In: Proceedings of the fifth arabic natural language processing workshop, pp. 226–236

  36. Safaya A, Abdullatif M, Yuret D (2020) KUISAIL at SemEval-2020 Task 12: BERT-CNN for offensive speech identification in social media. Online

  37. Wang H, Zheng H (2013) True positive rate. In: Encyclopedia of systems biology. Springer, New York, pp 2302–2303

  38. Nagoudi EMB, Elmadany A, Abdul-Mageed M, Alhindi T, Cavusoglu H (2020) Machine generation and detection of arabic manipulated and fake news. arXiv preprint arXiv:2011.03092.

Download references

Acknowledgements

The authors would like to convey their thanks and appreciation to the “University of Sharjah” for supporting the work through the research group—Machine Learning and Arabic Language Processing.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ali Bou Nassif.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Informed consent

This study does not involve any experiments on animals.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Nassif, A.B., Elnagar, A., Elgendy, O. et al. Arabic fake news detection based on deep contextualized embedding models. Neural Comput & Applic 34, 16019–16032 (2022). https://doi.org/10.1007/s00521-022-07206-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-022-07206-4

Keywords