Abstract
The dissemination of fake news on social media platforms is an issue of considerable interest, as it can be used to misinform people or lead them astray, which is particularly concerning when it comes to political events. The recent event of Hong Kong protests triggered an outburst of fake news posts that were identified on Twitter, which were then promptly removed and compiled into datasets to promote research. These datasets focusing on linguistic content were used in previous work to classify between tweets spreading fake and real news using traditional machine learning algorithms (Zervopoulos et al., in: IFIP international conference on artificial intelligence applications and innovations, Springer, Berlin, 2020). In this paper, the experimentation process on the previously constructed dataset is extended using deep learning algorithms along with a diverse set of input features, ranging from raw text to handcrafted features. Experiments showed that the deep learning algorithms outperformed the traditional approaches, reaching scores as high as 99.3% F1 Score, with the multilingual state-of-the-art model XLM-RoBERTa outperforming other algorithms using raw untranslated text. The combination of both traditional and deep learning algorithms allows for increased performance through the latter, while also gaining insight regarding tweet structure from the interpretability of the former.
Similar content being viewed by others
Notes
References
Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J, Devin M, Ghemawat S, Irving G, Isard M, et al. (2016) Tensorflow: a system for large-scale machine learning. In: 12th \(\{\)USENIX\(\}\) symposium on operating systems design and implementation (\(\{\)OSDI\(\}\)16), pp 265–283
Ahmed H, Traore I, Saad S (2018) Detecting opinion spams and fake news using text classification. Secur Priv 1(1):e9
Amara A, Taieb MAH, Aouicha MB (2021) Multilingual topic modeling for tracking covid-19 trends based on facebook data analysis. Appl Intell 1–22
Bajaj S (2017) The pope has a new baby! fake news detection using deep learning. Tech. rep., Technical Report, Stanford Univ
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Cao J, Sheng Q, Qi P, Zhong L, Wang Y, Zhang X (2019) False news detection on social media. arXiv preprint arXiv:190810818
Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) Smote: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357
Chu SKW, Xie R, Wang Y (2020) Cross-language fake news detection. Data Inf Manag 5(1):100–109
Conneau A, Khandelwal K, Goyal N, Chaudhary V, Wenzek G, Guzmán F, Grave E, Ott M, Zettlemoyer L, Stoyanov V (2019) Unsupervised cross-lingual representation learning at scale. arXiv preprint arXiv:191102116
Conneau A, Khandelwal K, Goyal N, Chaudhary V, Wenzek G, Guzmán F, Grave E, Ott M, Zettlemoyer L, Stoyanov V (2020) Unsupervised cross-lingual representation learning at scale. arXiv preprint arXiv:1911.02116
Devlin J, Chang MW, Lee K, Toutanova K (2018) Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:181004805
Fang Y, Gao J, Huang C, Peng H, Wu R (2019) Self multi-head attention-based convolutional neural networks for fake news detection. PLoS ONE 14(9):1–13. https://doi.org/10.1371/journal.pone.0222713
Faustini PHA, Covões TF (2020) Fake news detection in multiple platforms and languages. Expert Syst Appl 158:113503
Hamdi T, Slimi H, Bounhas I, Slimani Y (2020) A hybrid approach for fake news detection in twitter based on user features and graph embedding. In: International conference on distributed computing and internet technology. Springer, Berlin, pp 266–280
Helmstetter S, Paulheim H (2018) Weakly supervised learning for fake news detection on twitter. In: 2018 IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM), pp 274–277
Jónsson E, Stolee J (2015) An evaluation of topic modelling techniques for twitter. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing (short papers), pp 489–494
Kaliyar RK, Goswami A, Narang P, Sinha S (2020) Fndnet—a deep convolutional neural network for fake news detection. Cogn Syst Res 61:32–44. https://doi.org/10.1016/j.cogsys.2019.12.005
Khan JY, Khondaker MTI, Iqbal A, Afroz S (2019) A benchmark study on machine learning methods for fake news detection. arXiv:1905.04749
Long Y, Lu Q, Xiang R, Li M, Huang CR (2017) Fake news detection through multi-perspective speaker profiles. In: Proceedings of the eighth international joint conference on natural language processing (volume 2: short papers), pp 252–256
Nikiforos MN, Vergis S, Stylidou A, Augoustis N, Kermanidis KL, Maragoudakis M (2020) Fake news detection regarding the Hong Kong events from tweets. In: IFIP international conference on artificial intelligence applications and innovations. Springer, Berlin, pp 177–186
Oshikawa R, Qian J, Wang WY (2018) A survey on natural language processing for fake news detection. arXiv preprint arXiv:181100770
Parmelee JH, Bichard SL (2011) Politics and the Twitter revolution: how tweets influence the relationship between political leaders and the public. Lexington Books
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V et al (2011) Scikit-learn: machine learning in python. J Mach Learn Res 2:2825–2830
Pennington J, Socher R, Manning CD (2014) Glove: global vectors for word representation. In: Empirical methods in natural language processing (EMNLP), pp 1532–1543
Purbrick M (2019) A report of the 2019 Hong Kong protests. Asian Aff 50(4):465–487
Shu K, Sliva A, Wang S, Tang J, Liu H (2017) Fake news detection on social media: a data mining perspective. SIGKDD Explor Newsl 19(1):22–36
Singhal S, Shah RR, Chakraborty T, Kumaraguru P, Satoh S (2019) Spotfake: a multi-modal framework for fake news detection. In: 2019 IEEE Fifth international conference on multimedia big data (BigMM). IEEE, pp 39–47
Vosoughi S, Roy D, Aral S (2018) The spread of true and false news online. Science 359(6380):1146–1151. https://science.sciencemag.org/content/359/6380/1146.full.pdf
Wang WY (2017) “liar, liar pants on fire”: a new benchmark dataset for fake news detection. arXiv preprint arXiv:170500648
Wang Y, Ma F, Jin Z, Yuan Y, Xun G, Jha K, Su L, Gao J (2018) Eann: event adversarial neural networks for multi-modal fake news detection. In: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining. ACM, pp 849–857
Wolf T, Debut L, Sanh V, Chaumond J, Delangue C, Moi A, Cistac P, Rault T, Louf R, Funtowicz M et al (2019) Huggingface’s transformers: state-of-the-art natural language processing. ArXiv pp arXiv–1910
Yang Y, Zheng L, Zhang J, Cui Q, Li Z, Yu PS (2018) TI-CNN: Convolutional neural networks for fake news detection. arXiv:1806.00749
Zervopoulos A, Alvanou AG, Bezas K, Papamichail A, Maragoudakis M, Kermanidis K (2020) Hong Kong protests: using natural language processing for fake news detection on twitter. In: IFIP international conference on artificial intelligence applications and innovations. Springer, Berlin, pp 408–419
Zhou X, Zafarani R (2020) A survey of fake news: fundamental theories, detection methods, and opportunities. ACM Comput Surv (CSUR) 53(5):1–40
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zervopoulos, A., Alvanou, A.G., Bezas, K. et al. Deep learning for fake news detection on Twitter regarding the 2019 Hong Kong protests. Neural Comput & Applic 34, 969–982 (2022). https://doi.org/10.1007/s00521-021-06230-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-021-06230-0