Abstract
Nowadays, social media and virtual networking hubs like Twitter, Facebook have become an integral part of our daily lives. The recent boom in multimedia technology and increased internet access has led us into a hyper-connected global world. But these networks are often observed as the conduit for propagating fake news, which may cause a severe problem to a healthy social environment and destroy the harmony between the users. This calls for a proper segregation tool to classify various news articles as real or fake. Numerous research has been done on this topic, including the use of Artificial Intelligence (AI). In this work, we propose a deep learning based hybrid framework utilizing Word2Vec embedding and LSTM for fake news detection. As part of our approach, we generate Word2Vec embedding for obtaining vector representations of the news excerpts. The Word2Vec embeddings assist in generating context-free and data agnostic feature vectors for our news articles. The stacked LSTM layers process the extracted feature vectors to obtain the topic-relevant salient features for the news articles. This is followed by two fully connected dense layers for classifying whether the news excerpt under consideration is real or fake. We also perform hyperparameter tuning for achieving a better performance of our model. The proposed model is context-free and independent of datasets as well as topics for fake news detection. We compare the proposed method’s performance with some traditional Machine Learning baseline models, deep learning models, the pre-trained Bidirectional Encoder Representations from Transformers (BERT) via transfer learning, and some recently proposed state-of-the-art models. These models are tested on four datasets belonging to different domains for both training and testing purposes. Our proposed technique outperforms other well-known methods based on various performance metrics through intensive experimentation.
Similar content being viewed by others
Data Availability
In this manuscript, we have used publicly available data and performed analysis on those data for our study. We have cited all such datasets used in the paper.
References
Ahmed H, Traore I, Saad S (2017) Detection of online fake news using n-gram analysis and machine learning techniques. In: International conference on intelligent, secure, and dependable systems in distributed and cloud environments. Springer, Cham, pp 127–138
Ahmed H, Traore I, Saad S (2018) Detecting opinion spams and fake news using text classification. Security and Privacy 1(1):e9
Allcott H, Gentzkow M (2017) Social media and fake news in the 2016 election. J Econ Perspect 31(2):211–36
Almeida TA, Hidalgo JMG, Yamakami A (2011) Contributions to the study of SMS spam filtering: new collection and results. In: Proceedings of the 11th ACM symposium on Document engineering, pp 259–262
Anand S, Mallik A, Kumar S (2022) Integrating node centralities, similarity measures, and machine learning classifiers for link prediction. Multimed Tools Appl 1–29
Braşoveanu A M, Andonie R (2020) Integrating machine learning techniques in semantic fake news detection. Neural Process Lett 1–8
Brooks HZ, Porter MA (2020) A model for the influence of media on the ideology of content in online social networks. Phys Rev Res 2(2):023041
Caliskan A, Bryson JJ, Narayanan A (2017) Semantics derived automatically from language corpora contain human-like biases. Science 356(6334):183–186
Ciampaglia GL, Shiralkar P, Rocha LM, Bollen J, Menczer F, Flammini A (2015) Computational fact checking from knowledge networks. PloS one 10(6):e0128193
Devlin J, Chang MW, Lee K, Toutanova K (2018) Bert: pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805
Ghanem B, Rosso P, Rangel F (2018) Stance detection in fake news a combined feature representation. In: Proceedings of the first workshop on Fact Extraction and VERification (FEVER), pp 66–71
Goldani MH, Momtazi S, Safabakhsh R (2020) Detecting fake news with capsule neural networks. Appl Soft Comput 106991
Greff K, Srivastava RK, Koutník J, Steunebrink BR, Schmidhuber J (2016) LSTM: a search space odyssey. IEEE Trans Neural Netw Learn Syst 28(10):2222–2232
Hai Z, Zhao P, Cheng P, Yang P, Li XL, Li G (2016) Deceptive review spam detection via exploiting task relatedness and unlabeled data. In: Proceedings of conference on empirical methods in natural language processing, pp 1817–1826
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Hu M, Liu B (2004) Mining and summarizing customer reviews. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining, pp 168–177
Hunan SL, Hunan TH, Hunan JL, Hunan YL, Kumar A (2021) An effective learning evaluation method based on text data with real-time attribution—a case study for mathematical class with students of junior middle school in China. Transactions on Asian and Low-Resource Language Information Processing
Kaliyar RK, Goswami A, Narang P, Sinha S (2020) FNDNEt—a deep convolutional neural network for fake news detection. Cogn Syst Res 61:32–44
Kaliyar RK, Goswami A Narang P (2021) FakeBERT: fake news detection in social media with a BERT-based deep learning approach. Multimed Tools Appl 1–24
Kumar S, Shah N (2018) False information on web and social media: a survey. arXiv:1804.08559
Kumar S, Panda BS, Aggarwal D (2020) Community detection in complex networks using network embedding and gravitational search algorithm. J Intell Inf Syst 1–22
Kumar S, Kumar A, Mallik A, Dhall S (2022) Opinion leader detection in Asian social networks using modified spider monkey optimization. Transactions on Asian and Low-Resource Language Information Processing
Kumar S, Mallik A, Panda BS (2022) Influence maximization in social networks using transfer learning via graph-based LSTM. Expert Syst Appl 118770
Liu S, He T, Dai J (2021) A survey of CRF algorithm based knowledge extraction of elementary mathematics in Chinese. Mob Netw Appl 26 (5):1891–903
Long Y (2017) Fake news detection through multi-perspective speaker profiles. Association for Computational Linguistics
Meel P, Vishwakarma DK (2019) Fake news, rumor, information pollution in social media and web: a contemporary survey of state-of-the-arts, challenges and opportunities. Expert Syst Appl 112986
Mihalcea R, Strapparava C (2009) The lie detector: explorations in the automatic recognition of deceptive language. In: Proceedings of the ACL-IJCNLP, pp 309–312
Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv:1301.3781
O’Brien N, Latessa S, Evangelopoulos G, Boix X (2018) The language of fake news: opening the black-box of deep learning based detectors
Pennington J, Socher R, Manning CD (2014) Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532–1543
Pérez-Rosas V, Kleinberg B, Lefevre A, Mihalcea R (2017) Automatic detection of fake news. arXiv:1708.07104
Ruchansky N, Seo S, Liu Y (2017) Csi: a hybrid deep model for fake news detection. In: Proceedings of the 2017 ACM on conference on information and knowledge management, pp 797–806
Sahoo SR, Gupta BB (2021) Multiple features based approach for automatic fake news detection on social networks using deep learning. Appl Soft Comput 100:106983
Singh DV, Dasgupta R, Ghosh I (2017) Automated fake news detection using linguistic analysis and machine learning. In: International conference on social computing, behavioral-cultural modeling, & prediction and behavior representation in modeling and simulation (SBP-BRiMS), pp 1–3
Singhania S, Fernandez N, Rao S (2017) 3HAN: a deep neural network for fake news detection. In: International conference on neural information processing. Springer, Cham, pp 572–581
Spohr D (2017) Fake news and ideological polarization: filter bubbles and selective exposure on social media. Bus Inf Rev 34(3):150–160
Tacchini E, Ballarin G, Della Vedova ML, Moret S, de Alfaro L (2017) Some like it hoax: automated fake news detection in social networks. arXiv:1704.07506
Thota A, Tilak P, Ahluwalia S, Lohia N (2018) Fake news detection: a deep learning approach. SMU Data Sci Rev 1(3):10
Tschiatschek S, Singla A, Gomez Rodriguez M, Merchant A, Krause A (2018) Fake news detection in social networks via crowd signals. In: Companion proceedings of the the Web conference, pp 517–524
Vicario MD, Quattrociocchi W, Scala A, Zollo F (2019) Polarization and fake news: early warning of potential misinformation targets. ACM Trans Web (TWEB) 13(2):1–22
Wang Y, Ma F, Jin Z, Yuan Y, Xun G, Jha K, Gao J (2018) Eann: event adversarial neural networks for multi-modal fake news detection. In: Proceedings of the 24th acm sigkdd international conference on knowledge discovery & data mining, pp 849–857
Wang S, Liu X, Liu S, Muhammad K, Heidari AA, Del Ser J, de Albuquerque VH (2021) Human short long-term cognitive memory mechanism for visual monitoring in IoT-assisted smart cities. IEEE Internet Things J 9 (10):7128–7139
William WY (2017) “liar, liar pants on fire”: a new benchmark dataset for fake news detection. arXiv:1705.00648
Yang Y, Zheng L, Zhang J, Cui Q, Li Z, Yu PS (2018) TI-CNN: convolutional neural networks for fake news detection. arXiv:1806.00749
Zhang Y, Wallace B (2015) A sensitivity analysis of (and practitioners’ guide to) convolutional neural networks for sentence classification. arXiv:1510.03820
Zhang Y, Jin R, Zhou ZH (2010) Understanding bag-of-words model: a statistical framework. Int J Mach Learn Cybern 1:43–52. https://doi.org/10.1007/s13042-010-0001-0
Zhang J, Dong B, Philip SY (2020) Fakedetector: effective fake news detection with deep diffusive neural network. In: 2020 IEEE 36th international conference on data engineering (ICDE). IEEE, pp 1826–1829
Funding
The authors declare that they have not received any funding.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Mallik, A., Kumar, S. Word2Vec and LSTM based deep learning technique for context-free fake news detection. Multimed Tools Appl 83, 919–940 (2024). https://doi.org/10.1007/s11042-023-15364-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-15364-3