Abstract
The majority of users were available on the Internet and created a number of social networking accounts during India’s COVID-19-caused lockdown, which lasted from March to June 2020. A massive amount of information is currently being disseminated on the Internet via various social networking accounts. Some false or fake information in the form of “government letters or resolutions, religious comments, hate speech, and so on" has spread like wildfire. As a result, there are major social issues affecting areas such as unemployment, politics, healthcare, poverty, religious cleavages, etc. Due to the vast availability of similar datasets comprising these types of information, manual detection of fake news or false information is challenging. This issue requires immediate attention in terms of automatically finding false news. With this motivation, we present a novel ‘ConFake’ algorithm. This algorithm includes an eighty content-based feature set for identifying fake news. Content-based and word vector features extracted from the textual content of news stories were used in the experiment. These characteristics were combined and input into machine learning classifiers. To validate the experimental findings, we ran all of the experiments on five publicly available datasets and one synthetically generated ConFake dataset that combined five datasets, namely: Kaggle, McIntire, Reuter, BuzzFeed, and PolitiFact. The proposed model achieved the highest accuracy of 97.31% when compared to other cutting-edge models.



Similar content being viewed by others
Data Availability
The data that support the findings of this study are available from the corresponding author upon reasonable request.
Notes
Source:https://www.truthorfiction.com/u-s-military-dogs-being-evacuated-from-afghanistan/
Source:https://www.truthorfiction.com/no-a-study-didnt-find-that-the-most-highly-educated-americans-are-also-the-most-vaccine-hesitant/
References
Ahmed, H (2017) Detecting opinion spam and fake news using n-gram analysis and semantic similarity. PhD thesis, University of Victoria
Ahmed, H, Traore, I, Saad, S (2017) Detection of online fake news using n-gram analysis and machine learning techniques. In: Intelligent, secure, and dependable systems in distributed and cloud environments: first international conference, ISDDC 2017, Vancouver, BC, Canada, October 26-28, 2017, Proceedings 1, Springer, pp 127–138
Ajao, O, Bhowmik, D, Zargari, S (2018) Fake news identification on twitter with hybrid cnn and rnn models. In: Proceedings of the 9th international conference on social media and society, pp 226–230
Allcott H, Gentzkow M (2017) Social media and fake news in the 2016 election. J Econ Perspec 31(2):211–236
Bali, APS, Fernandes, M, Choubey, S, Goel, M (2019) Comparative performance of machine learning algorithms for fake news detection. In: Advances in computing and data sciences: third international conference, ICACDS 2019, Ghaziabad, India, April 12–13, 2019, Revised Selected Papers, Part II 3, Springer, pp 420–430
Bezerra JFR (2021) Content-based fake news classification through modified voting ensemble. J Inf Telecommun 5(4):499–513
Braşoveanu, AMP, Andonie, R (2019) Semantic fake news detection: A machine learning perspective. In: Advances in Computational Intelligence: 15th international work-conference on artificial neural networks, IWANN 2019, Gran Canaria, Spain, June 12–14, 2019, Proceedings, Part I 15, Springer, pp 656–667
Burgoon, JK, Blair, JP, Qin, T, Nunamaker, JF (2003) Detecting deception through linguistic analysis. In: Intelligence and Security Informatics: first NSF/NIJ symposium, ISI 2003, Tucson, AZ, USA, June 2–3, 2003 Proceedings 1, Springer, pp 91–101
Choudhary A, Arora A (2021) Linguistic feature based learning model for fake news detection and classification. Exp Syst Appl 169:114171
Fact Check. https://www.factcheck.org/. Accessed: 31 Mar 2020
Fake News Kaggle dataset. https://www.kaggle.com/c/fake-news/data?select=train.csv. Accessed: 15 Apr 2020
Faustini PHA, Covoes TF (2020) Fake news detection in multiple platforms and languages. Exp Syst Appl 158:113503
Fullfact. https://fullfact.org/. Accessed: 31 Mar 2020
Ghanem B, Rosso P, Rangel F (2020) An emotional analysis of false information in social media and news articles. ACM Trans Int Technol (TOIT) 20(2):1–18
Gilda, S (2017) Notice of violation of ieee publication principles: evaluating machine learning algorithms for fake news detection. In: 2017 IEEE 15th student conference on research and development (SCOReD), IEEE, pp 110–115
Gogate, M, Adeel, A, Hussain, A, Deep learning driven multimodal fusion for automated deception detection. In: 2017 IEEE symposium series on computational intelligence (SSCI), IEEE, pp 1–6
Gravanis G, Vakali A, Diamantaras K, Karadais P (2019) Behind the cues: a benchmarking study for fake news detection. Exp Syst Appl 128:201–213
Hakak S, Alazab M, Khan S, Gadekallu TR, Maddikunta PKR, Khan WZ (2021) An ensemble machine learning approach through effective feature extraction to classify fake news. Future Gener Comput Syst 117:47–58
Hoax Slayer. http://hoaxslayer.com/. Accessed: 31 Mar 2020
Horne, B, Adali, S (2017) This just in: Fake news packs a lot in title, uses simpler, repetitive content in text body, more similar to satire than real news. In: Proceedings of the international AAAI conference on web and social media, vol 11, pp 759–766
Huang Y-F, Chen P-H (2020) Fake news detection using an ensemble learning model based on self-adaptive harmony search algorithms. Exp Syst Appl 159:113584
Jain, MK, Garg, R, Gopalani, D, Meena, YK (2022) Review on analysis of classifiers for fake news detection. In: Emerging technologies in computer engineering: cognitive computing and intelligent IoT, Springer, pp 395–407
Jain, MK, Gopalani, D, Meena, YK, Kumar, R (2020) Machine learning based fake news detection using linguistic features and word vector features. In: 2020 IEEE 7th Uttar pradesh section international conference on electrical, electronics and computer engineering (UPCON), IEEE, pp 1–6
Jin, Z, Cao, J, Guo, H, Zhang, Y, Luo, J (2017) Multimodal fusion with recurrent neural networks for rumor detection on microblogs. In: Proceedings of the 25th ACM international conference on multimedia, pp 795–816
Jin Z, Cao J, Zhang Y, Zhou J, Tian Q (2016) Novel visual and statistical image features for microblogs news verification. IEEE Trans Multimed 19(3):598–608
Kaliyar, RK, Goswami, A, Narang, P (2019) Multiclass fake news detection using ensemble machine learning. In: 2019 IEEE 9th international conference on advanced computing (IACC), IEEE, pp 103–107
Kaliyar RK, Goswami A, Narang P, Sinha S (2020) FNDNet-a deep convolutional neural network for fake news detection. Cogn Syst Res 61:32–44
Kaur S, Kumar P, Kumaraguru P (2020) Detecting clickbaits using two-phase hybrid CNN-LSTM biterm model. Exp Syst Appl 151:113350
Khan JY, Khondaker MTI, Afroz S, Uddin G, Iqbal A (2021) A benchmark study of machine learning models for online fake news detection. Mach Learn Appl 4:100032
Khattar, D, Goud, JS, Gupta, M, Varma, V (2019) MVAE: multimodal variational autoencoder for fake news detection. In: The world wide web conference, pp 2915–2921
Maan, M, Jain, MK, Trivedi, S, Sharma, R (2022) Machine learning based rumor detection on twitter data. In: Emerging technologies in computer engineering: cognitive computing and intelligent IoT. Springer, pp 259–273
McIntire dataset. https://github.com/lutzhamel/fake-news/tree/master/data. Accessed: 31 Mar 2020
Meel P, Vishwakarma DK (2020) Fake news, rumor, information pollution in social media and web: a contemporary survey of state-of-the-arts, challenges and opportunities. Exp Syst Appl 153:112986
Newman ML, Pennebaker JW, Berry DS, Richards JM (2003) Lying words: Predicting deception from linguistic styles. Person Soc Psychol Bullet 29(5):665–675
Pérez-Rosas, V, Kleinberg, B, Lefevre, A, Mihalcea, R (2017) Automatic detection of fake news. arXiv:1708.07104
Politifact news dataset. http://www.politifact.com/. Accessed: 31 Mar 2020
Qi, P, Cao, J, Yang, T, Guo, J, Li, J (2019) Exploiting multi-domain visual information for fake news detection. In: 2019 IEEE international conference on data mining (ICDM), IEEE, pp 518–527
Ratner B (2009) The correlation coefficient: Its values range between+ 1/- 1, or do they? J Target Measur Anal Market 17(2):139–142
Ravi K, Ravi V (2017) A novel automatic satire and irony detection using ensembled feature selection and data mining. Knowledge-Based Syst 120:15–33
Reddy H, Raj N, Gala M, Basava A (2020) Text-mining-based fake news detection using ensemble methods. Int J Autom Comput 17(2):210–221
Reis, JCS, Correia, A, Murai, F, Veloso, A, Benevenuto, F (2019) Explainable machine learning for fake news detection. In: Proceedings of the 10th ACM conference on web science, pp 17–26
Reis JCS, Correia A, Murai F, Veloso A, Benevenuto F (2019) Supervised learning for fake news detection. IEEE Intell Syst 34(2):76–81
Ruchansky, N, Seo, S, Liu, Y (2017) CSI: a hybrid deep model for fake news detection. In: Proceedings of the 2017 ACM on conference on information and knowledge management, pp 797–806
Saquete E, Tomás D, Moreda P, Martínez-Barco P, Palomar M (2020) Fighting post-truth using natural language processing: a review and open challenges. Exp Syst Appl 141:112943
Schwarz N, Newman E, Leach W (2016) Making the truth stick and the myths fade: lessons from cognitive psychology. Behav Sci Policy 2:85–95
Shah, P, Kobti, Z (2020) Multimodal fake news detection using a cultural algorithm with situational and normative knowledge. In: 2020 IEEE congress on evolutionary computation (CEC), IEEE, pp 1–7
Sharma K, Qian F, Jiang H, Ruchansky N, Zhang M, Liu Y (2019) Combating fake news: a survey on identification and mitigation techniques. ACM Trans Intell Syst Technol (TIST) 10(3):1–42
Shu, K, Wang, S, Liu, H (2019) Beyond news contents: The role of social context for fake news detection. In: Proceedings of the twelfth ACM international conference on web search and data mining, pp 312–320
Shu K, Mahudeswaran D, Liu H (2019) FakeNewsTracker: a tool for fake news collection, detection, and visualization. Comput Math Org Theory 25:60–71
Shu K, Mahudeswaran D, Wang S, Lee D, Liu H (2020) Fakenewsnet: a data repository with news content, social context, and spatiotemporal information for studying fake news on social media. Big Data 8(3):171–188
Silva RM, Santos RLS, Almeida TA, Pardo TAS (2020) Towards automatically filtering fake news in portuguese. Exp Syst Appl 146:113199
Singh, V, Dasgupta, R, Sonagra, D, Raman, K, Ghosh, I (2017) Automated fake news detection using linguistic analysis and machine learning. In: International conference on social computing, behavioral-cultural modeling, & prediction and behavior representation in modeling and simulation (SBP-BRiMS), pp 1–3
Singhal, S, Shah, RR, Chakraborty, T, Kumaraguru, P, Satoh, S (2019) Spotfake: a multi-modal framework for fake news detection. In: 2019 IEEE fifth international conference on multimedia big data (BigMM), IEEE, pp 39–47
Snopes. https://www.snopes.com/. Accessed: 31 Mar 2020
Tausczik YR, Pennebaker JW (2010) The psychological meaning of words: LIWC and computerized text analysis methods. J Lang Soc Psychol 29(1):24–54
Truthorfiction. https://www.truthorfiction.com/. Accessed: 31 Mar 2020
Verma PK, Agrawal P, Amorim I, Prodan R (2021) WELFake: word embedding over linguistic features for fake news detection. IEEE Trans Comput Soc Syst 8(4):881–893
Vicario MD, Quattrociocchi W, Scala A, Zollo F (2019) Polarization and fake news: early warning of potential misinformation targets. ACM Trans Web (TWEB) 13(2):1–22
Vishwakarma DK, Varshney D, Yadav A (2019) Detection and veracity analysis of fake news via scrapping and authenticating the web search. Cogn Syst Res 58:217–229
Viswas News. http://www.vishvasnews.com/. Accessed: 31 Mar 2020
Wang, WY (2017) “Liar, liar pants on fire”: a new benchmark dataset for fake news detection. In: Proceedings of the 55th annual meeting of the association for computational linguistics (Vol 2: Short Papers), Association for Computational Linguistics, pp 422–426
Wang, Y, Ma, F, Jin, Z, Yuan, Y, Xun, G, Jha, K, Su, L, Gao, J (2018) EANN: event adversarial neural networks for multi-modal fake news detection. In: Proceedings of the 24th ACM sigkdd international conference on knowledge discovery & data mining, pp 849–857
Wu Y, Fang Y, Shang S, Jin J, Wei L, Wang H (2021) A novel framework for detecting social bots with deep neural networks and active learning. Knowl-Based Syst 211:106525
Wynne, HE, Wint, ZZ (2019) Content based fake news detection using n-gram models. In: Proceedings of the 21st international conference on information integration and web-based applications & services, pp 669–673
Yang, Y, Zheng, L, Zhang, J, Cui, Q, Li, Z, Yu, PS (2018) TI-CNN: convolutional neural networks for fake news detection. arXiv:1806.00749
Zhou, X, Wu, J, Zafarani, R (2020) Similarity-aware multi-modal fake news detection. In: Advances in knowledge discovery and data mining: 24th pacific-asia conference, PAKDD 2020, Singapore, May 11–14, 2020, Proceedings, Part II, Springer, pp 354–367
Zhou X, Zafarani R (2020) A survey of fake news: Fundamental theories, detection methods, and opportunities. ACM Comput Surv (CSUR) 53(5):1–40
Zhou L, Burgoon JK, Nunamaker JF, Twitchell D (2004) Automating linguistics-based cues for detecting deception in text-based asynchronous computer-mediated communications. Group Dec Nego 13:81–106
Zhou X, Jain A, Phoha VV, Zafarani R (2020) Fake news early detection: a theory-driven model. Digit Threats Res Pract 1(2):1–25
Funding
The authors did not receive support from any organization for the submitted work.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
The authors declare they have no financial interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Jain, M.K., Gopalani, D. & Meena, Y.K. ConFake: fake news identification using content based features. Multimed Tools Appl 83, 8729–8755 (2024). https://doi.org/10.1007/s11042-023-15792-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-15792-1