Skip to main content
Log in

ConFake: fake news identification using content based features

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

The majority of users were available on the Internet and created a number of social networking accounts during India’s COVID-19-caused lockdown, which lasted from March to June 2020. A massive amount of information is currently being disseminated on the Internet via various social networking accounts. Some false or fake information in the form of “government letters or resolutions, religious comments, hate speech, and so on" has spread like wildfire. As a result, there are major social issues affecting areas such as unemployment, politics, healthcare, poverty, religious cleavages, etc. Due to the vast availability of similar datasets comprising these types of information, manual detection of fake news or false information is challenging. This issue requires immediate attention in terms of automatically finding false news. With this motivation, we present a novel ‘ConFake’ algorithm. This algorithm includes an eighty content-based feature set for identifying fake news. Content-based and word vector features extracted from the textual content of news stories were used in the experiment. These characteristics were combined and input into machine learning classifiers. To validate the experimental findings, we ran all of the experiments on five publicly available datasets and one synthetically generated ConFake dataset that combined five datasets, namely: Kaggle, McIntire, Reuter, BuzzFeed, and PolitiFact. The proposed model achieved the highest accuracy of 97.31% when compared to other cutting-edge models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Data Availability

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Notes

  1. Source:https://www.truthorfiction.com/u-s-military-dogs-being-evacuated-from-afghanistan/

  2. Source:https://www.truthorfiction.com/no-a-study-didnt-find-that-the-most-highly-educated-americans-are-also-the-most-vaccine-hesitant/

References

  1. Ahmed, H (2017) Detecting opinion spam and fake news using n-gram analysis and semantic similarity. PhD thesis, University of Victoria

  2. Ahmed, H, Traore, I, Saad, S (2017) Detection of online fake news using n-gram analysis and machine learning techniques. In: Intelligent, secure, and dependable systems in distributed and cloud environments: first international conference, ISDDC 2017, Vancouver, BC, Canada, October 26-28, 2017, Proceedings 1, Springer, pp 127–138

  3. Ajao, O, Bhowmik, D, Zargari, S (2018) Fake news identification on twitter with hybrid cnn and rnn models. In: Proceedings of the 9th international conference on social media and society, pp 226–230

  4. Allcott H, Gentzkow M (2017) Social media and fake news in the 2016 election. J Econ Perspec 31(2):211–236

  5. Bali, APS, Fernandes, M, Choubey, S, Goel, M (2019) Comparative performance of machine learning algorithms for fake news detection. In: Advances in computing and data sciences: third international conference, ICACDS 2019, Ghaziabad, India, April 12–13, 2019, Revised Selected Papers, Part II 3, Springer, pp 420–430

  6. Bezerra JFR (2021) Content-based fake news classification through modified voting ensemble. J Inf Telecommun 5(4):499–513

    Google Scholar 

  7. Braşoveanu, AMP, Andonie, R (2019) Semantic fake news detection: A machine learning perspective. In: Advances in Computational Intelligence: 15th international work-conference on artificial neural networks, IWANN 2019, Gran Canaria, Spain, June 12–14, 2019, Proceedings, Part I 15, Springer, pp 656–667

  8. Burgoon, JK, Blair, JP, Qin, T, Nunamaker, JF (2003) Detecting deception through linguistic analysis. In: Intelligence and Security Informatics: first NSF/NIJ symposium, ISI 2003, Tucson, AZ, USA, June 2–3, 2003 Proceedings 1, Springer, pp 91–101

  9. Choudhary A, Arora A (2021) Linguistic feature based learning model for fake news detection and classification. Exp Syst Appl 169:114171

    Article  Google Scholar 

  10. Fact Check. https://www.factcheck.org/. Accessed: 31 Mar 2020

  11. Fake News Kaggle dataset. https://www.kaggle.com/c/fake-news/data?select=train.csv. Accessed: 15 Apr 2020

  12. Faustini PHA, Covoes TF (2020) Fake news detection in multiple platforms and languages. Exp Syst Appl 158:113503

    Article  Google Scholar 

  13. Fullfact. https://fullfact.org/. Accessed: 31 Mar 2020

  14. Ghanem B, Rosso P, Rangel F (2020) An emotional analysis of false information in social media and news articles. ACM Trans Int Technol (TOIT) 20(2):1–18

    Article  Google Scholar 

  15. Gilda, S (2017) Notice of violation of ieee publication principles: evaluating machine learning algorithms for fake news detection. In: 2017 IEEE 15th student conference on research and development (SCOReD), IEEE, pp 110–115

  16. Gogate, M, Adeel, A, Hussain, A, Deep learning driven multimodal fusion for automated deception detection. In: 2017 IEEE symposium series on computational intelligence (SSCI), IEEE, pp 1–6

  17. Gravanis G, Vakali A, Diamantaras K, Karadais P (2019) Behind the cues: a benchmarking study for fake news detection. Exp Syst Appl 128:201–213

    Article  Google Scholar 

  18. Hakak S, Alazab M, Khan S, Gadekallu TR, Maddikunta PKR, Khan WZ (2021) An ensemble machine learning approach through effective feature extraction to classify fake news. Future Gener Comput Syst 117:47–58

    Article  Google Scholar 

  19. Hoax Slayer. http://hoaxslayer.com/. Accessed: 31 Mar 2020

  20. Horne, B, Adali, S (2017) This just in: Fake news packs a lot in title, uses simpler, repetitive content in text body, more similar to satire than real news. In: Proceedings of the international AAAI conference on web and social media, vol 11, pp 759–766

  21. Huang Y-F, Chen P-H (2020) Fake news detection using an ensemble learning model based on self-adaptive harmony search algorithms. Exp Syst Appl 159:113584

    Article  Google Scholar 

  22. Jain, MK, Garg, R, Gopalani, D, Meena, YK (2022) Review on analysis of classifiers for fake news detection. In: Emerging technologies in computer engineering: cognitive computing and intelligent IoT, Springer, pp 395–407

  23. Jain, MK, Gopalani, D, Meena, YK, Kumar, R (2020) Machine learning based fake news detection using linguistic features and word vector features. In: 2020 IEEE 7th Uttar pradesh section international conference on electrical, electronics and computer engineering (UPCON), IEEE, pp 1–6

  24. Jin, Z, Cao, J, Guo, H, Zhang, Y, Luo, J (2017) Multimodal fusion with recurrent neural networks for rumor detection on microblogs. In: Proceedings of the 25th ACM international conference on multimedia, pp 795–816

  25. Jin Z, Cao J, Zhang Y, Zhou J, Tian Q (2016) Novel visual and statistical image features for microblogs news verification. IEEE Trans Multimed 19(3):598–608

    Article  Google Scholar 

  26. Kaliyar, RK, Goswami, A, Narang, P (2019) Multiclass fake news detection using ensemble machine learning. In: 2019 IEEE 9th international conference on advanced computing (IACC), IEEE, pp 103–107

  27. Kaliyar RK, Goswami A, Narang P, Sinha S (2020) FNDNet-a deep convolutional neural network for fake news detection. Cogn Syst Res 61:32–44

    Article  Google Scholar 

  28. Kaur S, Kumar P, Kumaraguru P (2020) Detecting clickbaits using two-phase hybrid CNN-LSTM biterm model. Exp Syst Appl 151:113350

    Article  Google Scholar 

  29. Khan JY, Khondaker MTI, Afroz S, Uddin G, Iqbal A (2021) A benchmark study of machine learning models for online fake news detection. Mach Learn Appl 4:100032

    Google Scholar 

  30. Khattar, D, Goud, JS, Gupta, M, Varma, V (2019) MVAE: multimodal variational autoencoder for fake news detection. In: The world wide web conference, pp 2915–2921

  31. Maan, M, Jain, MK, Trivedi, S, Sharma, R (2022) Machine learning based rumor detection on twitter data. In: Emerging technologies in computer engineering: cognitive computing and intelligent IoT. Springer, pp 259–273

  32. McIntire dataset. https://github.com/lutzhamel/fake-news/tree/master/data. Accessed: 31 Mar 2020

  33. Meel P, Vishwakarma DK (2020) Fake news, rumor, information pollution in social media and web: a contemporary survey of state-of-the-arts, challenges and opportunities. Exp Syst Appl 153:112986

    Article  Google Scholar 

  34. Newman ML, Pennebaker JW, Berry DS, Richards JM (2003) Lying words: Predicting deception from linguistic styles. Person Soc Psychol Bullet 29(5):665–675

    Article  Google Scholar 

  35. Pérez-Rosas, V, Kleinberg, B, Lefevre, A, Mihalcea, R (2017) Automatic detection of fake news. arXiv:1708.07104

  36. Politifact news dataset. http://www.politifact.com/. Accessed: 31 Mar 2020

  37. Qi, P, Cao, J, Yang, T, Guo, J, Li, J (2019) Exploiting multi-domain visual information for fake news detection. In: 2019 IEEE international conference on data mining (ICDM), IEEE, pp 518–527

  38. Ratner B (2009) The correlation coefficient: Its values range between+ 1/- 1, or do they? J Target Measur Anal Market 17(2):139–142

    Article  Google Scholar 

  39. Ravi K, Ravi V (2017) A novel automatic satire and irony detection using ensembled feature selection and data mining. Knowledge-Based Syst 120:15–33

    Article  Google Scholar 

  40. Reddy H, Raj N, Gala M, Basava A (2020) Text-mining-based fake news detection using ensemble methods. Int J Autom Comput 17(2):210–221

    Article  Google Scholar 

  41. Reis, JCS, Correia, A, Murai, F, Veloso, A, Benevenuto, F (2019) Explainable machine learning for fake news detection. In: Proceedings of the 10th ACM conference on web science, pp 17–26

  42. Reis JCS, Correia A, Murai F, Veloso A, Benevenuto F (2019) Supervised learning for fake news detection. IEEE Intell Syst 34(2):76–81

    Article  Google Scholar 

  43. Ruchansky, N, Seo, S, Liu, Y (2017) CSI: a hybrid deep model for fake news detection. In: Proceedings of the 2017 ACM on conference on information and knowledge management, pp 797–806

  44. Saquete E, Tomás D, Moreda P, Martínez-Barco P, Palomar M (2020) Fighting post-truth using natural language processing: a review and open challenges. Exp Syst Appl 141:112943

    Article  Google Scholar 

  45. Schwarz N, Newman E, Leach W (2016) Making the truth stick and the myths fade: lessons from cognitive psychology. Behav Sci Policy 2:85–95

    Article  Google Scholar 

  46. Shah, P, Kobti, Z (2020) Multimodal fake news detection using a cultural algorithm with situational and normative knowledge. In: 2020 IEEE congress on evolutionary computation (CEC), IEEE, pp 1–7

  47. Sharma K, Qian F, Jiang H, Ruchansky N, Zhang M, Liu Y (2019) Combating fake news: a survey on identification and mitigation techniques. ACM Trans Intell Syst Technol (TIST) 10(3):1–42

    Article  Google Scholar 

  48. Shu, K, Wang, S, Liu, H (2019) Beyond news contents: The role of social context for fake news detection. In: Proceedings of the twelfth ACM international conference on web search and data mining, pp 312–320

  49. Shu K, Mahudeswaran D, Liu H (2019) FakeNewsTracker: a tool for fake news collection, detection, and visualization. Comput Math Org Theory 25:60–71

    Article  Google Scholar 

  50. Shu K, Mahudeswaran D, Wang S, Lee D, Liu H (2020) Fakenewsnet: a data repository with news content, social context, and spatiotemporal information for studying fake news on social media. Big Data 8(3):171–188

    Article  Google Scholar 

  51. Silva RM, Santos RLS, Almeida TA, Pardo TAS (2020) Towards automatically filtering fake news in portuguese. Exp Syst Appl 146:113199

    Article  Google Scholar 

  52. Singh, V, Dasgupta, R, Sonagra, D, Raman, K, Ghosh, I (2017) Automated fake news detection using linguistic analysis and machine learning. In: International conference on social computing, behavioral-cultural modeling, & prediction and behavior representation in modeling and simulation (SBP-BRiMS), pp 1–3

  53. Singhal, S, Shah, RR, Chakraborty, T, Kumaraguru, P, Satoh, S (2019) Spotfake: a multi-modal framework for fake news detection. In: 2019 IEEE fifth international conference on multimedia big data (BigMM), IEEE, pp 39–47

  54. Snopes. https://www.snopes.com/. Accessed: 31 Mar 2020

  55. Tausczik YR, Pennebaker JW (2010) The psychological meaning of words: LIWC and computerized text analysis methods. J Lang Soc Psychol 29(1):24–54

    Article  Google Scholar 

  56. Truthorfiction. https://www.truthorfiction.com/. Accessed: 31 Mar 2020

  57. Verma PK, Agrawal P, Amorim I, Prodan R (2021) WELFake: word embedding over linguistic features for fake news detection. IEEE Trans Comput Soc Syst 8(4):881–893

    Article  Google Scholar 

  58. Vicario MD, Quattrociocchi W, Scala A, Zollo F (2019) Polarization and fake news: early warning of potential misinformation targets. ACM Trans Web (TWEB) 13(2):1–22

    Article  Google Scholar 

  59. Vishwakarma DK, Varshney D, Yadav A (2019) Detection and veracity analysis of fake news via scrapping and authenticating the web search. Cogn Syst Res 58:217–229

    Article  Google Scholar 

  60. Viswas News. http://www.vishvasnews.com/. Accessed: 31 Mar 2020

  61. Wang, WY (2017) “Liar, liar pants on fire”: a new benchmark dataset for fake news detection. In: Proceedings of the 55th annual meeting of the association for computational linguistics (Vol 2: Short Papers), Association for Computational Linguistics, pp 422–426

  62. Wang, Y, Ma, F, Jin, Z, Yuan, Y, Xun, G, Jha, K, Su, L, Gao, J (2018) EANN: event adversarial neural networks for multi-modal fake news detection. In: Proceedings of the 24th ACM sigkdd international conference on knowledge discovery & data mining, pp 849–857

  63. Wu Y, Fang Y, Shang S, Jin J, Wei L, Wang H (2021) A novel framework for detecting social bots with deep neural networks and active learning. Knowl-Based Syst 211:106525

    Article  Google Scholar 

  64. Wynne, HE, Wint, ZZ (2019) Content based fake news detection using n-gram models. In: Proceedings of the 21st international conference on information integration and web-based applications & services, pp 669–673

  65. Yang, Y, Zheng, L, Zhang, J, Cui, Q, Li, Z, Yu, PS (2018) TI-CNN: convolutional neural networks for fake news detection. arXiv:1806.00749

  66. Zhou, X, Wu, J, Zafarani, R (2020) Similarity-aware multi-modal fake news detection. In: Advances in knowledge discovery and data mining: 24th pacific-asia conference, PAKDD 2020, Singapore, May 11–14, 2020, Proceedings, Part II, Springer, pp 354–367

  67. Zhou X, Zafarani R (2020) A survey of fake news: Fundamental theories, detection methods, and opportunities. ACM Comput Surv (CSUR) 53(5):1–40

    Article  Google Scholar 

  68. Zhou L, Burgoon JK, Nunamaker JF, Twitchell D (2004) Automating linguistics-based cues for detecting deception in text-based asynchronous computer-mediated communications. Group Dec Nego 13:81–106

    Article  Google Scholar 

  69. Zhou X, Jain A, Phoha VV, Zafarani R (2020) Fake news early detection: a theory-driven model. Digit Threats Res Pract 1(2):1–25

    Article  Google Scholar 

Download references

Funding

The authors did not receive support from any organization for the submitted work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mayank Kumar Jain.

Ethics declarations

Conflicts of interest

The authors declare they have no financial interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jain, M.K., Gopalani, D. & Meena, Y.K. ConFake: fake news identification using content based features. Multimed Tools Appl 83, 8729–8755 (2024). https://doi.org/10.1007/s11042-023-15792-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-15792-1

Keywords

Navigation