Skip to main content

Advertisement

Log in

Analysis of contextual features’ granularity for fake news detection

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

While the world is battling COVID-19 pandemic and its variants; netizens are combating Infodemic – “Proliferation of fake news online”. Spread of fake news during this global pandemic COVID-19 has dangerous consequences. Precise automated fake news detection is the need of the hour. This is the driving force behind this study. Intrinsic quality of news data: precision and objectivity can be studied to detect the credibility. However, to gain the knowledge or inference from it further it is a challenge. To address this challenge, this work proposes derivation of features into two categories: intrinsic word level features (Fine-grained) and sentence level features (Coarse- grained).For experimentation, fine-grained features are learned from word vector representation of the news articles. Psycho-linguistic and sentiment level word level features are derived using Empath library. Coarse-grained features (sentence-level) comprise of vectors generated by Text summarization and DOC2Vec. Taking the advantages of existing approaches, this paper proposes a new framework Granularity based Fake news Detection (GRAFED) that explores fusion of fine and coarse-grained features. The fusion of feature set is more powerful than individual fine or coarse-grained feature sets as they can capture complex interdependence of words in the sentence along with the semantics. Exhaustive experimentation using traditional classifiers with hybrid granular feature vector of GRAFED outperformed the existing approaches for publicly available state-of-art LIAR dataset. Experimental results show that the hybrid feature set is superior to individual feature set and the results are promising when compared to the existing state-of-art approaches as analyzed in the comparative analysis section.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Algorithm 1
Fig. 2
Algorithm 2
Fig. 3
Algorithm 3

Similar content being viewed by others

Explore related subjects

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

Data availability

The supporting data is uploaded in the supplementary material section.

References

  1. Liu J, Zhong N, Yao Y (2003) The Wisdom web: new challenges for web intelligence (WI). J Intell Inform Syst 20:5–9. https://doi.org/10.1023/A:1020945620934

    Article  Google Scholar 

  2. Zarocostas J (2020) How to fight an infodemic. Lancet 395(10225):676

    Article  Google Scholar 

  3. Kerschberg L (2014) The role of context in social semantic search and decision making. Intern J Artific Intell Tools 23(06):1460022.

  4. Albulescu C (2020) Coronavirus and oil price crash: a note. arXiv preprint arXiv:2003.06184. http://arxiv.org/abs/2003.06184. Accessed May 2021

  5. Gormsen NJ, Koijen RS (2020) Coronavirus: impact on stock prices and growth expectations. Rev Asset Pricing Studies 10(4):574–597

    Article  Google Scholar 

  6. Mhalla M (2020) The impact of novel coronavirus (COVID-19) on the global oil and aviation markets. J Asian Sci Res 10(2):96. https://doi.org/10.18488/journal.2.2020.102.96.104

    Article  Google Scholar 

  7. Sulkowski L (2020) Covid-19 pandemic; recession, virtual revolution leading to de-globalization? J Intercultural Manage 12(1):1–11

    Article  Google Scholar 

  8. Cinelli M, Quattrociocchi W, Galeazzi A, Valensise CM, Brugnoli E, Schmidt AL, Zola P, Zollo F, Scala A (2020) The COVID-19 social media infodemic. Sci Rep 10(1):1–10

    Article  Google Scholar 

  9. Ali M, Levine T (2008) The language of truthful and deceptive denials and confessions. Communication Rep 21(2):82–91

    Article  Google Scholar 

  10. Sonntag D (2004) Assessing the quality of natural language text data. Informatik 2004–Informatik verbindet–Band 1, Beiträge 34

  11. Zhang D, Xu J, Zadorozhny V, Grant J (2022) Fake news detection based on statement conflict. J Intell Inform Syst 59(1):173–192

    Article  Google Scholar 

  12. Galli A, Masciari E, Moscato V, Sperlí G (2022) A comprehensive Benchmark for fake news detection. J Intell Inform Syst 59(1):237–261

    Article  Google Scholar 

  13. Undeutsch U (1984) Courtroom evaluation of eyewitness testimony. Intern Rev Appl Psychol

  14. Driscoll LN (1994) A validity assessment of written statements from suspects in criminal investigations using the SCAN technique. Police Stud: Int’l Rev Police Dev 17:77

    Google Scholar 

  15. Zubiaga A, Liakata M, Procter R, Wong Sak Hoi G, Tolmie P (2016) Analysing how people orient to and spread rumours in social media by looking at conversational threads. PLoS ONE, 11(3), e0150989.

    Article  Google Scholar 

  16. Vosoughi S, Mohsenvand MN, Roy D (2017) Rumor gauge: predicting the veracity of rumors on Twitter. ACM Trans Knowl Discov Data (TKDD) 11(4):1–36

    Article  Google Scholar 

  17. Fuller CM, Biros DP, Wilson RL (2009) Decision support for determining veracity via linguistic-based cues. Decis Support Syst 46(3):695–703

    Article  Google Scholar 

  18. Zhou L, Burgoon JK, Nunamaker JF, Twitchell D (2004) Automating linguistics-based cues for detecting deception in text-based asynchronous computer-mediated communications. Group Decis Negot 13:81–106

    Article  Google Scholar 

  19. Pennebaker JW, Francis ME, Booth RJ (2001) Linguistic inquiry and word count: LIWC 2001. Lawrence Erlbaum Associates, Mahway, p 71

    Google Scholar 

  20. Fast E, Chen B, Bernstein MS (2016) Empath: Understanding topic signals in large-scale text. In Proceedings of the 2016 CHI conference on human factors in computing systems, 4647–4657

  21. Allahyari M, Pouriyeh S, Assefi M, Safaei S, Trippe ED, Gutierrez JB, Kochut K (2017) Text summarization techniques: a brief survey:1707.02268

  22. Le Q, Mikolov T (2014) Distributed representations of sentences and documents. In International conference on machine learning. PMLR, pp 1188–1196

  23. Anjali B, Reshma R, Lekshmy VG (2019) Detection of counterfeit news using machine learning. In 2019 2nd International Conference on Intelligent Computing, Instrumentation and Control Technologies ICICICT vol. 1. IEEE, pp 1382–1386

  24. Ruchansky N, Seo S, Liu Y (2017) CSI: a hybrid deep model for fake news detection. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, 797–806

  25. Nasir J, Khan S & Varlamis I (2021) Fake news detection: a hybrid CNN-RNN based deep learning approach. Intern J Inf Manag Data Insights 1(1):100007

  26. Allport GW, Postman L (1947) The psychology of rumor. Holt, Rinehart and Winston, New York

  27. Kwon S, Cha M, Jung K, Chen W, Wang Y (2013) Prominent features of rumor propagation in online social media. In 2013 IEEE 13th international conference on data mining IEEE, 1103–1108

  28. Tian L, Zhang X, Wang Y, Liu H (2020) Early detection of rumours on twitter via stance transfer learning. In Advances in Information Retrieval: 42nd European Conference on IR Research, ECIR 2020, Lisbon, Portugal, April 14–17, 2020, Proceedings, Part I 42 575–588. Springer International Publishing

  29. Shu K, Sliva A, Wang S, Tang J, Liu H (2017) Fake news detection on social media: a data mining perspective. ACM SIGKDD Explorations Newsl 19(1):22–36

    Article  Google Scholar 

  30. Li Y, Gao J, Meng C, Li Q, Su L, Zhao B, Fan W, Han J (2016) A survey on truth discovery. ACM SIGKDD Explorations Newsl 17(2):1–16

    Article  Google Scholar 

  31. Castillo C, Mendoza M, Poblete B (2011) Information credibility on twitter. In Proc of the Int’l ACM Conference on World Wide Web (WWW), pp. 675–684

  32. Chen Y, Rubin VL (2017) Perceptions of clickbait: a q-methodology approach. In Proceedings of the 45th Annual Conference of The Canadian Association for Information Science/L’Association canadienne des sciences de l’information (CAIS/ACSI2017), Ryerson University, Toronto, May 31-June 2

  33. Chen Y, Conroy NJ, Rubin VL (2015) Misleading online content: recognizing clickbait as false news. In Proc. of the ACM on Workshop on Multimodal Deception Detection (WMDD), pp 15–19

  34. Shao C, Ciampaglia GL, Varol O, Yang KC, Flammini A, Menczer F (2018) The spread of low-credibility content by social bots. Nat Commun 9(1):4787

    Article  Google Scholar 

  35. Benevenuto F, Magno G, Rodrigues T, Almeida V (2010) Detecting spammers on twitter. In Proc. of the Annual Collaboration, Electronic messaging, Anti-Abuse and Spam Conference (CEAS), pp 12–21

  36. Batchelor O (2017) Getting out the truth: the role of libraries in the fight against fake news. Reference services review 45(2):143–148

  37. Ahmed H, Traore I, Saad S (2018) Detecting opinion spams and fake news using text classification. Secur Priv 1(1):9

    Article  Google Scholar 

  38. Agarwal I, Rana D (n.d.) COVID19FN, Data M V3, https://doi.org/10.17632/b96v5hmfv6.3

  39. Garg S, Sharma DK (2020) New Politifact: a dataset for counterfeit news. In 2020 9th International Conference System Modeling and Advancement in Research Trends (SMART) (pp. 17–22). IEEE

  40. Wang WY (2017) Liar, liar pants on fire: a new benchmark dataset for fake news detection. https://doi.org/10.48550/arXiv.1705.00648

  41. Li L, Qin B, Ren W, Liu T (2017) Document representation and feature combination for deceptive spam review detection. Neurocomputing 254:33–41

    Article  Google Scholar 

  42. Ren Y, Ji D (2017) Neural networks for deceptive opinion spam detection: an empirical study. Inf Sci 385:213–224

    Article  Google Scholar 

  43. Agarwal I, Rana D (2020) Credibility of misinformation and the Science of sentiments. J Adv Res Dyn Control Syst 12(7):1738–1745. https://doi.org/10.5373/JARDCS/V12SP7/20202283

    Article  Google Scholar 

  44. Dong LY, Ji SJ, Zhang CJ, Zhang Q, Chiu DW, Qiu LQ, Li D (2018) An unsupervised topic-sentiment joint probabilistic model for detecting deceptive reviews. Expert Syst Appl 114:210–223

    Article  Google Scholar 

  45. Wu D, Yu F (2020) Data for better health (Guest editorial). Library Hi Tech 38(4):701–703

  46. Hung PC, Chiu DK, Fung WW, Cheung WK, Wong R, Choi SP, Cheng VS (2007) End-to-end privacy control in service outsourcing of human intensive processes: a multi-layered web service integration approach. Inform Syst Front 9(1):85–101

    Article  Google Scholar 

  47. Wu TY, Chen CM, Wang KH, Meng C, Wang EK (2019) A provably secure certificateless public key encryption with keyword search. J Chin Inst Eng 42(1):20–28

    Article  Google Scholar 

  48. Qiu L, Yu J, Fan X, Jia W, Gao W (2019) Analysis of influence maximization in temporal social networks. IEEE Access 7:42052–42062

    Article  Google Scholar 

  49. Hong D, Chiu DK, Shen VY, Cheung SC, Kafeza E (2007) Ubiquitous enterprise service adaptations based on contextual user behavior. Inform Syst Front 9(4):343–358

    Article  Google Scholar 

  50. Chiu DW, Leung HF, Lam KM (2009) On the making of service recommendations: an action theory based on utility, reputation, and risk attitude. Expert Syst Appl 36(2):3293–3301

    Article  Google Scholar 

  51. Su YS, Lin CL, Chen SY, Lai CF (2020) Bibliometric study of social network analysis literature. Library Hi Tech. https://doi.org/10.1108/LHT-01-2019-0028. (ahead of print)

    Article  Google Scholar 

Download references

Funding

No funding was received to assist with the preparation of this manuscript.

Author information

Authors and Affiliations

Authors

Contributions

Isha Agarwal conceived and designed the analysis; Raj Shah and Viren Kathiriya Collected the data; Kalp Panwala contributed in analysis; Performed the analysis; All the authors contributed in drafting of this manuscript and all authors reviewed the manuscript.

Corresponding author

Correspondence to Isha Agarwal.

Ethics declarations

Ethical approval and consent to participate

All authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript.

Consent for publication

The authors have no competing interests to declare that are relevant to the content of this article.

Competing interests

The authors have no competing interests to declare that are relevant to the content of this article.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Agarwal, I., Rana, D., Panwala, K. et al. Analysis of contextual features’ granularity for fake news detection. Multimed Tools Appl 83, 51835–51851 (2024). https://doi.org/10.1007/s11042-023-17465-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-17465-5

Keywords