Abstract
While the world is battling COVID-19 pandemic and its variants; netizens are combating Infodemic – “Proliferation of fake news online”. Spread of fake news during this global pandemic COVID-19 has dangerous consequences. Precise automated fake news detection is the need of the hour. This is the driving force behind this study. Intrinsic quality of news data: precision and objectivity can be studied to detect the credibility. However, to gain the knowledge or inference from it further it is a challenge. To address this challenge, this work proposes derivation of features into two categories: intrinsic word level features (Fine-grained) and sentence level features (Coarse- grained).For experimentation, fine-grained features are learned from word vector representation of the news articles. Psycho-linguistic and sentiment level word level features are derived using Empath library. Coarse-grained features (sentence-level) comprise of vectors generated by Text summarization and DOC2Vec. Taking the advantages of existing approaches, this paper proposes a new framework Granularity based Fake news Detection (GRAFED) that explores fusion of fine and coarse-grained features. The fusion of feature set is more powerful than individual fine or coarse-grained feature sets as they can capture complex interdependence of words in the sentence along with the semantics. Exhaustive experimentation using traditional classifiers with hybrid granular feature vector of GRAFED outperformed the existing approaches for publicly available state-of-art LIAR dataset. Experimental results show that the hybrid feature set is superior to individual feature set and the results are promising when compared to the existing state-of-art approaches as analyzed in the comparative analysis section.






Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.Data availability
The supporting data is uploaded in the supplementary material section.
References
Liu J, Zhong N, Yao Y (2003) The Wisdom web: new challenges for web intelligence (WI). J Intell Inform Syst 20:5–9. https://doi.org/10.1023/A:1020945620934
Zarocostas J (2020) How to fight an infodemic. Lancet 395(10225):676
Kerschberg L (2014) The role of context in social semantic search and decision making. Intern J Artific Intell Tools 23(06):1460022.
Albulescu C (2020) Coronavirus and oil price crash: a note. arXiv preprint arXiv:2003.06184. http://arxiv.org/abs/2003.06184. Accessed May 2021
Gormsen NJ, Koijen RS (2020) Coronavirus: impact on stock prices and growth expectations. Rev Asset Pricing Studies 10(4):574–597
Mhalla M (2020) The impact of novel coronavirus (COVID-19) on the global oil and aviation markets. J Asian Sci Res 10(2):96. https://doi.org/10.18488/journal.2.2020.102.96.104
Sulkowski L (2020) Covid-19 pandemic; recession, virtual revolution leading to de-globalization? J Intercultural Manage 12(1):1–11
Cinelli M, Quattrociocchi W, Galeazzi A, Valensise CM, Brugnoli E, Schmidt AL, Zola P, Zollo F, Scala A (2020) The COVID-19 social media infodemic. Sci Rep 10(1):1–10
Ali M, Levine T (2008) The language of truthful and deceptive denials and confessions. Communication Rep 21(2):82–91
Sonntag D (2004) Assessing the quality of natural language text data. Informatik 2004–Informatik verbindet–Band 1, Beiträge 34
Zhang D, Xu J, Zadorozhny V, Grant J (2022) Fake news detection based on statement conflict. J Intell Inform Syst 59(1):173–192
Galli A, Masciari E, Moscato V, Sperlí G (2022) A comprehensive Benchmark for fake news detection. J Intell Inform Syst 59(1):237–261
Undeutsch U (1984) Courtroom evaluation of eyewitness testimony. Intern Rev Appl Psychol
Driscoll LN (1994) A validity assessment of written statements from suspects in criminal investigations using the SCAN technique. Police Stud: Int’l Rev Police Dev 17:77
Zubiaga A, Liakata M, Procter R, Wong Sak Hoi G, Tolmie P (2016) Analysing how people orient to and spread rumours in social media by looking at conversational threads. PLoS ONE, 11(3), e0150989.
Vosoughi S, Mohsenvand MN, Roy D (2017) Rumor gauge: predicting the veracity of rumors on Twitter. ACM Trans Knowl Discov Data (TKDD) 11(4):1–36
Fuller CM, Biros DP, Wilson RL (2009) Decision support for determining veracity via linguistic-based cues. Decis Support Syst 46(3):695–703
Zhou L, Burgoon JK, Nunamaker JF, Twitchell D (2004) Automating linguistics-based cues for detecting deception in text-based asynchronous computer-mediated communications. Group Decis Negot 13:81–106
Pennebaker JW, Francis ME, Booth RJ (2001) Linguistic inquiry and word count: LIWC 2001. Lawrence Erlbaum Associates, Mahway, p 71
Fast E, Chen B, Bernstein MS (2016) Empath: Understanding topic signals in large-scale text. In Proceedings of the 2016 CHI conference on human factors in computing systems, 4647–4657
Allahyari M, Pouriyeh S, Assefi M, Safaei S, Trippe ED, Gutierrez JB, Kochut K (2017) Text summarization techniques: a brief survey:1707.02268
Le Q, Mikolov T (2014) Distributed representations of sentences and documents. In International conference on machine learning. PMLR, pp 1188–1196
Anjali B, Reshma R, Lekshmy VG (2019) Detection of counterfeit news using machine learning. In 2019 2nd International Conference on Intelligent Computing, Instrumentation and Control Technologies ICICICT vol. 1. IEEE, pp 1382–1386
Ruchansky N, Seo S, Liu Y (2017) CSI: a hybrid deep model for fake news detection. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, 797–806
Nasir J, Khan S & Varlamis I (2021) Fake news detection: a hybrid CNN-RNN based deep learning approach. Intern J Inf Manag Data Insights 1(1):100007
Allport GW, Postman L (1947) The psychology of rumor. Holt, Rinehart and Winston, New York
Kwon S, Cha M, Jung K, Chen W, Wang Y (2013) Prominent features of rumor propagation in online social media. In 2013 IEEE 13th international conference on data mining IEEE, 1103–1108
Tian L, Zhang X, Wang Y, Liu H (2020) Early detection of rumours on twitter via stance transfer learning. In Advances in Information Retrieval: 42nd European Conference on IR Research, ECIR 2020, Lisbon, Portugal, April 14–17, 2020, Proceedings, Part I 42 575–588. Springer International Publishing
Shu K, Sliva A, Wang S, Tang J, Liu H (2017) Fake news detection on social media: a data mining perspective. ACM SIGKDD Explorations Newsl 19(1):22–36
Li Y, Gao J, Meng C, Li Q, Su L, Zhao B, Fan W, Han J (2016) A survey on truth discovery. ACM SIGKDD Explorations Newsl 17(2):1–16
Castillo C, Mendoza M, Poblete B (2011) Information credibility on twitter. In Proc of the Int’l ACM Conference on World Wide Web (WWW), pp. 675–684
Chen Y, Rubin VL (2017) Perceptions of clickbait: a q-methodology approach. In Proceedings of the 45th Annual Conference of The Canadian Association for Information Science/L’Association canadienne des sciences de l’information (CAIS/ACSI2017), Ryerson University, Toronto, May 31-June 2
Chen Y, Conroy NJ, Rubin VL (2015) Misleading online content: recognizing clickbait as false news. In Proc. of the ACM on Workshop on Multimodal Deception Detection (WMDD), pp 15–19
Shao C, Ciampaglia GL, Varol O, Yang KC, Flammini A, Menczer F (2018) The spread of low-credibility content by social bots. Nat Commun 9(1):4787
Benevenuto F, Magno G, Rodrigues T, Almeida V (2010) Detecting spammers on twitter. In Proc. of the Annual Collaboration, Electronic messaging, Anti-Abuse and Spam Conference (CEAS), pp 12–21
Batchelor O (2017) Getting out the truth: the role of libraries in the fight against fake news. Reference services review 45(2):143–148
Ahmed H, Traore I, Saad S (2018) Detecting opinion spams and fake news using text classification. Secur Priv 1(1):9
Agarwal I, Rana D (n.d.) COVID19FN, Data M V3, https://doi.org/10.17632/b96v5hmfv6.3
Garg S, Sharma DK (2020) New Politifact: a dataset for counterfeit news. In 2020 9th International Conference System Modeling and Advancement in Research Trends (SMART) (pp. 17–22). IEEE
Wang WY (2017) Liar, liar pants on fire: a new benchmark dataset for fake news detection. https://doi.org/10.48550/arXiv.1705.00648
Li L, Qin B, Ren W, Liu T (2017) Document representation and feature combination for deceptive spam review detection. Neurocomputing 254:33–41
Ren Y, Ji D (2017) Neural networks for deceptive opinion spam detection: an empirical study. Inf Sci 385:213–224
Agarwal I, Rana D (2020) Credibility of misinformation and the Science of sentiments. J Adv Res Dyn Control Syst 12(7):1738–1745. https://doi.org/10.5373/JARDCS/V12SP7/20202283
Dong LY, Ji SJ, Zhang CJ, Zhang Q, Chiu DW, Qiu LQ, Li D (2018) An unsupervised topic-sentiment joint probabilistic model for detecting deceptive reviews. Expert Syst Appl 114:210–223
Wu D, Yu F (2020) Data for better health (Guest editorial). Library Hi Tech 38(4):701–703
Hung PC, Chiu DK, Fung WW, Cheung WK, Wong R, Choi SP, Cheng VS (2007) End-to-end privacy control in service outsourcing of human intensive processes: a multi-layered web service integration approach. Inform Syst Front 9(1):85–101
Wu TY, Chen CM, Wang KH, Meng C, Wang EK (2019) A provably secure certificateless public key encryption with keyword search. J Chin Inst Eng 42(1):20–28
Qiu L, Yu J, Fan X, Jia W, Gao W (2019) Analysis of influence maximization in temporal social networks. IEEE Access 7:42052–42062
Hong D, Chiu DK, Shen VY, Cheung SC, Kafeza E (2007) Ubiquitous enterprise service adaptations based on contextual user behavior. Inform Syst Front 9(4):343–358
Chiu DW, Leung HF, Lam KM (2009) On the making of service recommendations: an action theory based on utility, reputation, and risk attitude. Expert Syst Appl 36(2):3293–3301
Su YS, Lin CL, Chen SY, Lai CF (2020) Bibliometric study of social network analysis literature. Library Hi Tech. https://doi.org/10.1108/LHT-01-2019-0028. (ahead of print)
Funding
No funding was received to assist with the preparation of this manuscript.
Author information
Authors and Affiliations
Contributions
Isha Agarwal conceived and designed the analysis; Raj Shah and Viren Kathiriya Collected the data; Kalp Panwala contributed in analysis; Performed the analysis; All the authors contributed in drafting of this manuscript and all authors reviewed the manuscript.
Corresponding author
Ethics declarations
Ethical approval and consent to participate
All authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript.
Consent for publication
The authors have no competing interests to declare that are relevant to the content of this article.
Competing interests
The authors have no competing interests to declare that are relevant to the content of this article.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Agarwal, I., Rana, D., Panwala, K. et al. Analysis of contextual features’ granularity for fake news detection. Multimed Tools Appl 83, 51835–51851 (2024). https://doi.org/10.1007/s11042-023-17465-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-17465-5