Skip to main content

Advertisement

Log in

A veracity assessment algorithm for classification of healthcare information using feature bag mash-up approach

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Considering the rampant spread of information over the web, identifying the credibility of these contents is challenging. Although numerous automated approaches have been defined in the literature for veracity classification, generating a relevant and rich set of features is still a need of time. To fill the gaps mentioned above, the authors in this research have developed a novel feature mash-up approach, which consists of stance, pragmatic, and sentiment features. Further, veracity assessment algorithm (VAA) is proposed based on the newly generated feature bag, which assigns weights to the novel features using linear regression and classifies the veracity of information. Exhaustive experimentation showed that VAA outperformed other machine learning, ensemble learning, baseline classifier, and baseline studies in the literature with 91.40% accuracy. Further, when implemented with an incremental learning approach, the VAA showed an improved accuracy of 94.47%. To test the robustness of the algorithm, the experimentation was performed on two datasets, wherein VAA outperformed other algorithms in both the datasets. Therefore, newly generated feature bags can be used separately to classify stances, sentiments, and pragmatics in the natural language processing problems, and can assist in solving the problems from other research areas such as hate speech and sarcasm detection.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig.1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Data availability

Dataset will be accessible through Kaggle.

References

  1. Xue J, Wang Y, Tian Y, Li Y, Shi L, Wei L (2021) Detecting fake news by exploring the consistency of multimodal data. Inf Process Manag 58(5):102610. https://doi.org/10.1016/j.ipm.2021.102610

    Article  Google Scholar 

  2. Bensouda N, El Fkihi S, Faizi R (2024) A novel ensemble model for detecting fake news. IAES Int J Art Intell 13(1):1160–1171. https://doi.org/10.11591/ijai.v13.i1.pp1160-1171

    Article  Google Scholar 

  3. Pattanaik B, Mandal S, Tripathy RM (2023) A survey on rumor detection and prevention in social media using deep learning. Knowl Inf Syst 65(10):3839–3880. https://doi.org/10.1007/s10115-023-01902-w

    Article  Google Scholar 

  4. Jamialahmadi S, Sahebi I, Sabermahani MM, Shariatpanahi SP, Dadlani A, Maham B (2022) Rumor stance classification in online social networks: the state-of-the-art, prospects, and future challenges. IEEE Access 10:113131–113148. https://doi.org/10.1109/ACCESS.2022.3216835

    Article  Google Scholar 

  5. Samuel H, Zaïane O (2018) Medfact Towards improving veracity of medical information in social media using applied machine learning. In: Cheung JC, Bagheri E (eds) Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Springer, Cham

    Google Scholar 

  6. Zhang X, Gao W (2024) Predicting viral rumors and vulnerable users with graph-based neural multi-task learning for infodemic surveillance. Inf Process Manag. https://doi.org/10.1016/j.ipm.2023.103520

    Article  Google Scholar 

  7. Nguyen VC, Birnbaum M, De Choudhury M (2023) “Understanding and Mitigating Mental Health Misinformation on Video Sharing Platforms, In: ” CHI ’23: ACM Conference on Human Factors in Computing Systems, April 23â•fi28, Hamburg, Germany, vol 1, no 1, pp 1–5, 2023

  8. Castillo C, Mendoza M, Poblete B (2011) “Information credibility on Twitter,” In: Proceedings of the 20th International Conference Companion on World Wide Web, WWW 2011, no January, pp 675–684 https://doi.org/10.1145/1963405.1963500

  9. Shu K, Sliva A, Wang S, Tang J, Liu H (2017) Fake News Detection on Social Media: A Data Mining Perspective. SIGKDD Explor Newsl. https://doi.org/10.1145/3137597.3137600

    Article  Google Scholar 

  10. Alsaif HF, Aldossari HD (2023) Review of stance detection for rumor verification in social media. Eng Appl Artif Intell. https://doi.org/10.1016/j.engappai.2022.105801

    Article  Google Scholar 

  11. ALDayel A, Magdy W (2021) Stance detection on social media: state of the art and trends. Inf Process Manag. https://doi.org/10.1016/j.ipm.2021.102597

    Article  Google Scholar 

  12. Ma J, Gao W, Wong K-F (2018) “Detect Rumor and Stance Jointly by Neural Multi-task Learning,” In: The Web Conference 2018 - Companion of the World Wide Web Conference, WWW 2018, Association for Computing Machinery, pp 585–593 https://doi.org/10.1145/3184558.3188729

  13. Yang R, Ma J, Lin H, Gao W (2022) “A Weakly Supervised Propagation Model for Rumor Verification and Stance Detection with Multiple Instance Learning,” In: SIGIR 2022 - Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Association for Computing Machinery, pp 1761–1772 https://doi.org/10.1145/3477495.3531930.

  14. Islam MR, Muthiah S, Ramakrishnan N, (2019) “Rumorsleuth: Joint detection of rumor veracity and user stance,” In: Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2019, F S, W C, X X, (Eds.), Association for Computing Machinery, pp 131–136 https://doi.org/10.1145/3341161.3342916

  15. Pamungkas EW, Basile V,Patti V (2019) “Stance classification for rumour analysis in Twitter: Exploiting affective information and conversation structure,” In: CEUR Workshop Proceedings, A C, F B, D G, (Eds.), CEUR-WS 2019

  16. Masood R, Aker A (2018) “The fake news challenge: Stance detection using traditional machine learning approaches,”In: IC3K 2018 - Proceedings of the 10th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, vol 3, no Kmis, pp 128–135, 2018 https://doi.org/10.5220/0006898801280135

  17. Enayet O, El-Beltagy SR (2017) “NileTMRG at SemEval-2017 Task 8: Determining Rumour and Veracity Support for Rumours on Twitter,” In: Proceedings of the Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics (ACL), 2017, pp 470–474

  18. Aljrees T et al (2023) Fake news stance detection using selective features and FakeNET. PLoS ONE. https://doi.org/10.1371/journal.pone.0287298

    Article  Google Scholar 

  19. De Magistris G, Russo S, Roma P, Starczewski JT, Napoli C (2022) An explainable fake news detector based on named entity recognition and stance classification applied to COVID-19. Information (Switzerland) 13(3):1–14. https://doi.org/10.3390/info13030137

    Article  Google Scholar 

  20. Suhaimin MSM, Hijazi MHA, Alfred R, Coenen F (2019) Modified framework for sarcasm detection and classification in sentiment analysis. Indon J Elect Eng Comput Sci 13(3):1175–1183. https://doi.org/10.11591/ijeecs.v13.i3.pp1175-1183

    Article  Google Scholar 

  21. Zhang R, Liu N (2014) “Recognizing humor on twitter,” In: CIKM 2014 - Proceedings of the 2014 ACM International Conference on Information and Knowledge Management, pp 889–898, 2014 https://doi.org/10.1145/2661829.2661997

  22. Mane S, Khatavkar V (2023) “Polarity based Sarcasm Detection using Semigraph,” 2023

  23. Barve Y, Saini JR, Kotecha K, Gaikwad H (2022) Detecting and fact-checking misinformation using ‘veracity scanning model.’ Int J Adv Comput Sci Appl 13(2):201–209. https://doi.org/10.14569/IJACSA.2022.0130225

    Article  Google Scholar 

  24. Barve Y, Saini JR (2023) Detecting and classifying online health misinformation with ‘content similarity measure (CSM)’ algorithm: an automated fact-checking-based approach. J Supercomput. https://doi.org/10.1007/s11227-022-05032-y

    Article  Google Scholar 

  25. Meel P, Vishwakarma DK (2020) Fake news, rumor, information pollution in social media and web: a contemporary survey of state-of-the-arts, challenges and opportunities. Expert Syst Appl. https://doi.org/10.1016/j.eswa.2019.112986

    Article  Google Scholar 

  26. Przybyła P, Soto AJ (2021) When classification accuracy is not enough: explaining news credibility assessment. Inf Process Manag. https://doi.org/10.1016/j.ipm.2021.102653

    Article  Google Scholar 

  27. Zhou X, Jain A, Phoha VV, Zafarani R (2020) Fake news early detection: a theory-driven model. Digital Threats: Res Practice 1(2):1–25. https://doi.org/10.1145/3377478

    Article  Google Scholar 

  28. Zhao Y, Da J, Yan J (2021) Detecting health misinformation in online health communities: incorporating behavioral features into machine learning based approaches. Inf Process Manag. https://doi.org/10.1016/j.ipm.2020.102390

    Article  Google Scholar 

  29. Canhasi E, Shijaku R, Berisha E (2022) Albanian fake news detection. ACM Trans Asian Low-Resour Langu Inform Process. https://doi.org/10.1145/3487288

    Article  Google Scholar 

  30. Sicilia R, Lo Giudice S, Pei Y, Pechenizkiy M, Soda P (2018) Twitter rumour detection in the health domain. Expert Syst Appl. https://doi.org/10.1016/j.eswa.2018.05.019

    Article  Google Scholar 

  31. Barve Y, Saini JR, Pal K, Kotecha K (2022) A novel evolving sentimental bag-of-words approach for feature extraction to detect misinformation. Int J Adv Comput Sci Appl 13(4):266–275. https://doi.org/10.14569/IJACSA.2022.0130431

    Article  Google Scholar 

  32. Bai N, Wang Z, Meng F (2020) “A Stochastic Attention CNN Model for Rumor Stance Classification,” IEEE Access, 2020

  33. Indah DR (2015) “Pragmatic Features in the Speaking Sections of Bahasa Inggris Untuk Sma/Ma Kelas Xi Semester 1,” Magister Scientiae, vol 0, no 37, pp 66–79

  34. Bhatt S, Goenka N, Kalra S, Sharma Y (2022) Fake news detection: experiments and approaches beyond linguistic features. Lecture Notes on Data Eng Commun Technol 71:113–128. https://doi.org/10.1007/978-981-16-2937-2_9

    Article  Google Scholar 

  35. Hardalov M, Arora A, Nakov P, Augenstein I (2022) “A Survey on Stance Detection for Mis- and Disinformation Identification,” In: Findings of the Association for Computational Linguistics: NAACL 2022 - Findings, Association for Computational Linguistics (ACL), 2022, pp. 1259–1277

  36. Xuan K, Xia R (2019) “Rumor stance classification via machine learning with text, user and propagation features,” In: IEEE International Conference on Data Mining Workshops, ICDMW, P P, X C, Q H, (Eds.), IEEE Computer Society pp 560–566 https://doi.org/10.1109/ICDMW.2019.00085

  37. Alturayeif N, Luqman H, Ahmed M (2023) A systematic review of machine learning techniques for stance detection and its applications. Neural Comput Appl. https://doi.org/10.1007/s00521-023-08285-7

    Article  Google Scholar 

  38. Vaideghy A, Thiyagarajan C (2023) An ensemble classification and hybrid feature selection approach for fake news stance detection. Int J Recent and Innov Trends in Comput Commun 11(March):28–39. https://doi.org/10.17762/ijritcc.v11i4s.6304

    Article  Google Scholar 

  39. Margolin DB, Hannak A, Weber I (2018) Political fact-checking on twitter: when do corrections have an effect? Polit Commun 35(2):196–219. https://doi.org/10.1080/10584609.2017.1334018

    Article  Google Scholar 

  40. Pamungkas EW, Basile V, Patti V (2019) “Stance classification for rumour analysis in Twitter: Exploiting Affective Information And Conversation Structure,” CEUR Workshop Proceedings, 2482

  41. Bahuleyan H, Vechtomova O (2017) “UWaterloo at SemEval-2017 Task 8: Detecting Stance towards Rumours with Topic Independent Features,” In: Proceedings of the Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics (ACL), 2017, pp 461 –464

  42. Ghanem B, Rosso P, Rangel F (2019) “Stance Detection in Fake News A Combined Feature Representation,” pp 66–71, 2019, https://doi.org/10.18653/v1/w18-5510

  43. Hanselowski A et al. (2018) “A retrospective analysis of the fake news challenge stance detection task,” In: COLING 2018 - 27th International Conference on Computational Linguistics, Proceedings, EM B, L D,P I (Eds.), Association for Computational Linguistics (ACL), 2018, pp 1859–1874

  44. Shim E (2017) Hedges and boosters in academic writing. The Modern English Soc 18(3):71–90. https://doi.org/10.18095/meeso.2017.18.3.04

    Article  Google Scholar 

  45. Gupta A, Li H, Farnoush A, Jiang W (2022) Understanding patterns of COVID infodemic: a systematic and pragmatic approach to curb fake news. J Bus Res 140:670–683. https://doi.org/10.1016/j.jbusres.2021.11.032

    Article  Google Scholar 

  46. Stapleton A (2017) Deixis in Modern Linguistics. Article 9:1–9

    Google Scholar 

  47. Yang Y, Zheng L, Zhang J, Cui Q, Li Z, Yu PS (2018) “TI-CNN: Convolutional Neural Networks for Fake News Detection,” 2018

  48. Kwon S, Cha M, Jung K, Chen W, Wang Y (2013) “Prominent features of rumor propagation in online social media,” In: Proceedings - IEEE International Conference on Data Mining, ICDM, pp. 1103–1108, 2013 https://doi.org/10.1109/ICDM.2013.61

  49. Yang FC, Lee AJT, Kuo SC (2016) Mining health social media with sentiment analysis. J Med Syst. https://doi.org/10.1007/s10916-016-0604-4

    Article  Google Scholar 

  50. Morden JN, Khuman AS, Fasanmade A, Muhammad M (2022) A Fuzzy Logic Approach to a Hybrid Lexicon-Based Sentiment Analysis Detection Tool Using Healthcare Covid-19 News Articles. In: Chen T, Carter J, Mahmud M, Khuman AS (eds) Artificial Intelligence in Healthcare: Recent Applications and Developments. Springer, Singapore

    Google Scholar 

  51. Zang W, Zhang P, Zhou C, Guo L (2014) Comparative study between incremental and ensemble learning on data streams: case study. J Big Data 1(1):1–16. https://doi.org/10.1186/2196-1115-1-5

    Article  Google Scholar 

  52. Ksieniewicz P, Zyblewski P, Choraś M, Kozik R, Giełczyk A, Woźniak M (2020) “fake news detection from data streams.” Proceed Int Joint Confer Neural Netw. https://doi.org/10.1109/IJCNN48605.2020.9207498

    Article  Google Scholar 

  53. Habib A, Asghar MZ, Khan A, Habib A, Khan A (2019) False information detection in online content and its role in decision making: a systematic literature review. Soc Netw Anal Min. https://doi.org/10.1007/s13278-019-0595-5

    Article  Google Scholar 

  54. Barve Y, Mulay P (2020) Bibliometric survey on incremental learning in text classification algorithms for false information detection. Libr Philos Pract 2020:2388–2392

    Google Scholar 

  55. Sanagar S, Gupta D (2020) Unsupervised genre-based multidomain sentiment lexicon learning using corpus-generated polarity seed words. IEEE Access 8:118050–118071. https://doi.org/10.1109/ACCESS.2020.3005242

    Article  Google Scholar 

  56. Zeng L, Starbird K, Spiro ES (2016) “#Unconfirmed: Classifying rumor stance in crisis-related social media messages,” In: Proceedings of the 10th International Conference on Web and Social Media, ICWSM 2016, vol 892, no ICWSM, pp 747–750, 2016 https://doi.org/10.1609/icwsm.v10i1.14788

  57. Ghanem B, Cignarella AT, Bosco C, Rosso P,Rangel F (2019) “UPV-28-UNITO at SemEval-2019 task 7: Exploiting post’s nesting and syntax information for rumor stance classification,” In: NAACL HLT 2019 - International Workshop on Semantic Evaluation, SemEval 2019, Proceedings of the 13th Workshop, Association for Computational Linguistics (ACL), 2019, pp 1125–1131

  58. Salah I, Jouini K, Korbaa O (2023) On the use of text augmentation for stance and fake news detection. J Inform Telecommun 7(3):359–375. https://doi.org/10.1080/24751839.2023.2198820

    Article  Google Scholar 

  59. Zhou X, Mulay A, Ferrara E, Zafarani R (2020) “ReCOVery: A Multimodal Repository for COVID-19 News Credibility Research,” In: International Conference on Information and Knowledge Management, Proceedings, Association for Computing Machinery, 2020, pp 3205–3212 https://doi.org/10.1145/3340531.3412880

  60. Cui L, Lee D (2006) “CoAID: COVID-19 Healthcare Misinformation Dataset,” pp. 1–10, 2020, arXiv preprint arXiv:2006.00885

  61. Barve Y, Saini JR, Pal K, Kotecha K (2022) A novel evolving sentimental bag-of-words approach for feature extraction to detect misinformation. Int J Adv Comput Sci Appl 3(4):266–275

    Google Scholar 

  62. Di Sotto S, Viviani M (2022) Health misinformation detection in the social web: an overview and a data science approach. Int J Environ Res Public Health. https://doi.org/10.3390/ijerph19042173

    Article  Google Scholar 

  63. Dementieva D, Panchenko A (2021) “Cross-lingual evidence improves monolingual fake news detection,” In: ACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Student Research Workshop, pp 310–320, 2021 https://doi.org/10.18653/v1/2021.acl-srw.32

  64. Barve Y, Saini JR (2022) “A Novel Text Resemblance Index Method for Reference-based Fact-checking,” In: 3rd IEEE 2022 International Conference on Computing, Communication, and Intelligent Systems, ICCCIS 2022, S. M. K. M. J. V Nand P. Singh M., Ed., Institute of Electrical and Electronics Engineers Inc., 2022, pp 829–836 https://doi.org/10.1109/ICCCIS56430.2022.10037728

Download references

Funding

Not applicable.

Author information

Authors and Affiliations

Authors

Contributions

The authors confirm their contribution to the paper: Shraddha Vaidya and Jatinderkumar Saini were involved in study conception and design; Shraddha Vaidya helped in data collection; Shraddha Vaidya and Jatinderkumar Saini contributed to analysis and interpretation of results; Shraddha Vaidya and Jatinderkumar Saini were involved in draft manuscript preparation. All authors reviewed the results and approved the final version of the manuscript.

Corresponding author

Correspondence to Shraddha Vaidya.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Saini, J.R., Vaidya, S. A veracity assessment algorithm for classification of healthcare information using feature bag mash-up approach. J Supercomput 81, 285 (2025). https://doi.org/10.1007/s11227-024-06500-3

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11227-024-06500-3

Keywords