A veracity assessment algorithm for classification of healthcare information using feature bag mash-up approach

Saini, Jatinderkumar R.; Vaidya, Shraddha

doi:10.1007/s11227-024-06500-3

A veracity assessment algorithm for classification of healthcare information using feature bag mash-up approach

Published: 13 December 2024

Volume 81, article number 285, (2025)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Jatinderkumar R. Saini¹ &
Shraddha Vaidya¹

62 Accesses
Explore all metrics

Abstract

Considering the rampant spread of information over the web, identifying the credibility of these contents is challenging. Although numerous automated approaches have been defined in the literature for veracity classification, generating a relevant and rich set of features is still a need of time. To fill the gaps mentioned above, the authors in this research have developed a novel feature mash-up approach, which consists of stance, pragmatic, and sentiment features. Further, veracity assessment algorithm (VAA) is proposed based on the newly generated feature bag, which assigns weights to the novel features using linear regression and classifies the veracity of information. Exhaustive experimentation showed that VAA outperformed other machine learning, ensemble learning, baseline classifier, and baseline studies in the literature with 91.40% accuracy. Further, when implemented with an incremental learning approach, the VAA showed an improved accuracy of 94.47%. To test the robustness of the algorithm, the experimentation was performed on two datasets, wherein VAA outperformed other algorithms in both the datasets. Therefore, newly generated feature bags can be used separately to classify stances, sentiments, and pragmatics in the natural language processing problems, and can assist in solving the problems from other research areas such as hate speech and sarcasm detection.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Facilitating automated fact-checking: a machine learning based weighted ensemble technique for claim detection

Article Open access 11 January 2025

Detecting and classifying online health misinformation with ‘Content Similarity Measure (CSM)’ algorithm: an automated fact-checking-based approach

Article 07 January 2023

Ensemble of SVM Classifiers with Different Representations for Societal Risk Classification

Data availability

Dataset will be accessible through Kaggle.

References

Xue J, Wang Y, Tian Y, Li Y, Shi L, Wei L (2021) Detecting fake news by exploring the consistency of multimodal data. Inf Process Manag 58(5):102610. https://doi.org/10.1016/j.ipm.2021.102610
Article Google Scholar
Bensouda N, El Fkihi S, Faizi R (2024) A novel ensemble model for detecting fake news. IAES Int J Art Intell 13(1):1160–1171. https://doi.org/10.11591/ijai.v13.i1.pp1160-1171
Article Google Scholar
Pattanaik B, Mandal S, Tripathy RM (2023) A survey on rumor detection and prevention in social media using deep learning. Knowl Inf Syst 65(10):3839–3880. https://doi.org/10.1007/s10115-023-01902-w
Article Google Scholar
Jamialahmadi S, Sahebi I, Sabermahani MM, Shariatpanahi SP, Dadlani A, Maham B (2022) Rumor stance classification in online social networks: the state-of-the-art, prospects, and future challenges. IEEE Access 10:113131–113148. https://doi.org/10.1109/ACCESS.2022.3216835
Article Google Scholar
Samuel H, Zaïane O (2018) Medfact Towards improving veracity of medical information in social media using applied machine learning. In: Cheung JC, Bagheri E (eds) Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Springer, Cham
Google Scholar
Zhang X, Gao W (2024) Predicting viral rumors and vulnerable users with graph-based neural multi-task learning for infodemic surveillance. Inf Process Manag. https://doi.org/10.1016/j.ipm.2023.103520
Article Google Scholar
Nguyen VC, Birnbaum M, De Choudhury M (2023) “Understanding and Mitigating Mental Health Misinformation on Video Sharing Platforms, In: ” CHI ’23: ACM Conference on Human Factors in Computing Systems, April 23â•fi28, Hamburg, Germany, vol 1, no 1, pp 1–5, 2023
Castillo C, Mendoza M, Poblete B (2011) “Information credibility on Twitter,” In: Proceedings of the 20th International Conference Companion on World Wide Web, WWW 2011, no January, pp 675–684 https://doi.org/10.1145/1963405.1963500
Shu K, Sliva A, Wang S, Tang J, Liu H (2017) Fake News Detection on Social Media: A Data Mining Perspective. SIGKDD Explor Newsl. https://doi.org/10.1145/3137597.3137600
Article Google Scholar
Alsaif HF, Aldossari HD (2023) Review of stance detection for rumor verification in social media. Eng Appl Artif Intell. https://doi.org/10.1016/j.engappai.2022.105801
Article Google Scholar
ALDayel A, Magdy W (2021) Stance detection on social media: state of the art and trends. Inf Process Manag. https://doi.org/10.1016/j.ipm.2021.102597
Article Google Scholar
Ma J, Gao W, Wong K-F (2018) “Detect Rumor and Stance Jointly by Neural Multi-task Learning,” In: The Web Conference 2018 - Companion of the World Wide Web Conference, WWW 2018, Association for Computing Machinery, pp 585–593 https://doi.org/10.1145/3184558.3188729
Yang R, Ma J, Lin H, Gao W (2022) “A Weakly Supervised Propagation Model for Rumor Verification and Stance Detection with Multiple Instance Learning,” In: SIGIR 2022 - Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Association for Computing Machinery, pp 1761–1772 https://doi.org/10.1145/3477495.3531930.
Islam MR, Muthiah S, Ramakrishnan N, (2019) “Rumorsleuth: Joint detection of rumor veracity and user stance,” In: Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2019, F S, W C, X X, (Eds.), Association for Computing Machinery, pp 131–136 https://doi.org/10.1145/3341161.3342916
Pamungkas EW, Basile V,Patti V (2019) “Stance classification for rumour analysis in Twitter: Exploiting affective information and conversation structure,” In: CEUR Workshop Proceedings, A C, F B, D G, (Eds.), CEUR-WS 2019
Masood R, Aker A (2018) “The fake news challenge: Stance detection using traditional machine learning approaches,”In: IC3K 2018 - Proceedings of the 10th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, vol 3, no Kmis, pp 128–135, 2018 https://doi.org/10.5220/0006898801280135
Enayet O, El-Beltagy SR (2017) “NileTMRG at SemEval-2017 Task 8: Determining Rumour and Veracity Support for Rumours on Twitter,” In: Proceedings of the Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics (ACL), 2017, pp 470–474
Aljrees T et al (2023) Fake news stance detection using selective features and FakeNET. PLoS ONE. https://doi.org/10.1371/journal.pone.0287298
Article Google Scholar
De Magistris G, Russo S, Roma P, Starczewski JT, Napoli C (2022) An explainable fake news detector based on named entity recognition and stance classification applied to COVID-19. Information (Switzerland) 13(3):1–14. https://doi.org/10.3390/info13030137
Article Google Scholar
Suhaimin MSM, Hijazi MHA, Alfred R, Coenen F (2019) Modified framework for sarcasm detection and classification in sentiment analysis. Indon J Elect Eng Comput Sci 13(3):1175–1183. https://doi.org/10.11591/ijeecs.v13.i3.pp1175-1183
Article Google Scholar
Zhang R, Liu N (2014) “Recognizing humor on twitter,” In: CIKM 2014 - Proceedings of the 2014 ACM International Conference on Information and Knowledge Management, pp 889–898, 2014 https://doi.org/10.1145/2661829.2661997
Mane S, Khatavkar V (2023) “Polarity based Sarcasm Detection using Semigraph,” 2023
Barve Y, Saini JR, Kotecha K, Gaikwad H (2022) Detecting and fact-checking misinformation using ‘veracity scanning model.’ Int J Adv Comput Sci Appl 13(2):201–209. https://doi.org/10.14569/IJACSA.2022.0130225
Article Google Scholar
Barve Y, Saini JR (2023) Detecting and classifying online health misinformation with ‘content similarity measure (CSM)’ algorithm: an automated fact-checking-based approach. J Supercomput. https://doi.org/10.1007/s11227-022-05032-y
Article Google Scholar
Meel P, Vishwakarma DK (2020) Fake news, rumor, information pollution in social media and web: a contemporary survey of state-of-the-arts, challenges and opportunities. Expert Syst Appl. https://doi.org/10.1016/j.eswa.2019.112986
Article Google Scholar
Przybyła P, Soto AJ (2021) When classification accuracy is not enough: explaining news credibility assessment. Inf Process Manag. https://doi.org/10.1016/j.ipm.2021.102653
Article Google Scholar
Zhou X, Jain A, Phoha VV, Zafarani R (2020) Fake news early detection: a theory-driven model. Digital Threats: Res Practice 1(2):1–25. https://doi.org/10.1145/3377478
Article Google Scholar
Zhao Y, Da J, Yan J (2021) Detecting health misinformation in online health communities: incorporating behavioral features into machine learning based approaches. Inf Process Manag. https://doi.org/10.1016/j.ipm.2020.102390
Article Google Scholar
Canhasi E, Shijaku R, Berisha E (2022) Albanian fake news detection. ACM Trans Asian Low-Resour Langu Inform Process. https://doi.org/10.1145/3487288
Article Google Scholar
Sicilia R, Lo Giudice S, Pei Y, Pechenizkiy M, Soda P (2018) Twitter rumour detection in the health domain. Expert Syst Appl. https://doi.org/10.1016/j.eswa.2018.05.019
Article Google Scholar
Barve Y, Saini JR, Pal K, Kotecha K (2022) A novel evolving sentimental bag-of-words approach for feature extraction to detect misinformation. Int J Adv Comput Sci Appl 13(4):266–275. https://doi.org/10.14569/IJACSA.2022.0130431
Article Google Scholar
Bai N, Wang Z, Meng F (2020) “A Stochastic Attention CNN Model for Rumor Stance Classification,” IEEE Access, 2020
Indah DR (2015) “Pragmatic Features in the Speaking Sections of Bahasa Inggris Untuk Sma/Ma Kelas Xi Semester 1,” Magister Scientiae, vol 0, no 37, pp 66–79
Bhatt S, Goenka N, Kalra S, Sharma Y (2022) Fake news detection: experiments and approaches beyond linguistic features. Lecture Notes on Data Eng Commun Technol 71:113–128. https://doi.org/10.1007/978-981-16-2937-2_9
Article Google Scholar
Hardalov M, Arora A, Nakov P, Augenstein I (2022) “A Survey on Stance Detection for Mis- and Disinformation Identification,” In: Findings of the Association for Computational Linguistics: NAACL 2022 - Findings, Association for Computational Linguistics (ACL), 2022, pp. 1259–1277
Xuan K, Xia R (2019) “Rumor stance classification via machine learning with text, user and propagation features,” In: IEEE International Conference on Data Mining Workshops, ICDMW, P P, X C, Q H, (Eds.), IEEE Computer Society pp 560–566 https://doi.org/10.1109/ICDMW.2019.00085
Alturayeif N, Luqman H, Ahmed M (2023) A systematic review of machine learning techniques for stance detection and its applications. Neural Comput Appl. https://doi.org/10.1007/s00521-023-08285-7
Article Google Scholar
Vaideghy A, Thiyagarajan C (2023) An ensemble classification and hybrid feature selection approach for fake news stance detection. Int J Recent and Innov Trends in Comput Commun 11(March):28–39. https://doi.org/10.17762/ijritcc.v11i4s.6304
Article Google Scholar
Margolin DB, Hannak A, Weber I (2018) Political fact-checking on twitter: when do corrections have an effect? Polit Commun 35(2):196–219. https://doi.org/10.1080/10584609.2017.1334018
Article Google Scholar
Pamungkas EW, Basile V, Patti V (2019) “Stance classification for rumour analysis in Twitter: Exploiting Affective Information And Conversation Structure,” CEUR Workshop Proceedings, 2482
Bahuleyan H, Vechtomova O (2017) “UWaterloo at SemEval-2017 Task 8: Detecting Stance towards Rumours with Topic Independent Features,” In: Proceedings of the Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics (ACL), 2017, pp 461 –464
Ghanem B, Rosso P, Rangel F (2019) “Stance Detection in Fake News A Combined Feature Representation,” pp 66–71, 2019, https://doi.org/10.18653/v1/w18-5510
Hanselowski A et al. (2018) “A retrospective analysis of the fake news challenge stance detection task,” In: COLING 2018 - 27th International Conference on Computational Linguistics, Proceedings, EM B, L D,P I (Eds.), Association for Computational Linguistics (ACL), 2018, pp 1859–1874
Shim E (2017) Hedges and boosters in academic writing. The Modern English Soc 18(3):71–90. https://doi.org/10.18095/meeso.2017.18.3.04
Article Google Scholar
Gupta A, Li H, Farnoush A, Jiang W (2022) Understanding patterns of COVID infodemic: a systematic and pragmatic approach to curb fake news. J Bus Res 140:670–683. https://doi.org/10.1016/j.jbusres.2021.11.032
Article Google Scholar
Stapleton A (2017) Deixis in Modern Linguistics. Article 9:1–9
Google Scholar
Yang Y, Zheng L, Zhang J, Cui Q, Li Z, Yu PS (2018) “TI-CNN: Convolutional Neural Networks for Fake News Detection,” 2018
Kwon S, Cha M, Jung K, Chen W, Wang Y (2013) “Prominent features of rumor propagation in online social media,” In: Proceedings - IEEE International Conference on Data Mining, ICDM, pp. 1103–1108, 2013 https://doi.org/10.1109/ICDM.2013.61
Yang FC, Lee AJT, Kuo SC (2016) Mining health social media with sentiment analysis. J Med Syst. https://doi.org/10.1007/s10916-016-0604-4
Article Google Scholar
Morden JN, Khuman AS, Fasanmade A, Muhammad M (2022) A Fuzzy Logic Approach to a Hybrid Lexicon-Based Sentiment Analysis Detection Tool Using Healthcare Covid-19 News Articles. In: Chen T, Carter J, Mahmud M, Khuman AS (eds) Artificial Intelligence in Healthcare: Recent Applications and Developments. Springer, Singapore
Google Scholar
Zang W, Zhang P, Zhou C, Guo L (2014) Comparative study between incremental and ensemble learning on data streams: case study. J Big Data 1(1):1–16. https://doi.org/10.1186/2196-1115-1-5
Article Google Scholar
Ksieniewicz P, Zyblewski P, Choraś M, Kozik R, Giełczyk A, Woźniak M (2020) “fake news detection from data streams.” Proceed Int Joint Confer Neural Netw. https://doi.org/10.1109/IJCNN48605.2020.9207498
Article Google Scholar
Habib A, Asghar MZ, Khan A, Habib A, Khan A (2019) False information detection in online content and its role in decision making: a systematic literature review. Soc Netw Anal Min. https://doi.org/10.1007/s13278-019-0595-5
Article Google Scholar
Barve Y, Mulay P (2020) Bibliometric survey on incremental learning in text classification algorithms for false information detection. Libr Philos Pract 2020:2388–2392
Google Scholar
Sanagar S, Gupta D (2020) Unsupervised genre-based multidomain sentiment lexicon learning using corpus-generated polarity seed words. IEEE Access 8:118050–118071. https://doi.org/10.1109/ACCESS.2020.3005242
Article Google Scholar
Zeng L, Starbird K, Spiro ES (2016) “#Unconfirmed: Classifying rumor stance in crisis-related social media messages,” In: Proceedings of the 10th International Conference on Web and Social Media, ICWSM 2016, vol 892, no ICWSM, pp 747–750, 2016 https://doi.org/10.1609/icwsm.v10i1.14788
Ghanem B, Cignarella AT, Bosco C, Rosso P,Rangel F (2019) “UPV-28-UNITO at SemEval-2019 task 7: Exploiting post’s nesting and syntax information for rumor stance classification,” In: NAACL HLT 2019 - International Workshop on Semantic Evaluation, SemEval 2019, Proceedings of the 13th Workshop, Association for Computational Linguistics (ACL), 2019, pp 1125–1131
Salah I, Jouini K, Korbaa O (2023) On the use of text augmentation for stance and fake news detection. J Inform Telecommun 7(3):359–375. https://doi.org/10.1080/24751839.2023.2198820
Article Google Scholar
Zhou X, Mulay A, Ferrara E, Zafarani R (2020) “ReCOVery: A Multimodal Repository for COVID-19 News Credibility Research,” In: International Conference on Information and Knowledge Management, Proceedings, Association for Computing Machinery, 2020, pp 3205–3212 https://doi.org/10.1145/3340531.3412880
Cui L, Lee D (2006) “CoAID: COVID-19 Healthcare Misinformation Dataset,” pp. 1–10, 2020, arXiv preprint arXiv:2006.00885
Barve Y, Saini JR, Pal K, Kotecha K (2022) A novel evolving sentimental bag-of-words approach for feature extraction to detect misinformation. Int J Adv Comput Sci Appl 3(4):266–275
Google Scholar
Di Sotto S, Viviani M (2022) Health misinformation detection in the social web: an overview and a data science approach. Int J Environ Res Public Health. https://doi.org/10.3390/ijerph19042173
Article Google Scholar
Dementieva D, Panchenko A (2021) “Cross-lingual evidence improves monolingual fake news detection,” In: ACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Student Research Workshop, pp 310–320, 2021 https://doi.org/10.18653/v1/2021.acl-srw.32
Barve Y, Saini JR (2022) “A Novel Text Resemblance Index Method for Reference-based Fact-checking,” In: 3rd IEEE 2022 International Conference on Computing, Communication, and Intelligent Systems, ICCCIS 2022, S. M. K. M. J. V Nand P. Singh M., Ed., Institute of Electrical and Electronics Engineers Inc., 2022, pp 829–836 https://doi.org/10.1109/ICCCIS56430.2022.10037728

Download references

Funding

Not applicable.

Author information

Authors and Affiliations

Symbiosis Institute of Computer Studies and Research, Symbiosis International (Deemed University), Pune, India
Jatinderkumar R. Saini & Shraddha Vaidya

Authors

Jatinderkumar R. Saini
View author publications
You can also search for this author in PubMed Google Scholar
Shraddha Vaidya
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

The authors confirm their contribution to the paper: Shraddha Vaidya and Jatinderkumar Saini were involved in study conception and design; Shraddha Vaidya helped in data collection; Shraddha Vaidya and Jatinderkumar Saini contributed to analysis and interpretation of results; Shraddha Vaidya and Jatinderkumar Saini were involved in draft manuscript preparation. All authors reviewed the results and approved the final version of the manuscript.

Corresponding author

Correspondence to Shraddha Vaidya.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Saini, J.R., Vaidya, S. A veracity assessment algorithm for classification of healthcare information using feature bag mash-up approach. J Supercomput 81, 285 (2025). https://doi.org/10.1007/s11227-024-06500-3

Download citation

Accepted: 02 October 2024
Published: 13 December 2024
DOI: https://doi.org/10.1007/s11227-024-06500-3

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A veracity assessment algorithm for classification of healthcare information using feature bag mash-up approach

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Facilitating automated fact-checking: a machine learning based weighted ensemble technique for claim detection

Detecting and classifying online health misinformation with ‘Content Similarity Measure (CSM)’ algorithm: an automated fact-checking-based approach

Ensemble of SVM Classifiers with Different Representations for Societal Risk Classification

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now