Abstract
Nowadays, the appearance of common symptoms, such as cough, fever, and loss of smell and taste, is the starting point of a battle against the coronavirus. The first standard method of COVID-19 infection assertion has become the RT-PCR test, which is however an uncomfortable solution for both patients and medical staff due to its high cost, timeliness, and false-negative result issue. This has raised the need for reliable automatic detection systems that aid in the early prediction of the COVID-19 infections with a lower cost. In this work, we aim at profiting from the Machine Learning (ML) advances to provide a reliable and low-cost COVID-19 prediction system. This system is based on the disease starting point, which is the patients’ clinical symptoms, that are still under-explored. We developed seven predictive models using traditional ML classification algorithms using a public dataset of obvious high-risk factors from patients’ clinical signs. The dataset has first undergone a pre-processing phase consisting of feature engineering and dataset resampling to deal with imbalanced dataset issue. Our best classification model is able to detect true positives and true negatives and weed out false positive and false negatives with an accuracy of 93%.





Similar content being viewed by others
Data Availability Statement
Publicly available datasets were used in this study. The executable, source code and data are available at: https://github.com/boulbaba1981/COVID-19.
Notes
Coronavirus Disease 2019 Clinical Data Repository. Accessed from https://covidclinicaldata.org/.
Where TP is the true positive, TN is the true negative, FP is the false positive and FN is the false-negative.
References
Perc M, Miksić NG, Slavinec M, Stožer A. Forecasting covid-19. Front Phys. 2020;8:127.
Neji N, Boulbaba BA, Habib MK. Prediction of COVID-19 active cases using polynomial regression and arima models. In International Conference on Intelligent Systems Design and Applications, 2021;1–12. Springer.
Momtazmanesh S, Ochs HD, Uddin LQ, Perc M, Routes JM, Vieira DN, Al-Herz W, Baris S, Prando C, Rosivall L, Latiff AHA, Ulrichs T, Roudenok V, Becerra JCA, Salunke DB, Goudouris E, Condino-Neto A, Stashchak A, Kryvenko O, Stashchak M, Bondarenko A, Rezaei N. All together to fight COVID-19. Am J Trop Med Hyg. 2020;102(6):1181–3.
Kumar A, Gupta PK, Srivastava A. A review of modern technologies for tackling COVID-19 pandemic. Diabetes & Metabolic Syndrome. Clin Res Rev. 2020;14(4):569–73.
Mohamadou Y, Halidou A, Kapen PT. A review of mathematical modeling, artificial intelligence and datasets used in the study, prediction and management of COVID-19. Appl Intellig. 2020;50(11):3913–25.
Xiaowei Xu, Jiang Xiangao, Ma Chunlian, Peng Du, Li Xukun, Lv Shuangzhi, Liang Yu, Ni Qin, Chen Yanfei, Junwei Su, et al. A deep learning system to screen novel coronavirus disease 2019 pneumonia. Engineering. 2020;6(10):1122–9.
Wang L, Lin ZQ, Wong A. COVID-net: A tailored deep convolutional neural network design for detection of COVID-19 cases from chest x-ray images. Scient Rep. 2020;10(1):1–12.
Vaid Shashank, Kalantar Reza, Bhandari Mohit. Deep learning covid-19 detection bias: accuracy through artificial intelligence. Int Orthop. 2020;44:1539–42.
Narin A, Kaya C, Pamuk Z. Automatic detection of coronavirus disease (COVID-19) using x-ray images and deep convolutional neural networks. Pattern Analy Appl. 2021;24:1–14.
Ingle VA, Ambad PM. Cvdeep-COVID-19 detection model. SN Computer Sci. 2021;2(3):1–16.
Umer MJ, Amin J, Sharif M, Anjum MA, Azam F, Shah JH. An integrated framework for COVID-19 classification based on classical and quantum transfer learning from a chest radiograph. Concurren Comput Pract Exp. 2021;34.
Chen Nanshan, Zhou Min, Dong Xuan, Jieming Qu, Gong Fengyun, Han Yang, Qiu Yang, Wang Jingli, Liu Ying, Wei Yuan, et al. Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: a descriptive study. The Lancet. 2020;395(10223):507–13.
Brinati Davide, Campagner Andrea, Ferrari Davide, Locatelli Massimo, Banfi Giuseppe, Cabitza Federico. Detection of COVID-19 infection from routine blood exams with machine learning: a feasibility study. J Med Syst. 2020;44(8):1–12.
Chowdhury MEH, Rahman T, Khandakar A, Al-Madeed S, Zughaier SM, Doi SAR, Hassen H, Islam MT. An early warning tool for predicting mortality risk of COVID-19 patients using machine learning. Cognit Computat. 2021;1–16.
Assaf Dan, Gutman Ya’ara, Neuman Yair, Segal Gad, Amit Sharon, Gefen-Halevi Shiraz, Shilo Noya, Epstein Avi, Mor-Cohen Ronit, Biber Asaf, et al. Utilization of machine-learning models to accurately predict the risk for critical COVID-19. Int Emerg Med. 2020;15(8):1435–43.
Arpaci I, Huang S, Al-Emran M, Al-Kabi MN, Peng M. Predicting the COVID-19 infection with fourteen clinical features using machine learning classification algorithms. Multimedia Tools Appl. 2021;80(8):11943–57.
Nan SN, Ya Y, Ling TL, Nv GH, Ying PH, Bin J, et al. A prediction model based on machine learning for diagnosing the early COVID-19 patients. medRxiv, 2020.
Muhammad LJ, Algehyne EA, Usman SS, Ahmad A, Chakraborty C, Mohammed IA. Supervised machine learning models for prediction of COVID-19 infection using epidemiology dataset. SN Comput Sci. 2021;2(1):1–13.
Watson J, Whiting P. Coronavirus: how accurate are coronavirus tests. The Conversation, 2020.
Kaur H, Pannu HS, Malhi AK. A systematic review on imbalanced data challenges in machine learning: applications and solutions. ACM Computing Surveys (CSUR). 2019;52(4):1–36.
Rubaidi Z, Ammar BB, Aouicha MB. Fraud detection using large-scale imbalance dataset. Int J Artif Intellig Tools. 09 2022.
Suthaharan Shan. Machine learning models and algorithms for big data classification. Integr Ser Inf Syst. 2016;36:1–12.
Chen Xiaofeng, Tang Yanyan, Mo Yongkang, Li Shengkai, Lin Daiying, Yang Zhijian, Yang Zhiqi, Sun Hongfu, Qiu Jinming, Liao Yuting, et al. A diagnostic model for coronavirus disease 2019 (COVID-19) based on radiological semantic and clinical features: a multi-center study. Eur Radiol. 2020;30(9):4893–902.
Burian E, Jungmann F, Kaissis GA, Lohöfer FK, Spinner CD, Lahmer T, Treiber M, Dommasch M, Schneider G, Geisler F, et al. Intensive care risk estimation in COVID-19 pneumonia based on clinical and imaging parameters: experiences from the munich cohort. J Clin Med. 2020;9(5):1514.
Villavicencio CN, Jeng J-H, Hsieh J-G. Support vector machine modelling for covid-19 prediction based on symptoms using r programming language. In 2021 The 4th International Conference on Machine Learning and Machine Intelligence, 2021;65–70.
Villavicencio CN, Macrohon JJE, Inbaraj XA, Jeng J-H, Hsieh J-G. COVID-19 prediction applying supervised machine learning algorithms with comparative analysis using Weka. Algorithms. 2021;14(7):201.
Villavicencio CN, Macrohon JJ, Inbaraj XA, Jeng J-H, Hsieh J-G. Development of a machine learning based web application for early diagnosis of COVID-19 based on symptoms. Diagnostics. 2022;12(4):821.
Sun Z, Ding R, Zhou X. Machine learning applications in forecasting of covid-19 based on patients’ individual symptoms. In 2021 the 3rd International Conference On Intelligent Science And Technology (ICIST), 2021;39–44.
Zoabi Y, Deri-Rozov S, Shomron N. Machine learning-based prediction of COVID-19 diagnosis based on symptoms. NPJ Digital Med. 2021;4(1):1–5.
Funding
This work was supported by the Ministry of Higher Education and Scientific Research in Tunisia (MoHESR) as part of the Federated Research Project PRFCOV19-D1-P1.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interest
The authors declare that they have no conflict of interest.
Ethical Considerations and Consent to Participate
In this study, we used a publicly available dataset of clinical characteristics of patients who have taken a COVID-19 test https://covidclinicaldata.org/.
Consent for Publication
All authors have given their consent for publication.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Ben Ammar, B., Salem, A., Ben Said, M. et al. Machine Learning Models for Early Prediction of COVID-19 Infections Based on Clinical Signs. SN COMPUT. SCI. 5, 158 (2024). https://doi.org/10.1007/s42979-023-02489-3
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s42979-023-02489-3