Abstract
Atherosclerotic cardiovascular disease (ASCVD), which includes coronary heart disease (CHD) and ischemic stroke, is the leading cause of mortality globally. According to the European Society of Cardiology (ESC), 26 million people worldwide have heart disease, with 3.6 million diagnosed each year. Early detection of heart disease will aid in lowering the mortality rate. The lack of diversity in training data and the difficulty in comprehending the findings of complicated AI models are the key issues in current research for heart disease prediction using artificial intelligence. To overcome this, in this paper, cardiac disease prediction using AI algorithms with SelectKBest has been proposed. Features are standardized, balanced, and selected using the StandardScaler, SMOTE, and SelectKBest techniques. Machine learning models such as support vector machine (SVM), K-nearest neighbor(KNN), decision tree (DT), logistic regression (LR), adaptive boosting (AB), naive Bayes (NB), random forest (RF), and extra tree (ET) and deep learning models such as vanilla long short-term memory (LSTM), bidirectional long short-term memory (LSTM), stacked long short-term memory (LSTM), and deep neural network (DNN) are assessed using Alizadeh Sani, combined (Cleveland, Hungarian, Switzerland, Long Beach VA, and Stalog), and Pakistan heart failure datasets. As a result of the evaluation, the proposed deep neural network (DNN) with SelectKBest predicted heart disease in a promising way. The prediction rate of unweighted accuracy of 99% on Alizadeh Sani, 98% on combined, and 97% on Pakistan are gained in tenfold cross-validation experiments. The suggested approach can be utilized to diagnose heart disease in its early stages.
Graphical Abstract
Similar content being viewed by others
Abbreviations
- ASCVD :
-
Atherosclerotic cardiovascular disease
- CHD :
-
Coronary heart disease
- ESC :
-
European Society of Cardiology
- Region RWMA :
-
Regional wall motion abnormality
- HTN :
-
Hypertension
- DM :
-
Diabetes mellitus
- BP :
-
Blood pressure
- EF-TTE :
-
Ejection fraction-transthoracic echocardiography
- FBS :
-
Fasting blood sugar
- K :
-
Potassium
- ESR :
-
Erythrocyte sedimentation rate
- PR :
-
Pulse rate
- Q Wave :
-
Are both abnormally deep and wide implying myocardial infarction
- TG :
-
Triglyceride
- Lymph :
-
Lymphocyte
- Neut :
-
Neutrophil
- Poor R progression :
-
Poor R-wave progression
- PLT :
-
Platelet
- CRF :
-
Chronic renal failure
- BUN :
-
Blood urea nitrogen
- CR :
-
Creatine
- Na :
-
Sodium
- BMI :
-
Body mass index
- WBC :
-
White blood cell
- ST slope :
-
The slope of the peak exercise ST segment
- oldpeak :
-
ST depression induced by exercise relative to rest
References
Domanski M, Lloyd-Jones D, Fuster V, Grundy S (2011) Can we dramatically reduce the incidence of coronary heart disease? Nat Rev Cardiol 8(12). https://doi.org/10.1038/nrcardio.2011.158
Heidenreich PA et al (2011) Forecasting the future of cardiovascular disease in the United States: a policy statement from the American Heart Association. Circulation 123(8). https://doi.org/10.1161/CIR.0b013e31820a55f5
R. Kones (2011) Primary prevention of coronary heart disease: integration of new data, evolving views, revised goals, and role of rosuvastatin in management. A comprehensive survey. Drug Des Devel Ther 5. https://doi.org/10.2147/DDDT.S14934
López-Sendón J (2011) The heart failure epidemic. Medicographia 33:363–369
Vanisree K, Singaraju J (2011) Decision support system for congenital heart disease diagnosis based on signs and symptoms using neural networks. Int J Comput Appl 19(6):6–12. https://doi.org/10.5120/2368-3115
Ghwanmeh S, Mohammad A, Al-Ibrahim A (2013) Innovative artificial neural networks-based decision support system for heart diseases diagnosis. J Intell Learn Syst Appl 05(03). https://doi.org/10.4236/jilsa.2013.53019
Zebardast B, Rashidi R, Hasanpour T, Gharehchopogh FS (2014) Artificial neural network models for diagnosing heart disease: a brief review. Int J Acad Res 6(3):73–78. https://doi.org/10.7813/2075-4124.2014/6-3/A.11
Nazir S, Shahzad S, Mahfooz S, Nazir M (2018) Fuzzy logic based decision support system for component security evaluation. Int Arab J Inf Technol 15(2):224–231
Nazir S, Shahzad S, Riza LS (2017) Birthmark-based software classification using rough sets. Arab J Sci Eng 42(2):859–871. https://doi.org/10.1007/s13369-016-2371-4
Djerioui M, Brik Y, Ladjal M, Attallah B (2020) Heart disease prediction using MLP and LSTM models. In 2020 International Conference on Electrical Engineering. ICEE 2020. https://doi.org/10.1109/ICEE49691.2020.9249935
Sharma S, Parmar M (2020) Heart diseases prediction using deep learning neural network model. Int J Innov Technol Explor Eng 9(3):2244–2248. https://doi.org/10.35940/ijitee.C9009.019320
Methaila A, Kansal P, Arya H, Kumar P (2014) Early heart disease prediction using data mining techniques. In Computer Science & Information Technology ( CS & IT ), Academy & Industry Research Collaboration Center (AIRCC). pp. 53–59. https://doi.org/10.5121/csit.2014.4807
Ahmed A, Verma S (2017) Prediction of heart diseases using artificial intelligence. IJARCCE 6(6):121–124. https://doi.org/10.17148/ijarcce.2017.6623
Hsieh NC, Hung LP, Shih CC, Keh HC, Chan CH (2012) Intelligent postoperative morbidity prediction of heart disease using artificial intelligence techniques. J Med Syst 36(3):1809–1820. https://doi.org/10.1007/s10916-010-9640-7
Kumar Y, Koul A, Singla R, Ijaz MF (2022) Artificial intelligence in disease diagnosis: a systematic literature review, synthesizing framework and future research agenda. J Ambient Intell Humaniz Comput. https://doi.org/10.1007/s12652-021-03612-z
GhoshRoy D, Alvi PA, Tavares JMRS (2022) Detection of cardiovascular disease using ensemble feature engineering with decision tree. Int J Ambient Comput Intell 13(1):1–16. https://doi.org/10.4018/IJACI.300795
GhoshRoy D, Alvi PA, Santosh KC (2023) Unboxing industry-standard ai models for male fertility prediction with SHAP. Healthcare (Switzerland) 11(7). https://doi.org/10.3390/healthcare11070929
Das D, Santosh KC, Pal U (2020) Truncated inception net: COVID-19 outbreak screening using chest X-rays. Phys Eng Sci Med 43(3):915–925. https://doi.org/10.1007/s13246-020-00888-x
D. GhoshRoy, Alvi PA, Santosh KC (2023) Explainable AI to predict male fertility using extreme gradient boosting algorithm with SMOTE. Electronics (Switzerland) 12(1). https://doi.org/10.3390/electronics12010015
Saeed AM, Hussein SR, Ali CM, Rashid TA (2022) Medical dataset classification for Kurdish short text over social media. Data Brief 42. https://doi.org/10.1016/j.dib.2022.108089
Abdalla PA, Qadir AM, Rashid OJ, Rawf KMH, Abdulrahman AO, Mohammed BA (2022) Deep transfer learning networks for brain tumor detection: the effect of mri patient image augmentation methods. Int J Electron Commun Syst 2(2). https://doi.org/10.24042/ijecs.v2i2.14815
Abdalla P et al (2023) Transfer learning models comparison for detecting and diagnosing skin cancer. Acta Inform Malaysia 7(1)
Chicco D, Jurman G (2020) Machine learning can predict survival of patients with heart failure from serum creatinine and ejection fraction alone. BMC Med Inform Decis Mak 20(1):16. https://doi.org/10.1186/s12911-020-1023-5
Sarmah SS (2020) An efficient IoT-based patient monitoring and heart disease prediction system using deep learning modified neural network. IEEE Access 8. https://doi.org/10.1109/ACCESS.2020.3007561
Kavitha M, Gnaneswar G, Dinesh R, Sai YR, Suraj RS (2021) Heart disease prediction using hybrid machine learning model. Proceedings of the 6th International Conference on Inventive Computation Technologies, ICICT 2021. p 1329–1333. https://doi.org/10.1109/ICICT50816.2021.9358597
El-Shafiey MG, Hagag A, El-Dahshan E-SA, Ismail MA (2021) A hybrid bidirectional LSTM and 1D CNN for heart disease prediction. IJCSNS 21(10):135
Chiu C-C, Wu C-M, Chien T-N, Kao L-J, Li C, Jiang H-L (2022) Applying an improved stacking ensemble model to predict the mortality of ICU patients with heart failure. J Clin Med 11(21):6460. https://doi.org/10.3390/jcm11216460
Lin A et al (2022) Deep learning-enabled coronary CT angiography for plaque and stenosis quantification and cardiac risk prediction: an international multicentre study. Lancet Digit Health 4(4). https://doi.org/10.1016/S2589-7500(22)00022-X
Vaid A et al (2022) Using deep-learning algorithms to simultaneously identify right and left ventricular dysfunction from the electrocardiogram. JACC Cardiovasc Imaging 15(3). https://doi.org/10.1016/j.jcmg.2021.08.004
Aliyar Vellameeran F, Brindha T (2022) A new variant of deep belief network assisted with optimal feature selection for heart disease diagnosis using IoT wearable medical devices. Comput Methods Biomech Biomed Engin 25(4):387–411. https://doi.org/10.1080/10255842.2021.1955360
Brownlee J (2016) Machine learning mastery with Python: understand your data, create accurate models, and work projects end-to-end. Machine Learning Mastery
Géron A (2019) Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems. O’Reilly Media. [Online]. Available: https://books.google.iq/books?id=HHetDwAAQBAJ. Accessed 1 Jan 2023
Brownlee J (2020) Data preparation for machine learning: data cleaning, feature selection, and data transforms in Python. Machine Learning Mastery
Chen B, Xia S, Chen Z, Wang B, Wang G (2021) RSMOTE: a self-adaptive robust SMOTE for imbalanced problems with label noise. Inf Sci (N Y) 553. https://doi.org/10.1016/j.ins.2020.10.013
Douzas G, Bacao F, Last F (2018) Improving imbalanced learning through a heuristic oversampling method based on k-means and SMOTE. Inf Sci (N Y) 465. https://doi.org/10.1016/j.ins.2018.06.056
Brownlee J (2020) Imbalanced Classification with Python: Better Metrics, Balance Skewed Classes, Cost-Sensitive Learning. Machine Learning Mastery. [Online]. Available: https://books.google.pt/books?id=jaXJDwAAQBAJ. Accessed 1 Jan 2023
“A gentle introduction to k-fold cross-validation.” https://machinelearningmastery.com/k-fold-cross-validation/. Accessed 25 Aug 2022
Roberts DR et al (2017) Cross-validation strategies for data with temporal, spatial, hierarchical, or phylogenetic structure. Ecography 40(8):913–929. https://doi.org/10.1111/ECOG.02881
Masmoudi O, Jaoua M, Jaoua A, Yacout S (2021) Data preparation in machine learning for condition-based maintenance. J Comput Sci 17(6):525–538. https://doi.org/10.3844/JCSSP.2021.525.538
Kuhn M, Johnson K (2019) Feature engineering and selection: A practical approach for predictive models. https://doi.org/10.1201/9781315108230
Pudjihartono N, Fadason T, Kempa-Liehr AW, O’Sullivan JM (2022) A review of feature selection methods for machine learning-based disease risk prediction. Front Bioinform 2. https://doi.org/10.3389/fbinf.2022.927312
“A Gentle Introduction to k-fold Cross-Validation.” [Online]. Available: https://machinelearningmastery.com/k-fold-cross-validation/
Roberts DR et al (2017) Cross-validation strategies for data with temporal, spatial, hierarchical, or phylogenetic structure. Ecography 40(8):913–929. https://doi.org/10.1111/ECOG.02881
Yuvalı M, Yaman B, Tosun Ö (2022) Classification comparison of machine learning algorithms using two independent CAD datasets. Mathematics 10(3):311. https://doi.org/10.3390/math10030311
Hassannataj Joloudari J et al (2022) GSVMA: a genetic support vector machine ANOVA method for CAD diagnosis. Front Cardiovasc Med 8. https://doi.org/10.3389/fcvm.2021.760178
Gupta A, Arora HS, Kumar R, Raman B (2021) DMHZ: a decision support system based on machine computational design for heart disease diagnosis using Z-Alizadeh Sani Dataset. International Conference on Information Networking, vol. 2021-January, p 818–823. https://doi.org/10.1109/ICOIN50884.2021.9333884
Saboor A, Usman M, Ali S, Samad A, Abrar MF, Ullah N (2022) A method for improving prediction of human heart disease using machine learning algorithms. Mob Inf Syst 2022:1–9. https://doi.org/10.1155/2022/1410169
MuntasirNishat M et al (2022) A comprehensive investigation of the performances of different machine learning classifiers witH SMOTE-ENN oversampling technique and hyperparameter optimization for imbalanced heart failure dataset. Sci Program 2022:1–17. https://doi.org/10.1155/2022/3649406
Newaz A, Ahmed N, ShahriyarHaq F (2021) Survival prediction of heart failure patients using machine learning techniques. Inform Med Unlocked 26:100772. https://doi.org/10.1016/J.IMU.2021.100772
Ishaq A et al (2021) Improving the prediction of heart failure patients’ survival using smote and effective data mining techniques. IEEE Access 9:39707–39716. https://doi.org/10.1109/ACCESS.2021.3064084
Mamun M, Farjana A, al Mamun M, Ahammed MS, Rahman MM (2022) Heart failure survival prediction using machine learning algorithm: am I safe from heart failure? 2022 IEEE World AI IoT Congress, AIIoT 2022:194–200. https://doi.org/10.1109/AIIOT54504.2022.9817303
Faieq AK, Mijwil MM (2022) Prediction of of heart diseases utilising support vector machine and artificial neural network. Indones J Electr Eng Comput Sci 26(1):374. https://doi.org/10.11591/ijeecs.v26.i1.pp374-380
Nagavelli U, Samanta D, Chakraborty P (2022) Machine learning technology-based heart disease detection models. J Healthc Eng 2022:1–9. https://doi.org/10.1155/2022/7351061
Author information
Authors and Affiliations
Contributions
The introductory section of the paper was prepared by Jihad Hama, who also reviewed the initial draft of the paper. Mariwan, for their part, was instrumental in the creation of the remainder of the paper’s written content and additionally spearheaded the practical aspects of the study. Moreover, Mariwan took on the task of acquiring, collating, preparing, scrutinizing, and evaluating the paper’s dataset.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
Table 10
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Saeed, M.H., Hama, J.I. Cardiac disease prediction using AI algorithms with SelectKBest. Med Biol Eng Comput 61, 3397–3408 (2023). https://doi.org/10.1007/s11517-023-02918-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11517-023-02918-8