Abstract
Heart disease is a complex disease, and many people around the world suffer from this disease. Due to the lack of a healthy lifestyle, it is the most common cause of death worldwide. Machine learning plays an important role in medical treatment. The goal of this research is to develop a machine learning model to help diagnose heart disease quickly and accurately. In this article, an effective and improved machine learning method is proposed to diagnose heart disease. We designed a novel and robust ensemble model that combines the top three classifiers, namely Random Forest, XGBoost and Gradient Boosting Machine, to effectively diagnose heart disease. We used an ensemble voting method to combine the results of the top three classifiers to improve the prediction of heart disease. We used a combined heart disease dataset containing five different datasets (Hungary, Statlog, Switzerland, VA Long Beach and Cleveland). Feature selection algorithms (Pearson Correlation, Univariate Feature Selection, Recursive Feature Elimination, Boruta Feature Selection, Random forest, and LightGBM) are used to select highly relevant features based on rankings to improve classification accuracy. The proposed ensemble model is designed using seven highly relevant features, and a comparison of machine learning algorithms and ensemble learning techniques is applied to the selected features. Different performance evaluation methods are used to evaluate the proposed model: accuracy, sensitivity, precision, F1-score, MCC, NPV and AUC. Results analysis shows that the ensemble model achieves excellent classification accuracy, sensitivity, and precision of 96.17%, 98.37%, and 94.53%. Our proposed model performs better than existing models and individual classifiers. The results show that the proposed ensemble method can effectively predict the risk of heart disease.
Similar content being viewed by others
References
Sanz, M., Marco del Castillo, A., Jepsen, S., Gonzalez-Juanatey, J.R., D’Aiuto, F., Bouchard, P., Wimmer, G.: Periodontitis and cardiovascular diseases: Consensus report. J. Clin. Periodontol. 47(3), 268–288 (2020)
Allen, L.A., Stevenson, L.W., Grady, K.L., Goldstein, N.E., Matlock, D.D., Arnold, R.M., Spertus, J.A.: Decision making in advanced heart failure: A scientific statement from the American Heart Association. Circulation. 125(15), 1928–1952 (2012)
Pouriyeh, S., Vahid, S., Sannino, G., De Pietro, G., Arabnia, H., Gutierrez, J.: A comprehensive investigation and comparison of machine learning techniques in the domain of heart disease. In 2017 IEEE symposium on computers and communications (ISCC), pp. 204–207. IEEE. July 2017
Ghwanmeh, S., Mohammad, A., Al-Ibrahim, A.: Innovative artificial neural networks-based decision support system for heart diseases diagnosis. J. Intell. Learn. Syst. Appl. 5(3), 176–83 (2013)
Sevakula, R.K., Verma, N.K.: Assessing generalization ability of majority vote point classifiers. IEEE Trans. neural networks Learn. Syst. 28(12), 2985–2997 (2016)
Li, H., Cui, Y., Liu, Y., Li, W., Shi, Y., Fang, C., Lu, Y.: Ensemble learning for overall power conversion efficiency of the all-organic dye-sensitized solar cells. IEEE Access. 6, 34118–34126 (2018)
Shamrat, F.J.M., Raihan, M.A., Rahman, A.S., Mahmud, I., Akter, R.: An analysis on breast disease prediction using machine learning approaches. Int. J. Sci. Technol. Res. 9(02), 2450–2455 (2020)
Singh, D., Samagh, J.S.: A comprehensive review of heart disease prediction using machine learning. J. Crit. Reviews. 7(12), 281–285 (2020)
Asif, S., Wenhui, Y., Tao, Y., Jinhai, S., Jin, H.: An Ensemble Machine Learning Method for the Prediction of Heart Disease. In 2021 4th International Conference on Artificial Intelligence and Big Data (ICAIBD), pp. 98–103. IEEE. May 2021
Liu, X., Wang, X., Su, Q., Zhang, M., Zhu, Y., Wang, Q., Wang, Q.: A hybrid classification system for heart disease diagnosis based on the RFRS method. Comput. Math. Methods Med. (2017). https://doi.org/10.1155/2017/8272091
Amin, M.S., Chiam, Y.K., Varathan, K.D.: Identification of significant features and data mining techniques in predicting heart disease. Telematics Inform. 36, 82–93 (2019)
Atallah, R., Al-Mousa, A.: Heart disease detection using machine learning majority voting ensemble method. In 2019 2nd international conference on new trends in computing sciences (ictcs), pp. 1–6. IEEE. October, 2019
Kannan, R., Vasanthi, V.: Machine learning algorithms with ROC curve for predicting and diagnosing the heart disease. In: Soft Computing and Medical Bioinformatics, pp. 63–72. Springer, Singapore (2019)
Gudadhe, M., Wankhade, K., Dongre, S.: Decision support system for heart disease based on support vector machine and artificial neural network. In 2010 International Conference on Computer and Communication Technology (ICCCT), pp. 741–745. IEEE, September, 2010
Prasad, R., Anjali, P., Adil, S., Deepa, N.: Heart disease prediction using logistic regression algorithm using machine learning. Int. J. Eng. Adv. Technol. 8(3S), 659–662 (2019)
Melillo, P., De Luca, N., Bracale, M., Pecchia, L.: Classification tree for risk assessment in patients suffering from congestive heart failure via long-term heart rate variability. IEEE J. biomedical health Inf. 17(3), 727–733 (2013)
Nalluri, S., Saraswathi, V., Ramasubbareddy, R., Govinda, S., K., Swetha, E.: Chronic heart disease prediction using data mining techniques. In: Data Engineering and Communication Technology, pp. 903–912. Springer, Singapore (2020)
Sapra, L., Sandhu, J.K., Goyal, N.: Intelligent method for detection of coronary artery disease with ensemble approach. In: Advances in Communication and Computational Technology, pp. 1033–1042. Springer, Singapore (2021)
Raza, K.: Improving the prediction accuracy of heart disease with ensemble learning and majority voting rule. In: U-Healthcare Monitoring Systems, pp. 179–196. Academic Press, Cambridge (2019)
Mohan, S., Thirumalai, C., Srivastava, G.: Effective heart disease prediction using hybrid machine learning techniques. IEEE Access 7, 81542–81554 (2019)
Zomorodi-moghadam, M., Abdar, M., Davarzani, Z., Zhou, X., Pławiak, P., Acharya, U.R.: Hybrid particle swarm optimization for rule discovery in the diagnosis of coronary artery disease. Expert Syst. 38(1), e12485 (2021)
Geweid, G.G., Abdallah, M.A.: A new automatic identification method of heart failure using improved support vector machine based on duality optimization technique. IEEE Access 7, 149595–149611 (2019)
Haq, A.U., Li, J.P., Memon, M.H., Nazir, S., Sun, R.: A hybrid intelligent system framework for the prediction of heart disease using machine learning algorithms. Mob. Inform. Syst. (2018). https://doi.org/10.1155/2018/3860146
Rashmi, G.O., Kumar, U.M.A.: Machine learning methods for heart disease prediction. Int. J. Eng. Adv. Technol. 8(5S), 220–223 (2019)
Sharma, S., Parmar, M.: Heart diseases prediction using deep learning neural network model. Int. J. Innovative Technol. Exploring Eng. (IJITEE). 9(3), 2244–2248 (2020)
Dwivedi, A.K.: Performance evaluation of different machine learning techniques for prediction of heart disease. Neural Comput. Appl. 29(10), 685–693 (2018)
Alizadehsani, R., Habibi, J., Hosseini, M.J., Mashayekhi, H., Boghrati, R., Ghandeharioun, A., Sani, Z.A.: A data mining approach for diagnosis of coronary artery disease. Comput. Methods Programs Biomed. 111(1), 52–61 (2013)
Guidi, G., Pettenati, M.C., Melillo, P., Iadanza, E.: A machine learning system to improve heart failure patient assistance. IEEE J. Biomed. Health Inform. 18(6), 1750–1756 (2014)
Abdar, M., Acharya, U.R., Sarrafzadegan, N., Makarenkov, V.: NE-nu-SVC: A new nested ensemble clinical decision support system for effective diagnosis of coronary artery disease. IEEE Access. 7, 167605–167620 (2019)
Qin, C.J., Guan, Q., Wang, X.P.: Application of ensemble algorithm integrating multiple criteria feature selection in coronary heart disease detection. Biomed. Eng. 29(06), 1750043 (2017)
Abdar, M., Książek, W., Acharya, U.R., Tan, R.S., Makarenkov, V., Pławiak, P.: A new machine learning technique for an accurate diagnosis of coronary artery disease. Comput. Methods Programs Biomed. 179, 104992 (2019)
Shah, D., Patel, S., Bharti, S.K.: Heart disease prediction using machine learning techniques. SN Comput. Sci. 1(6), 1–6 (2020)
Latha, C.B.C., Jeeva, S.C.: Improving the accuracy of prediction of heart disease risk based on ensemble classification techniques. Inf. Med. Unlocked. 16, 100203 (2019)
Nasarian, E., Abdar, M., Fahami, M.A., Alizadehsani, R., Hussain, S., Basiri, M.E., Sarrafzadegan, N.: Association between work-related features and coronary artery disease: a heterogeneous hybrid feature selection integrated with balancing approach. Pattern Recognit. Lett. 133, 33–40 (2020)
Dua, D., Graff, C.: UCI Machine Learning Repository. University of California, School of Information and Computer Science, Irvine, CA (2019)
Alizadehsani, R., Roshanzamir, M., Abdar, M., Beykikhoshk, A., Khosravi, A., Panahiazar, M., Sarrafzadegan, N.: A database for using machine learning and data mining techniques for coronary artery disease diagnosis. Sci. Data. 6(1), 1–13 (2019)
Kursa, M.B., Rudnicki, W.R.: Feature selection with the Boruta package. J. Stat. Softw. 36, 1–13 (2010)
Bashir, S., Qamar, U., Khan, F.H.: A multicriteria weighted vote-based classifier ensemble for heart disease prediction. Comput. Intell. 32(4), 615–645 (2016)
Ali, L.I., Niamat, A., Golilarz, N.A., Ali, A., Xingzhong, X.: An expert system based on optimized stacked support vector machines for effective diagnosis of heart disease. IEEE Access 4, 2169–3536 (2019)
Paul, A.K., Shill, P.C., Rabin, M., Islam, R., Murase, K.: Adaptive weighted fuzzy rule-based system for the risk level assessment of heart disease. Appl. Intell. 48(7), 1739–1756 (2018)
Dinesh, K.G., Arumugaraj, K., Santhosh, K.D., Mareeswari, V.: ‘Prediction of cardiovascular disease using machine learning algorithms, In: Proceedings International Conference on Current Trends towards Converging Technologies (ICCTCT), Coimbatore, India, pp. 1–7 (2018)
Funding
This work was supported in part by the National Key Research and Development Program of China under Grant 2019YFA0706400 and Grant 2019YFA0706402, in part by the Pre-Research Funds for Equipment of China under Grant 61409220115.
Author information
Authors and Affiliations
Contributions
Sohaib Asif: Data curation, Methodology, Validation, Writing – original draft. Wenhui Yi: Conceptualization, Investigation, Supervision, Writing – review & editing. Jin Hou and Jinhai Si: Formal analysis, Validation, Investigation. Qurrat ul Ain and Yueyang Yi: Analysis, Validation, Writing – review & editing. All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Informed consent
For this type of study, formal consent is not required.
Experiments involving human and/or animal participants
This paper does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Asif, S., Wenhui, Y., ul Ain, Q. et al. Improving the accuracy of diagnosing and predicting coronary heart disease using ensemble method and feature selection techniques. Cluster Comput 27, 1927–1946 (2024). https://doi.org/10.1007/s10586-023-04062-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10586-023-04062-2