Risk assessment of coronary heart disease based on cloud-random forest

Wang, Jing; Rao, Congjun; Goh, Mark; Xiao, Xinping

doi:10.1007/s10462-022-10170-z

Risk assessment of coronary heart disease based on cloud-random forest

Published: 24 March 2022

Volume 56, pages 203–232, (2023)
Cite this article

Artificial Intelligence Review Aims and scope Submit manuscript

Jing Wang¹,
Congjun Rao¹,
Mark Goh² &
…
Xinping Xiao¹

1503 Accesses
34 Citations
1 Altmetric
Explore all metrics

Abstract

Coronary heart disease (CHD) is a major public health problem affecting a nation’s economic and social development. Risk assessing CHD in a timely manner helps to stop, reverse, and reduce the spread of many chronic diseases and health hazards. This paper proposes a cloud-random forest (C-RF) model combining cloud model and random forest to assess the risk of CHD. In this model, based on the traditional classification and regression trees (CART), a weight determining algorithm based on the cloud model and decision-making trial and evaluation laboratory is applied to obtain the weights of the evaluation attributes. The attribute weight and the gain value of the smallest Gini coefficient corresponding to the same attribute are weighted and summed. The weighted sum is then used to replace the original gain value. This value rule is used as a new CART node split criterion to construct a new decision tree, thus forming a new random forest, namely, the C-RF. The Framingham dataset of the Kaggle platform is the research sample for the empirical analysis. Comparing the C-RF model with CART, support vector machine (SVM), convolutional neural network (CNN), and random forest (RF) using standard performance evaluation indexes such as accuracy, error rates, ROC curve and AUC value. The result shows that the classification accuracy of the C-RF model is 85%, which is improved by 8, 9, 4 and 3% respectively compared with CART, SVM, CNN and RF. The error rate of the first type is 13.99%, which is 6.99, 7.44, 4.47 and 3.02% lower than CART, SVM, CNN and RF respectively. The AUC value is 0.85, which is also higher than other comparison models. Thus, the C-RF model is more superior on classification performance and classification effect in the risk assessment of CHD.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 12

Heart Disease Prediction using Machine Learning Techniques

Article 16 October 2020

A Review on Random Forest: An Ensemble Classifier

Comparing different supervised machine learning algorithms for disease prediction

Article Open access 21 December 2019

Data availability

The authors confirm that the data supporting the findings of this study are available within the article.

References

Ahmed H, Younis EMG, Hendawi A, Ali AA (2020) Heart disease identification from patients’ social posts, machine learning solution on Spark. Future Gener Comput Syst 111:714–722
Article Google Scholar
Ali F, El-Sappagh S, Islam SMR, Kwak D, Ali A, Imran M, Kwak KS (2020) A smart healthcare monitoring system for heart disease prediction based on ensemble deep learning and feature fusion. Inf Fusion 63:208–222
Article Google Scholar
Avci E (2009) A new intelligent diagnosis system for the heart valve diseases by using genetic-SVM classifier. Expert Syst Appl 36(7):10618–10626
Article Google Scholar
Bentéjac C, Csörgő A, Martínez-Muñoz G (2021) A comparative analysis of gradient boosting algorithms. Artif Intell Rev 54(3):1937–1967
Article Google Scholar
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Article MATH Google Scholar
Cardiovascular Diseases (2020) https://www.who.int/westernpacific/health-topics/cardi. Accessed 20 Dec 2020
Chen H, Lin Z, Wu H, Wang L, Wu T, Tan C (2015) Diagnosis of colorectal cancer by near-infrared optical fiber spectroscopy and random forest. Spectrochim Acta Part A 135:185–191
Article Google Scholar
Chen L, Nan G, Li M, Feng B, Liu Q (2022) Manufacturer’s online selling strategies under spillovers from online to offline sales. J Oper Res Soc. https://doi.org/10.1080/01605682.2022.2032426
Article Google Scholar
Cuixart BC, Alemán Sánchez JJA, Banegas BJRB et al (2018) Recomendaciones preventivas cardiovasculares. Actualización PAPPS 2018. Aten Primaria 50:4–28
Article Google Scholar
D’Agostino RB (2008) General cardiovascular risk profile for use in primary care the Framingham heart study. Circulation 118(4):743–753
Article Google Scholar
Dutta A, Batabyal T, Basu M, Acton ST (2020) An efficient convolutional neural network for coronary heart disease prediction. Expert Syst Appl 159:113408
Article Google Scholar
Fontela E, Gabus A (1974) DEMATEL: progress achieved. Futures 6(4):361–363
Article Google Scholar
Gajowniczek K, Grzegorczyk I, Ząbkowski T, Bajaj C (2020) Weighted random forests to improve arrhythmia classification. Electronics 9(1):99
Article Google Scholar
Gao MY, Yang HL, Xiao QZ, Goh M (2021) A novel method for carbon emission forecasting based on Gompertz’s law and fractional grey model: evidence from American industrial sector. Renew Energy 181:803–819
Article Google Scholar
Gao MY, Yang HL, Xiao QZ, Goh M (2022) COVID-19 lockdowns and air quality: evidence from grey spatiotemporal forecasts. Socio-Econ Plan Sci. https://doi.org/10.1016/j.seps.2022.101228
Article Google Scholar
Gárate-Escamila AK, Hassani AHE, Andrès E (2020) Classification models for heart disease prediction using feature selection and PCA. Inform Med Unlocked 19:100330
Article Google Scholar
Grajski KA, Breiman L, Prisco GVD, Freeman WJ (1986) Classification of EEG spatial patterns with a tree-structured methodology: CART. IEEE Trans Biomed Eng 33(12):1076–1086
Article Google Scholar
Guo K, Fu XY, Zhang HM, Wang MJ, Hong SL, Ma SX (2021) Predicting the postoperative blood coagulation state of children with congenital heart disease by machine learning based on real-world data. Transl Pediatrics 10(1):33–43
Article Google Scholar
Hamad K, Al-Ruzouq R, Zeiada W, Dabous SA, Khalil MA (2020) Predicting incident duration using random forests. Transp A 16(3):1269–1293
Google Scholar
Han S, Kim H, Lee Y (2020) Double random forest. Mach Learn 109(8):1569–1586
Article MathSciNet MATH Google Scholar
Han SF, Jia XY, Zhu RF, Cao Y, Xu ZY, Meng YF (2021) Gastroenterology nurse prescribing in China: a delphi method. J Adv Nurs 77(3):1228–1243
Article Google Scholar
Herrera F, Herrera-Viedma E, Martı́nez L (2000) A fusion approach for managing multi-granularity linguistic term sets in decision making. Fuzzy Sets Syst 114(1):43–58
Article MATH Google Scholar
Holloway-Brown J, Helmstedt KJ, Mengersen KL (2021) Spatial random forest (S-RF): a random forest approach for spatially interpolating missing land-cover data with multiple classes. Int J Remote Sens 42(10):3756–3776
Article Google Scholar
Hosni M, Carrillo de Gea JM, Idri A, Bajta ME, Alemán JLF, García-Mateos G, Abnane I (2021) A systematic mapping study for ensemble classification methods in cardiovascular disease. Artif Intell Rev 54(4):2827–2861
Article Google Scholar
Jain V, Phophalia A (2020) M-ary random forest—a new multidimensional partitioning approach to random forest. Multimed Tools Appl. https://doi.org/10.1007/s11042-020-10047-9
Article Google Scholar
Juan-Jose B, Enrique P, Ester GO, Gema V, Emilia C, Gergana K, Cristian H, Manuel FL (2019) Comparison of machine learning algorithms for clinical event prediction (risk of coronary heart disease). J Biomed Inform 97:103257
Article Google Scholar
Junior JC, Binuesa F, Caneo LF, Turquetto ALR, Arita ECTC, Barbosa AC, Fernandes AMS, Trindade EM, Jatene FB, Dossou P, Jatene MB (2020) Improving preoperative risk-of-death prediction in surgery congenital heart defects using artificial intelligence model: a pilot study. PLoS ONE 15(9):e0238199
Article Google Scholar
Kang YX, Mao SH, Zhang YH (2022) Fractional time-varying grey traffic flow model based on viscoelastic fluid and its application. Transp Res Part B 157:149–174
Article Google Scholar
Li DY, Du Y (2005) Uncertainty artificial intelligence. National Defense Industry Press, Arlington
MATH Google Scholar
Li DY, Liu CY, Gan WY (2009) A new cognitive model: cloud model. Int J Intell Syst 24(3):357–375
Article MATH Google Scholar
Li B, Dong XJ, Wen JH (2022) Cooperative-driving control for mixed fleets at wireless charging sections for lane changing behaviour. Energy 243:122976
Article Google Scholar
Liang XW, Jiang AP, Li T, Xue YY, Wang GT (2020) LR-SMOTE—An improved unbalanced data set oversampling based on K-means and SVM. Knowl Based Syst 196:105845
Article Google Scholar
Madani A, Arnaout R, Mofrad M, Arnaout R (2018) Fast and accurate view classification of echocardiograms using deep learning. NPJ Digit Med 1(1):6
Article Google Scholar
Mander A, Clayton D (2000) Hotdeck imputation. Stata Tech Bull 9(51):156–166
Google Scholar
Masetic Z, Subasi A (2016) Congestive heart failure detection using random forest classifier. Comput Methods Programs Biomed 130:54–64
Article Google Scholar
Miao KH, Miao JH, Miao GJ (2016) Diagnosing coronary heart disease using ensemble machine learning. Int J Adv Comput Sci Appl 7:30–39
Google Scholar
Organization WH (1999) The double burden: emerging epidemics and persistent problems. World Health Rep 221:7
Google Scholar
Qian CJ, Wang L, Gao YZ, Yousuf A, Yang XP, Oto A, Shen DG (2016) In vivo MRI based prostate cancer localization with random forests and auto-context model. Comput Med Imaging Graph 52:44–57
Article Google Scholar
Rao CJ, Gao Y (2022) Evaluation mechanism design for the development level of urban-rural integration based on an improved TOPSIS method. Mathematics 10(3):380
Article Google Scholar
Rao CJ, Yan BJ (2020) Study on the interactive influence between economic growth and environmental pollution. Environ Sci Pollut Res 27(31):39442–39465
Article Google Scholar
Rao CJ, Lin H, Liu M (2020a) Design of comprehensive evaluation index system for P2P credit risk of “three rural” borrowers. Soft Comput 24(15):11493–11509
Article Google Scholar
Rao CJ, Liu M, Goh M, Wen JH (2020b) 2-stage modified random forest model for credit risk assessment of P2P network lending to “Three Rurals” borrowers. Appl Soft Comput 95:106570
Article Google Scholar
Ricciardi C, Edmunds KJ, Recenti M, Sigurdsson S, Gudnason V, Carraro U, Gargiulo P (2020) Assessing cardiovascular risks from a mid-thigh CT image: a tree-based machine learning approach using radiodensitometric distributions. Sci Rep 10(1):2863
Article Google Scholar
Rodriguez-Galiano VF, Ghimire B, Rogan J, Chica-Olmo M, Rigol-Sanchez JP (2012) An assessment of the effectiveness of a random forest classifier for land-cover classification. ISPRS J Photogramm Remote Sens 67:93–104
Article Google Scholar
Safdar S, Zafar S, Zafar N, Khan NF (2018) Machine learning based decision support systems (DSS) for heart disease diagnosis: a review. Artif Intell Rev 50(4):597–623
Article Google Scholar
Shah SMS, Shah FA, Hussain SA (2020) Support vector machines-based heart disease diagnosis using feature subset, wrapping selection and extraction methods. Comput Electr Eng 84:106628
Article Google Scholar
Shao YE, Hou CD, Chiu CC (2014) Hybrid intelligent modeling schemes for heart disease classification. Appl Soft Comput 14:47–52
Article Google Scholar
Shi XP, Wong YD, Li MZF, Palanisamy C, Chai C (2019) A feature learning approach based on XGBoost for driving assessment and risk prediction. Accid Anal Prev 129:170–179
Article Google Scholar
Shilaskar S, Ghatol A (2013) Feature selection for medical diagnosis: evaluation for cardiovascular diseases. Expert Syst Appl 40(10):4146–4153
Article Google Scholar
Soliman H (2020) Random forest based searching approach for RDF. IEEE Access 8:50367–50376
Article Google Scholar
Tian C, Peng JJ, Zhang S, Wang JQ, Goh M (2021) A sustainability evaluation framework for WET-PPP projects based on a picture fuzzy similarity-based VIKOR method. J Clean Prod 289:125130
Article Google Scholar
Tian C, Peng JJ, Zhang ZQ, Wang JQ, Goh M (2022) An extended picture fuzzy MULTIMOORA method based on Schweizer-Sklar aggregation operators. Soft Comput. https://doi.org/10.1007/s00500-021-06690-5
Article Google Scholar
Valarmathi R, Sheela T (2021) Heart disease prediction using hyper parameter optimization (HPO) tuning. Biomed Signal Process Control 70:103033
Article Google Scholar
Wang JQ, Liu T (2012) Uncertain linguistic multi-criteria group decision-making based on cloud model. Control Decis 27(8):1185–1190
MathSciNet MATH Google Scholar
Wang YY, Wang DJ, Wang YZ, Jin YC (2017) Improved random forest ensemble classification method to predict survival of colorectal cancer. Manage Sci 30(1):95–106
Google Scholar
Wang ST, Wang YY, Wang DJ, Yin YQ, Wang YZ, Jin YC (2020) An improved random forest-based rule extraction method for breast cancer diagnosis. Appl Soft Comput 86:105941
Article Google Scholar
Wei G, Zhao J, Feng YL, He AX, Yu J (2020) A novel hybrid feature selection method based on dynamic feature importance. Appl Soft Comput 93:106337
Article Google Scholar
Wen JH, Wu CZ, Zhang RY, Xiao XP, Nengchao Nv NC, Shi Y (2020) Rear-end collision warning of connected automated vehicles based on a novel stochastic local multivehicle optimal velocity model. Accid Anal Prev 148:105800
Article Google Scholar
Xiao C, Li Y, Jiang YM (2020) Heart coronary artery segmentation and disease risk warning based on a deep learning algorithm. IEEE Access 8:140108–140121
Article Google Scholar
Xie H, Li SY, Sun YH, Han W (2018) Research on DEMATEL method for solving attribute weight based on cloud model. Comput Eng Appl 54(7):257–263
Google Scholar
Zhang JY, Zhu HL, Chen YK, Yang CG, Cheng HM, Li Y, Zhong WX, Wang F (2021) Ensemble machine learning approach for screening of coronary heart disease based on echocardiography and risk factors. BMC Med Inform Decis Mak 21(1):187
Article Google Scholar
Zhong Y, Yang HY, Zhang YC, Li P (2021) Online rebuilding regression random forests. Knowl Based Syst 221:106960
Article Google Scholar

Download references

Acknowledgements

We would like to thank the editors and the anonymous reviewers for their helpful comments.

Funding

This work is supported by the National Natural Science Foundation of China (No. 72071150, 71671135, 71871174).

Author information

Authors and Affiliations

School of Science, Wuhan University of Technology, Wuhan, 430070, People’s Republic of China
Jing Wang, Congjun Rao & Xinping Xiao
NUS Business School and The Logistics Institute-Asia Pacific, National University of Singapore, Singapore, 119623, Singapore
Mark Goh

Authors

Jing Wang
View author publications
You can also search for this author in PubMed Google Scholar
Congjun Rao
View author publications
You can also search for this author in PubMed Google Scholar
Mark Goh
View author publications
You can also search for this author in PubMed Google Scholar
Xinping Xiao
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

JW: software, writing—original draft. CR: conceptualization, methodology, data curation. MG: formal analysis, supervision, writing—review & editing. XX: visualization, investigation.

Corresponding author

Correspondence to Congjun Rao.

Ethics declarations

Conflict of interest

The authors declare that they have no competing interests.

Ethical approval and consent to participate

There no ethical approval and patient consent to participate are required for this study.

Consent for publication

The authors confirm that the final version of the manuscript has been reviewed, approved, and consented for publication by all authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, J., Rao, C., Goh, M. et al. Risk assessment of coronary heart disease based on cloud-random forest. Artif Intell Rev 56, 203–232 (2023). https://doi.org/10.1007/s10462-022-10170-z

Download citation

Accepted: 11 March 2022
Published: 24 March 2022
Issue Date: January 2023
DOI: https://doi.org/10.1007/s10462-022-10170-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Risk assessment of coronary heart disease based on cloud-random forest

Abstract

Access this article

Similar content being viewed by others

Heart Disease Prediction using Machine Learning Techniques

A Review on Random Forest: An Ensemble Classifier

Comparing different supervised machine learning algorithms for disease prediction

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval and consent to participate

Consent for publication

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Risk assessment of coronary heart disease based on cloud-random forest

Abstract

Access this article

Similar content being viewed by others

Heart Disease Prediction using Machine Learning Techniques

A Review on Random Forest: An Ensemble Classifier

Comparing different supervised machine learning algorithms for disease prediction

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval and consent to participate

Consent for publication

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation