Skip to main content
Log in

Risk assessment of coronary heart disease based on cloud-random forest

  • Published:
Artificial Intelligence Review Aims and scope Submit manuscript

Abstract

Coronary heart disease (CHD) is a major public health problem affecting a nation’s economic and social development. Risk assessing CHD in a timely manner helps to stop, reverse, and reduce the spread of many chronic diseases and health hazards. This paper proposes a cloud-random forest (C-RF) model combining cloud model and random forest to assess the risk of CHD. In this model, based on the traditional classification and regression trees (CART), a weight determining algorithm based on the cloud model and decision-making trial and evaluation laboratory is applied to obtain the weights of the evaluation attributes. The attribute weight and the gain value of the smallest Gini coefficient corresponding to the same attribute are weighted and summed. The weighted sum is then used to replace the original gain value. This value rule is used as a new CART node split criterion to construct a new decision tree, thus forming a new random forest, namely, the C-RF. The Framingham dataset of the Kaggle platform is the research sample for the empirical analysis. Comparing the C-RF model with CART, support vector machine (SVM), convolutional neural network (CNN), and random forest (RF) using standard performance evaluation indexes such as accuracy, error rates, ROC curve and AUC value. The result shows that the classification accuracy of the C-RF model is 85%, which is improved by 8, 9, 4 and 3% respectively compared with CART, SVM, CNN and RF. The error rate of the first type is 13.99%, which is 6.99, 7.44, 4.47 and 3.02% lower than CART, SVM, CNN and RF respectively. The AUC value is 0.85, which is also higher than other comparison models. Thus, the C-RF model is more superior on classification performance and classification effect in the risk assessment of CHD.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

Similar content being viewed by others

Data availability

The authors confirm that the data supporting the findings of this study are available within the article.

References

  • Ahmed H, Younis EMG, Hendawi A, Ali AA (2020) Heart disease identification from patients’ social posts, machine learning solution on Spark. Future Gener Comput Syst 111:714–722

    Article  Google Scholar 

  • Ali F, El-Sappagh S, Islam SMR, Kwak D, Ali A, Imran M, Kwak KS (2020) A smart healthcare monitoring system for heart disease prediction based on ensemble deep learning and feature fusion. Inf Fusion 63:208–222

    Article  Google Scholar 

  • Avci E (2009) A new intelligent diagnosis system for the heart valve diseases by using genetic-SVM classifier. Expert Syst Appl 36(7):10618–10626

    Article  Google Scholar 

  • Bentéjac C, Csörgő A, Martínez-Muñoz G (2021) A comparative analysis of gradient boosting algorithms. Artif Intell Rev 54(3):1937–1967

    Article  Google Scholar 

  • Breiman L (2001) Random forests. Mach Learn 45(1):5–32

    Article  MATH  Google Scholar 

  • Cardiovascular Diseases (2020) https://www.who.int/westernpacific/health-topics/cardi. Accessed 20 Dec 2020

  • Chen H, Lin Z, Wu H, Wang L, Wu T, Tan C (2015) Diagnosis of colorectal cancer by near-infrared optical fiber spectroscopy and random forest. Spectrochim Acta Part A 135:185–191

    Article  Google Scholar 

  • Chen L, Nan G, Li M, Feng B, Liu Q (2022) Manufacturer’s online selling strategies under spillovers from online to offline sales. J Oper Res Soc. https://doi.org/10.1080/01605682.2022.2032426

    Article  Google Scholar 

  • Cuixart BC, Alemán Sánchez JJA, Banegas BJRB et al (2018) Recomendaciones preventivas cardiovasculares. Actualización PAPPS 2018. Aten Primaria 50:4–28

    Article  Google Scholar 

  • D’Agostino RB (2008) General cardiovascular risk profile for use in primary care the Framingham heart study. Circulation 118(4):743–753

    Article  Google Scholar 

  • Dutta A, Batabyal T, Basu M, Acton ST (2020) An efficient convolutional neural network for coronary heart disease prediction. Expert Syst Appl 159:113408

    Article  Google Scholar 

  • Fontela E, Gabus A (1974) DEMATEL: progress achieved. Futures 6(4):361–363

    Article  Google Scholar 

  • Gajowniczek K, Grzegorczyk I, Ząbkowski T, Bajaj C (2020) Weighted random forests to improve arrhythmia classification. Electronics 9(1):99

    Article  Google Scholar 

  • Gao MY, Yang HL, Xiao QZ, Goh M (2021) A novel method for carbon emission forecasting based on Gompertz’s law and fractional grey model: evidence from American industrial sector. Renew Energy 181:803–819

    Article  Google Scholar 

  • Gao MY, Yang HL, Xiao QZ, Goh M (2022) COVID-19 lockdowns and air quality: evidence from grey spatiotemporal forecasts. Socio-Econ Plan Sci. https://doi.org/10.1016/j.seps.2022.101228

    Article  Google Scholar 

  • Gárate-Escamila AK, Hassani AHE, Andrès E (2020) Classification models for heart disease prediction using feature selection and PCA. Inform Med Unlocked 19:100330

    Article  Google Scholar 

  • Grajski KA, Breiman L, Prisco GVD, Freeman WJ (1986) Classification of EEG spatial patterns with a tree-structured methodology: CART. IEEE Trans Biomed Eng 33(12):1076–1086

    Article  Google Scholar 

  • Guo K, Fu XY, Zhang HM, Wang MJ, Hong SL, Ma SX (2021) Predicting the postoperative blood coagulation state of children with congenital heart disease by machine learning based on real-world data. Transl Pediatrics 10(1):33–43

    Article  Google Scholar 

  • Hamad K, Al-Ruzouq R, Zeiada W, Dabous SA, Khalil MA (2020) Predicting incident duration using random forests. Transp A 16(3):1269–1293

    Google Scholar 

  • Han S, Kim H, Lee Y (2020) Double random forest. Mach Learn 109(8):1569–1586

    Article  MathSciNet  MATH  Google Scholar 

  • Han SF, Jia XY, Zhu RF, Cao Y, Xu ZY, Meng YF (2021) Gastroenterology nurse prescribing in China: a delphi method. J Adv Nurs 77(3):1228–1243

    Article  Google Scholar 

  • Herrera F, Herrera-Viedma E, Martı́nez L (2000) A fusion approach for managing multi-granularity linguistic term sets in decision making. Fuzzy Sets Syst 114(1):43–58

    Article  MATH  Google Scholar 

  • Holloway-Brown J, Helmstedt KJ, Mengersen KL (2021) Spatial random forest (S-RF): a random forest approach for spatially interpolating missing land-cover data with multiple classes. Int J Remote Sens 42(10):3756–3776

    Article  Google Scholar 

  • Hosni M, Carrillo de Gea JM, Idri A, Bajta ME, Alemán JLF, García-Mateos G, Abnane I (2021) A systematic mapping study for ensemble classification methods in cardiovascular disease. Artif Intell Rev 54(4):2827–2861

    Article  Google Scholar 

  • Jain V, Phophalia A (2020) M-ary random forest—a new multidimensional partitioning approach to random forest. Multimed Tools Appl. https://doi.org/10.1007/s11042-020-10047-9

    Article  Google Scholar 

  • Juan-Jose B, Enrique P, Ester GO, Gema V, Emilia C, Gergana K, Cristian H, Manuel FL (2019) Comparison of machine learning algorithms for clinical event prediction (risk of coronary heart disease). J Biomed Inform 97:103257

    Article  Google Scholar 

  • Junior JC, Binuesa F, Caneo LF, Turquetto ALR, Arita ECTC, Barbosa AC, Fernandes AMS, Trindade EM, Jatene FB, Dossou P, Jatene MB (2020) Improving preoperative risk-of-death prediction in surgery congenital heart defects using artificial intelligence model: a pilot study. PLoS ONE 15(9):e0238199

    Article  Google Scholar 

  • Kang YX, Mao SH, Zhang YH (2022) Fractional time-varying grey traffic flow model based on viscoelastic fluid and its application. Transp Res Part B 157:149–174

    Article  Google Scholar 

  • Li DY, Du Y (2005) Uncertainty artificial intelligence. National Defense Industry Press, Arlington

    MATH  Google Scholar 

  • Li DY, Liu CY, Gan WY (2009) A new cognitive model: cloud model. Int J Intell Syst 24(3):357–375

    Article  MATH  Google Scholar 

  • Li B, Dong XJ, Wen JH (2022) Cooperative-driving control for mixed fleets at wireless charging sections for lane changing behaviour. Energy 243:122976

    Article  Google Scholar 

  • Liang XW, Jiang AP, Li T, Xue YY, Wang GT (2020) LR-SMOTE—An improved unbalanced data set oversampling based on K-means and SVM. Knowl Based Syst 196:105845

    Article  Google Scholar 

  • Madani A, Arnaout R, Mofrad M, Arnaout R (2018) Fast and accurate view classification of echocardiograms using deep learning. NPJ Digit Med 1(1):6

    Article  Google Scholar 

  • Mander A, Clayton D (2000) Hotdeck imputation. Stata Tech Bull 9(51):156–166

    Google Scholar 

  • Masetic Z, Subasi A (2016) Congestive heart failure detection using random forest classifier. Comput Methods Programs Biomed 130:54–64

    Article  Google Scholar 

  • Miao KH, Miao JH, Miao GJ (2016) Diagnosing coronary heart disease using ensemble machine learning. Int J Adv Comput Sci Appl 7:30–39

    Google Scholar 

  • Organization WH (1999) The double burden: emerging epidemics and persistent problems. World Health Rep 221:7

    Google Scholar 

  • Qian CJ, Wang L, Gao YZ, Yousuf A, Yang XP, Oto A, Shen DG (2016) In vivo MRI based prostate cancer localization with random forests and auto-context model. Comput Med Imaging Graph 52:44–57

    Article  Google Scholar 

  • Rao CJ, Gao Y (2022) Evaluation mechanism design for the development level of urban-rural integration based on an improved TOPSIS method. Mathematics 10(3):380

    Article  Google Scholar 

  • Rao CJ, Yan BJ (2020) Study on the interactive influence between economic growth and environmental pollution. Environ Sci Pollut Res 27(31):39442–39465

    Article  Google Scholar 

  • Rao CJ, Lin H, Liu M (2020a) Design of comprehensive evaluation index system for P2P credit risk of “three rural” borrowers. Soft Comput 24(15):11493–11509

    Article  Google Scholar 

  • Rao CJ, Liu M, Goh M, Wen JH (2020b) 2-stage modified random forest model for credit risk assessment of P2P network lending to “Three Rurals” borrowers. Appl Soft Comput 95:106570

    Article  Google Scholar 

  • Ricciardi C, Edmunds KJ, Recenti M, Sigurdsson S, Gudnason V, Carraro U, Gargiulo P (2020) Assessing cardiovascular risks from a mid-thigh CT image: a tree-based machine learning approach using radiodensitometric distributions. Sci Rep 10(1):2863

    Article  Google Scholar 

  • Rodriguez-Galiano VF, Ghimire B, Rogan J, Chica-Olmo M, Rigol-Sanchez JP (2012) An assessment of the effectiveness of a random forest classifier for land-cover classification. ISPRS J Photogramm Remote Sens 67:93–104

    Article  Google Scholar 

  • Safdar S, Zafar S, Zafar N, Khan NF (2018) Machine learning based decision support systems (DSS) for heart disease diagnosis: a review. Artif Intell Rev 50(4):597–623

    Article  Google Scholar 

  • Shah SMS, Shah FA, Hussain SA (2020) Support vector machines-based heart disease diagnosis using feature subset, wrapping selection and extraction methods. Comput Electr Eng 84:106628

    Article  Google Scholar 

  • Shao YE, Hou CD, Chiu CC (2014) Hybrid intelligent modeling schemes for heart disease classification. Appl Soft Comput 14:47–52

    Article  Google Scholar 

  • Shi XP, Wong YD, Li MZF, Palanisamy C, Chai C (2019) A feature learning approach based on XGBoost for driving assessment and risk prediction. Accid Anal Prev 129:170–179

    Article  Google Scholar 

  • Shilaskar S, Ghatol A (2013) Feature selection for medical diagnosis: evaluation for cardiovascular diseases. Expert Syst Appl 40(10):4146–4153

    Article  Google Scholar 

  • Soliman H (2020) Random forest based searching approach for RDF. IEEE Access 8:50367–50376

    Article  Google Scholar 

  • Tian C, Peng JJ, Zhang S, Wang JQ, Goh M (2021) A sustainability evaluation framework for WET-PPP projects based on a picture fuzzy similarity-based VIKOR method. J Clean Prod 289:125130

    Article  Google Scholar 

  • Tian C, Peng JJ, Zhang ZQ, Wang JQ, Goh M (2022) An extended picture fuzzy MULTIMOORA method based on Schweizer-Sklar aggregation operators. Soft Comput. https://doi.org/10.1007/s00500-021-06690-5

    Article  Google Scholar 

  • Valarmathi R, Sheela T (2021) Heart disease prediction using hyper parameter optimization (HPO) tuning. Biomed Signal Process Control 70:103033

    Article  Google Scholar 

  • Wang JQ, Liu T (2012) Uncertain linguistic multi-criteria group decision-making based on cloud model. Control Decis 27(8):1185–1190

    MathSciNet  MATH  Google Scholar 

  • Wang YY, Wang DJ, Wang YZ, Jin YC (2017) Improved random forest ensemble classification method to predict survival of colorectal cancer. Manage Sci 30(1):95–106

    Google Scholar 

  • Wang ST, Wang YY, Wang DJ, Yin YQ, Wang YZ, Jin YC (2020) An improved random forest-based rule extraction method for breast cancer diagnosis. Appl Soft Comput 86:105941

    Article  Google Scholar 

  • Wei G, Zhao J, Feng YL, He AX, Yu J (2020) A novel hybrid feature selection method based on dynamic feature importance. Appl Soft Comput 93:106337

    Article  Google Scholar 

  • Wen JH, Wu CZ, Zhang RY, Xiao XP, Nengchao Nv NC, Shi Y (2020) Rear-end collision warning of connected automated vehicles based on a novel stochastic local multivehicle optimal velocity model. Accid Anal Prev 148:105800

    Article  Google Scholar 

  • Xiao C, Li Y, Jiang YM (2020) Heart coronary artery segmentation and disease risk warning based on a deep learning algorithm. IEEE Access 8:140108–140121

    Article  Google Scholar 

  • Xie H, Li SY, Sun YH, Han W (2018) Research on DEMATEL method for solving attribute weight based on cloud model. Comput Eng Appl 54(7):257–263

    Google Scholar 

  • Zhang JY, Zhu HL, Chen YK, Yang CG, Cheng HM, Li Y, Zhong WX, Wang F (2021) Ensemble machine learning approach for screening of coronary heart disease based on echocardiography and risk factors. BMC Med Inform Decis Mak 21(1):187

    Article  Google Scholar 

  • Zhong Y, Yang HY, Zhang YC, Li P (2021) Online rebuilding regression random forests. Knowl Based Syst 221:106960

    Article  Google Scholar 

Download references

Acknowledgements

We would like to thank the editors and the anonymous reviewers for their helpful comments.

Funding

This work is supported by the National Natural Science Foundation of China (No. 72071150, 71671135, 71871174).

Author information

Authors and Affiliations

Authors

Contributions

JW: software, writing—original draft. CR: conceptualization, methodology, data curation. MG: formal analysis, supervision, writing—review & editing. XX: visualization, investigation.

Corresponding author

Correspondence to Congjun Rao.

Ethics declarations

Conflict of interest

The authors declare that they have no competing interests.

Ethical approval and consent to participate

There no ethical approval and patient consent to participate are required for this study.

Consent for publication

The authors confirm that the final version of the manuscript has been reviewed, approved, and consented for publication by all authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, J., Rao, C., Goh, M. et al. Risk assessment of coronary heart disease based on cloud-random forest. Artif Intell Rev 56, 203–232 (2023). https://doi.org/10.1007/s10462-022-10170-z

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10462-022-10170-z

Keywords

Navigation