Abstract
Clinical decision support systems have always assisted physicians in diagnosing diseases. Coronary artery disease (CAD) is currently responsible for a large percentage of deaths, which motivated researchers to propose more accurate prediction models. This paper employs neural networks (NN) and a boosted C5.0 decision tree model to predict CAD for the well-known Cleveland Heart Disease dataset. We attempt to tune the optimal size and configuration of the neural networks and identify the insensitive features in both models, followed by assessing the effect of eliminating such features in the results. Both models are evaluated through ten experiments, each of which has different training and testing datasets, but with the same size. The most and the least important input features in each model are determined. The performance of the reduced dataset, i.e., the removed insignificant features contributing to the models, has been evaluated through statistical tests. Our results show that there is no significant difference between running the NN and C5.0 algorithms by initial dataset in terms of three performance criteria: positive prediction value (PPV), negative prediction value (NPV) and total accuracy value (TAV). Regarding the TAV criterion, the NN applied to the reduced dataset outperforms the C5.0 model with a 95% confidence interval. Finally, further discussion shows the trade-off between the NPV and PPV.
Similar content being viewed by others
References
Ahmad F, Mat Isa NA, Hussain Z, Osman MK (2013) Intelligent medical disease diagnosis using improved hybrid genetic algorithm—multilayer perceptron network. J Med Syst 37:9934. doi:10.1007/s10916-013-9934-7
Bouckaert RR, Frank E (2004) Evaluating the replicability of significance tests for comparing learning algorithms. In: Advances in knowledge discovery and data mining. Springer, Berlin, pp 3–12
Centers for Disease Control and Prevention (CDC) (2015) Heart Disease Facts. In: Centers Dis. Control Prev. https://www.cdc.gov/heartdisease/facts.htm. Accessed 2 Mar 2017
Cheung N (2001) Machine learning techniques for medical analysis. University of Queenland, Brisbane
Das R, Turkoglu I, Sengur A (2009) Effective diagnosis of heart disease through neural networks ensembles. Expert Syst Appl 36:7675–7680. doi:10.1016/j.eswa.2008.09.013
Detrano R (1988) Heart disease data set. https://archive.ics.uci.edu/ml/machine-learning-databases/heart-disease/. Accessed 1 Dec 2016
Detrano R, Janosi A, Steinbrunn W et al (1989) International application of a new probability algorithm for the diagnosis of coronary artery disease. Am J Cardiol 64:304–310
Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55:119–139. doi:10.1006/jcss.1997.1504
Hagan MT, Demuth HB, Beale MH (1996) Neural network design. PWS Pub, Boston
Hedeshi NG, Abadeh MS (2014) Coronary artery disease detection using a fuzzy-boosting PSO approach. Intell Neurosci 6:6. doi:10.1155/2014/783734
Hornik K, Stinchcombe M, White H (1990) Universal approximation of an unknown mapping and its derivatives using multilayer feedforward networks. Neural Netw 3:551–560. doi:10.1016/0893-6080(90)90005-6
Karaolis MA, Moutiris JA, Hadjipanayi D, Pattichis CS (2010) Assessment of the risk factors of coronary heart events based on data mining with decision trees. IEEE Trans Inf Technol Biomed 14:559–566. doi:10.1109/TITB.2009.2038906
Kuhn M, Weston S, Coulter N et al (2015) Package “C50”. https://cran.r-project.org/web/packages/C50/C50.pdf. Accessed 1 Apr 2017
Lakshmi BN, Indumathi TS, Ravi N (2016) A Study on C.5 decision tree classification algorithm for risk predictions during pregnancy. Procedia Technol 24:1542–1549. doi:10.1016/j.protcy.2016.05.128
Larose DT, Larose CD (2015) Data mining and predictive analytics. Wiley, Hoboken
McCulloch WS, Pitts W (1990) A logical calculus of the ideas immanent in nervous activity. Bull Math Biol 52:99–115. doi:10.1007/BF02459570
Millie DF, Weckman GR, Young WA et al (2012) Modeling microalgal abundance with artificial neural networks: demonstration of a heuristic “Grey-Box” to deconvolve and quantify environmental influences. Environ Model Softw 38:27–39. doi:10.1016/j.envsoft.2012.04.009
Polat K, Sahan S, Kodaz H, Gunes S (2005) A new classification method to diagnosis liver disorders: supervised artificial immune system (AIRS). In: Proc. IEEE 13th Signal Process. Commun. Appl. Conf. 2005. pp 169–174
Quinlan JR (1992) C4.5: programs for machine learning. Morgan Kaufmann Publishers, San Francisco
Rahman RM, Md. Hasan FR (2011) Using and comparing different decision tree classification techniques for mining ICDDR, B Hospital Surveillance data. Expert Syst Appl 38:11421–11436. doi:10.1016/j.eswa.2011.03.015
Rao BS, Rao KN, Setty SP (2014) An approach for heart disease detection by enhancing training phase of neural network using hybrid algorithm. 2014 IEEE Int. Adv. Comput. Conf. pp 1211–1220
Saltelli A, Tarantola S, Campolongo F, Ratto M (2004) Sensitivity analysis in practice: a guide to assessing scientific models. Wiley, Hoboken
Sharda R (1994) Neural networks for the MS/OR analyst: an application bibliography. Interfaces (Providence) 24:116–130. doi:10.1287/inte.24.2.116
Sharda R, Delen D (2006) Predicting box-office success of motion pictures with neural networks. Expert Syst Appl 30:243–254. doi:10.1016/j.eswa.2005.07.018
Shouman M, Turner T, Stocker R (2012) Applying k-nearest neighbour in diagnosing heart disease patients. Int J Inf Educ Technol 2:220–223. doi:10.7763/IJIET.2012.V2.114
Son C-S, Kim Y-N, Kim H-S et al (2012) Decision-making model for early diagnosis of congestive heart failure using rough set and decision tree approaches. J Biomed Inform 45:999–1008. doi:10.1016/j.jbi.2012.04.013
Subbulakshmi CV, Deepa SN (2015) Medical dataset classification: a machine learning paradigm integrating particle swarm optimization with extreme learning machine classifier. Sci World J 2015:1–12. doi:10.1155/2015/418060
Tu MC, Shin D, Shin D (2009) Effective diagnosis of heart disease through bagging approach. In: Biomedical Engineering and Informatics, 2009. BMEI’09. 2nd International Conference on. pp 1–4
Weng C-H, Huang TC-K, Han R-P (2016) Disease prediction with different types of neural network classifiers. Telemat Inform 33:277–292. doi:10.1016/j.tele.2015.08.006
Wiharto W, Kusnanto H, Herianto H (2016) Intelligence system for diagnosis level of coronary heart disease with K-star algorithm. Healthc Inform Res 22:30. doi:10.4258/hir.2016.22.1.30
Wu X, Kumar V, Ross Quinlan J et al (2008) Top 10 algorithms in data mining. Knowl Inf Syst 14:1–37. doi:10.1007/s10115-007-0114-2
Yeung DS, Cloete I, Daming Shi, Ng WWY (2010) Sensitivity analysis for neural networks. Springer Berlin Heidelberg, Heidelberg
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Ahmadi, E., Weckman, G.R. & Masel, D.T. Decision making model to predict presence of coronary artery disease using neural network and C5.0 decision tree. J Ambient Intell Human Comput 9, 999–1011 (2018). https://doi.org/10.1007/s12652-017-0499-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12652-017-0499-z