Abstract
The use of machine learning tools has become widespread in medical diagnosis. The main reason for this is the effective results obtained from classification and diagnosis systems developed to help medical professionals in the diagnosis phase of diseases. The primary objective of this study is to improve the accuracy of classification in medical diagnosis problems. To this end, studies were carried out on 3 different datasets. These datasets are heart disease, Parkinson’s disease (PD) and BUPA liver disorders. Key feature of these datasets is that they have a linearly non-separable distribution. A new method entitled k-medoids clustering-based attribute weighting (kmAW) has been proposed as a data preprocessing method. The support vector machine (SVM) was preferred in the classification phase. In the performance evaluation stage, classification accuracy, specificity, sensitivity analysis, f-measure, kappa statistics value and ROC analysis were used. Experimental results showed that the developed hybrid system entitled kmAW + SVM gave better results compared to other methods described in the literature. Consequently, this hybrid intelligent system can be used as a useful medical decision support tool.








Similar content being viewed by others
References
Das, R., Turkoglu, I., and Sengur, A., Diagnosis of valvular heart disease through neural networks ensembles. Comput. Methods Programs Biomed. 93(2):185–191, 2009.
Peker, M., A new approach for automatic sleep scoring: Combining Taguchi based complex-valued neural network and complex wavelet transform. Comput. Methods Programs Biomed. 2016. doi:10.1016/j.cmpb.2016.01.001.
Das, R., and Sengur, A., Evaluation of ensemble methods for diagnosing of valvular heart disease. Expert Syst. Appl. 37(7):5110–5115, 2010.
Bache, K., and Lichman, M., UCI machine learning repository. 2013, Available at http://archive.ics.uci.edu/ml.
Duch, W., Adamczak, R., and Grabczewski, K., A new methodology of extraction, optimization and application of crisp and fuzzy logical rules. IEEE Trans. Neural Network 12(2):277–306, 2001.
Sahan, S., Polat, K., Kodaz, H., and Gunes, S., The medical applications of attribute weighted artificial immune system (AWAIS): Diagnosis of heart and diabetes diseases. Lect. Notes Comput. Sci. 3627:456–468, 2005.
Polat, K., and Gunes, S., A hybrid approach to medical decision support systems: Combining feature selection, fuzzy weighted pre-processing and AIRS. Comput. Methods Programs Biomed. 88(2):164–174, 2007.
Polat, K., Sahan, S., and Gunes, S., Automatic detection of heart disease using an artificial immune recognition system (AIRS) with fuzzy resource allocation mechanism and k-NN (nearest neighbour) based weighting preprocessing. Expert Syst. Appl. 32(2):625–631, 2007.
Ozsen, S., and Gunes, S., Effect of feature-type in selecting distance measure for an artificial immune system as a pattern recognizer. Digit. Signal Process. 18(4):635–645, 2008.
Kahramanli, H., and Allahverdi, N., Design of a hybrid system for the diabetes and heart diseases. Expert Syst. Appl. 35(1–2):82–89, 2008.
Polat, K., and Gunes, S., A new feature selection method on classification of medical datasets: Kernel F-score feature selection. Expert Syst. Appl. 36(7):10367–10373, 2009.
Das, R., Turkoglu, I., and Sengur, A., Effective diagnosis of heart disease through neural networks ensembles. Expert Syst. Appl. 36(4):7675–7680, 2009.
Subbulakshmi, C. V., Deepa, S. N., and Malathi, N., Extreme learning machine for two category data classification. In 2012 I.E. International Conference on Advanced Communication Control and Computing Technologies (ICACCCT), pp. 458–461, 2012.
Mantas, C. J., and Abellán, J., Credal-C4. 5: Decision tree based on imprecise probabilities to classify noisy data. Expert Syst. Appl 41(10):4625–4637, 2014.
Shahbaba, B., and Neal, R., Nonlinear models using Dirichlet process mixtures. J. Mach. Learn. Res. 10:1829–1850, 2009.
Das, R., A comparison of multiple classification methods for diagnosis of Parkinson disease. Expert Syst. Appl. 37(2):1568–1572, 2010.
Guo, P. F., Bhattacharya, P., and Kharma, N., Advances in detecting Parkinson’s disease. in Medical Biometrics, vol. 6165 of Lect. Notes Comput. Sci, pp. 306–314, 2010.
Sakar, C. O., and Kursun, O., Telediagnosis of Parkinson’s disease using measurements of dysphonia. J. Med. Syst. 34(4):591–599, 2010.
Ozcift, A., and Gulten, A., Classifier ensemble construction with rotation forest to improve medical diagnosis performance of machine learning algorithms. Comput. Methods Programs Biomed. 104(3):443–451, 2011.
Astrom, F., and Koker, R., A parallel neural network approach to prediction of Parkinson’s disease. Expert Syst. Appl. 38(10):12470–12474, 2011.
Luukka, P., Feature selection using fuzzy entropy measures with similarity classifier. Expert Syst. Appl. 38(4):4600–4607, 2011.
Li, D. C., Liu, C. W., and Hu, S. C., A fuzzy-based data transformation for feature extraction to increase classification performance with small medical data sets. Artif. Intell. Med. 52(1):45–52, 2011.
Ozcift, A., SVM feature selection based rotation forest ensemble classifiers to improve computer-aided diagnosis of Parkinson disease. J. Med. Syst. 36(4):2141–2147, 2012.
Polat, K., Classification of Parkinson’s disease using feature weighting method on the basis of fuzzy c-means clustering. Int. J. Syst. Sci. 43(4):597–609, 2012.
Daliri, M. R., Chi-square distance kernel of the gaits for the diagnosis of Parkinson’s disease. Biomed. Signal Process. Contr. 8(1):66–70, 2013.
Zuo, W. L., Wang, Z. Y., Liu, T., and Chen, H. L., Effective detection of Parkinson’s disease using an adaptive fuzzy k-nearest neighbor approach. Biomed. Signal Process. Contr. 8(4):364–373, 2013.
Chen, H. L., Huang, C. C., Yu, X. G., Xu, X., Sun, X., Wang, G., and Wang, S. J., An efficient diagnosis system for detection of Parkinson’s disease using fuzzy k-nearest neighbor approach. Expert Syst. Appl. 40(1):263–271, 2013.
Ma, C., Ouyang, J., Chen, H. L., and Zhao, X. H., An efficient diagnosis system for Parkinson’s disease using kernel-based extreme learning machine with subtractive clustering features weighting approach. Comput Math. Methods Med. 2014. doi:10.1155/2014/985789.
Pham, D. T., Dimov, S. S., and Salem, Z., Technique for selecting examples in inductive learning. In European Symposium on Intelligent Techniques (ESIT 2000), pp. 119–127, 2000.
Van Gestel, T., Suykens, J. A. K., Lanckriet, G., Lambrechts, A., De Moor, B., and Vandewalle, J., Bayesian framework for least squares support vector machine classifiers, Gaussian processes and kernel fisher discriminant analysis. Neural. Comput. 14(5):1115–1147, 2002.
Goncalves, L. B., Vellasco, M. B. R., Pacheco, M. A. C., and de Souza, F. J., Inverted hierarchical neuro-fuzzy BSP system: A novel neuro-fuzzy model for pattern classification and rule extraction in databases. IEEE Trans. Syst. Man Cybern. C Appl. Rev. 36(2):236–248, 2006.
Polat, K., Sahan, S., Kodaz, H., and Gunes, S., Breast cancer and liver disorders classification using artificial immune recognition system (AIRS) with performance evaluation by fuzzy resource allocation mechanism. Expert Syst. Appl. 32(1):172–183, 2007.
Jin, B., Tang, Y. C., and Zhang, Y. Q., Support vector machines with genetic fuzzy feature transformation for biomedical data classification. Inform. Sci. 177(2):476–489, 2007.
Ozsen, S., and Gunes, S., Attribute weighting via genetic algorithms for attribute weighted artificial immune system (AWAIS) and its application to heart disease and liver disorders problems. Expert Syst. Appl. 36(1):386–392, 2009.
Lee, Y. J., and Mangasarian, O. L., SSVM: A smooth support vector machine for classification. Comput. Optim. Appl. 20(1):5–22, 2001.
Chen, L. F., Su, C. T., Chen, K. H., and Wang, P. C., Particle swarm optimization for feature selection with application in obstructive sleep apnea diagnosis. Neural Comput. Appl. 21(8):2087–2096, 2012.
Dehuri, S., Roy, R., Cho, S. B., and Ghosh, A., An improved swarm optimized functional link artificial neural network (ISO-FLANN) for classification. J. Syst. Software 85(6):1333–1345, 2012.
Shao, Y. H., and Deng, N. Y., A coordinate descent margin based-twin support vector machine for classification. Neural Network 25:114–121, 2012.
Savitha, R., Suresh, S., Sundararajan, N., and Kim, H. J., A fully complex-valued radial basis function classifier for real-valued classification problems. Neurocomputing 78(1):104–110, 2012.
López, F. M., Puertas, S. M., and Arriaza, J. T., Training of support vector machine with the use of multivariate normalization. Appl. Soft Comput. 24:1105–1111, 2014.
Gunes, S., Polat, K., and Yosunkaya, S., Efficient sleep stage recognition system based on EEG signal using k-means clustering based feature weighting. Expert Syst. Appl. 37(12):7922–7928, 2010.
Han, J., Kamber, M., and Pei, J., Data mining: Concepts and techniques. Morgan Kaufmann, 2006.
Polat, K., and Gunes, S., A hybrid medical decision making system based on principles component analysis, k-NN based weighted pre-processing and adaptive neuro-fuzzy inference system. Digit. Signal Process. 16(6):913–921, 2006.
Tahir, M. A., Bouridane, A., and Kurugollu, F., Simultaneous feature selection and feature weighting using hybrid tabu search/k-nearest neighbor classifier. Pattern Recogn. Lett. 28(4):438–446, 2007.
Sun, Y., Iterative RELIEF for feature weighting: Algorithms, theories, and applications. IEEE Trans. Pattern Anal. Mach. Intell. 29(6):1035–1051, 2007.
Polat, K., Latifoglu, F., Kara, S., and Gunes, S., Usage of novel similarity based weighting method to diagnose the Atherosclerosis from carotid artery Doppler signals. Med. Biol. Eng. Comput. 46:353–362, 2008.
Dua, S., Singh, H., and Thompson, H. W., Associative classification of mammograms using weighted rules. Expert Syst. Appl. 36(5):9250–9259, 2009.
Polat, K., and Durduran, S. S., Subtractive clustering attribute weighting (SCAW) to discriminate the traffic accidents on Konya–Afyonkarahisar highway in Turkey with the help of GIS: A case study. Adv. Eng. Software 42(7):491–500, 2011.
Unal, Y., Polat, K., and Kocer, H. E., Pairwise FCM based feature weighting for improved classification of vertebral column disorders. Comput. Biol. Med. 46:61–70, 2014.
MacQueen, J. B., Some methods for classification and analysis of multivariate observations. In Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–297, 1967.
Bezdek, J. C., Pattern recognition with fuzzy objective function algorithms. Plenum Press, New York, 1981.
Yager, R. R., and Filev, D. P., Generation of fuzzy rules by mountain clustering. J. Intell. Fuzzy Syst. 24:209–219, 1994.
Chiu, S. L., Fuzzy model identification based on cluster estimation. J. Intell. Fuzzy Syst. 2:267–278, 1994.
Kaufman, L., and Rousseeuw, P., Clustering by means of medoids. North-Holland, 1987.
Kaufman, L., and Rousseeuw, P. J., Finding groups in data: An introduction to cluster analysis. Wiley, Hoboken, NJ, 1990.
Vapnik, V. N., The nature of statistical learning theory. Springer, NewYork, 1995.
Berikol, G. B., Yildiz, O., and Ozcan, I. T., Diagnosis of acute coronary syndrome with a support vector machine. J. Med. Syst. 40(4):1–8, 2016.
Su, L., Shi, T., Xu, Z., Lu, X., and Liao, G., Defect inspection of flip chip solder bumps using an ultrasonic transducer. Sensors 13(12):16281–16291, 2013.
Cortes, C., and Vapnik, V., Support vector network. Mach. Learn. 20(3):273–297, 1995.
Elbaz, A., Bower, J. H., Maraganore, D. M., McDonnell, S. K., Peterson, B. J., Ahlskog, J. E., Schaid, D. J., and Rocca, W. A., Risk tables for Parkinsonism and Parkinson’s disease. J. Clin. Epidemiol. 55:25–31, 2002.
Little, M. A., McSharry, P. E., Hunter, E. J., and Ramig, L. O., Suitability of dysphonia measurements for telemonitoring of Parkinson’s disease. IEEE Trans. Biomed. Eng. 56:1015–1022, 2009.
Bergstra, J., and Bengio, Y., Random search for hyper-parameter optimization. The J. Mach. Learn. Res. 13(1):281–305, 2012.
Chang, C. C., and Lin, C. J., LIBSVM: A library for support vector machines. 2001, Software available at http://www.csie.ntu.edu.tw/cjlin/libsvm.
Cohen, J., A coefficient of agreement for nominal scales. Educ. Psychol. Meas. 20(1):37–46, 1960.
Kocer, S., and Canal, M. R., Classifying epilepsy diseases using artificial neural networks and genetic algorithm. J. Med. Syst. 35(4):489–498, 2011.
Alickovic, E., and Subasi, A., Medical decision support system for diagnosis of heart arrhythmia using DWT and random forests classifier. J. Med. Syst. 40(4):1–12, 2016.
Ozsen, S., Gunes, S., Kara, S., and Latifoglu, F., Use of kernel functions in artificial immune systems for the nonlinear classification problems. IEEE Trans. Inform. Tech. Biomed. 13(4):621–628, 2009.
Tian, J., Li, M., and Chen, F., A hybrid classification algorithm based on coevolutionary EBFNN and domain covering method. Neural Comput. Appl. 18(3):293–308, 2009.
Torun, Y., and Tohumoglu, G., Designing simulated annealing and subtractive clustering based fuzzy classifier. Appl. Soft Comput. 11(2):2193–2201, 2011.
Al-Obeidat, F., Belacela, N., Carretero, J. A., and Mahanti, P., An evolutionary framework using particle swarm optimization for classification method PROAFTN. Appl. Soft Comput. 11(8):4971–4980, 2011.
Jaganathan, P., and Kuppuchamy, R., A threshold fuzzy entropy based featureselection for medical database classification. Comput. Biol. Med. 43:2222–2229, 2013.
Lim, C. K., and Chan, C. S., A weighted inference engine based on interval valued fuzzy relational theory. Expert Syst. Appl. 42:3410–3419, 2015.
Yang, C. Y., Chou, J. J., and Lian, F. L., Robust classifier learning with fuzzy class labels for large-margin support vector machines. Neurocomputing 99:1–14, 2013.
Ahmad, F., Isa, N. A. M., Hussain, Z., and Osman, M. K., Intelligent medical disease diagnosis using improved hybrid genetic algorithm-multilayer perceptron network. J. Med. Syst. 37(2):1–8, 2013.
Ibrikci, T., Ustun, D., and Kaya, I. E., Diagnosis of several diseases by using combined kernels with support vector machine. J. Med. Syst. 36(3):1831–1840, 2012.
Psorakis, I., Damoulas, T., and Girolami, M. A., Multiclass relevance vector machines: Sparsity and accuracy. IEEE Trans. Neural Network 21(10):1588–1598, 2010.
Lin, J. J., and Chang, P. C., A particle swarm optimization based classifier for liver disorders classification, in: International Conference on Computational Problem-Solving (ICCP), pp. 3–5, 2010.
Wang, J., Belatreche, A., Maguire, L., and McGinnity, T. M., An online supervised learning method for spiking neural networks with adaptive structure. Neurocomputing 144:526–536, 2014.
Ozsen, S., and Yucelbas, C., On the evolution of ellipsoidal recognition regions in artificial immune systems. Appl. Soft Comput. 31:210–222, 2015.
Author information
Authors and Affiliations
Corresponding author
Additional information
This article is part of the Topical Collection on Systems-Level Quality Improvement.
Rights and permissions
About this article
Cite this article
Peker, M. A decision support system to improve medical diagnosis using a combination of k-medoids clustering based attribute weighting and SVM. J Med Syst 40, 116 (2016). https://doi.org/10.1007/s10916-016-0477-6
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10916-016-0477-6