Abstract
In this paper, attribute weighting method based on the cluster centers with aim of increasing the discrimination between classes has been proposed and applied to nonlinear separable datasets including two medical datasets (mammographic mass dataset and bupa liver disorders dataset) and 2-D spiral dataset. The goals of this method are to gather the data points near to cluster center all together to transform from nonlinear separable datasets to linear separable dataset. As clustering algorithm, k-means clustering, fuzzy c-means clustering, and subtractive clustering have been used. The proposed attribute weighting methods are k-means clustering based attribute weighting (KMCBAW), fuzzy c-means clustering based attribute weighting (FCMCBAW), and subtractive clustering based attribute weighting (SCBAW) and used prior to classifier algorithms including C4.5 decision tree and adaptive neuro-fuzzy inference system (ANFIS). To evaluate the proposed method, the recall, precision value, true negative rate (TNR), G-mean1, G-mean2, f-measure, and classification accuracy have been used. The results have shown that the best attribute weighting method was the subtractive clustering based attribute weighting with respect to classification performance in the classification of three used datasets.
Similar content being viewed by others
References
Özşen, S., and Güneş, S., Attribute weighting via genetic algorithms for attribute weighted artificial immune system (AWAIS) and its application to heart disease and liver disorders problems. Expert Syst. Appl. 36:386–392, 2009.
Gançarski, P., Blansché, A., and Wania, A., Comparison between two coevolutionary feature weighting algorithms in clustering. Pattern Recognit. 41:983–994, 2008.
Polat, K., Şahan, S., and Güneş, S., A new method to medical diagnosis: Artificial immune recognition system (AIRS) with fuzzy weighted pre-processing and application to ECG arrhythmia. Expert Syst. Appl. 31:264–269, 2006.
Polat, K., Şahan, S., and Güneş, S., Automatic detection of heart disease using an artificial immune recognition system (AIRS) with fuzzy resource allocation mechanism and k-nn (nearest neighbour) based weighting preprocessing. Expert Syst. Appl. 32:625–631, 2007.
Polat, K., and Güneş, S., A hybrid medical decision making system based on principles component analysis, k-NN based weighted pre-processing and adaptive neuro-fuzzy inference system. Digital Signal Process. 16:913–921, 2006.
Polat, K., and Güneş, S., The effect to diagnostic accuracy of decision tree classifier of fuzzy and k-NN based weighted pre-processing methods to diagnosis of erythemato-squamous diseases. Digital Signal Process. 16:922–930, 2006.
Polat, K., Latifoğlu, F., Kara, S., and Güneş, S., Usage of novel similarity based weighting method to diagnose the atherosclerosis from carotid artery Doppler signals. Med. Biol. Eng. Comput. 46:353–362, 2008.
Dua, S., Singh, H., and Thompson, H. W., Associative classification of mammograms using weighted rules. Expert Syst. Appl. 36:9250–9259, 2009.
Polat, K.,Durduran, S.S., Subtractive clustering attribute weighting (SCAW) to discriminate the traffic accidents on Konya–Afyonkarahisar highway in Turkey with the help of GIS: A case study. Adv Eng Software In Press, 2011.
Bai, L., Liang, J., Dang, C., Cao, F., A novel attribute weighting algorithm for clustering high-dimensional categorical data. Pattern Recognition In Press, 2011.
Hathaway, R., and Bezdek, J., Fuzzy c-means clustering of incomplete data. IEEE Trans. Syst. Man Cybern. 31:735–744, 2001.
Everitt, B., Landau, S., and Leese, M., Cluster Analysis. Arnold, London, 2001.
Hartigan, J., Clustering Algorithms. Wiley, New York, 1975.
MacQueen, B., Some Methods for classification and Analysis of Multivariate Observations, Proceedings of 5-th Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, University of California Press, 1:281–297, 1967.
Bezdek, J. C., Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum, New York, 1981.
Yager, R. R., and Filev, D. P., Generation of fuzzy rules by mountain clustering. IEEE Trans. Syst. Man Cybern. 24:209–219, 1994.
Chiu, S. L., Fuzzy model identification based on cluster estimation. J. Intell. Fuzzy Syst. 2, 1994.
Dempster, A. P., Laird, N. M., and Rubin, D. B., Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. 39:1–38, 1977.
Marinai, S., Faini, S., Marino, E., Soda, G., Efficient word retrieval by means of SOM clustering and PCA. In: DAS 2006, Springer Verlag- LNCS 3872 (2006) 336–347
Çomak, E., Polat, K., Güneş, S., and Arslan, A., A new medical decision making system: Least square support vector machine (LSSVM) with Fuzzy Weighting Pre-processing. Expert Syst. Appl. 32:409–414, 2007.
Polat, K., Şahan, S., Kodaz, H., and Güneş, S., Breast cancer and liver disorders classification using artificial immune recognition system (AIRS) with performance evaluation by fuzzy resource allocation mechanism. Expert Syst. Appl. 32:172–183, 2007.
Jin, B., Tang, Y. C., and Zhang, Y. Q., Support vector machines with genetic fuzzy feature transformation for biomedical data classification. Inf. Sci. 177:476–489, 2007.
Elter, M., Schulz-Wendtland, R., and Wittenberg, T., The prediction of breast cancer biopsy outcomes using two CAD approaches that both emphasize an intelligible decision process. Med. Physic 34:4164–4172, 2007.
Benaki Lairenjamand Siri Krishan Wasan, Neural Network with Classification Based on Multiple Association Rule for Classifying Mammographic Data Lecture Notes in Computer Science, 2009, Volume 5788, Intelligent Data Engineering and Automated Learning - IDEAL 2009, Pages 465–476
UCI machine learning database, ftp://ftp.ics.uci.edu/pub/machine-learning-databases (last accessed: 2011)
http://www.benmargolis.com/compsci/ai/two_spirals_problem.htm (last accessed: 2011)
Singh, S., 2D spiral pattern recognition with possibilistic measures. Pattern Recognit. Lett. 19:141–147, 1998.
Güneş, S., Polat, K., and Yosunkaya, Ş., Efficient sleep stage recognition system based on EEG signal using k-means clustering based feature weighting. Expert Syst. Appl. 37:7922–7928, 2010.
Polat, K., Durduran, S.S., Automatic determination of traffic accidents based on KMC-based attribute weighting. Neural Computing & Applications, Article in Press, 2011.
http://www.resample.com/xlminer/help/kMClst/KMClust_intro.htm (last accessed: 2011)
http://en.wikipedia.org/wiki/K-means_clustering (last accessed: 2011)
http://en.wikipedia.org/wiki/Cluster_analysis#Fuzzy_c-means_clustering, (last accessed: 2011)
Polat, K., Classification of Parkinson’s disease using feature weighting method onthe basis of fuzzy C-means clustering. International Journal of Systems Science, Article in Press, 2011.
Quinlan, J. R., C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, 1993.
Mitchell, Tom M., Machine Learning. McGraw-Hill, 1997. pp. 55–58.
Kotsiantis, S. B., Supervised machine learning: A review of classification techniques. Informatica 31:249–268, 2007.
Quinlan, J. R., Improved use of continuous attributes in c4.5. J. Artif. Intell. Res. 4:77–90, 1996.
Jang, J.-S. R., Fuzzy Modeling Using Generalized Neural Networks and Kalman Filter Algorithm, Proc. of the Ninth National Conf. on Artificial Intelligence (AAAI-91), 762–767, 1991.
Polat, K., and Güneş, S., A hybrid medical decision making system based on principles component analysis, k-NN based weighted pre-processing and adaptive neuro-fuzzy inference system. Digital Signal Process. 16:913–921, 2006.
Akdemir, B., Kara, S., Polat, K., Güven, A., and Güneş, A., Ensemble adaptive network-based fuzzy inference system with weighted arithmetical mean and application to diagnosis of optic nerve disease from visual-evoked potential signals. Artif. Intell. Med. 43:141–149, 2008.
Kohavi, R., A study of cross-validation and bootstrap for accuracy estimation and model selection. Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence 2 (12):1137–1143, 1995, http://citeseer.ist.psu.edu/kohavi95study.html. (Morgan Kaufmann, San Mateo)
http://www2.cs.uregina.ca/~dbd/cs831/notes/confusion_matrix/confusion_matrix.html (last accessed: 2011)
Güven, A., Polat, K., Kara, S., and Güneş, S., The effect of generalized discriminate analysis (GDA) to the classification of optic nerve disease from VEP signals. Comput. Biol. Med. 38:62–68, 2008.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Polat, K. Application of Attribute Weighting Method Based on Clustering Centers to Discrimination of Linearly Non-Separable Medical Datasets. J Med Syst 36, 2657–2673 (2012). https://doi.org/10.1007/s10916-011-9741-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10916-011-9741-y