Abstract
With the growing demand of categorical data clustering, a new hybrid clustering algorithm, namely Rough set based Fuzzy K-Modes, is proposed in this paper. The principles of rough and fuzzy sets are used in integrated form. It gives the better handling of uncertainty, vagueness, and incompleteness in class definition, while using the concept of lower and upper approximations of rough, on the other hand, the membership function of fuzzy sets enables efficient handling of overlapping partitions. Superiority of the proposed method over state-of-the-art methods is demonstrated quantitatively. For this purpose, two artificial and two real life categorical data sets are used. Also statistical significance test has been carried out to establish the statistical significance of the proposed clustering results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Jain, A.K., Dubes, R.C.: Data clustering: A review. ACM Computing Surveys 31 (1999)
Maulik, U., Saha, I.: Modified differential evolution based fuzzy clustering for pixel classification in remote sensing imagery. Pattern Recognition 42(9), 2135–2149 (2009)
Saha, I., Maulik, U., Bandyopadhyay, S.: An Improved Multi-objective Technique for Fuzzy Clustering with Application to IRS Image Segmentation. In: Giacobini, M., Brabazon, A., Cagnoni, S., Di Caro, G.A., Ekárt, A., Esparcia-Alcázar, A.I., Farooq, M., Fink, A., Machado, P. (eds.) EvoWorkshops 2009. LNCS, vol. 5484, pp. 426–431. Springer, Heidelberg (2009)
Saha, I., Maulik, U., Plewczynski, D.: A new multi-objective technique for differential fuzzy clustering. Applied Soft Computing 11(2), 2765–2776 (2011)
Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum, New York (1981)
Huang, Z.: Extension of k-means algorithm for clustering large data sets with categorical values. Data Mining and Knowledge Discovery 2, 283–304 (1998)
Huang, Z., Ng, M.K.: A fuzzy k-modes algorithm for clustering categorical data. IEEE Transactions on Fuzzy Systems 7(4) (1999)
Gan, G., Wu, J., Yang, Z.: A genetic fuzzy k-Modes algorithm for clustering categorical data. Expert Systems with Applications 36, 1615–1620 (2009)
Kaufman, L., Roussenw, P.J.: Finding Groups in Data: An Introduction to Cluster Analysis. John Wiley & Sons, NY (1990)
Pawlak, Z.: Rough Sets: Theoretical Aspects of Resoning About Data. Kluwer Academic, MA (1992)
Dubois, D., Prade, H.: Rough fuzzy sets and fuzzy rough sets. International Journal of General Systems 17(2/3), 191–209 (1990)
Lingras, P., West, C.: Interval set clustering of web users with rough k-means. Journal of Intelligent Information Systems 23(1), 5–16 (2004)
Mitra, S., Banka, H., Pedrycz, W.: Rough-fuzzy collaborative clustering. IEEE Transactions on Systems, Man, and Cybernetics - Part B 36(4), 795–805 (2006)
Maji, P., Pal, S.K.: Rough set based generalized fuzzy c-means algorithm and quantitative indices. IEEE Transactions on Systems, Man, and Cybernetics - Part B 37(6), 1529–1540 (2007)
Maulik, U., Bandyopadhyay, S., Saha, I.: Integrating clustering and supervised learning for categorical data analysis. IEEE Transactions on Systems, Man and Cybernetics, Part A: Systems and Humans 40, 664–675 (2010)
Parmar, D., Wu, T., Blackhurst, J.: MMR: An algorithm for clustering categorical data using rough set theory. Data and Knowledge Engineering 63, 879–893 (2007)
Vermeulen-Jourdan, L., Dhaenens, C., Talbi, E.-G.: Clustering Nominal and Numerical Data: A New Distance Concept for a Hybrid Genetic Algorithm. In: Gottlieb, J., Raidl, G.R. (eds.) EvoCOP 2004. LNCS, vol. 3004, pp. 220–229. Springer, Heidelberg (2004)
Jardine, N., Sibson, R.: Mathematical Taxonomy. John Wiley and Sons (1971)
Mukhopadhyay, A., Bandyopadhyay, S., Maulik, U.: Clustering using multi-objective genetic algorithm and its application to image segmentation. In: Proc. IEEE International Conference on Systems, Man and Cybernetics (SMC 2006), vol. 3, pp. 2678–2683 (2006)
Ferguson, G.A., Takane, Y.: Statistical analysis in psychology and education (2005)
He, Z., Xu, X., Deng, S.: Attribute value weighting in k-modes clustering. Expert Systems with Applications 38, 15365–15369 (2011)
Cao, F., Liang, J., Li, D., Bai, L., Dang, C.: A dissimilarity measure for the k-Modes clustering algorithm. Knowledge-Based Systems 26, 120–127 (2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Saha, I., Sarkar, J.P., Maulik, U. (2012). Rough Set Based Fuzzy K-Modes for Categorical Data. In: Panigrahi, B.K., Das, S., Suganthan, P.N., Nanda, P.K. (eds) Swarm, Evolutionary, and Memetic Computing. SEMCCO 2012. Lecture Notes in Computer Science, vol 7677. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35380-2_38
Download citation
DOI: https://doi.org/10.1007/978-3-642-35380-2_38
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35379-6
Online ISBN: 978-3-642-35380-2
eBook Packages: Computer ScienceComputer Science (R0)