Abstract
Clustering a set of objects into homogeneous classes is a fundamental operation in data mining. Categorical data clustering based on rough set theory has been an active research area in the field of machine learning. However, pure rough set theory is not well suited for analyzing noisy information systems. In this paper, an alternative technique for categorical data clustering using Variable Precision Rough Set model is proposed. It is based on the classification quality of Variable Precision Rough theory. The technique is implemented in MATLAB. Experimental results on three benchmark UCI datasets indicate that the technique can be successfully used to analyze grouped categorical data because it produces better clustering results.
References
Huang, Z.: Extensions to the k-means algorithm for clustering large data sets with categorical values. Data Min. Knowl. Disc. 2(3), 283–304 (1998)
Johnson, R., Wichern, W.: Applied Multivariate Statistical Analysis. Prentice Hall, New York (2002)
Park, I.-K., Choi, G.-S.: Rough set approach for clustering categorical data using information-theoretic dependency measure. Inf. Syst. 48, 289–295 (2015). ISSN 0306-4379
Li, M., Deng, S., Wang, L., Feng, S., Fan, J.: Hierarchical clustering algorithm for categorical data using a probabilistic rough set model. Knowl. Based Syst. 65, 60–71 (2014). ISSN 0950-7051
Pawlak, Z.: Rough sets. Int. J. Comput. Inf. Sci. 11, 341–356 (1982)
Pawlak, Z.: Rough Sets: A Theoretical Aspect of Reasoning About Data. Kluwer Academic Publisher, Dordrecht (1991)
Pawlak, Z., Skowron, A.: Rudiments of rough sets. Inf. Sci. 177(1), 3–27 (2007)
Mazlack, L.J., He, A., Zhu, Y., Coppock, S.: A rough set approach in choosing partitioning attributes. In: Proceedings of the ISCA 13th International Conference, CAINE-2000, pp. 1–6 (2000)
Parmar, D., Wu, T., Blackhurst, J.: MMR: an algorithm for clustering categorical data using rough set theory. Data Knowl. Eng. 63, 879–893 (2007)
Gong, Z.T., Shi, Z.H., Yao, H.Y.: Variable precision rough set model for incomplete information systems and its Β-reducts. Comput. Inf. 31(2012), 1385–1399 (2012)
Ziarko, W.: Variable precision rough set model. J. Comput. Syst. Sci. 46, 39–59 (1991)
Hubert, L., Arabie, P.: Comparing partitions. J. Classif. 2(1), 193–218 (1985)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Yanto, I.T.R., Saedudin, R.R., Hartama, D., Herawan, T. (2017). Clustering Based on Classification Quality (CCQ). In: Herawan, T., Ghazali, R., Nawi, N.M., Deris, M.M. (eds) Recent Advances on Soft Computing and Data Mining. SCDM 2016. Advances in Intelligent Systems and Computing, vol 549. Springer, Cham. https://doi.org/10.1007/978-3-319-51281-5_33
Download citation
DOI: https://doi.org/10.1007/978-3-319-51281-5_33
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-51279-2
Online ISBN: 978-3-319-51281-5
eBook Packages: EngineeringEngineering (R0)