Abstract
This paper presents a rough set-based fuzzy clustering algorithm in which the objects of fuzzy clustering are initial clusters obtained in terms of equivalence relations. Initial clustering is performed directly by judging whether equivalence relations are equal, not computing the intersection of equivalence classes as usual, and the correctness of the theory is proved using rough set theory. Excessive generation of some small classes is suppressed by secondary clustering on the basis of defining fuzzy similarity between two initial clusters. Consequently the dimension of fuzzy similarity matrix is reduced. The definition of integrated approximation precision is given as evaluation of clustering validity. The algorithm can dynamically adjust parameter to get the optimal result. Some experiments were performed to validate this method. The results showed that the algorithm could handle preferably the clustering problems of both numerical data and nominal data.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Han, J.W., Kamber, M.: Data Mining Concepts and Techniques. China Higher Education Press, Beijing (2001)
Kaufman, L., Rousseeuw, P.J.: Finding Groups in Data: An Introduction to Clustering Analysis. John Wiley & Sons, New York (1990)
Mac Queen, J.: Some Methods for Classification and Analysis of Multivariate Observations. In: 5th Berkeley Symposium on Mathmatics, Statistics and Probability, Berkeley, California, vol. 1, pp. 281–297 (1967)
Ester, M., Kriegel, H., Sander, J., Xu, X.W.: A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. In: Simoudis, E., Han, J.W., Fayyad, U.M. (eds.) Proceeding of the 2nd International Conference on Knowledge Discovery and Data Mining, Portland, pp. 226–231. AAAI Press, Menlo Park (1996)
Bock, H.H.: Probabilistic Models in Cluster Analysis. Computational Statistics & Data Analysis 23, 5–28 (1996)
Pawlak, Z.: Rough Sets. International Journal of Information and Computer Sciences 11, 145–172 (1982)
Hirano, S., Tsumoto, S., Okuzaki, T., Hata, Y.: A Clustering Method for Nominal and Numerical Data Based on Rough Set Theory. In: Bulletin of International Rough Set Society Proceeding of Rough Set Theory and Granular Computing, Matsue, Shimane, Japan, pp. 211–216 (2001)
Lingras, P., Davies, C.: Application of Rough Genetic Algorithm. Computational Intelligence 17(3), 435–445 (2001)
Liu, S.H., Hu, F., Jia, B.Y., Shi, Z.Z.: A Rough Set-Based Hierarchical Clustering Algorithm, vol. 41(4), pp. 553–556 (2004)
Lingras, P., West, C.: Interval Set Clustering of Web Users with Rough k-means. Technical Report No. 2002-002, Department of Mathematics and Computer Science, St. Mary’s University, Halifax, Canada (2002)
An, Q.S., Shen, J.Y.: The Study of Clustering Algorithm Based on Information Granular and Rough Set. Pattern Recognition & Artificial Intelligence 16(4), 412–416 (2003)
Sun, H.Q., Xiong, Z.: Fuzzy Cluster Based on Rough Set and Result Evaluating. Journal of Fudan University (Nature Science) 43(5), 819–822 (2004)
Liu, P.Y., Wu, M.D.: Fuzzy Theory and its Application. National Defense University Press, Changsha (1998)
Blake, C.L., Merz, C.J.: UCI Repository of Machine Learning Databases (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yaqin, Z., Xianzhong, Z., Guizhong, T. (2005). A Rough Set-Based Fuzzy Clustering. In: Lee, G.G., Yamada, A., Meng, H., Myaeng, S.H. (eds) Information Retrieval Technology. AIRS 2005. Lecture Notes in Computer Science, vol 3689. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11562382_31
Download citation
DOI: https://doi.org/10.1007/11562382_31
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29186-2
Online ISBN: 978-3-540-32001-2
eBook Packages: Computer ScienceComputer Science (R0)