Abstract
The transitive closure method is one of the most frequently used fuzzy clustering techniques. It has O(n 3log2 n) time complexity and O(n 2) space complexity for matrix compositions while building transitive closures. These drawbacks limit its further applications to large-scale databases. In this paper, we proposed a fast fuzzy clustering algorithm to avoid matrix multiplications and gave a principle, where the clustering results were directly obtained from the λ-cut of the fuzzy similar relation of objects. Moreover, it was dispensable to compute and store the similar matrix of objects beforehand. The time complexity of the presented algorithm is O(n 2) at most and the space complexity is O(1). Theoretical analysis and experiments demonstrate that although the new algorithm is equivalent to the transitive closure method, the former is more suitable to treat large-scale datasets because of its high computing efficiency.
This work was supported by Science-Technology Development Project of Tianjin (No. 04310941R).
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Zadeh, L.A.: Fuzzy sets. Information and Control 8, 338–353 (1965)
Zadeh, L.A.: Similarity relations and fuzzy ordering. Information Science 3, 177–200 (1971)
Tamura, S., Higuchi, S., Tanaka, K.: Pattern Classification Based on Fuzzy Relations. IEEE Trans. Syst. Man Cybernet 1(1), 61–66 (1971)
Miyamoto, S.: Fuzzy Sets in Formation Retrieval and Cluster Analysis. Kluwer Academic Publishers, Dordrecht (1990)
Miyamoto, S.: Fuzzy Graphs as a Basis Tool for Agglomerative Clustering and Information Retrieval. In: Optiz, O., et al. (eds.) Information and Classification: Concepts, Methods and Applications, pp. 268–281. Springer, Berlin (1993)
Wu, F., Li, Q., Song, W.: Transfer Algorithm to Fuzzy Clustering Analysis. Journal of Southeast University of China 29(2), 105–110 (1999)
Ma, J., Shao, L.: An Optimal Algorithm for Fuzzy Classification Problem. China Journal of Software 12(4), 578–581 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Shi, L., He, P. (2005). A Fast Fuzzy Clustering Algorithm for Large-Scale Datasets. In: Li, X., Wang, S., Dong, Z.Y. (eds) Advanced Data Mining and Applications. ADMA 2005. Lecture Notes in Computer Science(), vol 3584. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11527503_24
Download citation
DOI: https://doi.org/10.1007/11527503_24
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-27894-8
Online ISBN: 978-3-540-31877-4
eBook Packages: Computer ScienceComputer Science (R0)