Trail-and-Error Approach for Determining the Number of Clusters

Sun, Haojun; Sun, Mei

doi:10.1007/11739685_24

Haojun Sun²² &
Mei Sun²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3930))

1210 Accesses
3 Citations

Abstract

Automatically determining the number of clusters is an important issue in cluster analysis. In this paper, we explore “trial-and-error” approach to determining the number of clusters in a given data set. The fuzzy clustering algorithm, FCM, is selected as the basic “trial” algorithm and cluster validity optimization responses to the “error” procedure. To improve the computation speed, we propose two strategies, eliminating and splitting, which allow the FCM-based algorithms more efficient. To improve existing validity measures, we make use of a new validity function that fits particularly data sets containing overlapping clusters. Experimental results are given to illustrate the performance of the new algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

A new validity index adapted to fuzzy clustering algorithm

Article 27 February 2018

A Generalization of Fuzzy c-Means with Variables Controlling Cluster Size

The Optimal Estimation of Fuzziness Parameter in Fuzzy C-Means Algorithm

References

Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice-Hall, Englewood Cliffs (1988)
MATH Google Scholar
Rezae, M., Letlieveldt, B., Reiber, J.: A new cluster validity index for the fuzzy c-means. Pattern Recognition Letters 19, 237–246 (1998)
Article Google Scholar
Rhee, H., Oh, K.: A Validity Measure for Fuzzy Clustering and Its Use in Selecting Optimal Number of Clusters. Proceedings of IEEE, 1020–1025 (1996)
Google Scholar
Bezdek, J.: Fuzzy mathematics in pattern classification. Ph.D. Dissertation, Cornell University (1973)
Google Scholar
Krishnapuram, R., Keller, J.: A possibilistic approach to clustering. Fuzzy Systems 1, 98–109 (1993)
Article Google Scholar
Sun, H., Wang, S., Jiang, Q.: A new validation index for determining the number of clusters in a data set. In: Proceedings of IJCNN, Washington, DC, USA, July 2001, pp. 1852–1857 (2001)
Google Scholar
Pena, J., Lozano, J., Larranaga, P.: An empirical comparison of four initialization methods for the k-means algorithm. Pattern Recognition Letters 20, 1027–1040 (1999)
Article Google Scholar
Gonzalez, T.: Clustering to Minimize and Maximum Intercluster Distance. Theoretical Computer Science 38, 293–306 (1985)
Article MATH MathSciNet Google Scholar
Bezdek, J.: Chapter F6: Pattern Recognition. In: Handbook of Fuzzy Computation. IOP Publishing Ltd. (1998)
Google Scholar
Pal, N., Bezdek, J.: On Cluster Validity for the Fuzzy C-Means Model. IEEE Trans. on Fuzzy Systems 3(3), 370–390 (1995)
Article Google Scholar

Download references

Author information

Authors and Affiliations

College of Mathematics and Computer Science, University of Hebei, Baoding, Hebei, 071002, China
Haojun Sun & Mei Sun

Authors

Haojun Sun
View author publications
You can also search for this author in PubMed Google Scholar
Mei Sun
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computing, Hong Kong Polytechnic University, P.O. Box, Hong Kong, China
Daniel S. Yeung
School of Creative Media, City University of Hong Kong,, China
Zhi-Qiang Liu
Department of Mathematics and Computer Science, Hebei University, 071002, Baoding, Hebei, P.R. China
Xi-Zhao Wang
School of Electrical and Information Engineering, University of Sydney, 2006, NSW, Australia
Hong Yan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sun, H., Sun, M. (2006). Trail-and-Error Approach for Determining the Number of Clusters. In: Yeung, D.S., Liu, ZQ., Wang, XZ., Yan, H. (eds) Advances in Machine Learning and Cybernetics. Lecture Notes in Computer Science(), vol 3930. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11739685_24

Download citation

DOI: https://doi.org/10.1007/11739685_24
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-33584-9
Online ISBN: 978-3-540-33585-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics