Skip to main content

Trail-and-Error Approach for Determining the Number of Clusters

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3930))

Abstract

Automatically determining the number of clusters is an important issue in cluster analysis. In this paper, we explore “trial-and-error” approach to determining the number of clusters in a given data set. The fuzzy clustering algorithm, FCM, is selected as the basic “trial” algorithm and cluster validity optimization responses to the “error” procedure. To improve the computation speed, we propose two strategies, eliminating and splitting, which allow the FCM-based algorithms more efficient. To improve existing validity measures, we make use of a new validity function that fits particularly data sets containing overlapping clusters. Experimental results are given to illustrate the performance of the new algorithms.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice-Hall, Englewood Cliffs (1988)

    MATH  Google Scholar 

  2. Rezae, M., Letlieveldt, B., Reiber, J.: A new cluster validity index for the fuzzy c-means. Pattern Recognition Letters 19, 237–246 (1998)

    Article  Google Scholar 

  3. Rhee, H., Oh, K.: A Validity Measure for Fuzzy Clustering and Its Use in Selecting Optimal Number of Clusters. Proceedings of IEEE, 1020–1025 (1996)

    Google Scholar 

  4. Bezdek, J.: Fuzzy mathematics in pattern classification. Ph.D. Dissertation, Cornell University (1973)

    Google Scholar 

  5. Krishnapuram, R., Keller, J.: A possibilistic approach to clustering. Fuzzy Systems 1, 98–109 (1993)

    Article  Google Scholar 

  6. Sun, H., Wang, S., Jiang, Q.: A new validation index for determining the number of clusters in a data set. In: Proceedings of IJCNN, Washington, DC, USA, July 2001, pp. 1852–1857 (2001)

    Google Scholar 

  7. Pena, J., Lozano, J., Larranaga, P.: An empirical comparison of four initialization methods for the k-means algorithm. Pattern Recognition Letters 20, 1027–1040 (1999)

    Article  Google Scholar 

  8. Gonzalez, T.: Clustering to Minimize and Maximum Intercluster Distance. Theoretical Computer Science 38, 293–306 (1985)

    Article  MATH  MathSciNet  Google Scholar 

  9. Bezdek, J.: Chapter F6: Pattern Recognition. In: Handbook of Fuzzy Computation. IOP Publishing Ltd. (1998)

    Google Scholar 

  10. Pal, N., Bezdek, J.: On Cluster Validity for the Fuzzy C-Means Model. IEEE Trans. on Fuzzy Systems 3(3), 370–390 (1995)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Sun, H., Sun, M. (2006). Trail-and-Error Approach for Determining the Number of Clusters. In: Yeung, D.S., Liu, ZQ., Wang, XZ., Yan, H. (eds) Advances in Machine Learning and Cybernetics. Lecture Notes in Computer Science(), vol 3930. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11739685_24

Download citation

  • DOI: https://doi.org/10.1007/11739685_24

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-33584-9

  • Online ISBN: 978-3-540-33585-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics