Skip to main content
Log in

Extension of the gap statistics index to fuzzy clustering

  • Methodologies and Application
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

The well-known gap statistic index proposed by Tibshirani et al. has successfully applied in many clustering evaluations. However, the gap statistic index cannot evaluate the clustering partitions from any fuzzy clustering algorithm. This is because fuzzy clustering cannot provide the within-cluster similarity measure that is used in the gas statistic index. Thus, the applicable range of the gap statistic index is very limited. In this paper, we present a new method that extends the gap statistic index to fuzzy clustering by using fuzzy membership notations. Our proposed method can extend the applicability of the gap statistic index, and outperform other existing fuzzy indices in several aspects. Experiments on eight sets of synthetic and real datasets are used to verify the applicability and efficiency of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Arima C, Hakamada K, Okamoto M, Hanai T (2008) Modified fuzzy Gap statistic for estimating preferable number of clusters in fuzzy k-means clustering. J Biosci Bioeng 105(3):273–281

    Article  Google Scholar 

  • Bezdek JC (1981) Pattern recognition with fuzzy objective function algorithms. Plenum Press, New York

    Book  MATH  Google Scholar 

  • Bezdek JC, Pal NR (1998) Some new indexes of cluster validity. IEEE Trans SMC-B 28(3):301–315

    Google Scholar 

  • Davies DL, Bouldin DW (1979) A cluster separation measure. IEEE Trans Patt Anal Mach Intell 1(2):224–227

    Article  Google Scholar 

  • Dunn JC (1973) A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters. J Cybern 3(3):32–57

    Article  MathSciNet  MATH  Google Scholar 

  • Huang JZ, Ng MK, Rong H (2005) Automated variable weighting in k-means type clustering. IEEE Trans Patt Anal Mach Intell 27(5):657–668

    Article  Google Scholar 

  • Kamel MS, Selim SZ (1991) A thresholded fuzzy c-means algorithm for semi-fuzzy clustering. Pattern Recogn 24(9):825–833

    Article  Google Scholar 

  • Kim M, Ramakrishna RS (2005) New indices for cluster validity assessment. Pattern Recogn Lett 26(15):2353–2363

    Article  Google Scholar 

  • Kim DJ, Park YW, Park DJ (2001) A novel validity index for determination of the optimal number of clusters. IEICE Trans Inform Syst 84(2):281–285

    Google Scholar 

  • Kim DJ, Lee KH, Lee D (2004) On cluster validity index for estimation of the optimal number of fuzzy clusters. Pattern Recogn 37(10):2009–2025

    Article  Google Scholar 

  • Kwon S (1998) Cluster validity index for fuzzy clustering. Electron Lett 34(22):176–177

    Article  Google Scholar 

  • Lange T, Roth V, Braum L, Buhmann JM (2004) Stability-based validation of clustering solutions. Neural Comput 16(6):1299–1323

    Article  MATH  Google Scholar 

  • Maulik U, Bandyop S (2002) Performance evaluation of some clustering algorithms and validity indices. IEEE Trans Patt Anal Mach Intel 24(12):1650–1654

    Article  Google Scholar 

  • Pakhira MK, Bandyopadhyay S, Maulik U (2004) Validity index for crisp and fuzzy clusters. Pattern Recogn 37(3):487–501

    Article  MATH  Google Scholar 

  • Pakhira MK, Bandyopadhyay S, Mauli UK (2005) A study of some fuzzy cluster validity indices, genetic clustering and application to pixel classification. Fuzzy Sets Syst 155(2):191–214

    Article  Google Scholar 

  • Parizeau M, Lee SW (1995) A fuzzy-syntactic approach to allograph modeling for cursive script recognition. IEEE Trans Patt Anal Machine Intell 17(7):702–712

    Article  Google Scholar 

  • Pedrycz W (1996) Conditional fuzzy c-means. Pattern Recogn Lett 17(6):625–631

    Article  Google Scholar 

  • Pedrycz W (2002) Collaborative fuzzy clustering. Pattern Recognit Lett 23(14):1675–1686

    Article  MATH  Google Scholar 

  • Pedrycz W, Loia V, Senatore S (2010) Fuzzy clustering with viewpoints. IEEE Trans Fuzzy Syst 18(2):274–284

    Google Scholar 

  • Rezaee B (2010) A cluster validity index for fuzzy clustering. Fuzzy Sets Syst 161(23):3014–3025

    Article  MathSciNet  MATH  Google Scholar 

  • Sentelle C, Hong S, Georgiopoulos M, Anagnostopoulos GC (2007) A fuzzy gap statistic for fuzzy c-means. In: Proceedings of the 11th international conference on artificial intelligence software computing, pp 68–73

  • Tibshirani R, Walther G, Hastie T (2001) Estimation the number of clusters in a dataset via the gap statistic. J R Soc-B 63(2):411–423

    Article  MathSciNet  MATH  Google Scholar 

  • Wang J, Chiang J (2008) A cluster validity measure with outlier detection for support vector clustering. IEEE Trans SMC-B 38(1):78–89

    Google Scholar 

  • Wu S, Chow WS (2004) Clustering of the self-organizing map using a clustering validity index based on inter-cluster and intra-cluster density. Pattern Recogn 37(2):175–188

    Article  MATH  Google Scholar 

  • Wu K, Yang M (2002) Alternative c-means clustering algorithms. Pattern Recogn 35(10):2267–2278

    Article  MATH  Google Scholar 

  • Wu K, Yang M (2005) A cluster validity index for fuzzy clustering. Pattern Recogn Lett 26(9):1275–1291

    Article  Google Scholar 

  • Xie XL, Beni G (1991) A validity measure for fuzzy clustering. IEEE Trans Pattern Anal Mach Intell 13(8):841–847

    Article  Google Scholar 

  • Xu R, Wunsch D (2005) Survey of clustering algorithm. IEEE Trans Neural Netw 16(3):645–678

    Article  Google Scholar 

  • Xu Z, Chen J, Wu J (2008) Clustering algorithm for intuitionistic fuzzy sets. Inf Sci 178(19):3775–3790

    Article  MathSciNet  MATH  Google Scholar 

  • Yue S, Wei M, Wang J, Wang H (2008) A general grid-based clustering algorithm. Pattern Recogn Lett 29(9):1372–1384

    Article  Google Scholar 

  • Yue S, Wang J, Wu T, Wang H (2010a) A new separation measure for improving the effectiveness of validity indices. Inf Sci 180(5):748–764

    Article  MathSciNet  Google Scholar 

  • Yue S, Wang J, Tao G,Wang H (2010b) An unsupervised grid-based approach for clustering analysis. Sci China Inf Sci 53(7):1345–1357

    Google Scholar 

  • Yue S, Wu T, Liu Z, Zhao X (2011) Fused multi-characteristic validity index: an application to reconstructed image valuation in electrical tomography 4(5):1052–1061

    Google Scholar 

  • Zhang Y, Wang W, Zhang X, Li Y (2008) A cluster validity index for fuzzy clustering. Inf Sci 178(4):1205–1218

    Article  MATH  Google Scholar 

Download references

Acknowledgments

This work was supported by the National Science Foundation of China under Grant No. 61774014, 60772080, 60532020 and the Tianjin Science Foundation of China under Grant No. 08JCYBJC13800.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shihong Yue.

Additional information

Communicated by V. Loia.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yue, S., Wang, P., Wang, J. et al. Extension of the gap statistics index to fuzzy clustering. Soft Comput 17, 1833–1846 (2013). https://doi.org/10.1007/s00500-013-1023-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-013-1023-9

Keywords

Navigation