Abstract
The well-known gap statistic index proposed by Tibshirani et al. has successfully applied in many clustering evaluations. However, the gap statistic index cannot evaluate the clustering partitions from any fuzzy clustering algorithm. This is because fuzzy clustering cannot provide the within-cluster similarity measure that is used in the gas statistic index. Thus, the applicable range of the gap statistic index is very limited. In this paper, we present a new method that extends the gap statistic index to fuzzy clustering by using fuzzy membership notations. Our proposed method can extend the applicability of the gap statistic index, and outperform other existing fuzzy indices in several aspects. Experiments on eight sets of synthetic and real datasets are used to verify the applicability and efficiency of the proposed method.
Similar content being viewed by others
References
Arima C, Hakamada K, Okamoto M, Hanai T (2008) Modified fuzzy Gap statistic for estimating preferable number of clusters in fuzzy k-means clustering. J Biosci Bioeng 105(3):273–281
Bezdek JC (1981) Pattern recognition with fuzzy objective function algorithms. Plenum Press, New York
Bezdek JC, Pal NR (1998) Some new indexes of cluster validity. IEEE Trans SMC-B 28(3):301–315
Davies DL, Bouldin DW (1979) A cluster separation measure. IEEE Trans Patt Anal Mach Intell 1(2):224–227
Dunn JC (1973) A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters. J Cybern 3(3):32–57
Huang JZ, Ng MK, Rong H (2005) Automated variable weighting in k-means type clustering. IEEE Trans Patt Anal Mach Intell 27(5):657–668
Kamel MS, Selim SZ (1991) A thresholded fuzzy c-means algorithm for semi-fuzzy clustering. Pattern Recogn 24(9):825–833
Kim M, Ramakrishna RS (2005) New indices for cluster validity assessment. Pattern Recogn Lett 26(15):2353–2363
Kim DJ, Park YW, Park DJ (2001) A novel validity index for determination of the optimal number of clusters. IEICE Trans Inform Syst 84(2):281–285
Kim DJ, Lee KH, Lee D (2004) On cluster validity index for estimation of the optimal number of fuzzy clusters. Pattern Recogn 37(10):2009–2025
Kwon S (1998) Cluster validity index for fuzzy clustering. Electron Lett 34(22):176–177
Lange T, Roth V, Braum L, Buhmann JM (2004) Stability-based validation of clustering solutions. Neural Comput 16(6):1299–1323
Maulik U, Bandyop S (2002) Performance evaluation of some clustering algorithms and validity indices. IEEE Trans Patt Anal Mach Intel 24(12):1650–1654
Pakhira MK, Bandyopadhyay S, Maulik U (2004) Validity index for crisp and fuzzy clusters. Pattern Recogn 37(3):487–501
Pakhira MK, Bandyopadhyay S, Mauli UK (2005) A study of some fuzzy cluster validity indices, genetic clustering and application to pixel classification. Fuzzy Sets Syst 155(2):191–214
Parizeau M, Lee SW (1995) A fuzzy-syntactic approach to allograph modeling for cursive script recognition. IEEE Trans Patt Anal Machine Intell 17(7):702–712
Pedrycz W (1996) Conditional fuzzy c-means. Pattern Recogn Lett 17(6):625–631
Pedrycz W (2002) Collaborative fuzzy clustering. Pattern Recognit Lett 23(14):1675–1686
Pedrycz W, Loia V, Senatore S (2010) Fuzzy clustering with viewpoints. IEEE Trans Fuzzy Syst 18(2):274–284
Rezaee B (2010) A cluster validity index for fuzzy clustering. Fuzzy Sets Syst 161(23):3014–3025
Sentelle C, Hong S, Georgiopoulos M, Anagnostopoulos GC (2007) A fuzzy gap statistic for fuzzy c-means. In: Proceedings of the 11th international conference on artificial intelligence software computing, pp 68–73
Tibshirani R, Walther G, Hastie T (2001) Estimation the number of clusters in a dataset via the gap statistic. J R Soc-B 63(2):411–423
Wang J, Chiang J (2008) A cluster validity measure with outlier detection for support vector clustering. IEEE Trans SMC-B 38(1):78–89
Wu S, Chow WS (2004) Clustering of the self-organizing map using a clustering validity index based on inter-cluster and intra-cluster density. Pattern Recogn 37(2):175–188
Wu K, Yang M (2002) Alternative c-means clustering algorithms. Pattern Recogn 35(10):2267–2278
Wu K, Yang M (2005) A cluster validity index for fuzzy clustering. Pattern Recogn Lett 26(9):1275–1291
Xie XL, Beni G (1991) A validity measure for fuzzy clustering. IEEE Trans Pattern Anal Mach Intell 13(8):841–847
Xu R, Wunsch D (2005) Survey of clustering algorithm. IEEE Trans Neural Netw 16(3):645–678
Xu Z, Chen J, Wu J (2008) Clustering algorithm for intuitionistic fuzzy sets. Inf Sci 178(19):3775–3790
Yue S, Wei M, Wang J, Wang H (2008) A general grid-based clustering algorithm. Pattern Recogn Lett 29(9):1372–1384
Yue S, Wang J, Wu T, Wang H (2010a) A new separation measure for improving the effectiveness of validity indices. Inf Sci 180(5):748–764
Yue S, Wang J, Tao G,Wang H (2010b) An unsupervised grid-based approach for clustering analysis. Sci China Inf Sci 53(7):1345–1357
Yue S, Wu T, Liu Z, Zhao X (2011) Fused multi-characteristic validity index: an application to reconstructed image valuation in electrical tomography 4(5):1052–1061
Zhang Y, Wang W, Zhang X, Li Y (2008) A cluster validity index for fuzzy clustering. Inf Sci 178(4):1205–1218
Acknowledgments
This work was supported by the National Science Foundation of China under Grant No. 61774014, 60772080, 60532020 and the Tianjin Science Foundation of China under Grant No. 08JCYBJC13800.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by V. Loia.
Rights and permissions
About this article
Cite this article
Yue, S., Wang, P., Wang, J. et al. Extension of the gap statistics index to fuzzy clustering. Soft Comput 17, 1833–1846 (2013). https://doi.org/10.1007/s00500-013-1023-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-013-1023-9