Abstract
Partitional clustering algorithms are the most widely used approach in clustering problems. However, how to evaluate the clustering performance of these algorithms remains unanswered due to the lack of an efficient measure for accurately representing the separation among partitioned clusters. In this paper, based on two most commonly used partitional clustering algorithms, c-means and fuzzy c-means, and their variants, we developed a new measure, called as dual center, to represent the separation among clusters. The new measure can efficiently represent the separation among various clusters. According to the defined measure, a new validity index is proposed for evaluating the clustering performance of partitional algorithms. Two groups of benchmark datasets with different characteristics were used to validate the effectiveness of the proposed validity index. Experimental results provide evidence that the proposed validity index outperforms some existing representative validity indexes in the two groups of typical and representative datasets.
Similar content being viewed by others
References
Anupam G, Chandra DB, Rajat DK (2013) Comparative analysis of cluster validity indices in identifying some possible genes mediating certain cancers. Mol Inform 32(4):347–354
Bache K, Lichman M (2013) UCI machine learning repository. University of California, School of Information and Computer Science, Irvine, CA. http://archive.ics.uci.edu/ml
Bezdek JC (1981) Pattern Recognition with fuzzy objective function algorithms. Plenum Press, New York
Bezdek JC, Pal NR (1998) Some new indexes of cluster validity. IEEE Trans SMC-B 28(3):301–315
Davies DL, Bouldin DW (1979) A cluster separation measure. IEEE Trans Pattern Anal Mach Intell 2(1):224–227
Dunn JC (1973) A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters. J Cybern 3(3):32–57
Krishnapuram R, Keller J (1993) A possibilistic approach to clustering. IEEE Trans Fuzzy Syst 1(2):98–109
Lee C, Zaine OR, Park H, Huang J, Greiner R (2008) Clustering high dimensional data: a graph-based relaxed optimization approach. Inf Sci 178:4501–4511
MacQueen JB (1967) Some methods for classification and analysis of multivariate observations. In: The 5th Berkeley symposium on mathematical and probability. Berkeley, pp 281–297
Maulik U, Bandyop S (2002) Performance evaluation of some clustering algorithms and validity indexes. IEEE Trans Pattern Anal Mach Intell 24(12):1650–1654
Olatz A, Ibai G, Javier M (2013) An extensive comparative study of cluster validity indices. Pattern Recognit 46(1):243–256
Pakhira MK, Bandyopadhyay S, Maulik U (2004) Validity index for crisp and fuzzy clusters. Pattern Recognit 37(3):487–501
Pakhira MK, Bandyopadhyay S, Maulik U (2005) A study of some fuzzy cluster validity indexes, genetic clustering and application to pixel classification. Fuzzy Sets Syst 155(3):191–214
Tibshirani R, Walther G, Hastie T (2000) Estimation the number of clusters in a dataset via the gap statistic. J R Soc B 63(2):411–423
Wang J, Chiang J (2008) A cluster validity measure with outlier detection for support vector clustering. IEEE Trans SMC-B 38(1):78–89
Wu K, Yang M (2002) Alternative c-means clustering algorithms. Pattern Recognit 35:2267–2278
Wu S, Chow WS (2004) Clustering of the self-organizing map using a clustering validity index based on inter-cluster and intra-cluster density. Pattern Recognit 37(2):175–188
Xie X, Beni G (1991) A validity measure for fuzzy clustering. IEEE Trans Pattern Anal Mach Intell 13(8):841–847
Xu R, Wunsch D (2005) Survey of clustering algorithm. IEEE Trans Neural Netw 16(3):645–678
Yue S, Wei M, Wang J, Wang H (2008) A general grid-clustering approach. Pattern Recogit Lett 29(9):1372–1384
Yue S, Wang J, Wu T (2010a) A new separation measure to improve the effectiveness of the clustering validation evaluation. Inf Sci 80(1):748–764
Yue S, Wang J, Wu T (2010b) A new unsupervised approach to clustering. Sci China Inf Sci 189(1):1345–1357
Yue S, Wu T, Liu Z (2011) Fused multi-characteristic validity index: an application to reconstructed image evaluation in electrical tomography. Int J Comp Intell Syst 4(5):1052–1061
Yue S, Wang P, Wang J, Huang T (2013) Extension of the gap statistics index to fuzzy clustering. Soft Comput 17(10):1833–1846
Zhang Y, Wang W, Zhang X, Li Y (2008) A cluster validity index for fuzzy clustering. Inf Sci 178:1205–1218
Acknowledgments
This work is supported by National Science Foundation of China (Grant Nos. 61174014, 60572065, 60772080) and National Science Foundation of Tianjin (Grant No. 08JCYBJC13800).
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by V. Loia.
Rights and permissions
About this article
Cite this article
Yue, S., Wang, J., Wang, J. et al. A new validity index for evaluating the clustering results by partitional clustering algorithms. Soft Comput 20, 1127–1138 (2016). https://doi.org/10.1007/s00500-014-1577-1
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-014-1577-1