Abstract
Clustering is the process of grouping a set of physical or abstract objects into multiple similar objects. Fuzzy C-means (FCM) clustering is one of the most widely used clustering methods, whose main research goal is to find the optimal clustering number of data sets, which is related to whether the data can be effectively divided. The study of clustering validity function is the process of evaluating the clustering quality and determining the optimal clustering number. Based on the idea of components, six cluster performance evaluation components are proposed to define compactness, variation, similarity, overlap and separation of data sets, respectively. Then a new validity function based on FCM clustering algorithm is synthesized by these six components. Finally, the proposed validity function and eight typical validity functions are compared on five artificial data sets and eight UCI data sets. The simulation results show that the proposed clustering validity function can evaluate the clustering results more effectively and determine the optimal clustering number of different data sets.
Similar content being viewed by others
References
Krista Rizman Žalik: Cluster validity index for estimation of fuzzy clusters of different sizes and densities. Pattern Recogn. 43(10), 3374–3390 (2010)
Hartigan, J.A., Wong, M.A.: A K-Means Clustering Algorithm. J. R. Stat. Soc.: Ser. C: Appl. Stat. 28(1), 100–108 (1979)
Lei, Y., Bezdek, J.C., Chan, J., Vinh, N.X., Romano, S., Bailey, J.: Extending information-theoretic validity indices for fuzzy clustering. IEEE Trans. Fuzzy Syst. 25(4), 1013–1018 (2017)
Ruspini, E.H.: A new approach to clustering. Inf. Control 15(1), 22–32 (1969)
Bezdek, J.C., Ehrlich, R., Full, W.: The fuzzy c-means clustering algorithm. Comput. Geosci. 10(2–3), 191–203 (1984)
Fuzzy granular gravitational clustering algorithm for multivariate data: Mauricio A. Sanchez, Oscar Castillo, Juan R. Castro, Patricia Melin. Inf. Sci. 279, 498–511 (2014)
Askari, S., Montazerin, N., Fazel Zarandi, M.H.: Generalized Possibilistic Fuzzy C-Means with novel cluster validity indices for clustering noisy data. Appl. Soft Comput. 53, 262–283 (2017)
Rubio, E., Castillo, O., Valdez, F., Melin, P., Gonzalez, C.I., Martinez, G.: An extension of the fuzzy possibilistic clustering algorithm using Type-2 fuzzy logic techniques. Adv. Fuzzy Syst. 2017, 23 (2017)
Farahani, F.V., Ahmadi, A., Zarandi, M.H.F.: Hybrid intelligent approach for diagnosis of the lung nodule from CT images using spatial kernelized fuzzy c-means and ensemble learning. Math. Comput. Simul. 149, 48–68 (2018)
Yan, Bo., Na, Xu., Xu, L.P., Li, M.Q., Cheng, P.: An improved partitioning algorithm based on FCM algorithm for extended target tracking in PHD filter. Digital Signal Processing 90, 54–70 (2019)
Liang, H., Zou, J.: Rock image segmentation of improved semi-supervised SVM–FCM algorithm based on chaos. Circuits Syst Signal Process 39, 571–585 (2020)
Bezdek, J.C., Moshtaghi, M., Runkler, T., Leckie, C.: The generalized c index for internal fuzzy cluster validity. IEEE Trans. Fuzzy Syst. 24(6), 1500–1512 (2016)
Bezdek, J.C., Pal, N.R.: Some new indexes of cluster validity. IEEE Trans. Syst. Man Cybern. B Cybern. 28(3), 301–315 (1998)
Simovici, D.A., Jaroszewicz, S.: An axiomatization of partition entropy. IEEE Trans. Inf. Theory 48(7), 2138–2142 (2002)
Silva, L., Moura, R., Canuto, A.M.P., Santiago, R.H.N., Bedregal, B.: An Interval-based framework for fuzzy clustering applications. IEEE Trans. Fuzzy Syst. 23(6), 2174–2187 (2015)
Fan, L., Xie, W.: Distance measure and induced fuzzy entropy. Fuzzy Sets Syst. 104(2), 305–314 (1999)
Gaiyun, Gong, Xinbo, Gao (2004) Cluster validity function based on the partition fuzzy degree. Pattern Recognition and Artificial Intelligence, 412–416
Liu, Y., Zhang, X., Chen, J., Chao, H.: A Validity Index for Fuzzy Clustering Based on Bipartite Modularity. Journal of Electrical and Computer Engineering 2019, 9 (2019)
J. Chen and D. Pi.(2013) A Cluster Validity Index for Fuzzy Clustering Based on Non-distance. International Conference on Computational and Information Sciences, 880–883
Joopudi, S., Rathi, S.S., Narasimhan, S., Rengaswamy, R.: A new cluster validity index for fuzzy clustering. IFAC Proceedings Volumes 46(32), 325–330 (2013)
Zhang, D., Ji, M., Yang, J., Zhang, Y., Xie, F.: A novel cluster validity index for fuzzy clustering based on bipartite modularity. Fuzzy Sets Syst. 253, 122–137 (2014)
XIE, Xuanli Lisa, BENI, Gerardo (1991) A validity measure for fuzzy clustering. IEEE Transactions on pattern analysis and machine intelligence, 841–847
Bensaid, A.M., et al.: Validity-guided (re)clustering with applications to image segmentation. IEEE Trans. Fuzzy Syst. 4(2), 112–123 (1996)
KWON, Soon H.: Cluster validity index for fuzzy clustering. Electron. Lett. 34, 2176–2177 (1998)
Kuo-Lung, Wu., Yang, M.-S.: A cluster validity index for fuzzy clustering. Pattern Recogn. Lett. 26(9), 1275–1291 (2005)
Zhu, L.F., Wang, J.S., Wang, H.Y.: A novel clustering validity function of fcm clustering algorithm. IEEE Access 7, 152289–152315 (2019)
Ouchicha, C., Ammor, O., Meknassi, M.: A new validity index in overlapping clusters for medical images. Control Comp, Sci. 54, 238–248 (2020)
Liu, Y., Jiang, Y., Tao Hou, Fu.: A new robust fuzzy clustering validity index for imbalanced data sets. Inf. Sci. 547, 579–591 (2021)
Wang, H.Y., Wang, J.S., Zhu, L.F.: A new validity function of FCM clustering algorithm based on the intra-class compactness and inter-class separation. Journal of Intelligent & Fuzzy Systems 40(6), 12411–12432 (2021)
Wang, H.Y., Wang, J.S., Wang, G.: Combination Evaluation method of fuzzy C-mean clustering validity based on hybrid weighted strategy. IEEE Access 9, 27239–27261 (2021)
Tasdemir, K., Merenyi, E.: A validity index for prototype-based clustering of data sets with complex cluster structures. IEEE Trans. Syst. Man Cybern. B Cybern. 41(4), 1039–1053 (2011)
Yuangang Tang, Fuchun Sun and Zengqi Sun (2005) Improved validation index for fuzzy clustering. Proceedings of the 2005, American Control Conference, 1120–1125
Min-You Chen, D.A., Linkens,: Rule-base self-generation and simplification for data-driven fuzzy models. Fuzzy Sets Syst. 142(2), 243–265 (2004)
Wu, C., Ouyang, C., Chen, L., Lu, L.: A new fuzzy clustering validity index with a median factor for centroid-based clustering. IEEE Trans. Fuzzy Syst. 23(3), 701–718 (2004)
Meng, L., Chunchun, Hu.: Cluster validity index based on measure of fuzzy partition [J]. Comput. Eng. 33(11), 15–17 (2007)
Pakhira, M.K., Bandyopadhyay, S., Maulik, U.: A study of some fuzzy cluster validity indices genetic clustering and application to pixel classification. Fuzzy Sets Syst. 155(2), 191–214 (2005)
Zhang, Y., Wang, W., Zhang, X., Li, Yi.: A cluster validity index for fuzzy clustering. Inf. Sci. 178(4), 1205–1218 (2008)
Rezaee, B.: A cluster validity index for fuzzy clustering. Fuzzy Sets Syst. 161(23), 3014–3025 (2010)
Acknowledgements
This work was supported by the Basic Scientific Research Project of Institution of Higher Learning of Liaoning Province (Grant No. LJKZ0293), and the Project by Liaoning Provincial Natural Science Foundation of China (Grant No. 20180550700).
Author information
Authors and Affiliations
Contributions
GW participated in the data collection, analysis, algorithm simulation, and draft writing. J-SW participated in the concept, design, interpretation and commented on the manuscript. Hong-Yu Wang participated in the critical revision of this paper.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that there is no conflict of interests regarding the publication of this article.
Rights and permissions
About this article
Cite this article
Wang, G., Wang, JS. & Wang, HY. Fuzzy C-Means Clustering Validity Function Based on Multiple Clustering Performance Evaluation Components. Int. J. Fuzzy Syst. 24, 1859–1887 (2022). https://doi.org/10.1007/s40815-021-01243-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40815-021-01243-2