Abstract
The fuzzy c-means clustering algorithm is the most common clustering algorithm. It solves the unrealistic nature of data by defining the membership matrix. As the fuzzy c-means clustering algorithm needs to set the number of classifications in advance, which is almost impossible in cases with no prior knowledge of the data set, some scholars put forward the concept of the validity index. Because the validity index is related to the distance relation between the membership matrix, the data point in the data set and the center of clustering, it is hoped that the feature weighting method can be used to evaluate all the characteristics of data in a data set to obtain the optimal classification number. Therefore, this paper presents an improved validity index for the comprehensive weight index, compactness index and separability index. This validity index first determines the relationship between the features of the data points and the data point itself. By defining the new compactness function and the separability function, the weight of each feature in the data set is obtained, and then the validity index is combined with the fuzzy c-means clustering algorithm to effectively determine the number of classes to be processed. The proposed algorithm is tested on two artificial data sets and real data sets; the experimental results demonstrated the advantages of this work in image processing and showed that it can effectively obtain reliable data classification results.
Similar content being viewed by others
References
Al-Ayyoub M, AlZu’bi S, Jararweh Y, Shehab MA, Gupta BB (2016) Accelerating 3D medical volume segmentation using GPUs. Multimed Tools Appl. https://doi.org/10.1007/s11042-016-4218-0
Almomani A, Gupta BB, Wan TC, Altaher A, Manickam S (2013) Phishing dynamic evolving neural fuzzy framework for online detection zero-day phishing email. arXiv preprint arXiv:1302.0629
An W, Liang M (2013) Fuzzy support vector machine based on within-class scatter for classification problems with outliers or noises. Neurocomputing 110:101–110
Anderson E (1935) The irises of the Gaspe peninsula. Bull Am Ir Soc 59:2–5
Ayed AB, Halima MB, Alimi AM (2014) Survey on clustering methods: Towards fuzzy clustering for big data. In Soft Computing and Pattern Recognition (SoCPaR), 2014 6th International Conference of (pp. 331–336). IEEE
Bezdek JC (1973) Cluster validity with fuzzy sets[J]. J Cybern 3(3):58–73
Bezdek JC (1974) Numerical taxonomy with fuzzy sets[J]. J Math Biol 1(1):57–71
Bezdek JC (1981) Pattern recognition with fuzzy objective function algorithms. Plenum, New York
Carazo JM, Rivera FF, Zapata EL, Radermacher M, Frank J (1990) Fuzzy sets-based classification of electron microscopy images of biological macromolecules with an application to ribosomal particles. J Microsc 157(2):187–203
Bezdek JC, Ehrlich R, Full W (1984) FCM: The fuzzy c-means clustering algorithm. Computers & Geosciences 10(2–3):191–203
Fu Z, Wu X, Guan C, Sun X, Ren K (2016) Toward efficient multi-keyword fuzzy search over encrypted outsourced data with accuracy improvement. IEEE Trans Inf Forensics Secur 11(12):2706–2716
Fukuyama Y (1989) A new method of choosing the number of clusters for the fuzzy c-mean method. In Proc. 5th Fuzzy Syst. Symp., 1989 (pp. 247–250)
Gu B, Sheng VS (2017) A robust regularization path algorithm for ν-support vector classification. IEEE Trans Neural Netw Learn Syst 28(5):1241–1248
Gu B, Sheng VS, Tay KY, Romano W, Li S (2015) Incremental support vector learning for ordinal regression. IEEE Trans Neural Netw Learn Syst 26(7):1403–1416
Gupta B, Agrawal DP, Yamaguchi S (eds) (2016) Handbook of research on modern cryptographic solutions for computer and cyber security. IGI Global
Kira K, Rendell LA (1992) A practical approach to feature selection. In Proceedings of the ninth international workshop on Machine learning (pp. 249–256)
Kononenko I (1994) Estimating attributes: analysis and extensions of RELIEF. In European conference on machine learning (pp. 171–182). Springer, Berlin
Krishnapuram R, Keller JM (1993) A possibilistic approach to clustering. IEEE Trans Fuzzy Syst 1(2):98–110
Kwon SH (1998) Cluster validity index for fuzzy clustering. Electron Lett 34(22):2176–2177
Li Z, Yuan J, Zhang W (2009) Fuzzy C-mean algorithm with morphology similarity distance. In Fuzzy Systems and Knowledge Discovery, 2009. FSKD'09. Sixth International Conference on (Vol. 3, pp. 90–94). IEEE
Li P, Li J, Huang Z, Li T, Gao CZ, Yiu SM, Chen K (2017) Multi-key privacy-preserving deep learning in cloud computing. Futur Gener Comput Syst 74:76–85
Li P, Li J, Huang Z, Gao CZ, Chen WB, Chen K (2017) Privacy-preserving outsourced classification in cloud computing. Clust Comput. https://doi.org/10.1007/s10586-017-0849-9
Liu C, Zhou A, Zhang G (2013) Automatic clustering method based on evolutionary optimisation. IET Comput Vis 7(4):258–271
MacQueen J (1967) Some methods for classification and analysis of multivariate observations. In Proceedings of the fifth Berkeley symposium on mathematical statistics and probability 1(14):281–297
Pedrycz W (1985) Algorithms of fuzzy clustering with partial supervision. Pattern Recogn Lett 3(1):13–20
Rhazali Y, Hadi Y, Mouloudi A (2016) A based-rule method to transform CIM to PIM into MDA. Int J Cloud Appl Comput (IJCAC) 6(2):11–24
Shieh SL, Liao IE (2012) A new approach for data clustering and visualization using self-organizing maps. Expert Syst Appl 39(15):11924–11933
Skabar A, Abdalgader K (2013) Clustering sentence-level text using a novel fuzzy relational clustering algorithm. IEEE Trans Knowl Data Eng 25(1):62–75
Vaidya J, Shafiq B, Basu A, Hong Y (2013) Differentially private naive bayes classification. In Proceedings of the 2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT)-Volume 01 (pp. 571–576). IEEE Computer Society
Wang, W., Yang, J., & Muntz, R. (1997). STING: A statistical information grid approach to spatial data mining. In VLDB (Vol. 97, pp. 186–195)
Wang F, Zhang Y, Rao Q, Li K, Zhang H (2017) Exploring mutual information-based sentimental analysis with kernel-based extreme learning machine for stock prediction. Soft Comput 21(12):3193–3205
Wu J, Guo S, Li J, Zeng D (2016) Big data meet green challenges: Big data toward green applications. IEEE Syst J 10(3):888–900
Wu J, Guo S, Li J, Zeng D (2016) Big data meet green challenges: Greening big data. IEEE Syst J 10(3):873–887
Wu H, Kuang L, Wang F, Rao Q, Gong M, Li Y (2017) A multiobjective box-covering algorithm for fractal modularity on complex networks. Appl Soft Comput 61:294–313
Xie XL, Beni G (1991) A validity measure for fuzzy clustering. IEEE Trans Pattern Anal Mach Intell 13(8):841–847
Xue Y, Jiang J, Zhao B, Ma T (2017) A self-adaptive artificial bee colony algorithm based on global best for global optimization. Soft Computing:1–18
Yu Z, Chen H, You J, Wong HS, Liu J, Li L, Han G (2014) Double selection based semi-supervised clustering ensemble for tumor clustering from gene expression profiles. IEEE/ACM Trans Comput Biol Bioinformatics (TCBB) 11(4):727–740
Zadeh LA (1965) Fuzzy sets. Inf Control 8(3):338–353
Acknowledgments
This work was supported by the National Natural Science Foundation of China under Grant Nos. 61573157 and 61561024, the Science and Technology Planning Project of Guangdong Province with the Grant No. 2017A010101037, the Science and Technology Research Project of Jiangxi Province under Grant Nos. GJJ160631 and GJJ160930, the Science Foundation of Jiangxi University of Science and Technology under the grant Nos. NSFJ2015-K13 and NSFJ2014-K11.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Li, W., Li, K., Guo, L. et al. A new validity index adapted to fuzzy clustering algorithm. Multimed Tools Appl 77, 11339–11361 (2018). https://doi.org/10.1007/s11042-017-5550-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-017-5550-8