A fuzzy clustering algorithm based on evolutionary programming
Introduction
Cluster analysis is a major analytic tool in image processing, spatial remote sensing, data mining, gene data processing and signal compression, and so on. Partitional clustering is the most commonly used general algorithm in pattern recognition. Classic partition-based clustering methods includes conventional c-means algorithm, fuzzy c-means (FCM) algorithm and maximum entropy method. All of these methods cannot optimize the feature the given data set, while to optimize cluster directly. At present, the most well-known fuzzy clustering of which is FCM, the algorithm is available for small and low-dimensional data sets it suffers from several inherent drawbacks: (1) to apply the method, the user has to know a prior knowledge; (2) the random initial choice could generate different clustering solutions, even no solution; (3) this objective function-based algorithm searches the optimum by the gradiential method so that it is easy to get trapped at a local minimum. To address these drawbacks, Gas were utilized to optimize the objective functions of k-means algorithm (Murthy & Chowdhury, 1996), it avoids the drawback (2) and (3), since the algorithm encode the cluster and is not a iterative process, the efficiency is lower; a modified algorithm (Bandyopadhyay & Maulik, 2002) is more efficient by encoding the cluster center and adapting iterative k-means algorithm in the search of GAs, an evolutionary programming algorithm (Sarkar, Yegnanarayana, & Khemani, 1997) has solved these above problems efficiently by optimizing DB index of k-means algorithm; the flexibility of a variable string length genetic algorithm (FVGA) (Pakhira & Bandyopadhyay, 2005), which the cluster validity indices are used as the searched objective functions, can find the optimal number of clusters in search process.
This paper proposes an evolutionary programming-based fuzzy c-means clustering algorithm (EPFCM) along with a single-point mutation in evolutionary programming (SPMEP) (Ji, Tang, & Guo, 2004), which utilizes the cluster validity indices to evaluate the result of clustering. To increase the convergence speed of the algorithm, the Modify algorithm is applied for varying the number of cluster centers dynamically. Experiments demonstrate EPFCM can find the proper number of clusters, and the clustering result does not depend critically on the choice of the initial cluster centers, the probability of trapping into the local optima will be very lower than FCM.
Section snippets
Basic idea
For most of clustering algorithms, the objective function is not convex, and hence, it may contain local minima. Therefore, while minimizing the objective function, there is a possibility of getting stuck in local minima (also in local maxima and saddle points). To solve this problem, evolutionary algorithms for fuzzy clustering have been proposed and achieved more efficient results; since the objective function decreases monotonically on the number of cluster centers, there are two cluster
EPFCM algorithm
We select single-point mutation evolutionary programming (SPMEP) (Ji et al., 2004) as optimal algorithm, FCM as clustering algorithm, and one of the four indices (Dong, Huang, Zhou, & He, 2007), PC (Bezdek, 1975), PE (Bezdek, 1974), PBMF (Maulik and Bandyopadhyay, 2000, Pakhira and Bandyopadhyay, 2005) or XB (Xie & Beni, 1991) as validity indices.
Results and discussion
In this section, first, we report the data sets profile of experiments of EPFCM and the results of numerical experiments. This includes a description of the data sets and comparison of the optimal clustering number and the value of validity indices with different indices. The results also show the variation of the value of validity indices with the different number of cluster. Secondly, we give the results of the automatic clustering algorithm, which includes the performance of clustering, and
Conclusion
Clustering technique is a tool and an approach of data analysis in a number of applications in a wide variety of fields of application and research. Many practical problem can be generalized as clustering, for example, clustering analysis technique plays an important role in many areas such as information retrieval, data mining, and so on. The tools of clustering analysis are introduced into statistician software such as S-Plus, SPSS, SAS. With the fast development of information retrieval,
Acknowledgements
This work is partially supported by the Natural Science Foundation of Heilongjiang Province, China, No. F200605 and the Heilongjiang Province Office of Education Overseas Scholars Cooperation Projects, China, No. 1153h21.
References (13)
- et al.
An evolutionary technique based on k-means algorithm for optimal clustering in RN
Inform. Sci.
(2002) - et al.
Wei Hou: Evolutionary programming using a mixed mutation strategy
Inf. Sci.
(2007) - et al.
A single-point mutation evolutionary programming
Inform. Process. Lett.
(2004) - et al.
Genetic algorithm-based clustering technique
Pattern Recog.
(2000) - et al.
In search of optimal clusters using genetic algorithms
Pattern Recog. Lett.
(1996) - et al.
A study of some fuzzy cluster validity indices, genetic clustering and application to pixel classification
Fuzzy Sets Syst.
(2005)
Cited by (15)
A fuzzy C-means algorithm for optimizing data clustering
2023, Expert Systems with ApplicationsEvolutionary Machine Learning: A Survey
2022, ACM Computing SurveysA novel deep reinforcement learning enabled agent for pumped storage hydro-wind-solar systems voltage control
2021, IET Renewable Power GenerationClustering human development index data with gravitational search algorithm-fuzzy 4-means (GSA-F4M)
2021, AIP Conference ProceedingsNew Global Membership Scaling Fuzzy C-Means Clustering Algorithm
2021, Lecture Notes on Data Engineering and Communications TechnologiesTraction Load Classification Method Based on Improved Clustering Method
2020, Xinan Jiaotong Daxue Xuebao/Journal of Southwest Jiaotong University