An Evolutionary Neuro-Fuzzy C-means Clustering Technique☆
Introduction
Innovation in the computing world and sophisticated experimentation have nurtured the tremendous surge in the amount of data generated in recent times. One way of handling large volumes of data is to partition them into logical groups using unsupervised learning techniques (Segatori et al., 2018). Clustering is one such method which segregates an ensemble of data points, called feature vectors, into certain groups demonstrating a distinctive pattern (Dowlatshahi and Nezamabadi-Pour, 2014). It was introduced in the 20th century by anthropologists (Driver and Kroeber, 1932), and since then it has witnessed significant revamping leading to a substantial rise in its modern-day applicability (Chen et al., 2018, Forouzanfar et al., 2010, Hosseini and Kiani, 2019, Jain, 2010, Kuila and Jana, 2014, Niknama et al., 2011). The existing clustering algorithms can be broadly classified into two categories namely, (a) Hard or crisp clustering — where each entity belongs to only one cluster, e.g., K-means, K-medoids, Hierarchical clustering, etc. and (b) Soft clustering — where the soft partitioning of feature space associates each entity to more than one cluster, e.g. Fuzzy C-Means (FCM), Possibilistic C-Means (PCM), etc. (Xenaki et al., 2016). As compared to hard clustering techniques, soft clustering methods are preferable in most of the real-life applications due to their flexible nature of grouping the data. For example, in marketing, customers having the same brand choices can be grouped into one cluster along with their degree of belongingness to that group using FCM. Grouping in this way helps in identifying the customers who are interested in more than one brand and up to what extent.
Among the soft clustering methods such as FCM, PCM, and others listed in Xenaki et al. (2016), FCM is considered in this study due to its increasing popularity and utility in various fields of research (Gamino-Sánchez et al., 2018, He et al., 2018, Shokouhifar and Jalali, 2017). Similar to many traditional clustering techniques (Xenaki et al., 2016), in the FCM algorithm, the number of clusters is assumed to be known a priori. The degree of belongingness of a point to a cluster is quantified using a fuzzy number called membership function (MF) value as shown in Fig. 1a and 1b. In contrast to this, Fig. 1c shows the result of a hard clustering technique (K-means), where the data points belong to either of the two clusters. In both K-means and FCM, every cluster is represented by a cluster center, called as cluster representative that is shown in Fig. 1. To ensure that all the points with similar features are grouped while maintaining maximum separation among groups with different features, a distance-based objective functional is minimized by optimally estimating the MF values and cluster representatives. The constraints in FCM signify that each cluster must contain all the data points with different degrees of belongingness and none of the clusters should remain empty. This formulation translates the FCM algorithm into a Non-Linear Programming (NLP) problem involving a non-convex objective functional and linear sum-to-one constraints with the Decision Variables (DVs) being the cluster representatives and MF values (Jayaram and Klawonn, 2013).
The presence of a large volume of data leads to a manifold increase in the number of decision variables and constraints due to rise in the number of MF values that need to be computed for each point corresponding to each cluster. This makes the size of the NLP problem very large, which thereby prevents the implementation of evolutionary solvers that have demonstrated their abilities to converge close to global solutions (Chiang et al., 2010, Miriyala et al., 2018, Yildiz, 2013). Thus, conventional classical solvers, utilizing the Karush–Kuhn–Tucker (KKT) conditions to solve the Lagrangian formulation of the NLP problem, have become popular to solve this kind of problem (Łęski, 2003). Consideration of only necessary conditions of optimality and inability to escape a local optimum, therefore, may lead to a suboptimal estimation of cluster representatives and MF values.
Implementation of evolutionary solvers for hard clustering is available in practice and has been reported by several authors in the past (Bandyopadhyay and Maulik, 2002, Krishna and Murty, 1999, Lucasius et al., 1993, Naldi et al., 2011, Sheng and Liu, 2006). An approach which uses the K-means operator instead of crossover to reach the locally optimal partitions and a biased mutation to widen the search space to reach the global optimum was penned down in (Krishna and Murty, 1999). However, this method works only when the number of clusters is less in number. In spite of the popularity the evolutionary solvers enjoy in scientific and industrial research, the evolutionary algorithms in soft clustering methods are mostly extensions of evolutionary hard clustering algorithms (Babu and Murty, 1994, Balaji et al., 2016, Campello et al., 2009, Hruschka et al., 2009). They were restricted to finding the optimal values of cluster representatives through evolutionary optimizer and then represent the MF values using the necessary conditions or some variants of FCM were used a local search operator. The estimation of optimal number of fuzzy partitions with less computational burden on the optimizer still remains to be a long-standing issue.
In this paper, the issues related to clustering of large volumes of data and estimation of an optimal number of clusters are addressed using a novel algorithm called Neuro-Fuzzy C-Means Clustering (NFCM). NFCM algorithm is built on the idea that the MF values are mapped with the data points through suitable function approximators. The parameters corresponding to the function approximator, are then optimally estimated for the desired prediction of optimal MF values. This reformulation makes the size of the optimization problem independent of the number of data points to be clustered. Artificial Neural Networks (ANNs), well-known function approximators, are implemented in this work to learn the mapping between data points and MF values (Haykin, 2004). The parameters in ANN, i.e., the weights and biases, are estimated using binary coded elitist population based Genetic Algorithm (GA) (Deb, 2001) to prevent the sub-optimal clustering solutions. Further, to eliminate the heuristics involved in ANN modeling, such as the design of ANN architecture and choice of activation function, an intelligent framework is constructed which allows the data to build optimal ANNs for clustering. This framework further ensures that the MF values, which are obtained as ANN outputs, are always feasible, thereby eliminating the necessity of constraints, such as sum-to-one, in NFCM algorithm. The most significant output of this framework is it allows for the estimation of the optimal number of clusters as it will be elucidated in the formulation section. The amalgamation of ANNs with FCM approach led to the naming of the algorithm as Neuro-Fuzzy c-means clustering.
Extensive simulation studies of NFCM are performed on nine different test data sets, where the actual clustering solution is known a priori. These data sets are chosen to test the efficiency of the NFCM algorithm primarily, but in general, there is no requirement of knowing the solution in-advance for clustering using NFCM. The test data sets were selected to ensure the presence of data points with dimensions ranging from 2 to 30, while the number of data points considered was up to 2300. The obtained clustering results are further compared with those of FCM and other well-known state-of-the-art clustering methods through standard measures of clustering efficiency. NFCM could not only estimate the number of clusters accurately but also perform the clustering most efficiently. The rest of the paper is organized as follows. Section 2 revisits the conventional FCM approach in brief along-with its shortcomings. Section 3 demonstrates the motivation behind recommending a new algorithm followed by Section 4, which illustrates the formulation and working of the proposed algorithm, NFCM. In Section 5, NFCM is tested with six real, and three synthetic benchmark case studies or test data sets and comprehensive comparison studies are presented before the work is concluded in Section 6.
Section snippets
FCM review, issues, and possible solutions
The conventional FCM clustering algorithm is reviewed in this section followed by the issues associated while dealing with voluminous data. Further, the possible solutions employed in literature are presented for addressing the problems.
Rationale behind NFCM
In soft clustering algorithms, the belongingness of an arbitrary feature vector to a cluster is quantified using fuzzy numbers . This number, also called the MF value, represents the distribution of belongingness of a data point to the entire feature space. The objective of soft clustering algorithms is to group the data points having similar MF distribution. Thus, if clustering is assumed to be possible, then there must be a pattern which allows them to be
NFCM: Formulation and solution strategy
The detailed description of the proposed algorithm, NFCM including the formulation, working principle and stepwise implementation are presented in this section. Besides this, the difference in the work flow between the conventional FCM algorithm and NFCM approach is highlighted.
Results and discussions
In this section, the performance of NFCM is tested for nine different data sets of varying dimensions ranging from 2 to 30. Among these, three are simulated data sets considered for demonstrating the proof of concept; six are real data sets taken from the UCI machine learning repository (Lichman, 2013). The characteristics of these nine data sets are presented in Table 1.
NFCM is implemented in 2017 version of MATLAB® without using any specific toolbox. The parameters used in GA are specified in
Conclusions
A novel soft clustering method termed NFCM, which stands for Neuro-Fuzzy C-Means clustering method is proposed in this work. The basis of NFCM is building a map between the feature vector space and membership function space by learning a multi-layered artificial neural network. The optimal configuration of ANN and the optimal transfer function are estimated by solving an Integer Nonlinear Programming problem, which results in the determination of the optimal number of clusters. The neuro-fuzzy
CRediT authorship contribution statement
Priyanka D. Pantula: Conceptualization, Investigation, Methodology, Visualization, Software, Writing - original draft. Srinivas S. Miriyala: Conceptualization, Formal analysis, Validation, Writing - review & editing. Kishalay Mitra: Supervision, Resources, Writing - review & editing.
Acknowledgment
The authors would like to thank Dr. B. Jayaram, Department of Mathematics, Indian Institute of Technology Hyderabad, for his valuable technical discussions and encouragement throughout this work. Authors like to acknowledge the support provided by the SPARC project (SPARC/2018-2019/P1084/SL) funded by the Ministry of Human Resources Development (MHRD) , Government of India, for this work.
References (39)
- et al.
Clustering with evolution strategies
Pattern Recognit.
(1994) - et al.
An evolutionary technique based on K-Means algorithm for optimal clustering in N
Inf. Sci. (Ny)
(2002) - et al.
A novel image segmentation method based on fast density clustering algorithm
Eng. Appl. Artif. Intell.
(2018) - et al.
A 2-Opt based differential evolution for global optimization
Appl. Soft Comput. J.
(2010) - et al.
GGSA: A grouping gravitational search algorithm for data clustering
Eng. Appl. Artif. Intell.
(2014) - et al.
Parameter optimization of improved fuzzy c-means clustering algorithm for brain MR image segmentation
Eng. Appl. Artif. Intell.
(2010) - et al.
Block-matching fuzzy C-means clustering algorithm for segmentation of color images degraded with Gaussian noise
Eng. Appl. Artif. Intell.
(2018) - et al.
A wavelet tensor fuzzy clustering scheme for multi-sensor human activity recognition
Eng. Appl. Artif. Intell.
(2018) - et al.
A big data driven distributed density based hesitant fuzzy clustering using apache spark with application to gene expression microarray
Eng. Appl. Artif. Intell.
(2019) Data clustering: 50 years beyond K-means
Pattern Recognit. Lett.
(2010)
Energy efficient clustering and routing algorithms for wireless sensor networks: Particle swarm optimization approach
Eng. Appl. Artif. Intell.
Towards a robust fuzzy clustering
Fuzzy Sets and Systems
On k-medoid clustering of large data sets with the aid of a genetic algorithm: background, feasiblity and comparison
Anal. Chim. Acta
TRANSFORM-ANN for online optimization of complex industrial processes: Casting process as case study
Eur. J. Oper. Res.
Efficiency issues of evolutionary k-means
Appl. Soft Comput. J.
Validity index for crisp and fuzzy clusters
Pattern Recognit.
The sailfish optimizer: A novel nature-inspired metaheuristic algorithm for solving constrained engineering optimization problems
Eng. Appl. Artif. Intell.
Farmland fertility: A new metaheuristic algorithm for solving continuous optimization problems
Appl. Soft Comput.
Optimized sugeno fuzzy clustering algorithm for wireless sensor networks
Eng. Appl. Artif. Intell.
Cited by (33)
Data-driven worst case model predictive control algorithm for propylene distillation column under uncertainty of top composition
2023, Journal of Process ControlA survey and comparison of leading-edge uncertainty handling methods for power grid modernization
2022, Expert Systems with ApplicationsFuzzy clustering decomposition of genetic algorithm-based instance selection for regression problems
2022, Information SciencesCitation Excerpt :Based on this work, Pal et al. [40] proposed the Possibilistic Fuzzy C-Means (PFCM) clustering, effectively joining the features of PCM and FCM clustering algorithms. Pantula et al. [41] presented a Neuro-Fuzzy C-Means Clustering algorithm. This was designed to resolve the issue of each data point belonging virtually to each cluster in FCM.
Stochastic optimization of industrial grinding operation through data-driven robust optimization
2022, Statistical Modeling in Machine Learning: Concepts and ApplicationsData driven robust optimization for handling uncertainty in supply chain planning models
2021, Chemical Engineering ScienceCitation Excerpt :This enables the usage of global optimization algorithms such as, genetic algorithm (GA) for clustering the data efficiently. Further, the NFCM algorithm enables the estimation of optimal number of clusters by optimizing an internal cluster validation index (Pantula et al., 2020). B.
- ☆
No author associated with this paper has disclosed any potential or pertinent conflicts which may be perceived to have impending conflict with this work. For full disclosure statements refer to https://doi.org/10.1016/j.engappai.2019.103435.