An Evolutionary Neuro-Fuzzy C-means Clustering Technique

https://doi.org/10.1016/j.engappai.2019.103435Get rights and content

Abstract

One of the standard approaches for data analysis in unsupervised machine learning techniques is cluster analysis or clustering, where the data possessing similar features are grouped into a certain number of clusters. Among several significant ways of performing clustering, Fuzzy C-means (FCM) is a methodology, where every data point is hypothesized to be associated with all the clusters through a fuzzy membership function value. FCM is performed by minimizing an objective functional by optimally estimating the decision variables namely, the membership function values and cluster representatives, under a constrained environment. With this approach, a marginal increase in the number of data points leads to an enormous increase in the size of decision variables. This explosion, in turn, prevents the application of evolutionary optimization solvers in FCM, which thereby leads to inefficient data clustering. In this paper, a Neuro-Fuzzy C-Means Clustering algorithm (NFCM) is presented to resolve the issues mentioned above by adopting a novel Artificial Neural Network (ANN) based clustering approach. In NFCM, a functional map is constructed between the data points and membership function values, which enables a significant reduction in the number of decision variables. Additionally, NFCM implements an intelligent framework to optimally design the ANN structure, as a result of which, the optimal number of clusters is identified. Results of 9 different data sets with dimensions ranging from 2 to 30 are presented along with a comprehensive comparison with the current state-of-the-art clustering methods to demonstrate the efficacy of the proposed algorithm.

Introduction

Innovation in the computing world and sophisticated experimentation have nurtured the tremendous surge in the amount of data generated in recent times. One way of handling large volumes of data is to partition them into logical groups using unsupervised learning techniques (Segatori et al., 2018). Clustering is one such method which segregates an ensemble of data points, called feature vectors, into certain groups demonstrating a distinctive pattern (Dowlatshahi and Nezamabadi-Pour, 2014). It was introduced in the 20th century by anthropologists (Driver and Kroeber, 1932), and since then it has witnessed significant revamping leading to a substantial rise in its modern-day applicability (Chen et al., 2018, Forouzanfar et al., 2010, Hosseini and Kiani, 2019, Jain, 2010, Kuila and Jana, 2014, Niknama et al., 2011). The existing clustering algorithms can be broadly classified into two categories namely, (a) Hard or crisp clustering — where each entity belongs to only one cluster, e.g., K-means, K-medoids, Hierarchical clustering, etc. and (b) Soft clustering — where the soft partitioning of feature space associates each entity to more than one cluster, e.g. Fuzzy C-Means (FCM), Possibilistic C-Means (PCM), etc. (Xenaki et al., 2016). As compared to hard clustering techniques, soft clustering methods are preferable in most of the real-life applications due to their flexible nature of grouping the data. For example, in marketing, customers having the same brand choices can be grouped into one cluster along with their degree of belongingness to that group using FCM. Grouping in this way helps in identifying the customers who are interested in more than one brand and up to what extent.

Among the soft clustering methods such as FCM, PCM, and others listed in Xenaki et al. (2016), FCM is considered in this study due to its increasing popularity and utility in various fields of research (Gamino-Sánchez et al., 2018, He et al., 2018, Shokouhifar and Jalali, 2017). Similar to many traditional clustering techniques (Xenaki et al., 2016), in the FCM algorithm, the number of clusters is assumed to be known a priori. The degree of belongingness of a point to a cluster is quantified using a fuzzy number called membership function (MF) value as shown in Fig. 1a and 1b. In contrast to this, Fig. 1c shows the result of a hard clustering technique (K-means), where the data points belong to either of the two clusters. In both K-means and FCM, every cluster is represented by a cluster center, called as cluster representative that is shown in Fig. 1. To ensure that all the points with similar features are grouped while maintaining maximum separation among groups with different features, a distance-based objective functional is minimized by optimally estimating the MF values and cluster representatives. The constraints in FCM signify that each cluster must contain all the data points with different degrees of belongingness and none of the clusters should remain empty. This formulation translates the FCM algorithm into a Non-Linear Programming (NLP) problem involving a non-convex objective functional and linear sum-to-one constraints with the Decision Variables (DVs) being the cluster representatives and MF values (Jayaram and Klawonn, 2013).

The presence of a large volume of data leads to a manifold increase in the number of decision variables and constraints due to rise in the number of MF values that need to be computed for each point corresponding to each cluster. This makes the size of the NLP problem very large, which thereby prevents the implementation of evolutionary solvers that have demonstrated their abilities to converge close to global solutions (Chiang et al., 2010, Miriyala et al., 2018, Yildiz, 2013). Thus, conventional classical solvers, utilizing the Karush–Kuhn–Tucker (KKT) conditions to solve the Lagrangian formulation of the NLP problem, have become popular to solve this kind of problem (Łęski, 2003). Consideration of only necessary conditions of optimality and inability to escape a local optimum, therefore, may lead to a suboptimal estimation of cluster representatives and MF values.

Implementation of evolutionary solvers for hard clustering is available in practice and has been reported by several authors in the past (Bandyopadhyay and Maulik, 2002, Krishna and Murty, 1999, Lucasius et al., 1993, Naldi et al., 2011, Sheng and Liu, 2006). An approach which uses the K-means operator instead of crossover to reach the locally optimal partitions and a biased mutation to widen the search space to reach the global optimum was penned down in (Krishna and Murty, 1999). However, this method works only when the number of clusters is less in number. In spite of the popularity the evolutionary solvers enjoy in scientific and industrial research, the evolutionary algorithms in soft clustering methods are mostly extensions of evolutionary hard clustering algorithms (Babu and Murty, 1994, Balaji et al., 2016, Campello et al., 2009, Hruschka et al., 2009). They were restricted to finding the optimal values of cluster representatives through evolutionary optimizer and then represent the MF values using the necessary conditions or some variants of FCM were used a local search operator. The estimation of optimal number of fuzzy partitions with less computational burden on the optimizer still remains to be a long-standing issue.

In this paper, the issues related to clustering of large volumes of data and estimation of an optimal number of clusters are addressed using a novel algorithm called Neuro-Fuzzy C-Means Clustering (NFCM). NFCM algorithm is built on the idea that the MF values are mapped with the data points through suitable function approximators. The parameters corresponding to the function approximator, are then optimally estimated for the desired prediction of optimal MF values. This reformulation makes the size of the optimization problem independent of the number of data points to be clustered. Artificial Neural Networks (ANNs), well-known function approximators, are implemented in this work to learn the mapping between data points and MF values (Haykin, 2004). The parameters in ANN, i.e., the weights and biases, are estimated using binary coded elitist population based Genetic Algorithm (GA) (Deb, 2001) to prevent the sub-optimal clustering solutions. Further, to eliminate the heuristics involved in ANN modeling, such as the design of ANN architecture and choice of activation function, an intelligent framework is constructed which allows the data to build optimal ANNs for clustering. This framework further ensures that the MF values, which are obtained as ANN outputs, are always feasible, thereby eliminating the necessity of constraints, such as sum-to-one, in NFCM algorithm. The most significant output of this framework is it allows for the estimation of the optimal number of clusters as it will be elucidated in the formulation section. The amalgamation of ANNs with FCM approach led to the naming of the algorithm as Neuro-Fuzzy c-means clustering.

Extensive simulation studies of NFCM are performed on nine different test data sets, where the actual clustering solution is known a priori. These data sets are chosen to test the efficiency of the NFCM algorithm primarily, but in general, there is no requirement of knowing the solution in-advance for clustering using NFCM. The test data sets were selected to ensure the presence of data points with dimensions ranging from 2 to 30, while the number of data points considered was up to 2300. The obtained clustering results are further compared with those of FCM and other well-known state-of-the-art clustering methods through standard measures of clustering efficiency. NFCM could not only estimate the number of clusters accurately but also perform the clustering most efficiently. The rest of the paper is organized as follows. Section 2 revisits the conventional FCM approach in brief along-with its shortcomings. Section 3 demonstrates the motivation behind recommending a new algorithm followed by Section 4, which illustrates the formulation and working of the proposed algorithm, NFCM. In Section 5, NFCM is tested with six real, and three synthetic benchmark case studies or test data sets and comprehensive comparison studies are presented before the work is concluded in Section 6.

Section snippets

FCM review, issues, and possible solutions

The conventional FCM clustering algorithm is reviewed in this section followed by the issues associated while dealing with voluminous data. Further, the possible solutions employed in literature are presented for addressing the problems.

Rationale behind NFCM

In soft clustering algorithms, the belongingness of an arbitrary feature vector xiiZ+|1in to a cluster jjZ+|1jm is quantified using fuzzy numbers uij. This number, also called the MF value, represents the distribution of belongingness of a data point to the entire feature space. The objective of soft clustering algorithms is to group the data points having similar MF distribution. Thus, if clustering is assumed to be possible, then there must be a pattern which allows them to be

NFCM: Formulation and solution strategy

The detailed description of the proposed algorithm, NFCM including the formulation, working principle and stepwise implementation are presented in this section. Besides this, the difference in the work flow between the conventional FCM algorithm and NFCM approach is highlighted.

Results and discussions

In this section, the performance of NFCM is tested for nine different data sets of varying dimensions ranging from 2 to 30. Among these, three are simulated data sets considered for demonstrating the proof of concept; six are real data sets taken from the UCI machine learning repository (Lichman, 2013). The characteristics of these nine data sets are presented in Table 1.

NFCM is implemented in 2017 version of MATLAB® without using any specific toolbox. The parameters used in GA are specified in

Conclusions

A novel soft clustering method termed NFCM, which stands for Neuro-Fuzzy C-Means clustering method is proposed in this work. The basis of NFCM is building a map between the feature vector space and membership function space by learning a multi-layered artificial neural network. The optimal configuration of ANN and the optimal transfer function are estimated by solving an Integer Nonlinear Programming problem, which results in the determination of the optimal number of clusters. The neuro-fuzzy

CRediT authorship contribution statement

Priyanka D. Pantula: Conceptualization, Investigation, Methodology, Visualization, Software, Writing - original draft. Srinivas S. Miriyala: Conceptualization, Formal analysis, Validation, Writing - review & editing. Kishalay Mitra: Supervision, Resources, Writing - review & editing.

Acknowledgment

The authors would like to thank Dr. B. Jayaram, Department of Mathematics, Indian Institute of Technology Hyderabad, for his valuable technical discussions and encouragement throughout this work. Authors like to acknowledge the support provided by the SPARC project (SPARC/2018-2019/P1084/SL) funded by the Ministry of Human Resources Development (MHRD) , Government of India, for this work.

References (39)

Cited by (33)

  • Fuzzy clustering decomposition of genetic algorithm-based instance selection for regression problems

    2022, Information Sciences
    Citation Excerpt :

    Based on this work, Pal et al. [40] proposed the Possibilistic Fuzzy C-Means (PFCM) clustering, effectively joining the features of PCM and FCM clustering algorithms. Pantula et al. [41] presented a Neuro-Fuzzy C-Means Clustering algorithm. This was designed to resolve the issue of each data point belonging virtually to each cluster in FCM.

  • Stochastic optimization of industrial grinding operation through data-driven robust optimization

    2022, Statistical Modeling in Machine Learning: Concepts and Applications
  • Data driven robust optimization for handling uncertainty in supply chain planning models

    2021, Chemical Engineering Science
    Citation Excerpt :

    This enables the usage of global optimization algorithms such as, genetic algorithm (GA) for clustering the data efficiently. Further, the NFCM algorithm enables the estimation of optimal number of clusters by optimizing an internal cluster validation index (Pantula et al., 2020). B.

View all citing articles on Scopus

No author associated with this paper has disclosed any potential or pertinent conflicts which may be perceived to have impending conflict with this work. For full disclosure statements refer to https://doi.org/10.1016/j.engappai.2019.103435.

View full text