Two cluster validity indices for the LAMDA clustering method
Section snippets
Mathematical expressions and symbols
See Table 1.
LAMDA
LAMDA is a fuzzy method for clustering and classification tasks. Considering the former, LAMDA calculates the adequacy between samples and a class, using historical data. To find the adequacy, the algorithm relates the contribution of features or attributes of a sample with respect to a class. The above allows establishing the global adequacy between a sample and a class [33]. To understand its operation, four steps are explained as shown below:
New cluster validity indices for the LAMDA algorithm
Firstly, the definition of two indices are presented, and afterwards the importance of their expressions and interactions between them are explained below.
Computational complexity of the LAMDA algorithm and metrics
In order to analyze the computational complexity of the LAMDA algorithm and the cluster validity indices, two steps are defined: (1) By applying the LAMDA algorithm and (2) By applying the cluster validity indices.
- 1.
First step: The LAMDA algorithm calculates MAD and GAD functions according to the historical data size. The MAD function generates a computational complexity but it can increase to if or is large value. The above is related with the automatic generation of classes.
Experimental settings
In this section, we explain two kind of experiments to apply the , , and indices. It is important to clarify that all experiments use the LAMDA algorithm with different GAD functions explained in Section 2.1.
Results and discussion
For this section, the main results and a general discussion are shown. In order to clarify the main results, several points are mentioned below:
- •
The symbols () and () mean and value, respectively.
- •
These symbols are used for others metrics.
- •
MAD and GAD functions are represented by and , where indicates the kind of function.
- •
The TIGAD function, the GTD function, and the intuitionistic fuzzy complement function are represented as , where indicates the kind
Conclusion
In this paper, two cluster validity indices, CVGED and CVCOD, are proposed for the LAMDA clustering algorithm. The CVCOD index shows the best performance for the experiments 1 and 2, where the most optimal number of clusters and quality of clustering were obtained. One advantage of the CVCOD index is to improve the stability clustering (experiment 1) and to generate the best data partition analyzed by other indices and metrics (experiment 2). Therefore, the CVCOD index can find the most optimal
CRediT authorship contribution statement
Javier Fernando Botía Valderrama: Conceptualization, Methodology, Formal analysis, Writing - original draft, Project administration, Validation, Supervision. Diego José Luis Botía Valderrama: Software, Investigation, Writing - review & editing, Validation, Visualization, Resources.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
We acknowledge to “Alcaldía Municipal de Medellín - Información y Evaluación Estratégica” and open data supplied by “Ministerio de las TICs”, “Secretaría de Educación de Antioquia”, “Departamento Administrativo de Planeación - Subdirección de Información y Evaluación Estratégica ” and “MEData”, for allowing the access and the use of the historical data about the population projections 1995–2005–2015 and 2016–2020 of Medellín city, multidimensional index quality of life survey of Medellín city,
References (67)
Data clustering: 50 years beyond k-means
Pattern Recognit. Lett.
(2010)- et al.
FCM: The fuzzy C-means clustering algorithm
Comput. Geosci.
(1984) - et al.
Process situation assessment: From a fuzzy partition to a finite state machine
Eng. Appl. Artif. Intell.
(2006) - et al.
Detection of functional states by the ‘LAMDA’ classification technique: application to a coagulation process in drinking water treatment
C. R. Phys.
(2005) - et al.
Situation prediction based on fuzzy clustering for industrial complex processes
Inform. Sci.
(2014) - et al.
A new criterion to validate and improve the classification process of LAMDA algorithm applied to diesel engines
Eng. Appl. Artif. Intell.
(2017) - et al.
Image color segmentation using the fuzzy tree algorithm T-LAMDA
Fuzzy Sets and Systems
(2007) - et al.
Fuzzy cellular automata and intuitionistic fuzzy sets applied to an optical frequency comb spectral shape
Eng. Appl. Artif. Intell.
(2017) - et al.
Similarity-margin based feature selection for symbolic interval data
Pattern Recognit. Lett.
(2011) - et al.
Membership-margin based feature selection for mixed type and high-dimensional data: Theory and applications
Inform. Sci.
(2015)
On fuzzy cluster validity indices
Fuzzy Sets and Systems
Automaton based on fuzzy clustering methods for monitoring industrial processes
Eng. Appl. Artif. Intell.
Latent connectives in human decision making
Fuzzy Sets and Systems
A novel intuitionistic fuzzy c means clustering algorithm and its application to medical images
Appl. Soft Comput.
Fuzzy entropy and conditioning
Inform. Sci.
Use of a fuzzy granulation – degranulation criterion for assessing cluster validity
Fuzzy Sets and Systems
Exploring the uniform effect of FCM clustering: A data distribution perspective
Knowl.-Based Syst.
Silhouettes: A graphical aid to the interpretation and validation of cluster analysis
J. Comput. Appl. Math.
Measuring the congruence of fuzzy partitions in fuzzy c-means clustering
Appl. Soft Comput.
Integrating cluster validity indices based on data envelopment analysis
Appl. Soft Comput.
Collaborative clustering: Why, when, what and how
Inf. Fusion
Statistical and Machine-Learning Data Mining: Techniques for Better Predictive Modeling and Analysis of Big Data
Fuzzy prototypes: From a cognitive view to a machine learning principle
Fuzzy clustering with a fuzzy covariance matrix
Multiple kernel fuzzy clustering
IEEE Trans. Fuzzy Syst.
Picture fuzzy clustering: a new computational intelligence method
Soft Comput.
Controlling selectivity in nonstandard pattern recognition algorithms, the process of classification and learning the meaning of linguistic descriptors of concepts
Search algorithm for image recognition based on learning algorithm for multivariate data analysis
Methodology for Predicting the Behavior of Optical Frequency Comb
Fuzzy logic selection as a new reliable tool to identify molecular grade signatures in breast cancer – the INNODIAG study
BMC Med. Genom.
Reinforced operators in fuzzy clustering systems
A general version of the triple operator
Int. J. Intell. Syst.
Yager–rybalov triple operator as a means of reducing the number of generated clusters in unsupervised anuran vocalization recognition
Cited by (8)
LAMDA-HSCC: A semi-supervised learning algorithm based on the multivariate data analysis
2022, Expert Systems with ApplicationsCitation Excerpt :Although there are several semi-supervised algorithms, these methods are based on assumptions about the distribution of the data such as that they follow a normal distribution, and in the practice is too hard to hold (He et al., 2021). In addition, normally, the assignment of an individual to a class or cluster is through distance methods to a centroid of a class/cluster, which can generate poor performance in non-convex groups (Cerrada et al., 2019; Valderrama & Valderrama, 2020). In this work, we propose a semi-supervised learning algorithm, called LAMDA-HSCC, which involves tasks of classification and clustering to consider the following scenarios:
PIFHC: The Probabilistic Intuitionistic Fuzzy Hierarchical Clustering Algorithm
2022, Applied Soft ComputingA new validity clustering index-based on finding new centroid positions using the mean of clustered data to determine the optimum number of clusters
2022, Expert Systems with ApplicationsCitation Excerpt :The study was limited to images with a well-contrasted background for binary segmentation, and the indices are not used in real-time applications. In Valderrama and Valderrama (2020), two clustering validity indices were proposed to evaluate the clusters produced by the learning algorithm and multivariable data analysis (LAMDA). The first cluster validity index is known as CVI based on granulation error and the ratio of distance (CVGED), whereas the second index is called CVI based on the ratio of covariance and distance (CVCOD).
A New Cluster Validity Index for Fuzzy Clustering Using Separation and Compactness
2023, Research SquareP-IT2IFCM: Probabilistic Interval Type-2 Intuitionistic Fuzzy c-Means Clustering Algorithm
2022, IEEE International Conference on Fuzzy SystemsLAMDA controller applied to the trajectory tracking of an aerial manipulator
2021, Applied Sciences (Switzerland)