Performance evaluation of FMIG clustering using fuzzy validity indexes

Tlili, Monia; Ayadi, Thouraya; Hamdani, Tarek M.; Alimi, Adel M.

doi:10.1007/s00500-014-1478-3

Performance evaluation of FMIG clustering using fuzzy validity indexes

Focus
Published: 21 October 2014

Volume 19, pages 3515–3528, (2015)
Cite this article

Soft Computing Aims and scope Submit manuscript

Monia Tlili¹,
Thouraya Ayadi¹,
Tarek M. Hamdani^1,2 &
…
Adel M. Alimi¹

194 Accesses
1 Citation
Explore all metrics

Abstract

The clustering of high-dimensional data presents a critical computational problem. Therefore, it is convenient to arrange the cluster centres on a grid with a small dimensional space that reduces computational cost and can be easily visualized. Moreover, in real applications there is no sharp boundary between classes, real datasets are naturally defined in a fuzzy context. Therefore, fuzzy clustering fits better for complex real datasets to determine the best distribution. Self-organizing map (SOM) technique is appropriate for clustering and vector quantization of high-dimensional data. In this paper we present a new fuzzy learning approach called FMIG (fuzzy multilevel interior growing self-organizing maps). The proposed algorithm is a fuzzy version of MIGSOM (multilevel interior growing self-organizing maps). The main contribution of FMIG is to define a fuzzy process of data mapping and to take into account the fuzzy aspect of high-dimensional real datasets. This new algorithm is able to auto-organize the map accordingly to the fuzzy training property of the nodes. In the second step, the introduced scheme for FMIG is clustered by means of fuzzy C-means (FCM) to discover the interior class distribution of the learned data. The validation of FCM partitions is directed by applying six validity indexes. Superiority of the new method is demonstrated by comparing it with crisp MIGSOM, GSOM (growing SOM) and FKCN (fuzzy Kohonen clustering network) techniques. Thus, our new method shows improvement in term of quantization error, topology preservation and clustering ability.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The use of conventional clustering methods combined with SOM to increase the efficiency

Article 24 June 2021

Clustering-Based Adaptive Self-Organizing Map

Fuzzy C-Means Clustering Validity Function Based on Multiple Clustering Performance Evaluation Components

Article 21 February 2022

References

Abonyi J, Migaly S, Szeifert F (2002) Fuzzy self-organizing map based on regularized fuzzy c-means clustering. Advances in soft computing, Engineering Design and Manufacturing
Google Scholar
Abraham A, Nath B (2001) A neuro-fuzzy approach for modelling electricity demand in Victoria. Appl Soft Comput 1(2):127–138
Article Google Scholar
Aguilera PA, Frenich AG (2001) Application of the Kohonen neural network in coastal water management: methodological development for the assessment and prediction of water quality. Water Res 35:4053–4062
Article Google Scholar
Alimi AM, Hassine R, Selmi M (2003) Beta fuzzy logic systems: approximation properties in the MIMO case. Int J Appl Math Comput Sci 13(2):225–238
Alimi AM (2000) The beta system: toward a change in our use of neuro-fuzzy systems. Int J Manag, Invited Paper, no. June, pp 15–19
Alimi AM (2003) Beta neuro-fuzzy systems. TASK Q J, Special Issue on Neural Networks edited by Duch W and Rutkowska D, vol 7, no 1, pp 23–41
Amarasiri R, Alahakoon D, Smith KA (2004) HDGSOM: a modified growing self-organizing map for high dimensional data clustering. Fourth international coriference on hybrid intelligent systems (HIS’04), pp 216–221
Ayadi T, Ellouze M, Hamdani TM, Alimi AM (2012) Movie scenes detection with MIGSOM based on shots semi-supervised clustering. Neural Comput Appl. doi 10.1007/s00521-012-0930-5
Ayadi T, Hamdani TM, Alimi AM, and Khabou MA (2007) 2IBGSOM: interior and irregular boundaries growing self-organizing maps. IEEE sixth international conference on machine learning and applications, pp 387–392
Ayadi T, Hamdani TM, Alimi AM (2010) A new data topology matching technique with multilevel interior growing self-organizing maps. IEEE international conference on systems, man, and cybernetics, pp 2479–2486
Ayadi T, Hamdani TM, Alimi AM (2011) On the use of cluster validity for evaluation of MIGSOM clustering. ISCIII: 5th international symposium on computational intelligence and intelligent informatics, pp 121–126
Ayadi T, Hamdani TM, Alimi AM (2012) MIGSOM: multilevel interior growing self-organizing maps for high dimensional data clustering. In: Neural processing letters, vol 36, pp 235 256
Barrea A (2011) Local fuzzy c-means clustering for medical spectroscopy images. Appl Math Sci 5(30):1449–1458
MATH Google Scholar
Bezdek JC (1974) Cluster validity with fuzzy sets. J Cybern 58–72
Bezdek JC (1987) Convergence theory for fuzzy c-means. IEEE Trans SMC, pp 873–877
Bezdek JC, Harris JD (1978) Fuzzy partitions and relation: an axiomatic basic for clustering fuzzy sets and systems. Academic Press, Massachusetts
Bezdek JC, Castelaz PF (1977) Prototype classification and feature selection with fuzzy sets. IEEE Trans Syst Man Cybern 7(2):87–92
Bezdek JC (1981) Pattern recognition with fuzzy objective function algorithms. Plenum Press, New York
Book MATH Google Scholar
Blake CL, Merz CJ (1998) UCI repository of machine learning databases, [http://www.ics.uci.edu/learn/MLRepository.html]. University of California, Department of Information and Computer Science, Irvine, CA
Chang PC, Liao TW (2006) Combining SOM and fuzzy rule base for flow time prediction in semiconductor manufacturing factory. Appl Soft Comput 6(2):198–206
Article Google Scholar
Das S, Abraham A, Konar A (2008) Automatic kernel clustering with a Multi-Elitist Particle Swarm Optimization Algorithm. Patt Recogn Lett 29:688–699
Article Google Scholar
Das S, Abraham A (2008) Automatic clustering using an improved differential evolution algorithm. IEEE Trans SMC, vol 38, no 1, pp 218–237
Denaïa MA, Palisb F, Zeghbibb A (2007) Modeling and control of non-linear systems using soft computing techniques. Appl Soft Comput 7(3):728–738
Article Google Scholar
Dhahri H, Alimi AM (2005) Hierarchical learning algorithm for the beta basis function neural network. In: Proceedings of third international conference on systems, signals & devices: SSD’2005, Sousse, Tunisia, March 2005
Dunn JC (1974) Well separated clusters and optimal fuzzy partitions. J Cybern 4:95–104
Article MathSciNet Google Scholar
El Malek J, Alimi AM, Tourki R (2002) Problems in pattern classification in high dimensional spaces: behavior of a class of combined neuro-fuzzy classifiers. Fuzzy Sets Syst 128(1):15–33
Fritzke B (1994) Growing cell structure: a self organizing network for supervised and un-supervised learning. Neural Netw 7:1441–1460
Hamdani TM,. Alimi AM, Khabou MA (2011) An iterative method for deciding SVM and single layer neural network structures. Neural Process Lett 33(2):171–186
Hamdani TM, Alimi AM, Karray F (2008) Enhancing the structure and parameters of the centers for BBF fuzzy neural network classifier construction based on data structure. In: Proceedings of IEEE international join conference on neural networks, Hong Kong, IJCNN, art. no. 4634247, pp 3174–3180
Hamdani TM, Won JM, Alimi AM, Karray F (2011) Hierarchical genetic algorithm with new evaluation function and bi-coded representation for features selection with confidence rate. Appl Soft Comput 11(1):2501–2509, ISSN 1568–4946
Hsu AL, Halgarmuge SK (2001) Enhanced topology preservation of dynamic self-organising maps for data visualization. IFSA world congress and 20th NAFIPS international conference, vol 3, pp 1786–1791
Hu W, Xie D, Tan T, Maybank S (2004) Learning activity patterns using fuzzy self-organizing neural network. IEEE Trans Syst Man Cybern B 34(3):1618–1626
Article Google Scholar
Kim DW, Lee KH, Lee D (2004) On cluster validity index for estimation of the optimal number of fuzzy clusters. Patt Recogn 37(10):2009–2025
Article Google Scholar
Kim M, Ramakrishna RS (2005) New indices for cluster validity assessment. Patt Recogn Lett 26(15):2353–2363
Article Google Scholar
Kohonen T (1998) Statistical pattern recognition with neural networks: benchmark studies. In: Proceedings of the second annual IEEE international conference on neural networks, vol 1
Kohonen T (1982) Self-organized formation of topologically correct feature maps. Biol Cybern 43:59–69
Article MATH MathSciNet Google Scholar
Kohonen T (1984) Self-organization and associative memory. Springer, Berlin
MATH Google Scholar
Kohonen T (1997) Self-organizing maps, 2nd edn. Springer, Berlin
Book MATH Google Scholar
Kohonen T (1998) The self-organizing map. Neurocomputing 21:1–6
Article MATH Google Scholar
Kohonen T (2001) Self-organizing maps. Springer, Berlin
Book MATH Google Scholar
Li Hui-Ya, Hwang Wen-Jyi, Chang Chia-Yen (2011) Efficient fuzzy C-means architecture for image segmentation. Sensors 2011(11):6697–6718. doi:10.3390/s110706697
Article Google Scholar
Mingoti SA, Lima JO (2006) Comparing SOM neural network with Fuzzy c-means, K-means and traditional hierarchical clustering algorithms. Eur J Oper Res 174:1742–1759
Pal NR, Bezdek JC (1992) On cluster validity for fuzzy c-means model. IEEE Trans Fuzzy Syst 3:370–379
Article Google Scholar
Pascual A, Barcena M, Merelo JJ, Carazo JM (2000) Mapping and fuzzy classification of macromolecular images using self-organizing neural networks. Ultramicroscopy 84:85–99
Article Google Scholar
Petterssona F, Chakrabortib N, Saxéna H (2007) A genetic algorithms based multi-objective neural net applied to noisy blast furnace data. Appl Soft Comput 7(1):387–397
Article Google Scholar
Ravia V, Kurniawana H, Thaia PN, Kumarb PR (2008) Soft computing system for bank performance prediction. Appl Soft Comput 8(1):305–315
Article Google Scholar
Theodoridis Y (1996) Spatial datasets—an unofficial collection. Available from: http://www.dias.cti.gr/~ytheod/research/datasets/spatial.html
Tlili M, Ayadi T, Hamdani TM, Alimi AM (2012) FMIG: fuzzy multilevel interior growing self-organizing maps. IEEE international conference on tools with artificial intelligence (ICTAI)
Tlili M, Hamdani TM, Alimi AM, Abraham A (2014) FVINOS: fuzzy validity index with noise-overlap separation for clustering algorithms. Patt Recogn Lett (in press)
Tsekouras GE, Sarimveis H (2004) A new approach for measuring the validity of the fuzzy c-means algorithm. Adv Eng Softw 35(8–9):567–575
Article MATH Google Scholar
Velmurugan T, Santhanam T (2010) Performance evaluation of K-means and fuzzy C-means clustering algorithms for statistical distributions of input data points. Eur J Sci Res ISSN 1450–216X, 46(3):320–330
Wu KL, Yang MS (2005) A cluster validity index for fuzzy clustering. Patt Recogn Lett 26(9):1275–1291
Article Google Scholar
Xie XL, Beni G (1991) A validity measure for fuzzy clustering. Pattern Analysis and Machine Intelligence, IEEE Transactions on Pattern Analysis and Machine Intelligence 13(8):841–847
Article Google Scholar
Yuksel ME, Besdok E (2004) A simple neuro-fuzzy impulse detector for efficient blur reduction of impulse noise removal operators for digital images. IEEE Trans Fuzzy Syst 12(6):854–865
Article Google Scholar

Download references

Acknowledgments

The authors would like to acknowledge the financial support of this work by grants from General Direction of Scientific Research (DGRST), REGIM-Lab (LR11ES48) Tunisia, under the ARUB program.

Author information

Authors and Affiliations

REGIM Lab: Research Groups on Intelligent Machines, National Engineering School of Sfax (ENIS), University of Sfax, BP 1173, Sfax, 3038, Tunisia
Monia Tlili, Thouraya Ayadi, Tarek M. Hamdani & Adel M. Alimi
College of Science and Arts at Al-Ula, Taibah University, Al-Madinah al-Munawwarah, Saudi Arabia
Tarek M. Hamdani

Authors

Monia Tlili
View author publications
You can also search for this author in PubMed Google Scholar
Thouraya Ayadi
View author publications
You can also search for this author in PubMed Google Scholar
Tarek M. Hamdani
View author publications
You can also search for this author in PubMed Google Scholar
Adel M. Alimi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Monia Tlili.

Additional information

Communicated by E. Lughofer.

Appendix

1.1 Description of the used validity indexes

Let $X=\left\{ {x_1 ,x_2 ,...,x_n } \right\} $ be a dataset in an m-dimensional Euclidean space Rm with its ordinary Euclidean norm $\left\| . \right\| $ and let C, be the matrix of cluster centres with $c_i $, the centre of the cluster $C_i $ ($1\le i\le \mathrm{{nbc}})$. In case of fuzzy clustering, a pattern (point) may belong to all the clusters with a certain fuzzy membership grade. Consider the matrix $U=\left| {u_{ik} } \right| $ where $u_{ik} $ is the value of membership of the point k in cluster $i$, and $u_{ik} $ is in the interval [0,1] ($1\le k\le n)$.

We present below the validity indices studied in this work depending on the geometric aspects of clusters (compactness, separation) and the data properties (noise, overlap) evaluated by each index.

1.1.1 SVI index

SVI (Separation Validity Index) (Wu and Yang 2005) is given by

$$\begin{aligned} \mathrm{{SVI}}=\frac{\varvec{\pi }}{S} \end{aligned}$$

(17)

$\varvec{\pi }$ is a global compactness of the partition, it makes the sum of $\pi _i $, $1\le i\le \mathrm{{nbc}}$.

The compactness of the cluster c$^i$ is defined by

$$\begin{aligned} \pi _i =\frac{\sigma _i }{n_{{i}} }1\le i\le \mathrm{{nbc}} \end{aligned}$$

(18)

The variation $\sigma _i $ and the cardinality $n_{i} $ of the cluster c$_i$ are given, respectively, as

$$\begin{aligned} \sigma _i =\sum \limits _{k=1}^n ({u}_{ik} )^m\left\| {x_k \left. {-\nu _i } \right\| } \right. ^{2},n_i =\sum \limits _{k=1}^n {u_{ik} } ,1\le i\le \mathrm{{nbc}} \end{aligned}$$

(19)

with ${u}_{ik}$, the membership degree of the vector $x_k$ to the cluster c$_i $ and $\nu _i$ the centre of c$_i $.

$S$ is the global separation between the nbc clusters defined as

$$\begin{aligned} S=\sum \limits _{i=1}^{\mathrm{{nbc}}+1} {\sum \limits _{\begin{array}{c} j=1\\ j\ne i \end{array}}^{\mathrm{{nbc}}+1} {(\mathrm{de}\nu _{ij} )^2} } \end{aligned}$$

(20)

$\mathrm{de}\nu _{ij} $ is the deviation between the centres of $c_i$ and $c_j $.

$$\begin{aligned} de\nu _{ij} =(\mu _{ij} )^{(2+\omega )/{2\omega }}\left\| {z_j -z_i } \right\| \end{aligned}$$

(21)

$\left[ {z_1 ,z_2 ,...,z_{\mathrm{{nbc}},} z_{\mathrm{{nbc}}+1} } \right] =\left[ {\nu _1 ,\nu _2 ,...,\nu _{\mathrm{{nbc}},} \overline{x} } \right] ^T,$ where $\overline{x} $ is the mean of X.

$$\begin{aligned} \overline{x} =\sum \limits _{k=1}^n {{x_k }/n} \end{aligned}$$

(22)

$\mu _{ij} $ is the degree of membership of $z_j $ to the centre $z_i $.

$$\begin{aligned}&\mu _{ij} =\frac{1}{\sum \limits _{\begin{array}{c} l=1\\ l\ne j \end{array}}^{\mathrm{{nbc}}+1} {\left( {\frac{\left\| {z_j -z_i } \right\| }{\left\| {z_j -z_l } \right\| }}\right) ^\omega } },1\le i\le \mathrm{{nbc}}+1;\nonumber \\&\qquad \qquad 1\le j\le \mathrm{{nbc}}+1,j\ne i \end{aligned}$$

(23)

1.1.2 XBI index

This index is normalized and gives the best number of clusters at its minimum value (Mingoti and Lima 2006) as shown:

$$\begin{aligned} \mathrm{{XBI}}(\mathrm{{nbc}})=\frac{\max _{k=1,..,nbc} \left\{ {\sum \limits _{j=1}^n {\frac{u_{kj}^2 \left\| {x_j -c_k } \right\| ^2}{n_k }} } \right\} +\max \mathrm{{Diff}}(\mathrm{{nbc}})}{\min _{i,j\ne i} \left\| {c_i -c_j } \right\| ^2} \end{aligned}$$

(24)

where

$\max _{k=1,..,\mathrm{{nbc}}} \left\{ {\sum \limits _{j=1}^n {\frac{u_{kj}^2 \left\| {x_j -c_k } \right\| ^2}{n_k }} } \right\} $ gives the max fuzzy distance between points $x_j $ with membership degree $u_{kj}$ to the center $c_k $.

This index introduced the $\max \mathrm{{Diff}}(\mathrm{{nbc}})$ factor in order to compare the nbc-partitions obtained by increasing nbc. Thus, we can find out the best partition.

$$\begin{aligned}&\max \mathrm{{Diff}}(\mathrm{{nbc}})=\max \nolimits _{\mathrm{{nbc}}\max ,...,\mathrm{{nbc}}} \mathrm{{diff}}_{dw}\end{aligned}$$

(25)

$$\begin{aligned}&\mathrm{{diff}}_{dw} =dw(\mathrm{{nbc}})-dw(\mathrm{{nbc}}+1) \end{aligned}$$

(26)

with $dw$ intra-cluster distance measured for different values of nbc.

1.1.3 PCAESN index

Called Partition Coefficient And Exponential Separation (PCAES) (Hu et al. 2004), this validity index takes into account two factors with a normalized partition coefficient and an exponential separation measure to validate separately each fuzzy cluster $i$. PCAES is then defined as:

$$\begin{aligned} \mathrm{{PCAES}}(\mathrm{{nbc}})=\sum \limits _{i=1}^{\mathrm{{nbc}}} {\mathrm{{PCAESi}}} \end{aligned}$$

(27)

The validity index of cluster i is measured by:

$$\begin{aligned} \mathrm{{PCAESi}}=\sum \limits _{j=1}^n {{\mu _{ij}^2 }/{\mu _M }-} \text{ exp } ( {-\min \{\left\| {a_\mathrm{i} -a_k} \right\| ^{2}\} /{\beta _\mathrm{T}}}) \end{aligned}$$

(28)

where$\sum \limits _{j=1}^n {{\mu _{ij}^2 }/{\mu _M }} $: the compactness of the cluster i compared to the most compact cluster with its value of compactness $\mu _M $measured by:

$$\begin{aligned} \mu _M =\mathop {\min }\limits _{1\le i\le nbc} \left\{ {\sum \limits _{j=1}^n {\mu _{ij}^2 } } \right\} \end{aligned}$$

(29)

$\mu _{ij}$ is the membership degree of vector j ($1\le j\le n)$ to cluster i ($1\le k\le \mathrm{{nbc}})$,

$ \text{ exp } ( {{-\min \{ }\left\| {a_\mathrm{i} -a_k } \right\| ^{2}{\} } / {\beta _\mathrm{T} }})$ is the separation measure relative to the total average separation $\beta _\mathrm{T} $ of the nbc clusters given by:

$$\begin{aligned} \beta _\mathrm{T} =\frac{\sum \nolimits _{l=1}^{nbc} {\left\| {a_l -\overline{a} } \right\| ^2} }{nbc} \end{aligned}$$

(30)

$a_\mathrm{i}$ is the center of cluster $i$, and $\overline{a} =\sum \nolimits _{i=1}^{\mathrm{{nbc}}} {a_i /nbc} $, the average of center vectors $a_\mathrm{i} $.

Normalisation of PCAES index:

Basically, each validity index measure PCAESi ($i =1{\ldots }{\mathrm{nbc}}$) is obtained by a subtraction between the compactness and separation values which are defined in the interval [0 1]. Consequently, we present PCAES values as

$$\begin{aligned} -\mathrm{{nbc}}< \text{ PCAES } ( {\mathrm{{nbc}}})\, <\mathrm{{nbc}} \end{aligned}$$

(31)

Thus, to obtain the value of PCAES in [0 1], we have specified PCAESN Tlili et al. (2014) as following:

$$\begin{aligned} \mathrm{{PCAESN}}(\mathrm{{nbc}})=0.5+\left[ {( {\mathrm{{PCAES}}(\mathrm{{nbc}})/nbc})/2} \right] \end{aligned}$$

(32)

1.1.4 VOS index

Validity Overlap Separation (VOS) (Pal and Bezdek 1992) gives its values in the interval [0 1] and reaches its best partition for the minimum value.

This index is given by

$$\begin{aligned} \mathrm{{VOS}}(\mathrm{{nbc}},U)=\frac{Overlap^N(\mathrm{{nbc}},U)}{\mathrm{{Sep}}^N(\mathrm{{nbc}},U)}, \end{aligned}$$

(33)

where $Overlap^N(nbc,U)$ gives an inter-cluster overlap for different values of nbc, normalized by the max overlap for nbc = 2...nbcmax.

$$\begin{aligned} \mathrm{{Overlap}}^N(\mathrm{{nbc}},U)=\frac{\mathrm{{Overlap}}(\mathrm{{nbc}},U)}{\mathrm{{Overlap}}_{\max } } \end{aligned}$$

(34)

$Sep^N(nbc,U)$ calculates the separation between clusters for different values of nbc, normalized by the max separation for nbc = 2...nbcmax.

$$\begin{aligned} Sep^N(nbc,U)=\frac{Sep(nbc,U)}{Sep_{\max } } \end{aligned}$$

(35)

1.2 Overlap function

$$\begin{aligned} \mathrm{{Overlap}}(\mathrm{{nbc}},U)=\frac{2}{\mathrm{{nbc}}(\mathrm{{nbc}}-1)}\sum \limits _{p=1}^{\mathrm{{nbc}}-1} {\sum \limits _{q=p+1}^{nbc} {P(\overline{F} _p ,\overline{F} _q )} } \end{aligned}$$

(36)

$P(\overline{F} _p ,\overline{F} _q )$ defines the total overlap between two fuzzy clusters $\overline{F} _p $ and $\overline{F} _q $:

$$\begin{aligned} P(\overline{F} _p ,\overline{F} _q )=\sum \limits _\mu {f(\mu :} \overline{F} _p ,\overline{F} _q ) \end{aligned}$$

(37)

The function $f(\mu :\overline{F} _p ,\overline{F} _q )$ calculates the overlap degree between two fuzzy clusters $\overline{F} _p $ and $\overline{F} _q $ at a membership degree $\mu $ given by

$$\begin{aligned} f(\mu :\overline{F} _p ,\overline{F} _q )=\sum \limits _{j=1}^\mathrm{nbc} {\delta (x_j ,} \mu :\overline{F} _p ,\overline{F} _q ) \end{aligned}$$

(38)

$\delta (x_j , \mu :\overline{F} _p ,\overline{F} _q )$ indicates whether two clusters are overlapping at the membership degree $\mu $ for data point $x_j $. It returns an overlap value of $\omega (x_j )$ when the membership degrees of two clusters are both greater than $\mu $; otherwise it returns 0.0:

$$\begin{aligned}&\delta (x_j , \mu :\overline{F} _p ,\overline{F} _q )\nonumber \\&\quad =\left\{ {\begin{array}{l} \omega (x_j )\quad \text{ if } (\mu _{\overline{F} _p } (x_j )\ge \mu ) \text{ and }\, (\mu _{\overline{F} _q } (x_j )\ge \mu ), \\ 0 \quad \text{ otherwise } \\ \end{array}} \right. \end{aligned}$$

(39)

$\omega (x_j )$ is a weight factor for each point $x_j $ between clusters. $\omega (x_j )$ is a value in [0 1].

1.3 Separation function

The separation measure is obtained using the similarity distance $S(\overline{F} _p ,\overline{F} _q )$ between two fuzzy clusters $\overline{F} _p $ and $\overline{F} _q $. It is defined as

$$\begin{aligned} Sep(nbc,U)=1-\underbrace{\min }_{p\ne q}S(\overline{F} _p ,\overline{F} _q ) \end{aligned}$$

(40)

$S(\overline{F} _p ,\overline{F} _q )$ is the maximum membership degree between two clusters $\overline{F} _p $ and $\overline{F} _q $ in the interval [0 1]:

$$\begin{aligned} S(\overline{F} _p ,\overline{F} _q )=\underbrace{\max }_{x\in X}\min (\mu _{\overline{F} p} (x),\mu _{\overline{F} q} (x)), \end{aligned}$$

(41)

where $\mu _{\overline{F} p} \text{( }x\text{) }$, $\mu _{\overline{F} q} (x)$ are the membership degrees of vector $x$, respectively, in $\overline{F} _p $ and $\overline{F} _q $.

1.3.1 Dunn index

The validity indice Dunn (1974) is based on inter-cluster distance and the diameter of a hyperspheric cluster. Dunn index is given as

$$\begin{aligned}&\mathrm{{Dunn}}(\mathrm{{nbc}})=\min \nolimits _{i=1,...,\mathrm{{nbc}}}\\&\quad \times \left\{ {\min \nolimits _{j=i+1,...,\mathrm{{nbc}},j\ne i} \left\{ {\left. {\frac{d(c_{i,} c_j )}{\max _{k=1,...,nbc} \left\{ {\mathrm{{diam}}(c_k )} \right\} }} \right\} } \right. } \right\} \\&d(c_{i,} c_j )=\min \nolimits _{\mathop {x\in C}\nolimits _i ,\mathop {y\in C}\nolimits _j } \left\{ {d(x,y)} \right\} \end{aligned}$$

$diam(c_k )$ is the diameter of the cluster $c_k $.

1.3.2 FVINOS index

Fuzzy Validity Index with Noise-Overlap Separation (Tlili et al. 2014) (FVINOS) is inspired from the Davies–Bouldin validity index Mingoti and Lima (2006). FVINOS is defined as

$$\begin{aligned}&\mathrm{{FVINOS}}(\mathrm{{nbc}})=\frac{1}{\mathrm{{nbc}}}\sum \limits _{i=1}^{\mathrm{{nbc}}} \nonumber \\&\quad \times {\left( {\frac{\max _{k=1,\ldots ,\mathrm{{nbc}},k\ne i} \left\{ {S_i +S_k } \right\} +\max \mathrm{{Diff}}_i (\mathrm{{nbc}})}{\min _{l=1,\ldots ,nbc,l\ne i} \left\{ {d_{i,l} } \right\} }}\right) } \end{aligned}$$

(42)

$\min _{l=1,...,\mathrm{{nbc}},l\ne i} \left\{ {d_{i,l} } \right\} $ calculates the minimum separation between two clusters $i$ and l.

max Diff(nbc) factor is given to compare each two successive obtained partitions. In consequence, the best partition of the dataset is found.

$$\begin{aligned} \max \mathrm{{Diff}}_i (\mathrm{{nbc}})=\max \nolimits _{\mathrm{{nbc}}\max ,...,\mathrm{{nbc}}} \mathrm{{diff}}_i (\mathrm{{nbc}}) \end{aligned}$$

(43)

$$\begin{aligned} \mathrm{{diff}}_i (\mathrm{{nbc}})&= \max _{k=1,..,\mathrm{{nbc}},k\ne i} \left\{ {S_i (\mathrm{{nbc}})+S_k (\mathrm{{nbc}})} \right\} \nonumber \\&-\max \nolimits _{k=1,..,\mathrm{{nbc}}\!+\!1,k\ne i} \left\{ {S_i (\mathrm{{nbc}}\!+\!1)\!+\!S_k (nbc\!+\!1)} \right\} \nonumber \\ \end{aligned}$$

(44)

$\mathrm{diff}_i (\mathrm{nbc})$ calculates the difference between max sums of compactness in a pair of clusters $i$ and k; this is calculated for the obtained partition at nbc and nbc+1.

We define the average of fuzzy compactness relative to cluster $i$ as

$$\begin{aligned} S_i =\frac{1}{n_i }\sum \limits _{\mathop {x\in C}\nolimits _i } {(u_i (x)^m)\times d(x,c_i )}, \end{aligned}$$

(45)

where $u_i (x)$ is the membership degree of point $x$ to cluster $i$, $m$ is the fuzzifier factor.

$d(x,c_i )$ is the Euclidean distance separating $x$ from the cluster centre $c_i $.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tlili, M., Ayadi, T., Hamdani, T.M. et al. Performance evaluation of FMIG clustering using fuzzy validity indexes. Soft Comput 19, 3515–3528 (2015). https://doi.org/10.1007/s00500-014-1478-3

Download citation

Published: 21 October 2014
Issue Date: December 2015
DOI: https://doi.org/10.1007/s00500-014-1478-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Performance evaluation of FMIG clustering using fuzzy validity indexes

Abstract

Access this article

Similar content being viewed by others

The use of conventional clustering methods combined with SOM to increase the efficiency

Clustering-Based Adaptive Self-Organizing Map

Fuzzy C-Means Clustering Validity Function Based on Multiple Clustering Performance Evaluation Components

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix

1.1 Description of the used validity indexes

1.1.1 SVI index

1.1.2 XBI index

1.1.3 PCAESN index

1.1.4 VOS index

1.2 Overlap function

1.3 Separation function

1.3.1 Dunn index

1.3.2 FVINOS index

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Performance evaluation of FMIG clustering using fuzzy validity indexes

Abstract

Access this article

Similar content being viewed by others

The use of conventional clustering methods combined with SOM to increase the efficiency

Clustering-Based Adaptive Self-Organizing Map

Fuzzy C-Means Clustering Validity Function Based on Multiple Clustering Performance Evaluation Components

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix

Appendix

1.1 Description of the used validity indexes

1.1.1 SVI index

1.1.2 XBI index

1.1.3 PCAESN index

1.1.4 VOS index

1.2 Overlap function

1.3 Separation function

1.3.1 Dunn index

1.3.2 FVINOS index

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation