Skip to main content
Log in

Performance evaluation of FMIG clustering using fuzzy validity indexes

  • Focus
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

The clustering of high-dimensional data presents a critical computational problem. Therefore, it is convenient to arrange the cluster centres on a grid with a small dimensional space that reduces computational cost and can be easily visualized. Moreover, in real applications there is no sharp boundary between classes, real datasets are naturally defined in a fuzzy context. Therefore, fuzzy clustering fits better for complex real datasets to determine the best distribution. Self-organizing map (SOM) technique is appropriate for clustering and vector quantization of high-dimensional data. In this paper we present a new fuzzy learning approach called FMIG (fuzzy multilevel interior growing self-organizing maps). The proposed algorithm is a fuzzy version of MIGSOM (multilevel interior growing self-organizing maps). The main contribution of FMIG is to define a fuzzy process of data mapping and to take into account the fuzzy aspect of high-dimensional real datasets. This new algorithm is able to auto-organize the map accordingly to the fuzzy training property of the nodes. In the second step, the introduced scheme for FMIG is clustered by means of fuzzy C-means (FCM) to discover the interior class distribution of the learned data. The validation of FCM partitions is directed by applying six validity indexes. Superiority of the new method is demonstrated by comparing it with crisp MIGSOM, GSOM (growing SOM) and FKCN (fuzzy Kohonen clustering network) techniques. Thus, our new method shows improvement in term of quantization error, topology preservation and clustering ability.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Abonyi J, Migaly S, Szeifert F (2002) Fuzzy self-organizing map based on regularized fuzzy c-means clustering. Advances in soft computing, Engineering Design and Manufacturing

    Google Scholar 

  • Abraham A, Nath B (2001) A neuro-fuzzy approach for modelling electricity demand in Victoria. Appl Soft Comput 1(2):127–138

    Article  Google Scholar 

  • Aguilera PA, Frenich AG (2001) Application of the Kohonen neural network in coastal water management: methodological development for the assessment and prediction of water quality. Water Res 35:4053–4062

    Article  Google Scholar 

  • Alimi AM, Hassine R, Selmi M (2003) Beta fuzzy logic systems: approximation properties in the MIMO case. Int J Appl Math Comput Sci 13(2):225–238

  • Alimi AM (2000) The beta system: toward a change in our use of neuro-fuzzy systems. Int J Manag, Invited Paper, no. June, pp 15–19

  • Alimi AM (2003) Beta neuro-fuzzy systems. TASK Q J, Special Issue on Neural Networks edited by Duch W and Rutkowska D, vol 7, no 1, pp 23–41

  • Amarasiri R, Alahakoon D, Smith KA (2004) HDGSOM: a modified growing self-organizing map for high dimensional data clustering. Fourth international coriference on hybrid intelligent systems (HIS’04), pp 216–221

  • Ayadi T, Ellouze M, Hamdani TM, Alimi AM (2012) Movie scenes detection with MIGSOM based on shots semi-supervised clustering. Neural Comput Appl. doi 10.1007/s00521-012-0930-5

  • Ayadi T, Hamdani TM, Alimi AM, and Khabou MA (2007) 2IBGSOM: interior and irregular boundaries growing self-organizing maps. IEEE sixth international conference on machine learning and applications, pp 387–392

  • Ayadi T, Hamdani TM, Alimi AM (2010) A new data topology matching technique with multilevel interior growing self-organizing maps. IEEE international conference on systems, man, and cybernetics, pp 2479–2486

  • Ayadi T, Hamdani TM, Alimi AM (2011) On the use of cluster validity for evaluation of MIGSOM clustering. ISCIII: 5th international symposium on computational intelligence and intelligent informatics, pp 121–126

  • Ayadi T, Hamdani TM, Alimi AM (2012) MIGSOM: multilevel interior growing self-organizing maps for high dimensional data clustering. In: Neural processing letters, vol 36, pp 235 256

  • Barrea A (2011) Local fuzzy c-means clustering for medical spectroscopy images. Appl Math Sci 5(30):1449–1458

    MATH  Google Scholar 

  • Bezdek JC (1974) Cluster validity with fuzzy sets. J Cybern 58–72

  • Bezdek JC (1987) Convergence theory for fuzzy c-means. IEEE Trans SMC, pp 873–877

  • Bezdek JC, Harris JD (1978) Fuzzy partitions and relation: an axiomatic basic for clustering fuzzy sets and systems. Academic Press, Massachusetts

  • Bezdek JC, Castelaz PF (1977) Prototype classification and feature selection with fuzzy sets. IEEE Trans Syst Man Cybern 7(2):87–92

  • Bezdek JC (1981) Pattern recognition with fuzzy objective function algorithms. Plenum Press, New York

    Book  MATH  Google Scholar 

  • Blake CL, Merz CJ (1998) UCI repository of machine learning databases, [http://www.ics.uci.edu/learn/MLRepository.html]. University of California, Department of Information and Computer Science, Irvine, CA

  • Chang PC, Liao TW (2006) Combining SOM and fuzzy rule base for flow time prediction in semiconductor manufacturing factory. Appl Soft Comput 6(2):198–206

    Article  Google Scholar 

  • Das S, Abraham A, Konar A (2008) Automatic kernel clustering with a Multi-Elitist Particle Swarm Optimization Algorithm. Patt Recogn Lett 29:688–699

    Article  Google Scholar 

  • Das S, Abraham A (2008) Automatic clustering using an improved differential evolution algorithm. IEEE Trans SMC, vol 38, no 1, pp 218–237

  • Denaïa MA, Palisb F, Zeghbibb A (2007) Modeling and control of non-linear systems using soft computing techniques. Appl Soft Comput 7(3):728–738

    Article  Google Scholar 

  • Dhahri H, Alimi AM (2005) Hierarchical learning algorithm for the beta basis function neural network. In: Proceedings of third international conference on systems, signals & devices: SSD’2005, Sousse, Tunisia, March 2005

  • Dunn JC (1974) Well separated clusters and optimal fuzzy partitions. J Cybern 4:95–104

    Article  MathSciNet  Google Scholar 

  • El Malek J, Alimi AM, Tourki R (2002) Problems in pattern classification in high dimensional spaces: behavior of a class of combined neuro-fuzzy classifiers. Fuzzy Sets Syst 128(1):15–33

  • Fritzke B (1994) Growing cell structure: a self organizing network for supervised and un-supervised learning. Neural Netw 7:1441–1460

  • Hamdani TM,. Alimi AM, Khabou MA (2011) An iterative method for deciding SVM and single layer neural network structures. Neural Process Lett 33(2):171–186

  • Hamdani TM, Alimi AM, Karray F (2008) Enhancing the structure and parameters of the centers for BBF fuzzy neural network classifier construction based on data structure. In: Proceedings of IEEE international join conference on neural networks, Hong Kong, IJCNN, art. no. 4634247, pp 3174–3180

  • Hamdani TM, Won JM, Alimi AM, Karray F (2011) Hierarchical genetic algorithm with new evaluation function and bi-coded representation for features selection with confidence rate. Appl Soft Comput 11(1):2501–2509, ISSN 1568–4946

  • Hsu AL, Halgarmuge SK (2001) Enhanced topology preservation of dynamic self-organising maps for data visualization. IFSA world congress and 20th NAFIPS international conference, vol 3, pp 1786–1791

  • Hu W, Xie D, Tan T, Maybank S (2004) Learning activity patterns using fuzzy self-organizing neural network. IEEE Trans Syst Man Cybern B 34(3):1618–1626

    Article  Google Scholar 

  • Kim DW, Lee KH, Lee D (2004) On cluster validity index for estimation of the optimal number of fuzzy clusters. Patt Recogn 37(10):2009–2025

    Article  Google Scholar 

  • Kim M, Ramakrishna RS (2005) New indices for cluster validity assessment. Patt Recogn Lett 26(15):2353–2363

    Article  Google Scholar 

  • Kohonen T (1998) Statistical pattern recognition with neural networks: benchmark studies. In: Proceedings of the second annual IEEE international conference on neural networks, vol 1

  • Kohonen T (1982) Self-organized formation of topologically correct feature maps. Biol Cybern 43:59–69

    Article  MATH  MathSciNet  Google Scholar 

  • Kohonen T (1984) Self-organization and associative memory. Springer, Berlin

    MATH  Google Scholar 

  • Kohonen T (1997) Self-organizing maps, 2nd edn. Springer, Berlin

    Book  MATH  Google Scholar 

  • Kohonen T (1998) The self-organizing map. Neurocomputing 21:1–6

    Article  MATH  Google Scholar 

  • Kohonen T (2001) Self-organizing maps. Springer, Berlin

    Book  MATH  Google Scholar 

  • Li Hui-Ya, Hwang Wen-Jyi, Chang Chia-Yen (2011) Efficient fuzzy C-means architecture for image segmentation. Sensors 2011(11):6697–6718. doi:10.3390/s110706697

    Article  Google Scholar 

  • Mingoti SA, Lima JO (2006) Comparing SOM neural network with Fuzzy c-means, K-means and traditional hierarchical clustering algorithms. Eur J Oper Res 174:1742–1759

  • Pal NR, Bezdek JC (1992) On cluster validity for fuzzy c-means model. IEEE Trans Fuzzy Syst 3:370–379

    Article  Google Scholar 

  • Pascual A, Barcena M, Merelo JJ, Carazo JM (2000) Mapping and fuzzy classification of macromolecular images using self-organizing neural networks. Ultramicroscopy 84:85–99

    Article  Google Scholar 

  • Petterssona F, Chakrabortib N, Saxéna H (2007) A genetic algorithms based multi-objective neural net applied to noisy blast furnace data. Appl Soft Comput 7(1):387–397

    Article  Google Scholar 

  • Ravia V, Kurniawana H, Thaia PN, Kumarb PR (2008) Soft computing system for bank performance prediction. Appl Soft Comput 8(1):305–315

    Article  Google Scholar 

  • Theodoridis Y (1996) Spatial datasets—an unofficial collection. Available from: http://www.dias.cti.gr/~ytheod/research/datasets/spatial.html

  • Tlili M, Ayadi T, Hamdani TM, Alimi AM (2012) FMIG: fuzzy multilevel interior growing self-organizing maps. IEEE international conference on tools with artificial intelligence (ICTAI)

  • Tlili M, Hamdani TM, Alimi AM, Abraham A (2014) FVINOS: fuzzy validity index with noise-overlap separation for clustering algorithms. Patt Recogn Lett (in press)

  • Tsekouras GE, Sarimveis H (2004) A new approach for measuring the validity of the fuzzy c-means algorithm. Adv Eng Softw 35(8–9):567–575

    Article  MATH  Google Scholar 

  • Velmurugan T, Santhanam T (2010) Performance evaluation of K-means and fuzzy C-means clustering algorithms for statistical distributions of input data points. Eur J Sci Res ISSN 1450–216X, 46(3):320–330

  • Wu KL, Yang MS (2005) A cluster validity index for fuzzy clustering. Patt Recogn Lett 26(9):1275–1291

    Article  Google Scholar 

  • Xie XL, Beni G (1991) A validity measure for fuzzy clustering. Pattern Analysis and Machine Intelligence, IEEE Transactions on Pattern Analysis and Machine Intelligence 13(8):841–847

    Article  Google Scholar 

  • Yuksel ME, Besdok E (2004) A simple neuro-fuzzy impulse detector for efficient blur reduction of impulse noise removal operators for digital images. IEEE Trans Fuzzy Syst 12(6):854–865

    Article  Google Scholar 

Download references

Acknowledgments

The authors would like to acknowledge the financial support of this work by grants from General Direction of Scientific Research (DGRST), REGIM-Lab (LR11ES48) Tunisia, under the ARUB program.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Monia Tlili.

Additional information

Communicated by E. Lughofer.

Appendix

Appendix

1.1 Description of the used validity indexes

Let \(X=\left\{ {x_1 ,x_2 ,...,x_n } \right\} \) be a dataset in an m-dimensional Euclidean space Rm with its ordinary Euclidean norm \(\left\| . \right\| \) and let C, be the matrix of cluster centres with \(c_i \), the centre of the cluster \(C_i \) (\(1\le i\le \mathrm{{nbc}})\). In case of fuzzy clustering, a pattern (point) may belong to all the clusters with a certain fuzzy membership grade. Consider the matrix \(U=\left| {u_{ik} } \right| \) where \(u_{ik} \) is the value of membership of the point k in cluster \(i\), and \(u_{ik} \) is in the interval [0,1] (\(1\le k\le n)\).

We present below the validity indices studied in this work depending on the geometric aspects of clusters (compactness, separation) and the data properties (noise, overlap) evaluated by each index.

1.1.1 SVI index

SVI (Separation Validity Index) (Wu and Yang 2005) is given by

$$\begin{aligned} \mathrm{{SVI}}=\frac{\varvec{\pi }}{S} \end{aligned}$$
(17)

\(\varvec{\pi }\) is a global compactness of the partition, it makes the sum of \(\pi _i \), \(1\le i\le \mathrm{{nbc}}\).

The compactness of the cluster c\(^i\) is defined by

$$\begin{aligned} \pi _i =\frac{\sigma _i }{n_{{i}} }1\le i\le \mathrm{{nbc}} \end{aligned}$$
(18)

The variation \(\sigma _i \) and the cardinality \(n_{i} \) of the cluster c\(_i\) are given, respectively, as

$$\begin{aligned} \sigma _i =\sum \limits _{k=1}^n ({u}_{ik} )^m\left\| {x_k \left. {-\nu _i } \right\| } \right. ^{2},n_i =\sum \limits _{k=1}^n {u_{ik} } ,1\le i\le \mathrm{{nbc}} \end{aligned}$$
(19)

with \({u}_{ik}\), the membership degree of the vector \(x_k\) to the cluster c\(_i \) and \(\nu _i\) the centre of c\(_i \).

\(S\) is the global separation between the nbc clusters defined as 

$$\begin{aligned} S=\sum \limits _{i=1}^{\mathrm{{nbc}}+1} {\sum \limits _{\begin{array}{c} j=1\\ j\ne i \end{array}}^{\mathrm{{nbc}}+1} {(\mathrm{de}\nu _{ij} )^2} } \end{aligned}$$
(20)

\(\mathrm{de}\nu _{ij} \) is the deviation between the centres of \(c_i\) and \(c_j \).

$$\begin{aligned} de\nu _{ij} =(\mu _{ij} )^{(2+\omega )/{2\omega }}\left\| {z_j -z_i } \right\| \end{aligned}$$
(21)

\(\left[ {z_1 ,z_2 ,...,z_{\mathrm{{nbc}},} z_{\mathrm{{nbc}}+1} } \right] =\left[ {\nu _1 ,\nu _2 ,...,\nu _{\mathrm{{nbc}},} \overline{x} } \right] ^T,\) where \(\overline{x} \) is the mean of X.

$$\begin{aligned} \overline{x} =\sum \limits _{k=1}^n {{x_k }/n} \end{aligned}$$
(22)

\(\mu _{ij} \) is the degree of membership of \(z_j \) to the centre \(z_i \).

$$\begin{aligned}&\mu _{ij} =\frac{1}{\sum \limits _{\begin{array}{c} l=1\\ l\ne j \end{array}}^{\mathrm{{nbc}}+1} {\left( {\frac{\left\| {z_j -z_i } \right\| }{\left\| {z_j -z_l } \right\| }}\right) ^\omega } },1\le i\le \mathrm{{nbc}}+1;\nonumber \\&\qquad \qquad 1\le j\le \mathrm{{nbc}}+1,j\ne i \end{aligned}$$
(23)

1.1.2 XBI index

This index is normalized and gives the best number of clusters at its minimum value (Mingoti and Lima 2006) as shown:

$$\begin{aligned} \mathrm{{XBI}}(\mathrm{{nbc}})=\frac{\max _{k=1,..,nbc} \left\{ {\sum \limits _{j=1}^n {\frac{u_{kj}^2 \left\| {x_j -c_k } \right\| ^2}{n_k }} } \right\} +\max \mathrm{{Diff}}(\mathrm{{nbc}})}{\min _{i,j\ne i} \left\| {c_i -c_j } \right\| ^2} \end{aligned}$$
(24)

where

\(\max _{k=1,..,\mathrm{{nbc}}} \left\{ {\sum \limits _{j=1}^n {\frac{u_{kj}^2 \left\| {x_j -c_k } \right\| ^2}{n_k }} } \right\} \) gives the max fuzzy distance between points \(x_j \) with membership degree \(u_{kj}\) to the center \(c_k \).

This index introduced the \(\max \mathrm{{Diff}}(\mathrm{{nbc}})\) factor in order to compare the nbc-partitions obtained by increasing nbc. Thus, we can find out the best partition.

$$\begin{aligned}&\max \mathrm{{Diff}}(\mathrm{{nbc}})=\max \nolimits _{\mathrm{{nbc}}\max ,...,\mathrm{{nbc}}} \mathrm{{diff}}_{dw}\end{aligned}$$
(25)
$$\begin{aligned}&\mathrm{{diff}}_{dw} =dw(\mathrm{{nbc}})-dw(\mathrm{{nbc}}+1) \end{aligned}$$
(26)

with \(dw\) intra-cluster distance measured for different values of nbc.

1.1.3 PCAESN index

Called Partition Coefficient And Exponential Separation (PCAES) (Hu et al. 2004), this validity index takes into account two factors with a normalized partition coefficient and an exponential separation measure to validate separately each fuzzy cluster \(i\). PCAES is then defined as:

$$\begin{aligned} \mathrm{{PCAES}}(\mathrm{{nbc}})=\sum \limits _{i=1}^{\mathrm{{nbc}}} {\mathrm{{PCAESi}}} \end{aligned}$$
(27)

The validity index of cluster i is measured by:

$$\begin{aligned} \mathrm{{PCAESi}}=\sum \limits _{j=1}^n {{\mu _{ij}^2 }/{\mu _M }-} \text{ exp } ( {-\min \{\left\| {a_\mathrm{i} -a_k} \right\| ^{2}\} /{\beta _\mathrm{T}}}) \end{aligned}$$
(28)

where\(\sum \limits _{j=1}^n {{\mu _{ij}^2 }/{\mu _M }} \): the compactness of the cluster i compared to the most compact cluster with its value of compactness \(\mu _M \)measured by:

$$\begin{aligned} \mu _M =\mathop {\min }\limits _{1\le i\le nbc} \left\{ {\sum \limits _{j=1}^n {\mu _{ij}^2 } } \right\} \end{aligned}$$
(29)

\(\mu _{ij}\) is the membership degree of vector j (\(1\le j\le n)\) to cluster i (\(1\le k\le \mathrm{{nbc}})\),

\( \text{ exp } ( {{-\min \{ }\left\| {a_\mathrm{i} -a_k } \right\| ^{2}{\} } / {\beta _\mathrm{T} }})\) is the separation measure relative to the total average separation \(\beta _\mathrm{T} \) of the nbc clusters given by:

$$\begin{aligned} \beta _\mathrm{T} =\frac{\sum \nolimits _{l=1}^{nbc} {\left\| {a_l -\overline{a} } \right\| ^2} }{nbc} \end{aligned}$$
(30)

\(a_\mathrm{i}\) is the center of cluster \(i\), and \(\overline{a} =\sum \nolimits _{i=1}^{\mathrm{{nbc}}} {a_i /nbc} \), the average of center vectors \(a_\mathrm{i} \).

Normalisation of PCAES index:

Basically, each validity index measure PCAESi (\(i =1{\ldots }{\mathrm{nbc}}\)) is obtained by a subtraction between the compactness and separation values which are defined in the interval [0 1]. Consequently, we present PCAES values as

$$\begin{aligned} -\mathrm{{nbc}}< \text{ PCAES } ( {\mathrm{{nbc}}})\, <\mathrm{{nbc}} \end{aligned}$$
(31)

Thus, to obtain the value of PCAES in [0 1], we have specified PCAESN Tlili et al. (2014) as following:

$$\begin{aligned} \mathrm{{PCAESN}}(\mathrm{{nbc}})=0.5+\left[ {( {\mathrm{{PCAES}}(\mathrm{{nbc}})/nbc})/2} \right] \end{aligned}$$
(32)

1.1.4 VOS index

Validity Overlap Separation (VOS) (Pal and Bezdek 1992) gives its values in the interval [0 1] and reaches its best partition for the minimum value.

This index is given by

$$\begin{aligned} \mathrm{{VOS}}(\mathrm{{nbc}},U)=\frac{Overlap^N(\mathrm{{nbc}},U)}{\mathrm{{Sep}}^N(\mathrm{{nbc}},U)}, \end{aligned}$$
(33)

where \(Overlap^N(nbc,U)\) gives an inter-cluster overlap for different values of nbc, normalized by the max overlap for nbc = 2...nbcmax.

$$\begin{aligned} \mathrm{{Overlap}}^N(\mathrm{{nbc}},U)=\frac{\mathrm{{Overlap}}(\mathrm{{nbc}},U)}{\mathrm{{Overlap}}_{\max } } \end{aligned}$$
(34)

\(Sep^N(nbc,U)\) calculates the separation between clusters for different values of nbc, normalized by the max separation for nbc = 2...nbcmax.

$$\begin{aligned} Sep^N(nbc,U)=\frac{Sep(nbc,U)}{Sep_{\max } } \end{aligned}$$
(35)

1.2 Overlap function

$$\begin{aligned} \mathrm{{Overlap}}(\mathrm{{nbc}},U)=\frac{2}{\mathrm{{nbc}}(\mathrm{{nbc}}-1)}\sum \limits _{p=1}^{\mathrm{{nbc}}-1} {\sum \limits _{q=p+1}^{nbc} {P(\overline{F} _p ,\overline{F} _q )} } \end{aligned}$$
(36)

\(P(\overline{F} _p ,\overline{F} _q )\) defines the total overlap between two fuzzy clusters \(\overline{F} _p \) and \(\overline{F} _q \):

$$\begin{aligned} P(\overline{F} _p ,\overline{F} _q )=\sum \limits _\mu {f(\mu :} \overline{F} _p ,\overline{F} _q ) \end{aligned}$$
(37)

The function \(f(\mu :\overline{F} _p ,\overline{F} _q )\) calculates the overlap degree between two fuzzy clusters \(\overline{F} _p \) and \(\overline{F} _q \) at a membership degree \(\mu \) given by

$$\begin{aligned} f(\mu :\overline{F} _p ,\overline{F} _q )=\sum \limits _{j=1}^\mathrm{nbc} {\delta (x_j ,} \mu :\overline{F} _p ,\overline{F} _q ) \end{aligned}$$
(38)

\(\delta (x_j , \mu :\overline{F} _p ,\overline{F} _q )\) indicates whether two clusters are overlapping at the membership degree \(\mu \) for data point \(x_j \). It returns an overlap value of \(\omega (x_j )\) when the membership degrees of two clusters are both greater than \(\mu \); otherwise it returns 0.0:

$$\begin{aligned}&\delta (x_j , \mu :\overline{F} _p ,\overline{F} _q )\nonumber \\&\quad =\left\{ {\begin{array}{l} \omega (x_j )\quad \text{ if } (\mu _{\overline{F} _p } (x_j )\ge \mu ) \text{ and }\, (\mu _{\overline{F} _q } (x_j )\ge \mu ), \\ 0 \quad \text{ otherwise } \\ \end{array}} \right. \end{aligned}$$
(39)

\(\omega (x_j )\)  is a weight factor for each point \(x_j \) between clusters. \(\omega (x_j )\) is a value in [0 1].

1.3 Separation function

The separation measure is obtained using the similarity distance \(S(\overline{F} _p ,\overline{F} _q )\) between two fuzzy clusters \(\overline{F} _p \) and \(\overline{F} _q \). It is defined as

$$\begin{aligned} Sep(nbc,U)=1-\underbrace{\min }_{p\ne q}S(\overline{F} _p ,\overline{F} _q ) \end{aligned}$$
(40)

\(S(\overline{F} _p ,\overline{F} _q )\) is the maximum membership degree between two clusters \(\overline{F} _p \) and \(\overline{F} _q \) in the interval [0 1]:

$$\begin{aligned} S(\overline{F} _p ,\overline{F} _q )=\underbrace{\max }_{x\in X}\min (\mu _{\overline{F} p} (x),\mu _{\overline{F} q} (x)), \end{aligned}$$
(41)

where \(\mu _{\overline{F} p} \text{( }x\text{) }\), \(\mu _{\overline{F} q} (x)\) are the membership degrees of vector \(x\), respectively, in \(\overline{F} _p \) and \(\overline{F} _q \).

1.3.1 Dunn index

The validity indice Dunn (1974) is based on inter-cluster distance and the diameter of a hyperspheric cluster. Dunn index is given as

$$\begin{aligned}&\mathrm{{Dunn}}(\mathrm{{nbc}})=\min \nolimits _{i=1,...,\mathrm{{nbc}}}\\&\quad \times \left\{ {\min \nolimits _{j=i+1,...,\mathrm{{nbc}},j\ne i} \left\{ {\left. {\frac{d(c_{i,} c_j )}{\max _{k=1,...,nbc} \left\{ {\mathrm{{diam}}(c_k )} \right\} }} \right\} } \right. } \right\} \\&d(c_{i,} c_j )=\min \nolimits _{\mathop {x\in C}\nolimits _i ,\mathop {y\in C}\nolimits _j } \left\{ {d(x,y)} \right\} \end{aligned}$$

\(diam(c_k )\) is the diameter of the cluster \(c_k \).

1.3.2 FVINOS index

Fuzzy Validity Index with Noise-Overlap Separation (Tlili et al. 2014) (FVINOS) is inspired from the Davies–Bouldin validity index Mingoti and Lima (2006). FVINOS is defined as

$$\begin{aligned}&\mathrm{{FVINOS}}(\mathrm{{nbc}})=\frac{1}{\mathrm{{nbc}}}\sum \limits _{i=1}^{\mathrm{{nbc}}} \nonumber \\&\quad \times {\left( {\frac{\max _{k=1,\ldots ,\mathrm{{nbc}},k\ne i} \left\{ {S_i +S_k } \right\} +\max \mathrm{{Diff}}_i (\mathrm{{nbc}})}{\min _{l=1,\ldots ,nbc,l\ne i} \left\{ {d_{i,l} } \right\} }}\right) } \end{aligned}$$
(42)

\(\min _{l=1,...,\mathrm{{nbc}},l\ne i} \left\{ {d_{i,l} } \right\} \) calculates the minimum separation between two clusters \(i\) and l.

max Diff(nbc) factor is given to compare each two successive obtained partitions. In consequence, the best partition of the dataset is found.

$$\begin{aligned} \max \mathrm{{Diff}}_i (\mathrm{{nbc}})=\max \nolimits _{\mathrm{{nbc}}\max ,...,\mathrm{{nbc}}} \mathrm{{diff}}_i (\mathrm{{nbc}}) \end{aligned}$$
(43)
$$\begin{aligned} \mathrm{{diff}}_i (\mathrm{{nbc}})&= \max _{k=1,..,\mathrm{{nbc}},k\ne i} \left\{ {S_i (\mathrm{{nbc}})+S_k (\mathrm{{nbc}})} \right\} \nonumber \\&-\max \nolimits _{k=1,..,\mathrm{{nbc}}\!+\!1,k\ne i} \left\{ {S_i (\mathrm{{nbc}}\!+\!1)\!+\!S_k (nbc\!+\!1)} \right\} \nonumber \\ \end{aligned}$$
(44)

\(\mathrm{diff}_i (\mathrm{nbc})\) calculates the difference between max sums of compactness in a pair of clusters \(i\) and k; this is calculated for the obtained partition at nbc and nbc+1.

We define the average of fuzzy compactness relative to cluster \(i\) as

$$\begin{aligned} S_i =\frac{1}{n_i }\sum \limits _{\mathop {x\in C}\nolimits _i } {(u_i (x)^m)\times d(x,c_i )}, \end{aligned}$$
(45)

where \(u_i (x)\) is the membership degree of point \(x\) to cluster \(i\), \(m\) is the fuzzifier factor.

\(d(x,c_i )\) is the Euclidean distance separating \(x\) from the cluster centre \(c_i \).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tlili, M., Ayadi, T., Hamdani, T.M. et al. Performance evaluation of FMIG clustering using fuzzy validity indexes. Soft Comput 19, 3515–3528 (2015). https://doi.org/10.1007/s00500-014-1478-3

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-014-1478-3

Keywords

Navigation