ABSTRACT
Many clustering techniques are highly dependent on the initialization. The introduction of membership degrees used in fuzzy logic, avoids local minima, however the global minimum is far from satisfactory, especially when dealing with clusters with varying density. Here we propose an initialization method combining distance and density in order to approach as near as possible the final cluster centroids. Comparisons are given with KKZ method.
- J. Macqueen, "Some methods for classification and analysis of multivariate observations," in In 5-th Berkeley Symposium on Mathematical Statistics and Probability, 1967, pp. 281--297.Google Scholar
- Clementine 7.0 User's Guide Package (2003) Chicago: SPSS Inc.Google Scholar
- R. Roiger and M. Geatz, Data Mining: A Tutorial Based Primer, 1st edition. Boston: Pearson, 2002.Google Scholar
- J. Han, M. Kamber, and J. Pei, Data Mining: Concepts and Techniques, Third Edition, 3 edition. Haryana, India; Burlington, MA: Morgan Kaufmann, 2011.Google ScholarDigital Library
- E. H. Ruspini, "A new approach to clustering," Inf. Control, vol. 15, no. 1, pp. 22--32, Jul. 1969. Google ScholarCross Ref
- J. C. Dunn, "A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters," J. Cybern., vol. 3, no. 3, pp. 32--57, Jan. 1973. Google ScholarCross Ref
- F. Klawonn, R. Kruse, and R. Winkler, "Fuzzy clustering: More than just fuzzification," Fuzzy Sets Syst., vol. 281, pp. 272--279, Dec. 2015. Google ScholarDigital Library
- S. B. Green and N. J. Salkind, Using SPSS for Windows and Macintosh, Books a la Carte, 8 edition. Hoboken: Pearson, 2016.Google Scholar
- E. Forgy, "Cluster analysis of multivariate data: efficiency versus interpretability of classifications," Biometrics, vol. 21, pp. 768--769, 1965.Google Scholar
- Q. Yuan, H. Shi, and X. Zhou, "An optimized initialization center K-means clustering algorithm based on density," in 2015 IEEE International Conference on Cyber Technology in Automation, Control, and Intelligent Systems (CYBER), 2015, pp. 790--794. Google ScholarCross Ref
- SAS/ETS User's Guide, Version 8 (2000) Volumes 1 and 2, SAS Publiching.Google Scholar
- J. T. Tou and R. C. Gonzalez, Pattern Recognition Principles, 2nd edition. Reading, Mass.: Addison-Wesley, 1977.Google Scholar
- I. Katsavounidis, C. C. J. Kuo, and Z. Zhang, "A new initialization technique for generalized Lloyd iteration," IEEE Signal Process. Lett., vol. 1, no. 10, pp. 144--146, Oct. 1994. Google ScholarCross Ref
- J. He, M. Lan, C.-L. Tan, S.-Y. Sung, and H.-B. Low, "Initialization of cluster refinement algorithms: a review and comparative study," in 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No. 04CH37541), 2004, vol. 1, p. 302.Google Scholar
- M. E. Celebi, H. A. Kingravi, and P. A. Vela, "A comparative study of efficient initialization methods for the k-means clustering algorithm," Expert Syst. Appl., vol. 40, no. 1, pp. 200--210, Jan. 2013. Google ScholarDigital Library
- B. Schölkopf, A. Smola, and K.-R. Müller, "Nonlinear Component Analysis As a Kernel Eigenvalue Problem," Neural Comput, vol. 10, no. 5, pp. 1299--1319, Jul. 1998. Google ScholarDigital Library
Recommendations
A new initialization method for categorical data clustering
In clustering algorithms, choosing a subset of representative examples is very important in data set. Such ''exemplars'' can be found by randomly choosing an initial subset of data objects and then iteratively refining it, but this works well only if ...
Initialization of K-modes clustering using outlier detection techniques
We considered the initialization of K-modes clustering from the view of outlier detection.We proposed an initialization algorithm for K-modes clustering via the distance-based outlier detection technique.We presented a partition entropy-based outlier ...
Robust Clustering with Distance and Density
Clustering is fundamental for using big data. However, AP affinity propagation is not good at non-convex datasets, and the input parameter has a marked impact on DBSCAN density-based spatial clustering of applications with noise. Moreover, new ...
Comments