Abstract
Clustering algorithms are non-supervised algorithms and, among the many available, the k-Means can be considered one of the most popular and successful. The performance of the k-Means, however, is highly dependent on a ‘good’ initialization of the k group centers (centroids) as well as of the value assigned to the number (k) of groups the final clustering should have. This chapter addresses experiments using five initialization algorithms available in the literature namely, the Method1, the k-Means++, the CCIA, the Maedeh&Suresh and the SPSS algorithms, to empirically evaluate their contribution to improving k-Means performance.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
MacQueen, J.B.: Some methods for classification and analysis of multivariate observations, In: Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–297. University of California Press (1987)
Al-Daoud, M., Roberts, S.A.: New methods for the initialisation of clusters. Pattern Recogn. Lett. 17, 451–455 (1996)
Arthur, D., Vassilvitskii, S.: K-Means++: the advantages of careful seeding. In: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1027–1035. Society for Industrial and Applied Mathematics, USA (2007)
Maedeh, A., Suresh, K.: Design of efficient k-Means clustering algorithm with improved initial centroids. Int. J. Eng. Technol. 5(1), 33–38 (2013)
Khan, S.S., Ahmad, A.: Cluster center initialization algorithm for k-Means clustering. Pattern Recogn. Lett. 25, 1293–1302 (2004)
Pavan, K.K., Rao, A.A., Rao, A.V.D., Sridhar, G.R.: Robust seed selection algorithm for k-means type algorithms. Int. J. Comput. Sci. Inform. Technol. (IJCSIT) 3(5), 147–163 (2011)
Aggarwal, C.C., Reddy, C.K.: Data Clustering Algorithms and Applications. Chapman & Hall/CRC Data Mining and Knowledge Discovery Series. CRC Press, Boca Raton (2013)
Han, J., Kamber, M., Pei, J.: Data Mining – Concepts and Techniques, 3rd edn. Morgan Kaufmann Publishers, Amsterdam (2012)
Burks, S., Harrell, G., Wang, J.: On initial effects of the k-Means clustering, In: Proceedings of the 2015 World Congress in Computer Science, Computer Engineering, & Applied Computing, USA, pp. 200–205 (2015)
Dua, D., Karra Taniskidou, E.: UCI Machine Learning Repository (http://archive.ics.edu/ml). University of California, School of Information and Computer Science, Irvine, CA (2017)
Chernoff, H.: The use of faces to represent points in n-dimensional space graphically, Technical report no. 71, Department of Statistics. Stanford University, Stanford, CA, USA (1971)
Hand, D.J., Daly, F., Lunn, A.D., McConway, K.J., Ostrowski, E.: Handbook of Small Data Sets, 1st edn. Chapman and Hall/CRC, London (1993)
Gionis, A., Mannila, H., Tsaparas, P.: Clustering aggregation. ACM Trans. Knowl. Discovery Data 1(1) (2007). https://doi.org/10.1145/1217299.1217303, http://doi.acm.org/10.1145/1217299.1217303, Article 4, 30 pages
Su, M.C., Chou, C.H., Hsieh, C.C.: Fuzzy C-Means algorithm with a point symmetry distance. Int. J. Fuzzy Syst. 7(4), 175–181 (2005)
Rousseeuw, P.: Silhouettes: a graphical-aid to the interpretation and validation of cluster analysis. Comput. Appl. Math. 20, 53–65 (1987)
Rand, W.M.: Objective criteria for the evaluation of clustering methods. J. Am. Stat. Assoc. 66(336), 846–850 (1971)
Acknowledgments
The authors thank UNIFACCAMP and CNPq for their support. This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior − Brasil (CAPES) − Finance Code 001.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
de Oliveira, A.F., do Carmo Nicoletti, M. (2020). Favoring the k-Means Algorithm with Initialization Methods. In: Abraham, A., Cherukuri, A.K., Melin, P., Gandhi, N. (eds) Intelligent Systems Design and Applications. ISDA 2018 2018. Advances in Intelligent Systems and Computing, vol 940. Springer, Cham. https://doi.org/10.1007/978-3-030-16657-1_3
Download citation
DOI: https://doi.org/10.1007/978-3-030-16657-1_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-16656-4
Online ISBN: 978-3-030-16657-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)