Full and Semi-supervised k-Means Clustering Optimised by Class Membership Hesitation

Płoński, Piotr; Zaremba, Krzysztof

doi:10.1007/978-3-642-37213-1_23

Piotr Płoński¹⁷ &
Krzysztof Zaremba¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7824))

Included in the following conference series:

International Conference on Adaptive and Natural Computing Algorithms

1752 Accesses

Abstract

K-Means algorithm is one of the most popular methods for cluster analysis. K-Means, as the majority of clustering methods optimise clusters in an unsupervised way. In this paper we present a method of cluster’s class membership hesitation, which enables k-Means to learn with fully and partially labelled data. In the proposed method the hesitation of cluster during optimisation step is controlled by Metropolis-Hastings algorithm. The proposed method was compared with state-of-art methods for supervised and semi-supervised clustering on benchmark data sets. Obtained results yield the same or better classification accuracy on both types of supervision.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Al-Harbi, S.H., Rayward-Smith, V.J.: Adapting k-means for supervised clustering. Applied Intelligence 24, 219–226 (2006)
Article Google Scholar
Asuncion, A., Newman, D.J.: UCI machine learning repository. University of California, Irvine, School of Information and Computer Sciences (2007)
Google Scholar
Arthur, D., Vassilvitskii, S.: K-means++: The Advantages of Careful Seeding. In: Symposium on Discrete Algorithms (2007)
Google Scholar
Basu, S., Banerjee, A., Mooney, R.J.: Semi- supervised clustering by seeding. In: Proceedings of the 19th International Conference on Machine Learning, pp. 19–26 (2002)
Google Scholar
Bilenko, M., Basu, S., Mooney, R.J.: Integrating constraints and metric learning in semi-supervised clustering. In: Proceedings of the 21th International Conference on Machine Learning, pp. 81–88 (2004)
Google Scholar
Du, K.-L.: Clustering: A neural network approach. Neural Networks 23, 89–107 (2010)
Article Google Scholar
Hamerly, G., Elkan, C.: Learning the k in k-means. In: Neural Information Processing Systems Conference (2003)
Google Scholar
Hastings, W.K.: Monte Carlo Sampling Methods Using Markov Chains and Their Applications. Biometrika 57, 97–109 (1970)
Article MATH Google Scholar
Hinton, G.E., Dayan, P., Revow, M.: Modeling the manifolds of images of handwritten digits. IEEE Transactions on Neural Networks 8, 65–74 (1997)
Article Google Scholar
Jain, A.K.: Data clustering: 50 years beyond k-means. Pattern Recognition Letters 31, 651–666 (2010)
Article Google Scholar
Kirkpatrick, S., Gelatt, C.D., Vecchi, M.P.: Optimization by Simulated Annealing. Science 220, 671–680 (1983)
Article MathSciNet MATH Google Scholar
Kohonen, T.: The Self-Organizing Map. Proceedings of the IEEE 78, 1464–1480 (1990)
Article Google Scholar
Likas, A., Nikos, V., Verbeek, J.J.: The global k-means clustering algorithm. Pattern Recognition 36, 451–461 (2003)
Article Google Scholar
MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceeding of the 5th Berkeley Symposium, pp. 281–297 (1967)
Google Scholar
Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A.H., Teller, E.: Equations of State Calculations by Fast Computing Machines. Journal of Chemical Physics 21, 1087–1092 (1953)
Article Google Scholar
Olszewski, D.: Asymmetric k-Means Algorithm. In: Dobnikar, A., Lotrič, U., Šter, B. (eds.) ICANNGA 2011, Part II. LNCS, vol. 6594, pp. 1–10. Springer, Heidelberg (2011)
Chapter Google Scholar
Perim, G.T., Wandekokem, E.D., Varejão, F.M.: K-Means Initialization Methods for Improving Clustering by Simulated Annealing. In: Geffner, H., Prada, R., Machado Alexandre, I., David, N. (eds.) IBERAMIA 2008. LNCS (LNAI), vol. 5290, pp. 133–142. Springer, Heidelberg (2008)
Chapter Google Scholar
Płoński, P., Zaremba, K.: Improving Performance of Self-Organising Maps with Distance Metric Learning Method. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2012, Part I. LNCS, vol. 7267, pp. 169–177. Springer, Heidelberg (2012)
Chapter Google Scholar
Płoński, P., Zaremba, K.: Self-Organising Maps for Classification with Metropolis-Hastings Algorithm for Supervision. In: Huang, T., Zeng, Z., Li, C., Leung, C.S. (eds.) ICONIP 2012, Part III. LNCS, vol. 7665, pp. 149–156. Springer, Heidelberg (2012)
Chapter Google Scholar
Siriseriwan, W., Sinapiromsaran, K.: Attributes Scaling for K-means Algorithm Controlled by Misclassification of All Clusters. In: Proceedings of the 3rd International Conference on Knowledge Discovery and Data Mining, pp. 220–223 (2010)
Google Scholar
Wagstaff, K., Cardie, C., Rogers, S.: Constrained k-means clustering with background knowledge. In: Proceedings of the 18th International Conference on Machine Learning, pp. 577–584 (2001)
Google Scholar
Wang, X., Wang, C., Shen, J.: Semi–supervised K-Means Clustering by Optimizing Initial Cluster Centers. In: Gong, Z., Luo, X., Chen, J., Lei, J., Wang, F.L. (eds.) WISM 2011, Part II. LNCS, vol. 6988, pp. 178–187. Springer, Heidelberg (2011)
Chapter Google Scholar
Yang, W., Rueda, L., Ngom, A.: A Simulated Annealing Approach to Find the Optimal Parameters for Fuzzy Clustering Microarray Data. In: Proceedings of the 25th International Conference of Chilean Computer Science Society, pp. 45–55 (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Radioelectronics, Warsaw University of Technology, Nowowiejska 15/19, 00-665, Warsaw, Poland
Piotr Płoński & Krzysztof Zaremba

Authors

Piotr Płoński
View author publications
You can also search for this author in PubMed Google Scholar
Krzysztof Zaremba
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Départment des Systémes d’Information, Quartier UNIL-Dorigny, Bâtiment Internef, Université de Lausanne, 105, Lausanne, Switzerland
Marco Tomassini , Alberto Antonioni , Fabio Daolio & Pierre Buesser , , &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Płoński, P., Zaremba, K. (2013). Full and Semi-supervised k-Means Clustering Optimised by Class Membership Hesitation. In: Tomassini, M., Antonioni, A., Daolio, F., Buesser, P. (eds) Adaptive and Natural Computing Algorithms. ICANNGA 2013. Lecture Notes in Computer Science, vol 7824. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37213-1_23

Download citation

DOI: https://doi.org/10.1007/978-3-642-37213-1_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37212-4
Online ISBN: 978-3-642-37213-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics