Skip to main content

Full and Semi-supervised k-Means Clustering Optimised by Class Membership Hesitation

  • Conference paper
Adaptive and Natural Computing Algorithms (ICANNGA 2013)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7824))

Included in the following conference series:

  • 1752 Accesses

Abstract

K-Means algorithm is one of the most popular methods for cluster analysis. K-Means, as the majority of clustering methods optimise clusters in an unsupervised way. In this paper we present a method of cluster’s class membership hesitation, which enables k-Means to learn with fully and partially labelled data. In the proposed method the hesitation of cluster during optimisation step is controlled by Metropolis-Hastings algorithm. The proposed method was compared with state-of-art methods for supervised and semi-supervised clustering on benchmark data sets. Obtained results yield the same or better classification accuracy on both types of supervision.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Al-Harbi, S.H., Rayward-Smith, V.J.: Adapting k-means for supervised clustering. Applied Intelligence 24, 219–226 (2006)

    Article  Google Scholar 

  2. Asuncion, A., Newman, D.J.: UCI machine learning repository. University of California, Irvine, School of Information and Computer Sciences (2007)

    Google Scholar 

  3. Arthur, D., Vassilvitskii, S.: K-means++: The Advantages of Careful Seeding. In: Symposium on Discrete Algorithms (2007)

    Google Scholar 

  4. Basu, S., Banerjee, A., Mooney, R.J.: Semi- supervised clustering by seeding. In: Proceedings of the 19th International Conference on Machine Learning, pp. 19–26 (2002)

    Google Scholar 

  5. Bilenko, M., Basu, S., Mooney, R.J.: Integrating constraints and metric learning in semi-supervised clustering. In: Proceedings of the 21th International Conference on Machine Learning, pp. 81–88 (2004)

    Google Scholar 

  6. Du, K.-L.: Clustering: A neural network approach. Neural Networks 23, 89–107 (2010)

    Article  Google Scholar 

  7. Hamerly, G., Elkan, C.: Learning the k in k-means. In: Neural Information Processing Systems Conference (2003)

    Google Scholar 

  8. Hastings, W.K.: Monte Carlo Sampling Methods Using Markov Chains and Their Applications. Biometrika 57, 97–109 (1970)

    Article  MATH  Google Scholar 

  9. Hinton, G.E., Dayan, P., Revow, M.: Modeling the manifolds of images of handwritten digits. IEEE Transactions on Neural Networks 8, 65–74 (1997)

    Article  Google Scholar 

  10. Jain, A.K.: Data clustering: 50 years beyond k-means. Pattern Recognition Letters 31, 651–666 (2010)

    Article  Google Scholar 

  11. Kirkpatrick, S., Gelatt, C.D., Vecchi, M.P.: Optimization by Simulated Annealing. Science 220, 671–680 (1983)

    Article  MathSciNet  MATH  Google Scholar 

  12. Kohonen, T.: The Self-Organizing Map. Proceedings of the IEEE 78, 1464–1480 (1990)

    Article  Google Scholar 

  13. Likas, A., Nikos, V., Verbeek, J.J.: The global k-means clustering algorithm. Pattern Recognition 36, 451–461 (2003)

    Article  Google Scholar 

  14. MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceeding of the 5th Berkeley Symposium, pp. 281–297 (1967)

    Google Scholar 

  15. Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A.H., Teller, E.: Equations of State Calculations by Fast Computing Machines. Journal of Chemical Physics 21, 1087–1092 (1953)

    Article  Google Scholar 

  16. Olszewski, D.: Asymmetric k-Means Algorithm. In: Dobnikar, A., Lotrič, U., Šter, B. (eds.) ICANNGA 2011, Part II. LNCS, vol. 6594, pp. 1–10. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  17. Perim, G.T., Wandekokem, E.D., Varejão, F.M.: K-Means Initialization Methods for Improving Clustering by Simulated Annealing. In: Geffner, H., Prada, R., Machado Alexandre, I., David, N. (eds.) IBERAMIA 2008. LNCS (LNAI), vol. 5290, pp. 133–142. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  18. Płoński, P., Zaremba, K.: Improving Performance of Self-Organising Maps with Distance Metric Learning Method. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2012, Part I. LNCS, vol. 7267, pp. 169–177. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  19. Płoński, P., Zaremba, K.: Self-Organising Maps for Classification with Metropolis-Hastings Algorithm for Supervision. In: Huang, T., Zeng, Z., Li, C., Leung, C.S. (eds.) ICONIP 2012, Part III. LNCS, vol. 7665, pp. 149–156. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  20. Siriseriwan, W., Sinapiromsaran, K.: Attributes Scaling for K-means Algorithm Controlled by Misclassification of All Clusters. In: Proceedings of the 3rd International Conference on Knowledge Discovery and Data Mining, pp. 220–223 (2010)

    Google Scholar 

  21. Wagstaff, K., Cardie, C., Rogers, S.: Constrained k-means clustering with background knowledge. In: Proceedings of the 18th International Conference on Machine Learning, pp. 577–584 (2001)

    Google Scholar 

  22. Wang, X., Wang, C., Shen, J.: Semi–supervised K-Means Clustering by Optimizing Initial Cluster Centers. In: Gong, Z., Luo, X., Chen, J., Lei, J., Wang, F.L. (eds.) WISM 2011, Part II. LNCS, vol. 6988, pp. 178–187. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  23. Yang, W., Rueda, L., Ngom, A.: A Simulated Annealing Approach to Find the Optimal Parameters for Fuzzy Clustering Microarray Data. In: Proceedings of the 25th International Conference of Chilean Computer Science Society, pp. 45–55 (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Płoński, P., Zaremba, K. (2013). Full and Semi-supervised k-Means Clustering Optimised by Class Membership Hesitation. In: Tomassini, M., Antonioni, A., Daolio, F., Buesser, P. (eds) Adaptive and Natural Computing Algorithms. ICANNGA 2013. Lecture Notes in Computer Science, vol 7824. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37213-1_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-37213-1_23

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-37212-4

  • Online ISBN: 978-3-642-37213-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics