Abstract
This paper proposes a novel core-growing (CG) clustering method based on scoring k-nearest neighbors (CG-KNN). First, an initial core for each cluster is obtained, and then a tree-like structure is constructed by sequentially absorbing data points into the existing cores according to the KNN linkage score. The CG-KNN can deal with arbitrary cluster shapes via the KNN linkage strategy. On the other hand, it allows the membership of a previously assigned training pattern to be changed to a more suitable cluster. This is supposed to enhance the robustness. Experimental results on four UCI real data benchmarks and Leukemia data sets indicate that the proposed CG-KNN algorithm outperforms several popular clustering algorithms, such as Fuzzy C-means (FCM) (Xu and Wunsch IEEE Transactions on Neural Networks 16:645–678, 2005), Hierarchical Clustering (HC) (Xu and Wunsch IEEE Transactions on Neural Networks 16:645–678, 2005), Self-Organizing Maps (SOM) (Golub et al. Science 286:531–537, 1999; Tamayo et al. Proceedings of the National Academy of Science USA 96:2907, 1999), and Non-Euclidean Norm FCM (NEFCM) (Karayiannis and Randolph-Gips IEEE Transactions On Neural Networks 16, 2005).
Similar content being viewed by others
References
Webb, A. R. (2002). Statistical pattern recognition (2nd ed.). Wiley.
Jain, A. K., & Dubes, R. C. (1988). Algorithm for clustering data. Englewood Cliffs: Prentice-Hall.
Camastra, F., Verri, A. (2005). A novel Kernel method for clustering. IEEE Transactions On Pattern Analysis and Machine Intelligence, 27(5).
Karayiannis, N. B., Randolph-Gips, M. M. (2005). Soft learning vector quantization and clustering algorithms based on non-euclidean norms: single-norm algorithms. IEEE Transactions On Neural Networks, 16(2).
Golub, T. R., Slonim, D. K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J. P., et al. (1999). Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science, 286(15), 531–537.
Xu, R., & Wunsch, D., II. (2005). Survey of clustering algorithms. IEEE Transactions on Neural Networks, 16(3), 645–678.
Tamayo, P., et al. (1999). The SOM was constructed using the GENECLUSTER software. Proceedings of the National Academy of Science USA, 96, 2907.
Tamayo, P., Ramaswamy, S. (2003). Cancer genomics and molecular pattern recognition. In M. Ladanyi, W. Gerald (Eds.), Expression profiling of human tumors: Diagnostic and research applications. Humana Press.
Kung, S. Y., Luo, Y., Mak, M.-W. (2008). Feature selection for genomic signal processing: unsupervised, supervised, and self-supervised scenarios. Journal of Signal Processing Systems.
Wang, Y. P., Gunampally, M., Chen, J., Bittel, D., Butler, M. G., & Cai, W. W. (2008). A comparison of fuzzy clustering approaches for quantification of microarray gene expression. Journal of Signal Processing Systems, 50, 305–320.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Hsieh, T.W., Taur, J.S. & Kung, S.Y. A KNN-Scoring Based Core-Growing Approach to Cluster Analysis. J Sign Process Syst Sign Image Video Technol 60, 105–114 (2010). https://doi.org/10.1007/s11265-009-0406-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11265-009-0406-8