Abstract
In recent years there has been increasing interest in the comparative clustering abilities of k-means, moving methods and self-organising neural networks. However, most comparative studies have either been restricted to specific problem areas or have been conducted with other limitations that do not provide a more general evaluation of the relative abilities of these methods under a wide variety of conditions. This report provides a systematic empirical evaluation of the clustering abilities of k-means, moving methods and two commonly used self-organising neural network architectures. Monte Carlo simulation examining the effects of cluster shape, dimensionality, noise, dispersion and number of clusters in the data is used to evaluate the above methods. Results indicate that, on average, k-means, moving methods and ‘winner take all’ self-organising networks perform equally well in terms of clustering ability. However, as the moving method consistently converges faster than k-means, under circumstances where convergence speed is an important factor it may well represent a more appropriate benchmark for future comparisons between pattern partitioning methods.
Similar content being viewed by others
References
Gordon, A.D. Classification: Methods for the Exploratory Analysis of Multivariate Data. Chapman & Hall, London, 1981.
Everitt, B.S. Cluster Analysis, 3rd Ed. Heinemann Educational, London, 1993
Forgy, E.W. Cluster analysis of multivariate data: Efficiency versus interpretability of classifications. Biometics 1965; 21(3): 768.
Cheng, R. and Miligan G.W. K-means clustering with influence detection. Educational and Psychological Measurement 1996; 56(5): 833–838.
Ismail, M.A. and Kamel, M.S. Multi-dimensional data clustering utilising hybrid search strategies. Pattern Recognition 1989; 22(1): 75–89.
Messatfa, H. An algorithm to maximise the agreement between partitions. J Classification 1992; 9: 5–15.
Wang, Y., Yan, H. and Sriskandarajah, C. The weighted sum of split and diameter clustering. J Classification 1996; 13: 231–248.
Anderberg, M.R. Cluster Analysis for Applications. Academic Press, London, 1973.
Kohonen, T. Self-Organising Maps. Springer-Verlag, Berlin, 1995.
Kohonen, T. Self-organised formation of topologically correct feature maps. Biological Cybernetics. 1982; 43: 59–69.
Rumelhart, D.E. and Zipser, D. Feature discovery by competitive learning. In: Rumelhart, D.E. and McClelland, J.L. (eds.) Parallel Distributed Processing, Vol. 1. MIT Press, 1986; 151–193.
Bezdek, J.C. Pattern Recognition with Fuzzy Objective Functions. Plenum Press, New York, 1981.
Ripley, B.D. Statistical aspects of neural networks (pre-print). Proc Séminaire Européen de Statistique, Sandberg, Denmark. Chapman & Hall, 1993.
Chen, B. and Titteringdon, D.M. Neural networks: A review from a statistical perspective. Statistical Science 1994; 9(1): 2–54.
Murtagh, F. and Hernández-Pajares, M. The Kohonen self-organising map method: An assessment. J Classification 1995; 12: 165–190.
Balakrishnan, P.V., Cooper, M.C., Jacob, V.S. and Lewis, P.A. A cluster analysis of brand choice using neural networks. In V.S. Jacob and H. Pirkul (eds.), Proceedings of the 20th International Business School Computer Users Group, Columbus, OH, 1992; 365–375.
Balakrishnan, P.V., Cooper, M.C., Jacob, V.S. and Lewis, P.A. A study of the classification capabilities of neural networks using unsupervised learning: A comparison with k-means clustering. Psychometrika 1994; 59(4): 509–524.
Balakrishnan, P.V., Cooper, M.C., Jacob, V.S. and Lewis, P.A. Comparative performance of the FSCL neural network and k-means for market segmentation. Euro J Operational Research 1996; 93(2): 346–357.
Openshaw, S. Census User's Handbook. Pearson Professional Ltd., Cambridge 1995.
Zhang, Q. and Boyle, R.D. A new clustering algorithm with multiple runs of iterative procedures. Pattern Recognition 1991; 24(9): 835–848.
Duda, R.O. and Hart, P.E. Pattern classification and Scene Analysis. Wiley, New York, 1973.
Hubert, L. and Arbie, P. Comparing partitions. J Classification 1985; 2: 193–218.
Nour, M.A. and Madey, G.R. Heuristic approaches to extending the Kohonen self organising algorithm. Euro J Operational Research 1996; 93: 428–448.
Milligan, G.W. and Cooper, M.C. Methodology review: Clustering methods. Applied Psychological Measurement 1987; 11(4): 329–354.
Milligan, G.W. Clustering validation: results and implications for applied analyses. In P. Arabie, J. Hubert and G. Se Soete (eds.), Clustering and Classification. World Scientific, River Edge, NJ, 1996; 341–375.
Kangas, J.A., Kohonen, T.K. and Laaksonen, J.T. Variants of self-organising maps. IEEE Trans Neural Networks 1990; 1(1): 93–99.
Milligan, G.W. An algorithm for generating artificial test clusters. Psychometrika 1985; 50(1): 123–127.
Ball, G.H. and Hall, D.J. ISODATA, A novel method of data analysis and pattern classification. AD 699616, Standford Res. Inst., Menlo Park, CA, 1965.
Chaudhuri, D., Chaudhuir, B.B. and Murthy, C.A. A new split and merge clustering technique. Pattern Recognition Letters 1992, 13: 399–409.
fritzke, B. Growing grid—a self-organising network with constant neighbourhood range and adaptation strength. Neural Process Lett 1995; 2(5): 9–13.
Fritzke, B. Unsupervised clustering with growing cell structures. Proc Int Joint Conf Neural Networks, Vol. 2, Seattle, WA, 1991; 531–536.
Osbourn, G.C. and Martinez, R.F. Empirically defined regions of influence for clustering analyses. Pattern Recognition 1995; 28(11) 1793–1806.
Hand, D.J. Discrimination and Classification. John Wiley & Sons, New York, 1981.
Bayne, C.K., Beauchamp, J.J., Begovich, C.L. and Kane, V.E. Monte Carlo comparisons of selected clustering procedures. Pattern Recognition 1980; 12: 51–62.
Cowgill, M.C Monte Carlo Validation of Two Genetic Clustering Algorithms. PhD Thesis, Virginia Polytechnic Institute and State University, 1993.
Jain, A.K. and Dubes, R.C. Algorithms for Clustering Data. Prentice Hall, Englewood Cliffs, NJ, 1988.
Shattuck, T., Germani, M.S. and Busek, P.R. Multivariate statistics for large data sets: Applications to individual aerosol particles. Analytical Chemistry 1991; 63(22): 2646–2656.
Babu, G.P. and Murty, M.N. A near optimal seed value selection in k-means algorithm using a genetic algorithm. Pattern Recognition Letters 1993; 14: 763–769.
Pollard, D. Strong consistency of k-means clustering. Annals Statistics 1981; 9(1): 135–140.
Backer, E. Computer Assisted Reasoning in Cluster Analysis. Prentice Hall, London, 1995.
Haykin, S. Neural Networks: a comprehensive foundation. Macmillan College Publishing, New York, 1994.
Hertz, J., Krogh, A. and Palmer R.G. Introduction to the Theory of Neural Networks. Addison-Wesley, Redwood City, CA, 1991.
Dubes, R. and Jain, A.K. Clustering techniques: the user's dilemma. Pattern Recognition 1976; 8: 247–260.
Salzberg, S.L. On comparing classifiers: pitfalls to avoid and a recommended approach. Data Mining and Knowledge Discovery 1999; 1: 1–11 (to appear).
Tyree, E.W. APC: A hybrid clustering methodology for data clustering and feature extraction. Unpublished PhD thesis, Department of Business Computing, City University. (To be submitted Spring 1998.)
Cormack, R.M. A review of classification (with Discussion). J Roy Statistical Soc Series A 1971; 134: 321–367.
SAS Release 6.09, SAS Institute Inc., Cary, NC, USA, 1989.
Mangiameli, P., Chen, S.K. and West, D A comparison of SOM neural network and hierarchical clustering methods. Euro J Operational Research 1996; 92: 402–417.
Bishop, C.M., Svensén, M. and Williams, C.K. GTM: a principled alternative to the self-organising map. In Advances in Neural Information Processing Systems 9, MIT Press, 1997.
Koikkalainen, P. Progress with the tree-structured self-organizing map. Proc ECAI'94, 11th Euro Conf Artificial Intelligence, Wiley, Chichester, 1994; 211–215.
Koikkalainen, P. Fast deterministic self-organizing maps. Proc ICANN'95, Int Conf Artificial Neural Networks, Vol II, EC2 & Cie, Paris, 1995; 63–68.
Carpenter, G.A. and Grossberg, S. ART2: self organization of stable category recognition codes for analog input patterns. Applied Optics 1987; 26: 4919–4930.
Carpenter, G.A., Grossberg, S. and Rosen, D.B. Fuzzy Art: fast stable learning and categorization for analog patterns by an adaptive resonance system. Neural Networks 1991; 4(6): 759–771.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Tyree, E.W., Long, J.A. A Monte Carlo evaluation of the moving method, k-means and two self-organising neural networks. Pattern Analysis & Applic 1, 79–90 (1998). https://doi.org/10.1007/BF01237937
Received:
Revised:
Accepted:
Issue Date:
DOI: https://doi.org/10.1007/BF01237937