Abstract
The paper presents a least squares framework for divisive clustering. Two popular divisive clustering methods, Bisecting K-Means and Principal Direction Division, appear to be versions of the same least squares approach. The PDD recently has been enhanced with a stopping criterion taking into account the minima of the corresponding one-dimensional density function (dePDDP method). We extend this approach to Bisecting K-Means by projecting the data onto random directions and compare thus modified methods. It appears the dePDDP method is superior at datasets with relatively small numbers of clusters, whatever cluster intermix, whereas our version of Bisecting K-Means is superior at greater cluster numbers with noise entities added to the cluster structure.
Similar content being viewed by others
References
ALBATINEH, A.N., NIEWIADOMSKA-BUGAJ, M., and MIHALKO, D. (2006), “On Similarity Indices and Correction for Chance Agreement”, Journal of Classification, 23, 301–313.
BOCK, H.H. (1996), “Probability Models and Hypothesis Testing in Partitioning Cluster Analysis”, in Clustering and Classification, eds. P. Arabie, C.D. Carroll and G. De Soete, River Edge NJ: World Scientific Publishing, pp. 377–453.
BOLEY, D. (1998), “Principal Direction Divisive Partitioning”, Data Mining and Knowledge Discovery, 2(4), 325–344.
FENG, Y., and HAMERLY, G. (2006), “PG-Means: Learning the Number of Clusters in Data”, in Advances in Neural Information Processing Systems, 19 (NIPS 2006), eds. B. Schölkopf, J.C. Platt and T. Hoffman, MIT Press, pp. 393–400.
FRAYLEY, C., and RAFTERY, A. (1998), “How Many Clusters? Which Clustering Method? Answers Via Model-Based Cluster Analysis”, The Computer Journal, 41(8), 578–588.
DASGUPTA, S. (1999), “Learning Mixtures of Gaussians”, IEEE Symposium on Foundations of Computer Science, 634–644.
DASGUPTA, S. (2000), “Experiments with Random Projection”, in Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence (UAI-2000), San Francisco: Morgan Kaufmann, p. 143–151.
DEMPSTER, A.P., LAIRD, N.M., and RUBIN, D.B. (1977), “Maximum Likelihood from Incomplete Data Via the EM Algorithm”, Journal of the Royal Statistical Society. Series B (Methodological), 39(1), 1–38.
EDWARDS, A.W.F., and CAVALLI-SFORZA, L.L. (1965), “A Method for Cluster Analysis”, Biometrics, 21, 362–375.
FISHER, D.W. (1987), “Knowledge Acquisition Via Incremental Conceptual Clustering”, Machine Learning, 2, 139–172.
GOWER, J.C. (1967), “A Comparison of Some Methods of Cluster Analysis”, Biometrics, 23, 623–637.
HAZMAN, M., EL-BELTAGY, S.R., and RAFEA, A. (2011), “A Survey of Ontology Learning Approaches”, International Journal of Computer Applications, 22(9), 36–43.
HUBERT, L.J., and ARABIE, P. (1985), “Comparing Partitions”, Journal of Classification, 2, 193–218.
JOLLIFFE, I.T. (2002), Principal Component Analysis (2nd ed.), Springer Series in Statistics, New York: Springer.
JUNG, Y., PARK, H., DING-ZHU, D., and BARRY, L. (2003), “A Decision Criterion for the Optimal Number of Clusters in Hierarchical Clustering”, Journal of Global Optimization, 25, 91–111.
MEILA, M. (2007), "Comparing Clusterings—An Information Based Distance", Journal of Multivariate Analysis, 98(5), 873–895.
MENDELL, R., RUBIN, D., and LO, Y. (2001), “Testing the Number of Components in a Normal Mixture”, Biometrika, 88(3), 767–778.
MICHALSKI, R.S., and STEPP, R.E. (1983), “Learning from Observation: Conceptual Clustering”, in Machine Learning: An Artificial Intelligence Approach, eds. R.S.
Michalski, J.G. Carbonell, T.M. Mitchell, San Mateo CA: Morgan Kauffmann, pp. 331–363.
MILLIGAN, G.W. (1996), “Clustering Validation: Results and Implications for Applied Analyses”, in Clustering and Classification, eds. P. Arabie, C.D. Carroll and G. De Soete, River Edge NJ: World Scientific Publishing, pp. 341–375.
MIRKIN, B. (1996), Mathematical Classification and Clustering, Dordrecht: Kluwer. MIRKIN, B. (2011), “Choosing the Number of Clusters”, WIRE Data Mining and Knowledge Discovery, 1, 252–260.
MIRKIN, B. (2012), Clustering: A Data Recovery Approach, London: CRC Press/Chapman and Hall.
MIRKIN, B., and MING-TSO CHIANG, M. (2010), “Intelligent Choice of the Number of Clusters in K-Means Clustering: An Experimental Study with Different Cluster Spreads”, Journal of Classification, 27, 3–40.
NEWMAN, M.E.J. (2006), “Modularity and Community Structure in Networks”, PNAS, 103(23), 8577–8582.
NG, A.Y., JORDAN, M.I., and WEISS, Y. (2001), “On Spectral Clustering: Analysis and an Algorithm”, Advances in Neural Information Processing Systems, 2, 849–856.
RAND, W.M. (1971), “Objective Criteria for the Evaluation of Clustering Methods”, Journal of the American Statistical Association, 66, 846–850.
SCHREIDER, Y.A., and SHAROV, A.A. (1982), Systems and Models (in Russian), Moscow: Radio i Sviaz'.
SHI, J., and MALIK, J. (2000), “Normalized Cuts and Image Segmentation”, IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8), 888–905.
SNEATH, P.H.A., and SOKAL, R.R. (1973), Numerical Taxonomy, San Francisco: W.H. Freeman. SONQUIST J.A., BAKER E.L., and MORGAN J.N. (1973), Searching for Structure, Ann Arbor: Institute for Social Research, University of Michigan.
STEINBACH, M., KARYPIS, G., and KUMAR, V. (2000), “A Comparison of Document Clustering Techniques”, KDD Workshop on Text Mining, 400(1), 525–526.
STEINLEY, D., and BRUSCO, M. (2007), “Initializing K-Means Batch Clustering: A Critical Evaluation of Several Techniques”, Journal of Classification, 24, 99–121.
TASOULIS, S.K., and TASOULIS, D.K. (2008), “Improving Principal Direction Divisive Clustering”, in: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2008), Workshop on Data Mining using Matrices and Tensors.
TASOULIS, S.K., TASOULIS, D.K., and PLAGIANAKOS, V.P. (2010), “Enhancing Principal Direction Divisive Clustering”, Pattern Recognition, 43, 3391–3411.
TASOULIS, S. K., TASOULIS, D. K., and PLAGIANAKOS, V.P. (2013), “Random Direction Divisive Clustering”, Pattern Recognition Letters, 34(2),131–139.
TEICHER, H. (1960), “On the Mixture of Distributions”, Annals of Mathematical Statististics, 31(1), 55–73.
VEMPALA, S. (2005), The Random Projection Method, DIMACS Series in Discrete Mathematics (Vol. 65), American Mathematical Society.
YEUNG, K.Y., FRALEY, C., MURUA, A., RAFTERY, A.E., and RUZZO, W.L. (2001), “Model-Based Clustering and Data Transformations for Gene Expression Data”, Bioinformatics, 17(10), 977–987.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Kovaleva, E.V., Mirkin, B.G. Bisecting K-Means and 1D Projection Divisive Clustering: A Unified Framework and Experimental Comparison. J Classif 32, 414–442 (2015). https://doi.org/10.1007/s00357-015-9186-y
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00357-015-9186-y