Skip to main content

Advertisement

Log in

An Exact Algorithm for the Two-Mode KL-Means Partitioning Problem

  • Published:
Journal of Classification Aims and scope Submit manuscript

Abstract

Two-mode partitioning applications are increasingly common in the physical and social sciences with a variety of models and methods spanning these applications. Two-mode KL-means partitioning (TMKLMP) is one type of two-mode partitioning model with a conceptual appeal that stems largely from the fact that it is a generalization of the ubiquitous (one-mode) K-means clustering problem. A number of heuristic methods have been proposed for TMKLMP, ranging from a two-mode version of the K-means heuristic to metaheuristic approaches based on simulated annealing, genetic algorithms, variable neighborhood search, fuzzy steps, and tabu search. We present an exact algorithm for TMKLMP based on branch-and-bound programming and demonstrate its utility for the clustering of brand switching, manufacturing cell formation, and journal citation data. Although the proposed branchand-bound algorithm does not obviate the need for approximation methods for large two-mode data sets, it does provide a first step in the development of methods that afford a guarantee of globally-optimal solutions for TMKLMP.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • ALOISE, D., HANSEN, P., and LIBERTI, L. (2012), “An Improved Column Generation Algorithm for Minimum Sum-of-Squares Clustering,” Mathematical Programming A, 131, 195–220.

    Article  MATH  MathSciNet  Google Scholar 

  • BAIER, D., GAUL, W., and SCHADER, M. (1997), “Two-Mode Overlapping Clustering with Applications in Simultaneous Benefit Segmentation and Market Structuring,” in Classification and Knowledge Organization, eds. R. Kar and O. Opitz, Heidelberg: Springer, pp. 557–566.

  • BALAS, E. (1965), “An Additive Algorithm for Solving Linear Programs with Zero-One Variables,” Operations Research, 13, 517–546.

  • BASS, F.M., PESSEMIER, E.A., and LEHMANN, D.R. (1972), “An Experimental Study of Relationships Between Attitudes of Brand Preference and Choice,” Behavioral Science, 17, 532–541.

    Article  Google Scholar 

  • BOTH, M., and GAUL, W. (1985), “PENCLUS: Penalty Clustering for Marketing Applications,” Discussion Paper No. 82, Institution of Decision Theory and Operations Research, University of Karlsruhe.

  • BOTH, M., and GAUL, W. (1987), “Ein Vergleich Zweimodaler Clusteranalyseverfahren,” Methods of Operations Research, 57, 593–605.

    Google Scholar 

  • BOYD, J.P., FITZGERALD, W.J., MAHUTGA, M.C., and SMITH, D.A. (2010), “Computing Continuous Core/Periphery Structures for Social Relations Data with MINRES/SVD,” Social Networks, 32, 125–137.

    Article  Google Scholar 

  • BRUSCO, M.J. (2006), “A Repetitive Branch-and-Bound Algorithm for Minimum Within-Cluster Sums of Squares Partitioning,” Psychometrika, 71, 347–363.

    Article  MATH  MathSciNet  Google Scholar 

  • BRUSCO, M. (2011), “Analysis of Two-Mode Network Data Using Nonnegative Matrix Factorization,” Social Networks, 33, 201–210.

  • BRUSCO, M., DOREIAN, P., MRVAR, A., and STEINLEY, D. (2013), “An Exact Algorithm for Blockmodeling of Two-Mode Network Data,” Journal of Mathematical Sociology, 37, 61–84.

    Article  MATH  MathSciNet  Google Scholar 

  • BRUSCO, M. ., and STAHL, S. (2005a), Branch-and-Bound Applications in Combinatorial Data Analysis, New York: Springer.

    MATH  Google Scholar 

  • BRUSCO, M.J., and STAHL, S. (2005b), “Optimal Least-Squares Unidimensional Scaling: Improved Branch-and-Bound Procedures and Comparison to Dynamic Programming,” Psychometrika, 70, 253–270.

    Article  MATH  MathSciNet  Google Scholar 

  • BRUSCO, M., and STEINLEY, D. (2007a), “A Variable Neighborhood Search Method for Generalized Blockmodeling of Two-Mode Binary Matrices,” Journal of Mathematical Psychology, 51, 325–338.

    Article  MATH  MathSciNet  Google Scholar 

  • BRUSCO, M.J., and STEINLEY, D. (2007b), “Exact and Approximate Algorithms for Part-Machine Clustering Based on a Relationship Between Interval Graphs and Robinson Matrices,” IIE Transactions, 39, 925–935.

    Article  Google Scholar 

  • CARBONNEAU, R.A., CAPOROSSI, G., and HANSEN, P. (2012), “Extensions to the Repetitive Branch-and-Bound Algorithm for Globally Optimal Clusterwise Regression,” Computers and Operations Research, 39, 2748–2762.

    Article  MATH  MathSciNet  Google Scholar 

  • CASTILLO, W., and TREJOS, J. (2002), “Two-Mode Partitioning: Review of Methods and Application of Tabu Search,” in Classification, Clustering and Data Analysis, eds. K. Jajuga, A. Sololowski, and H. Bock, Berlin: Springer, pp. 43–51.

  • CHAN, H.M., and MILNER, D.A. (1982), “Direct Clustering Algorithm for Group Formation in Cellular Manufacturing,” Journal of Manufacturing Systems, 1, 65–74.

    Article  Google Scholar 

  • CLAPHAM, C. (1996), The Concise Oxford Dictionary of Mathematics, New York: Oxford University Press.

  • COLOMBO, R.A., EHRENBERG, A.S.C., and SABAVALA, D.J. (1994), “The Car Challenge: Diversity in Analyzing Brand Switching Tables,” Working Paper, New York University.

  • DESARBO, W.S. (1982), “GENNCLUS: New Models for General Nonhierarchical Clustering Analysis,” Psychometrika, 47, 449–475.

    Article  MATH  MathSciNet  Google Scholar 

  • DESARBO, W.S., and DE SOETE, G. (1984), “On the Use of Hierarchical Clustering for the Analysis of Nonsymmetric Proximities,” Journal of Consumer Research, 11, 601–610.

    Article  Google Scholar 

  • DOREIAN, P. (1985), “Structural Equivalence in a Psychology Journal Network,” Journal of the American Society for Information Science, 36, 411–417.

  • DOREIAN, P. (1988), “Testing Structural Equivalence Hypotheses in a Network of Geographical Journals,” Journal of the American Society for Information Science, 39, 79–85.

  • DOREIAN, P., BATAGELJ, V., and FERLIGOJ, A. (2004), “Generalized Blockmodeling of Two-Mode Network Data,” Social Networks, 26, 29–53.

    Article  Google Scholar 

  • DOREIAN, P., BATAGELJ, V., and FERLIGOJ, A. (2005), Generalized Blockmodeling, Cambridge: Cambridge University Press.

    Google Scholar 

  • DOREIAN, P., and FARARO, T.J. (1985), “Structural Equivalence in a Journal Network,” Journal of the American Society for Information Science, 36, 28–37.

    Article  Google Scholar 

  • DOREIAN, P., LLOYD, P., and MRVAR, A. (2013), “Partitioning Large Signed Two-Mode Networks: Problems and Prospects,” Social Networks, 35, 178–203.

    Article  Google Scholar 

  • FORGY, E.W. (1965), “Cluster Analyses of Multivariate Data: Efficiency versus Interpretability of Classifications,” Abstract in Biometrics, 21, 768–769.

  • GAUL, W., and SCHADER, M. (1996), “A New Algorithm for Two-Mode Clustering,” in H. Bock & W. Polasek (Eds.), Data Analysis and Information Systems, eds. H. Bock and W. Polasek, Berlin: Springer, pp. 15–23.

  • GROENEN, P.J.F., and HEISER, W.J. (1996), “The Tunneling Method for Global Optimization in Multidimensional Scaling,” Psychometrika, 61, 529–550.

    Article  MATH  Google Scholar 

  • HANSEN, P., and DELATTRE, M. (1978), “Complete-Link Cluster Analysis by Graph Coloring,” Journal of the American Statistical Association, 73, 397–403.

    Article  Google Scholar 

  • HANSOHM, J. (2002), “Two-Mode Clustering with Genetic Algorithms,” in Classification, Automation and New Media, eds. W. Gaul and G. Ritter, Berlin: Springer, pp. 87–93.

  • HARTIGAN, J. (1972), “Direct Clustering of a Data Matrix,” Journal of the American Statistical Association, 67, 123–129.

  • HOFFMAN, D.L., VAN DER HEIJDEN, P.G.M., and NOVAK, T.P. (2001), “Mapping Asymmetry in Categorical Consumer Choice Data,” Working Paper, Retrieved from http://www.academia.edu/2611216/Mapping_asymmetry_in_categorical_consumer_choice_data

  • HUBERT, L., and ARABIE, P. (1985), “Comparing Partitions,” Journal of Classification, 2, 193–218.

    Article  Google Scholar 

  • KLEIN, G., and ARONSON, J.E. (1991), “Optimal Clustering: A Model and Method,” Naval Research Logistics, 38, 447–461.

    Article  MATH  Google Scholar 

  • KOONTZ, W.L.G., NARENDRA, P.M., and FUKUNAGA, K. (1975), “A Branch and Bound Clustering Algorithm,” IEEE Transactions on Computing, C-24, 908–915.

  • LAND, A.H., and DOIG, A. (1960), “An Automatic Method of Solving Discrete Programming Problems,” Econometrica, 28, 497–520.

    Article  MATH  MathSciNet  Google Scholar 

  • MACQUEEN, J.B. (1967), “Some Methods for Classification and Analysis of Multivariate Observations,” in Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability (Vol. 1), eds. L.M. Le Cam and J. Newman, Berkeley, CA: University of California Press, pp. 281–297.

  • MADEIRA, S.C., and OLIVEIRA, A.L. (2004), “Biclustering Algorithms for Biological Data Analysis: A Survey,” IEEE Transactions in Computational Biology and Bioinformatics, 1, 24–45.

    Article  Google Scholar 

  • MIRKIN, B., ARABIE, P., and HUBERT, L.J. (1995), Additive Two-Mode Clustering: The Error-Variance Approach Revisited,” Journal of Classification, 12, 243–263.

    Article  MATH  Google Scholar 

  • MISCHE, A., and PATTISON, P. (2000), “Composing a Civic Arena: Publics, Projects, and Social Settings,” Poetics, 27, 163–194.

    Article  Google Scholar 

  • MOSTELLER, F. (1968), “Association and Estimation in Contingency Tables,” Journal of the American Statistical Association, 63, 1–28.

  • PALUBECKIS, G. (1997), “A Branch-and-Bound Approach Using Polyhedral Results for a Clustering Problem,” INFORMS Journal on Computing, 9, 30–42.

  • PRELIĆ, A., BLUELER, S., ZIMMERMANN, P., WILLE, A., BÜHLMANN, P., GRUISSEM, W., HENNIG, L., THIELE, L., and ZITZLER, E. (2006), “A Systematic Comparison and Evaluation of Biclustering Methods for Gene Expression Data,” Bioinformatics, 22, 1122–1129.

    Article  Google Scholar 

  • RAO, V. R., and SABAVALA, D. J. (1981), “Inferences of Hierarchical Choice Processes from Panel Data,” Journal of Consumer Research, 8, 85–96.

    Article  Google Scholar 

  • RAO, V.R., SABAVALA, D.J., and LANGFELD, P.A. (1977), “Alternate Measures for Partitioning Analysis Based on Brand Switching Data,” Working Paper, Cornell University.

  • SCHEPERS, J., CEULEMANS, E., and VAN MECHELEN, I. (2008), “Selection Among Multi-Mode Partitioning Models of Different Complexities,” Journal of Classification, 25, 67–85.

    Article  MATH  MathSciNet  Google Scholar 

  • SCHEPERS, J., and VAN MECHELEN, I. (2011), “A Two-Mode Clustering Method to Capture the Nature of the Dominant Interaction Pattern in Large Profile Data Matrices,” Psychological Methods, 16, 361–371.

    Article  Google Scholar 

  • SELIM, H.M., ASKIN, R.G., and VAKHARIA, A.J. (1998), “Cell Formation in Group Technology: Review, Evaluation and Directions for Future Research,” Computers and Industrial Engineering, 34, 3–20.

    Article  Google Scholar 

  • STEINHAUS, H. (1956), “Sur la Division des Corps Matériels en Parties,” Bulletin de l’Académie Polonaise des Sciences, Classe III, IV(12), 801–804.

  • STEINLEY, D. (2006), “K-means Clustering: A Half-Century Synthesis,” British Journal of Mathematical and Statistical Psychology, 59, 1–34.

  • TREJOS, J., and CASTILLO, W. (2000), “Simulated Annealing Optimization for Two-Mode Partitioning, in Classification and Information at the Turn of the Millennium, eds. W. Gaul and R. Decker, Heidelberg: Springer,. pp. 135–142.

  • VAN MECHELEN, I., BOCK, H.H., and DEBOECK, P. (2004), “Two-Mode Clustering Methods: A Structured Overview,” Statistical Methods in Medical Research, 13, 363–394.

    Article  MATH  MathSciNet  Google Scholar 

  • VAN ROSMALEN, J., GROENEN, P.J.F., TREJOS, J., and CASTILLO, W. (2009), “Optimization Strategies for Two-Mode Partitioning,” Journal of Classification, 26, 155–181.

    Article  MathSciNet  Google Scholar 

  • VAN UITERT, M., MEULEMAN, W., and WESSELS, L. (2008), “Biclustering Sparse Binary Genomic Data,” Journal of Computational Biology, 15, 1329–1345.

    Article  MathSciNet  Google Scholar 

  • VICHI, M. (2001), “Double K-means Clustering for Simultaneous Classification of Objects and Variables,” in Advances in Classification and Data Analysis – Studies in Classification, Data Analysis and Knowledge Organization, eds. S. Borra, R. Rocchi, and M. Schader, Heidelberg: Springer, pp. 43–52.

  • WILDERJANS, T.F., DEPRIL, D., and VAN MECHELEN, I. (2013), “Additive Biclustering: A Comparison of One New and Two Existing ALS Algorithms,” Journal of Classification, 30, 56–74.

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Michael J. Brusco or Patrick Doreian.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Brusco, M.J., Doreian, P. An Exact Algorithm for the Two-Mode KL-Means Partitioning Problem. J Classif 32, 481–515 (2015). https://doi.org/10.1007/s00357-015-9185-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00357-015-9185-z

Keywords

Navigation