Skip to main content
Log in

Affinity Propagation and Uncapacitated Facility Location Problems

  • Published:
Journal of Classification Aims and scope Submit manuscript

Abstract

One of the most important distinctions that must be made in clustering research is the difference between models (or problems) and the methods for solving those problems. Nowhere is this more evident than with the evaluation of the popular affinity propagation algorithm (apcluster.m), which is a MATLAB implementation of a neural clustering method that has received significant attention in the biological sciences and other disciplines. Several authors have undertaken comparisons of apcluster.m with methods designed for models that fall within the class of uncapacitated facility location problems (UFLPs). These comparative models include the p-center (or K-center) model and, more importantly, the p-median (or K-median) model. The results across studies are conflicting and clouded by the fact that, although similar, the optimization model underlying apcluster.m is slightly different from the p-median model and appreciably different from the pcenter model. In this paper, we clarify that apcluster.m is actually a heuristic for a ‘maximization version’ of another model in the class of UFLPs, which is known as the simple plant location problem (SPLP). An exact method for the SPLP is described, and the apcluster.m program is compared to a fast heuristic procedure (sasplp.m) in both a simulation experiment and across numerous datasets from the literature. Although the exact method is the preferred approach when computationally feasible, both apcluster.m and sasplp.m are efficient and effective heuristic approaches, with the latter slightly outperforming the former in most instances.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • AGMON, S. (1954), “The Relaxation Method for Linear Inequalities”, Canadian Journal of Mathematics, 6, 382–392.

  • ALBA, E., and DOMINGUEZ, E. (2006), “Comparative Analysis of Modern Optimization Tools for the p-median Problem”, Statistics and Computing, 16, 251–260.

    Article  MathSciNet  Google Scholar 

  • ALP, O., ERKUT, E., and DREZNER, Z. (2003), “An Efficient Genetic Algorithm for the p-Median Problem”, Annals of Operations Research, 122, 21–42.

    Article  MATH  MathSciNet  Google Scholar 

  • ANDERSON, E. (1935), “The Irises of the Gaspé Peninsula”, Bulletin of the American Iris Society, 59, 2–5.

  • APELTSIN, L., MORRIS, J.H., BABBITT, P.C., and FERRIN, T.E. (2011), “Improving the Quality of Protein Similarity Network Clustering Algorithms Using the Network Edge Weight Distribution”, Bioinformatics, 27, 326–333.

    Article  Google Scholar 

  • AVELLA, P., SASSANO, A., and VASIL’EV, I. (2007), “Computational Study of Large-Scale p-Median Problems”, Mathematical Programming A, 109, 89–114.

  • BALINSKI, M.L. (1965), “Integer Programming: Methods, Uses, Computation”, Management Science, 12, 253–313.

    Article  MATH  MathSciNet  Google Scholar 

  • BELTRAN, C., TADONKI, C., and VIAL, J. (2006), “Solving the p-Median Problem with a Semi-Lagrangian Relaxation”, Computational Optimization and Applications, 35(2), 239–260.

    Article  MATH  MathSciNet  Google Scholar 

  • BERROU, C., GLAVIEUX, A., and THITIMAJSHIMA, P. (1993), “Near Shannon Limit Error-Correcting Coding and Decoding: Turbo-Codes (1)”, in Proceedings of the IEEE International Conference on Communications, ICC 93, Vol. 2, pp. 1064–1070.

  • BILDE, O., and KRARUP, J. (1977), “Sharp Lower Bounds and Efficient Algorithms for the Simple Plant Location Problem”, Annals of Discrete Mathematics, 1, 79–88.

    Article  MathSciNet  Google Scholar 

  • BLANCHARD, S.J., ALOISE, D., and DESARBO, W.S. (2012), “The Heterogeneous pmedian Problem for Categorization Based Clustering”, Psychometrika, 77, 741–762.

    Article  MATH  MathSciNet  Google Scholar 

  • BRUSCO, M.J., and CRADIT, J.D. (2001), “A Variable Selection Heuristic for K-means Clustering”, Psychometrika, 66, 249–270.

    Article  MATH  MathSciNet  Google Scholar 

  • BRUSCO, M.J., and KÖHN, H.-F. (2008a), “Comment on ‘Clustering by Passing Messages Between Data Points’”, Science, 319, 726c.

  • BRUSCO, M.J., and KÖHN, H.-F. (2008b), “Optimal Partitioning of a Data Set Based on the p-median Model”, Psychometrika, 73, 89–105.

    Article  MATH  MathSciNet  Google Scholar 

  • BRUSCO, M.J., and KÖHN, H.-F. (2009), “Exemplar-Based Clustering via Simulated Annealing”, Psychometrika, 74, 457–475.

    Article  MATH  MathSciNet  Google Scholar 

  • BRUSCO, M.J., and STEINLEY, D. (2007), “A Comparison of Heuristic Procedures for Minimum Within-Cluster Sums of Squares Partitionin”, Psychometrika, 72, 583–600.

    Article  MATH  MathSciNet  Google Scholar 

  • CHANG, J.T. (2012), “Deriving Transcriptional Programs and Functional Processes from Gene Expression Databases”, Bioinformatics, 28, 1122–1129.

    Article  Google Scholar 

  • CHEN, L., CHAN, T.-H., CHOYKE, P.L., HILLMAN, E.M.C., CHI, C.-Y., BHUJWALLA, Z.M., WANG, G., WANG, S.S., SZABO, Z., and WANG, Y. (2011), “CAM-CM: A Signal Deconvolution Tool for in vivo Dynamic Contrast-Enhanced Imaging of Complex Tissues”, Bioinformatics, 27, 2607–2609.

    Google Scholar 

  • CHIYOSHI, F., and GALVÃO, R.D. (2000), “A Statistical Analysis of Simulated Annealing Applied to the p-Median Problem”, Annals of Operations Research, 96, 61–74.

    Article  MATH  Google Scholar 

  • CHRISTOFIDES, N., and BEASLEY, J. (1982), “A Tree Search Algorithm for the pmedian Problem”, European Journal of Operational Research, 10, 196–204.

    Article  MATH  MathSciNet  Google Scholar 

  • CORNUEJOLS, G., FISHER, M.L., and NEMHAUSER, G.L. (1977), “Location of Bank Accounts to Optimize Float: An Analytic Study of Exact and Approximate Algorithms”, Management Science, 23, 789–810.

    Article  MATH  MathSciNet  Google Scholar 

  • DUECK, D. (2009), “Affinity Propagation: Clustering Data by Passing Messages”, Unpublished Doctoral Dissertation, Graduate Department of Electrical and Computer Engineering, University of Toronto.

  • EFROYMSON, M.A., and RAY, T.L. (1966), “A Branch-and-Bound Algorithm for Plant Location”, Operations Research, 14, 361–375.

    Article  Google Scholar 

  • EL-SHAIEB, A.M. (1973), “A New Algorithm for Locating Sources Among Destinations”, Management Science, 20, 221–231.

  • ERLENKOTTER, D. (1978), “A Dual Procedure for Uncapacitated Facility Location”, Operations Research, 26, 992–1009.

  • FISHER, R.A. (1936), “The Use of Multiple Measurements in Taxonomic Problems”, Annals of Eugenics, 7, 179–188.

    Article  Google Scholar 

  • FREY, B., and DUECK, D. (2007), “Clustering by Passing Messages Between Data Points”, Science, 315, 972–976.

    Article  MATH  MathSciNet  Google Scholar 

  • FREY, B., and DUECK, D. (2008), “Response to Comment on ‘Clustering by Passing Messages Between Data Points’”, Science, 319, 726d.

  • GALVÃO, R.D. (1980), “A Dual-Bounded Algorithm for the p-median Problem”, Operations Research, 28, 1112–1121.

    Article  MATH  MathSciNet  Google Scholar 

  • GALVÃO, R.D. (2004), “Uncapacitated Facility Location Problems: Contributions”, Pesquisa Operacional, 24, 7–38.

    Article  Google Scholar 

  • GALVÃO, R.D., and RAGGI, L.A. (1989), “A Method for Solving to Optimality Uncapacitated Location Problems”, Annals of Operations Research, 18, 225–244.

    Article  MATH  MathSciNet  Google Scholar 

  • GRÖTSCHEL, M., and HOLLAND, O. (1991), “Solution of Large-Scale Symmetric Traveling Salesman Problems”, Mathematical Programming, 51, 141–202.

    Article  MATH  MathSciNet  Google Scholar 

  • HAIR, J.F., ANDERSON, R.E., TATHAM, R.L., and BLACK, W.C. (1998), Multivariate Data Analysis (5th ed.), Upper Saddle River, NJ: Prentice Hall.

  • HAKIMI, S.L. (1964), “Optimum Locations of Switching Centers and the Absolute Centers and Medians of a Graph”, Operations Research, 12, 450–459.

    Article  MATH  Google Scholar 

  • HAKIMI, S.L. (1965), “Optimum Distribution of Switching Centers in a Communication Network and Some Related Graph Theory Problems”, Operations Research, 123, 462–475.

    Article  MathSciNet  Google Scholar 

  • HANJOUL, P., and PEETERS, D. (1985), “A Comparison of Two Dual-Based Procedures for Solving the p-Median Problem”, European Journal of Operational Research, 20, 387–396.

    Article  MATH  MathSciNet  Google Scholar 

  • HANSEN, P., and JAUMARD, B. (1997), “Cluster Analysis and Mathematical Programming”, Mathematical Programming, 79, 191–215.

    MATH  MathSciNet  Google Scholar 

  • HANSEN, P., and MLADENOVIĆ, N. (1997), “Variable Neighborhood Search for the p-Median”, Location Science, 5, 207–226.

    Article  MATH  Google Scholar 

  • HANSEN, P., and MLADENOVIĆ, N. (2008), “Complement to a Comparative Analysis of Heuristics for the p-Median Problem”, Statistics and Computing, 18, 41–46.

    Article  MathSciNet  Google Scholar 

  • HANSEN, P., MLADENOVIĆ, N., and PEREZ-BRITO, D. (2001), “Variable Neighborhood Decomposition Search”, Journal of Heuristics, 7, 335–350.

    Article  MATH  Google Scholar 

  • HELD, M., and KARP, R.M. (1970), “The Traveling Salesman Problem and Minimum Spanning Trees”, Operations Research, 18, 1138–1162.

    Article  MATH  MathSciNet  Google Scholar 

  • HELD, M., WOLFE, P., and CROWDER, H.P. (1974), “Validation of Subgradient Optimization”, Mathematical Programming, 6, 62–88.

    Article  MATH  MathSciNet  Google Scholar 

  • HEINZ, G., PETERSON, L.J., JOHNSON, R.W., and KERK, C.J. (2003), “Exploring Relationships in Body Dimensions”, Journal of Statistical Education, 11, www.amstat.org/publications/jse/v11n2/datasets.heinz.html.

  • HUBERT, L., and ARABIE, P. (1985), “Comparing Partitions”, Journal of Classification, 2, 193–218.

    Article  Google Scholar 

  • JARVINEN, P., RAJALA, J., and SINERVO, H. (1972), “A Branch-and-Bound Algorithm for Seeking the p-median”, Operations Research, 20, 173–178.

    Article  Google Scholar 

  • KARALETSOS, T., STEGLE, O., DREYER, D., WINN, J., and BORGWARDT, K.M. (2012), “ShapePheno: Unsupervised Extraction of Shape Phenotypes from biological Image Collections”, Bioinformatics, 28, 1001–1008.

    Article  Google Scholar 

  • KAUFMAN, L., and ROUSSEEUW, P.J. (2005), Finding Groups in Data: An Introduction to Cluster Analysis (2nd ed.), New York: Wiley.

  • KIDDLE, S.J., WINDRAM, O.P.F., MCHATTIE, S., MEAD, A., BEYNON, J., BUCHANAN-WOLLASTON, V., DENBY, K.J., and MUKHERJEE, S. (2010), “Temporal Clustering by Affinity Propagation Reveals Transcriptional Modules in Arabidopsis Thaliana”, Bioinformatics, 26, 355–362.

    Article  Google Scholar 

  • KLASTORIN, T. (1985), “The p-median Problem for Cluster Analysis: A Comparative Test Using the Mixture Model Approach”, Management Science, 31, 84–95.

  • KÖHN, H.-F., STEINLEY, D., and BRUSCO, M.J. (2010), “The p-median Model as a Tool for Clustering Psychological Data”, Psychological Methods, 15, 87–95.

    Article  Google Scholar 

  • KUEHN, A.A., and HAMBURGER, M.J. (1963), “A Heuristic Program for Locating Warehouses”, Management Science, 9, 643–666.

    Article  Google Scholar 

  • LEVANOVA, T., and LORESH, M.A. (2004), “Algorithms of Ant System and Simulated Annealing for the p-median Problem”, Automation and Remote Control, 65, 431–438.

    Article  MATH  Google Scholar 

  • LIN, S., and KERNIGHAN, B.W. (1973), “An Effective Heuristic Algorithm for the Traveling Salesman Problem”, Operations Research, 21, 498–516.

    Article  MATH  MathSciNet  Google Scholar 

  • MARANZANA, F.E. (1964), “On the Location of Supply Points to Minimize Transportation Costs”, Operations Research, 12, 138–139.

    Google Scholar 

  • MATHWORKS, INC. (2006), Using MATLAB (Version 7), Natick MA: The MathWorks, Inc.

  • MÉZARD, M., PARISI, G., and ZECCHINA, R. (2002), “Analytic and Algorithmic Solution of Random Satisfiability Problems”, Science, 297, 812–815.

    Article  Google Scholar 

  • MILLIGAN, G.W. (1980), “An Examination of the Effects of Six Types of Error Perturbation on Fifteen Clustering Algorithms”, Psychometrika, 45, 325–342.

    Article  Google Scholar 

  • MILLIGAN, G.W. (1996), “Clustering Validation: Results and Implications for Applied Analyses.” in Clustering and Classification, eds. P. Arabie, L.J. Hubert, and G. De Soete, River Edge NJ: World Scientific Publishing, pp. 321–375.

  • MILLIGAN, G.W., and COOPER, M.C. (1985), “An Examination of Procedures for Determining the Number of Clusters in a Data Set”, Psychometrika, 50, 159–179.

    Article  Google Scholar 

  • MILLIGAN, G.W., and COOPER, M.C. (1988), “A Study of the Standardization of Variables in Cluster Analysis”, Journal of Classification, 5, 181–204.

    Article  MathSciNet  Google Scholar 

  • MLADENOVIĈ, N., BRIMBERG, J., HANSEN, P., and MORENO-PÉREZ, J.A. (2007), “The p-Median Problem: A Survey of Metaheuristic Approaches”, European Journal of Operational Research, 179, 927–939.

    Article  MATH  MathSciNet  Google Scholar 

  • MOTZKIN, T., and SCHOENBERG, I.J. (1954), “The Relaxation Method for Linear Inequalities”, Canadian Journal of Mathematics, 6, 393–404.

    Article  MATH  MathSciNet  Google Scholar 

  • MUKHERJEE, S., and HILL, S.M. (2011), “Network Clustering: Probing Biological Heterogeneity by Sparse Graphical Models”, Bioinformatics, 27, 994–1000.

    Article  Google Scholar 

  • MULVEY, J.M., and CROWDER, H.P. (1979), “Cluster Analysis: An Application of Lagrangian Relaxation”, Management Science, 25, 329–340.

    Article  MATH  Google Scholar 

  • NARULA, S.C., OGBU, U.I., and SAMUELSON, H.M. (1977), “An Algorithm for the pmedian Problem”, Operations Research, 25, 709–713.

    Article  MATH  MathSciNet  Google Scholar 

  • RAO, M. R. (1971), “Cluster Analysis and Mathematical Programming”, Journal of the American Statistical Association, 66, 622–626.

  • REINELT, G. (2001), TSPLIB, http://www.iwr.uni-heidelberg.de/groups/comopt/software/TSPLIB95.

  • RESENDE, M.G.C., and WERNECK, R.F. (2004), “A Hybrid Heuristic for the p-median Problem”, Journal of Heuristics, 10, 59–88.

    Article  MATH  Google Scholar 

  • REVELLE, C.S., and SWAIN, R. (1970), “Central Facilities Location”, Geographical Analysis, 2, 30–42.

    Article  Google Scholar 

  • ROLLAND, E., SCHILLING, D.A. and CURRENT, J.R. (1996), “An Efficient Tabu Search Procedure for the p-Median Problem”, European Journal of Operational Research, 96, 329–342.

    Article  Google Scholar 

  • SPÄTH, H. (1980), Cluster Analysis Algorithms for Data Reduction and Classification of Objects, New York: Wiley.

  • SPELLMAN, P.T., SHERLOK, G., ZHANG, M.Q., IYER, V.R., ANDERS, K., EISEN, M.B., BROWN, P.O., BOTSTEIN, D., and FUTCHER, B. (1998), “Comprehensive Identification of Cell Cycle-Regulated Genes of the Yeast Saccharomyces Cerevisiae by Microarray Hybridization”, Molecular Biology of the Cell, 9, 3273–3297.

    Article  Google Scholar 

  • STEINLEY, D. (2003), “Local Optima in K-means Clustering: What You Don’t Know May Hurt You”, Psychological Methods, 8, 294–304.

  • STEINLEY, D. (2004), “Properties of the Hubert-Arabie Adjusted Rand Index”, Psychological Methods, 9, 386–396.

  • STEINLEY, D. (2006), “Profiling Local Optima in K-means Clustering: Developing a Diagnostic Technique”, Psychological Methods, 11, 178–192.

  • STEINLEY, D., and BRUSCO, M.J. (2007), “Initializing K-means Batch Clustering: A Critical Analysis of Several Techniques”, Journal of Classification, 24, 99–121.

    Article  MATH  MathSciNet  Google Scholar 

  • STEINLEY, D., and BRUSCO, M.J. (2008a), “A New Variable Weighting and Selection Procedure for K-means Cluster Analysis”, Multivariate Behavioral Research, 43, 77–108.

    Article  Google Scholar 

  • STEINLEY, D., and BRUSCO, M.J. (2008b), “Selection of Variables in Cluster Analysis: An Empirical Comparison of Eight Procedures”, Psychometrika, 73, 125–144.

    Article  MATH  MathSciNet  Google Scholar 

  • STEINLEY, D., and BRUSCO, M.J. (2011a), “Choosing the Number of Clusters in Kmeans Clustering”, Psychological Methods, 16, 285–297.

    Article  Google Scholar 

  • STEINLEY, D., and BRUSCO, M.J. (2011b), “Evaluating Mixture-Modeling for Clustering: Recommendations and Cautions”, Psychological Methods, 16, 63–79.

    Article  Google Scholar 

  • STEINLEY, D., and HENSON, R. (2005), “An Analytic Method for Generating Clusters with Known Overlap”, Journal of Classification, 22, 221–250.

    Article  MathSciNet  Google Scholar 

  • TANG, D., ZHU, Q., and YANG, F. (2010), “A Poisson-Based Adaptive Affinity Propagation Clustering for SAGE Data”, Computational Biology and Chemistry, 34, 63–70.

    Article  MathSciNet  Google Scholar 

  • TEITZ, M.B. and BART, P. (1968), “Heuristic Methods for Estimating the Generalized Vertex Median of a Weighted Graph”, Operations Research, 16, 955–961.

    Article  MATH  Google Scholar 

  • THIZY, J.-M., VAN WASSENHOVE, L., and KHUMAWALA, B. (1985), “Comparison of Exact and Approximate Methods of Solving the Uncapacitated Plant Location Problem”, Journal of Operations Management, 6, 23–34.

    Article  Google Scholar 

  • VINOD, H. (1969), “Integer Programming and the Theory of Grouping”, Journal of the American Statistical Association, 64, 506–517.

  • VLASBLOM, J. and WODAK, S.J. (2009), “Markov Clustering versus Affinity Propagation for the Partitioning of Protein Interaction Graphs”, BMC Bioinformatics, 10, 99.

  • WHITAKER, R. (1983), “A Fast Algorithm for the Greedy Interchange of Large-Scale Clustering and Median Location Problems”, INFOR, 21, 95–108.

  • WOŹNIAK, M., TIURYN, J., and DUTKOWSKI, J. (2010), “MODEVO: Exploring Modularity and Evolution of Protein Interaction Networks”, Bioinformatics, 26, 1790–1791.

    Article  Google Scholar 

  • ZHANG, J., LI, D., CHEN, H. and FANG, F. (2011), “Analysis of Activity in fMRI Data Using Affinity Propagation Clustering”, Computer Methods in Biomechanics and Biomedical Engineering, 14, 271–281.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michael J. Brusco.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Brusco, M.J., Steinley, D. Affinity Propagation and Uncapacitated Facility Location Problems. J Classif 32, 443–480 (2015). https://doi.org/10.1007/s00357-015-9187-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00357-015-9187-x

Keywords

Navigation