Skip to main content

Approximation Algorithms for Min-Sum k-Clustering and Balanced k-Median

  • Conference paper
  • First Online:
Automata, Languages, and Programming (ICALP 2015)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9134))

Included in the following conference series:

Abstract

We consider two closely related fundamental clustering problems in this paper. In the Min-Sum k -Clustering problem, one is given a metric space and has to partition the points into k clusters while minimizing the total pairwise distances between the points assigned to the same cluster. In the Balanced k -Median problem, the instance is the same and one has to obtain a partitioning into k clusters \(C_1,\ldots ,C_k\), where each cluster \(C_i\) has a center \(c_i\), while minimizing the total assignment costs for the points in the metric; here the cost of assigning a point j to a cluster \(C_i\) is equal to \(|C_i|\) times the distance between j and \(c_i\) in the metric.

In this paper, we present an \(O(\log n)\)-approximation for both these problems where n is the number of points in the metric that are to be served. This is an improvement over the \(O(\epsilon ^{-1}\log ^{1 + \epsilon } n)\)-approximation (for any constant \(\epsilon > 0\)) obtained by Bartal, Charikar, and Raz [STOC ’01]. We also obtain a quasi-PTAS for Balanced k-Median in metrics with constant doubling dimension.

As in the work of Bartal et al., our approximation for general metrics uses embeddings into tree metrics. The main technical contribution in this paper is an O(1)-approximation for Balanced k-Median in hierarchically separated trees (HSTs). Our improvement comes from a more direct dynamic programming approach that heavily exploits properties of standard HSTs. In this way, we avoid the reduction to special types of HSTs that were considered by Bartal et al., thereby avoiding an additional \(O(\epsilon ^{-1} \log ^\epsilon n)\) loss.

M.R. Salavatipour—Supported by NSERC.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Agarwal, A., Charikar, M., Makarychev, K., Makarychev, Y.: \(O(\sqrt{\log n})\)-approximation algorithms for Min UnCut, Min-2CNF deletion, and directed cut problems. In: Proc. of STOC (2005)

    Google Scholar 

  2. Arya, V., Garg, N., Khandekar, R., Meyerson, A., Munagala, K., Pandit, V.: Local Search Heuristics for \(k\)-Median and Facility Location Problem. SIAM Journal on Computing 33, 544–562 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  3. Bartal, Y.: Probabilistic approximation of metric spaces and its algorithmic application. In: Proc. of FOCS (1996)

    Google Scholar 

  4. Bartal, Y., Charikar, M., Raz, D.: Approximating min-sum \(k\)-Clustering in metric spaces. In: Proc. of STOC (2001)

    Google Scholar 

  5. Byrka, J., Pensyl, T., Rybicki, B., Srinivasan, A., Trinh, K.: An improved approximation for \(k\)-median, and positive correlation in budgeted optimization. In: Proc. of SODA (2015)

    Google Scholar 

  6. Chuzhoy, J., Rabani, Y.: Approximating k-median with non-uniform capacities. In: Proc. of SODA (2005)

    Google Scholar 

  7. Czumaj, A., Sohler, C.: Small space representations for metric min-sum k-clustering and their applications. In: Thomas, W., Weil, P. (eds.) STACS 2007. LNCS, vol. 4393, pp. 536–548. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  8. Fakcharoenphol, J., Rao, S., Talwar, K.: A tight bound on approximating arbitrary metrics by tree metrics. In: Proc. of STOC (2003)

    Google Scholar 

  9. de la Vega, W.F., Karpinski, M., Kenyon, C., Rabani, Y.: Approximation schemes for clustering problems. In: Proc. STOC (2003)

    Google Scholar 

  10. Guttman-Beck, N., Hassin, R.: Approximation algorithms for min-sum \(p\)-clustering. Discrete Applied Mathematics 89, 125–142 (1998)

    Article  MathSciNet  Google Scholar 

  11. Indyk, P.: A sublinear time approximation scheme for clustering in metric spaces. In: Proc. of FOCS (1999)

    Google Scholar 

  12. Kann, V., Khanna, S., Lagergren, J., Panconessi, A.: On the hardness of max \(k\)-cut and its dual. In: Israeli Symposium on Theoretical Computer Science (1996)

    Google Scholar 

  13. Li, S., Svensson, O.: Approximating \(k\)-median via pseudo-approximation. In: Proc. of STOC (2013)

    Google Scholar 

  14. Sahni, S., Gonzalez, T.: \(P\)-Complete Approximation Problems. J. of the ACM (JACM) 23(3), 555–565 (1976)

    Article  MATH  MathSciNet  Google Scholar 

  15. Schulman, L.J.: Clustering for edge-cost minimization. In: Proc. of STOC (2000)

    Google Scholar 

  16. Talwar, K.: Bypassing the embedding: algorithms for low dimensional metrics. In: Proc. of STOC (2004)

    Google Scholar 

  17. Wu, C., Xu, D., Du, D., Wang, Y.: An improved approximation algorithm for \(k\)-median problem using a new factor-revealing LP. http://arxiv.org/abs/1410.4161

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zachary Friggstad .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Behsaz, B., Friggstad, Z., Salavatipour, M.R., Sivakumar, R. (2015). Approximation Algorithms for Min-Sum k-Clustering and Balanced k-Median. In: Halldórsson, M., Iwama, K., Kobayashi, N., Speckmann, B. (eds) Automata, Languages, and Programming. ICALP 2015. Lecture Notes in Computer Science(), vol 9134. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-47672-7_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-47672-7_10

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-47671-0

  • Online ISBN: 978-3-662-47672-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics