Abstract
We consider two closely related fundamental clustering problems in this paper. In the Min-Sum k -Clustering problem, one is given a metric space and has to partition the points into k clusters while minimizing the total pairwise distances between the points assigned to the same cluster. In the Balanced k -Median problem, the instance is the same and one has to obtain a partitioning into k clusters \(C_1,\ldots ,C_k\), where each cluster \(C_i\) has a center \(c_i\), while minimizing the total assignment costs for the points in the metric; here the cost of assigning a point j to a cluster \(C_i\) is equal to \(|C_i|\) times the distance between j and \(c_i\) in the metric.
In this paper, we present an \(O(\log n)\)-approximation for both these problems where n is the number of points in the metric that are to be served. This is an improvement over the \(O(\epsilon ^{-1}\log ^{1 + \epsilon } n)\)-approximation (for any constant \(\epsilon > 0\)) obtained by Bartal, Charikar, and Raz [STOC ’01]. We also obtain a quasi-PTAS for Balanced k-Median in metrics with constant doubling dimension.
As in the work of Bartal et al., our approximation for general metrics uses embeddings into tree metrics. The main technical contribution in this paper is an O(1)-approximation for Balanced k-Median in hierarchically separated trees (HSTs). Our improvement comes from a more direct dynamic programming approach that heavily exploits properties of standard HSTs. In this way, we avoid the reduction to special types of HSTs that were considered by Bartal et al., thereby avoiding an additional \(O(\epsilon ^{-1} \log ^\epsilon n)\) loss.
M.R. Salavatipour—Supported by NSERC.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Agarwal, A., Charikar, M., Makarychev, K., Makarychev, Y.: \(O(\sqrt{\log n})\)-approximation algorithms for Min UnCut, Min-2CNF deletion, and directed cut problems. In: Proc. of STOC (2005)
Arya, V., Garg, N., Khandekar, R., Meyerson, A., Munagala, K., Pandit, V.: Local Search Heuristics for \(k\)-Median and Facility Location Problem. SIAM Journal on Computing 33, 544–562 (2004)
Bartal, Y.: Probabilistic approximation of metric spaces and its algorithmic application. In: Proc. of FOCS (1996)
Bartal, Y., Charikar, M., Raz, D.: Approximating min-sum \(k\)-Clustering in metric spaces. In: Proc. of STOC (2001)
Byrka, J., Pensyl, T., Rybicki, B., Srinivasan, A., Trinh, K.: An improved approximation for \(k\)-median, and positive correlation in budgeted optimization. In: Proc. of SODA (2015)
Chuzhoy, J., Rabani, Y.: Approximating k-median with non-uniform capacities. In: Proc. of SODA (2005)
Czumaj, A., Sohler, C.: Small space representations for metric min-sum k-clustering and their applications. In: Thomas, W., Weil, P. (eds.) STACS 2007. LNCS, vol. 4393, pp. 536–548. Springer, Heidelberg (2007)
Fakcharoenphol, J., Rao, S., Talwar, K.: A tight bound on approximating arbitrary metrics by tree metrics. In: Proc. of STOC (2003)
de la Vega, W.F., Karpinski, M., Kenyon, C., Rabani, Y.: Approximation schemes for clustering problems. In: Proc. STOC (2003)
Guttman-Beck, N., Hassin, R.: Approximation algorithms for min-sum \(p\)-clustering. Discrete Applied Mathematics 89, 125–142 (1998)
Indyk, P.: A sublinear time approximation scheme for clustering in metric spaces. In: Proc. of FOCS (1999)
Kann, V., Khanna, S., Lagergren, J., Panconessi, A.: On the hardness of max \(k\)-cut and its dual. In: Israeli Symposium on Theoretical Computer Science (1996)
Li, S., Svensson, O.: Approximating \(k\)-median via pseudo-approximation. In: Proc. of STOC (2013)
Sahni, S., Gonzalez, T.: \(P\)-Complete Approximation Problems. J. of the ACM (JACM) 23(3), 555–565 (1976)
Schulman, L.J.: Clustering for edge-cost minimization. In: Proc. of STOC (2000)
Talwar, K.: Bypassing the embedding: algorithms for low dimensional metrics. In: Proc. of STOC (2004)
Wu, C., Xu, D., Du, D., Wang, Y.: An improved approximation algorithm for \(k\)-median problem using a new factor-revealing LP. http://arxiv.org/abs/1410.4161
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Behsaz, B., Friggstad, Z., Salavatipour, M.R., Sivakumar, R. (2015). Approximation Algorithms for Min-Sum k-Clustering and Balanced k-Median. In: Halldórsson, M., Iwama, K., Kobayashi, N., Speckmann, B. (eds) Automata, Languages, and Programming. ICALP 2015. Lecture Notes in Computer Science(), vol 9134. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-47672-7_10
Download citation
DOI: https://doi.org/10.1007/978-3-662-47672-7_10
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-47671-0
Online ISBN: 978-3-662-47672-7
eBook Packages: Computer ScienceComputer Science (R0)