Skip to main content
Log in

An approximation algorithm for the uniform capacitated k-means problem

  • Published:
Journal of Combinatorial Optimization Aims and scope Submit manuscript

Abstract

In this paper, we consider the uniform capacitated k-means problem (UC-k-means), an extension of the classical k-means problem (k-means) in machine learning. In the UC-k-means, we are given a set \(\mathcal {D}\) of n points in d-dimensional space and an integer k. Every point in the d-dimensional space has an uniform capacity which is an upper bound on the number of points in \(\mathcal {D}\) that can be connected to this point. Every two-point pair in the space has an associated connecting cost, which is equal to the square of the distance between these two points. We want to find at most k points in the space as centers and connect every point in \(\mathcal {D}\) to some center without violating the capacity constraint, such that the total connecting costs is minimized. Based on the technique of local search, we present a bi-criteria approximation algorithm, which has a constant approximation guarantee and violates the cardinality constraint within a constant factor, for the UC-k-means.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Aggarwal A, Deshpande A, Kannan R (2009) Adaptive sampling for \(k\)-means clustering. In: Proceedings of APPROX and RANDOM, pp 15-28

  • Aggarwal A, Louis A, Bansal M, Garg N, Gupta N, Gupta S, Jain S (2013) A \(3\)-approximation algorithm for the facility location problem with uniform capacities. Math Program 141:527–547

    Article  MathSciNet  Google Scholar 

  • Ahmadian S, Norouzi-Fard A, Svensson O, Ward J (2017) Better guarantees for \(k\)-means and euclidean \(k\)-median by primal-dual algorithms. In: Proceedings of FOCS, pp 61–72

  • Aloise D, Deshpande A, Hansen P, Popat P (2009) NP-hardness of Euclidean sum-of-squares clustering. Mach Learn 75:245–248

    Article  Google Scholar 

  • Arthur D, Vassilvitskii S (2006) How slow is the \(k\)-means method? In: Proceedings of SoCG, pp 144–153

  • Arthur D, Vassilvitskii S (2007) \(k\)-means++: the advantages of careful seeding. In: Proceedings of SODA, pp 1027–1035

  • Awasthi P, Charikar M, Krishnaswamy R, Sinop AK (2015) The hardness of approximation of euclidean \(k\)-means. In: Proceedings of SoCG, pp 754–767

  • Bachem O, Lucic M, Hassani SH, Krause A (2016) Approximate \(k\)-means++ in sublinear time. In: Proceedings of AAAI, pp 1459–1467

  • Bachem O, Lucic M, Hassani SH, Krause A (2016) Fast and provably good seedings for \(k\)-means. In: Proceedings of NIPS, pp 55–63

  • Bachem O, Lucic M, Krause A (2017) Distributed and provably good seedings for \(k\)-means in constant rounds. In: Proceedings of ICML, pp 292–300

  • Bahmani B, Moseley B, Vattani A, Kumar R, Vassilvitskii S (2012) Scalable \(k\)-means++. In: Proceedings of the VLDB endowment, pp 622–633

  • Bhattacharya A, Jaiswal R, Kumar A (2018) Faster algorithms for the constrained \(k\)-means problem. Theory Comput Syst 62:93–115

    Article  MathSciNet  Google Scholar 

  • Blömer J, Lammersen C, Schmidt M, Sohler C (2016) Theoretical analysis of the \(k\)-means algorithm—a survey. In: Algorithm engineering, pp 81–116

  • Byrka J, Fleszar K, Rybicki B, Spoerhase J (2015) Bi-factor approximation algorithms for hard capacitated \(k\)-median problems. In: Proceedings of SODA, pp 722–736

  • Byrka J, Rybicki B, Uniyal S (2016) An approximation algorithm for uniform capacitated \(k\)-median problem with \(1+\varepsilon \) capacity violation. In: Proceedings of IPCO, pp 262–274

  • Chudak FA, Williamson DP (2005) Improved approximation algorithms for capacitated facility location problems. Math Program 102:207–222

    Article  MathSciNet  Google Scholar 

  • Demirci G, Li S (2016) Constant approximation for capacitated \(k\)-median with \((1+\epsilon ) \)-capacity violation. In: Proceedings of ICALP, pp 73:1–73:14

  • Ding H, Xu J (2015) A unified framework for clustering constrained data without locality property. In: Proceedings of ACM-SIAM symposium on Discrete algorithms, pp 1471–1490

  • Drineas P, Frieze A, Kannan R, Vempala S, Vinay V (2004) Clustering large graphs via the singular value decomposition. Mach Learn 56:9–33

    Article  Google Scholar 

  • Feldman D, Monemizadeh M, Sohler C (2007) A PTAS for \(k\)-means clustering based on weak coresets. In: Proceedings of SoCG, pp 11–18

  • Frey BJ, Dueck D (2007) Clustering by passing messages between data points. Science 315:972–976

    Article  MathSciNet  Google Scholar 

  • Geetha S, Poonthalir G, Vanathi PT (2009) Improved \(k\)-means algorithm for capacitated clustering problem. J Comput Sci 8:52–59

    Google Scholar 

  • Jain K, Vazirani VV (2001) Approximation algorithms for metric facility location and \(k\)-median problems using the primal-dual schema and Lagrangian relaxation. J ACM 48:274–296

    Article  MathSciNet  Google Scholar 

  • Kanungo T, Mount DM, Netanyahu NS, Piatko CD, Silverman R, Wu AY (2004) A local search approximation algorithm for \(k\)-means clustering. Comput Geom 28:89–112

    Article  MathSciNet  Google Scholar 

  • Korupolu MR, Plaxton CG, Rajaraman R (2000) Analysis of a local search heuristic for facility location problems. J Algorithms 37:146–188

    Article  MathSciNet  Google Scholar 

  • Koskosidis YA, Powell WB (1992) Clustering algorithms for consolidation of customer orders into vehicle shipments. Transp Res Part B Methodol 26:365–379

    Article  Google Scholar 

  • Lee E, Schmidt M, Wright J (2017) Improved and simplified inapproximability for \(k\)-means. Inf Process Lett 120:40–43

    Article  MathSciNet  Google Scholar 

  • Li S (2015) On uniform capacitated \(k\)-median beyond the natural LP relaxation. In: Proceedings of SODA, pp 696–707

  • Lloyd S (1982) Least squares quantization in PCM. IEEE Trans Inf Theory 28:129–137

    Article  MathSciNet  Google Scholar 

  • Mulvey JM, Beck MP (1984) Solving capacitated clustering problems. Eur J Oper Res 18:339–348

    Article  Google Scholar 

  • Osman IH, Christofides N (1994) Capacitated clustering problems by hybrid simulated annealing and tabu search. Int Trans Oper Res 1:317–336

    Article  Google Scholar 

  • Ostrovsky R, Rabani Y, Schulman LJ, Swamy C (2006) The effectiveness of Lloyd-type methods for the \(k\)-means problem. In: Proceedings of FOCS, pp 165–176

  • Shao J, Xu D (2013) An approximation algorithm for the risk-adjusted two-stage stochastic facility location problem with penalties. J Oper Res Soc China 1:339–346

    Article  Google Scholar 

  • Shieh HM, May MD (2001) Solving the capacitated clustering problem with genetic algorithms. J Chin Inst Ind Eng 18:1–12

    Google Scholar 

  • Vattani A (2011) \(k\)-means requires exponentially many iterations even in the plane. Discret Comput Geom 45:596–616

    Article  MathSciNet  Google Scholar 

  • Wu C, Xu D, Shu J (2013) An approximation algorithm for the stochastic fault-tolerant facility location problem. J Oper Res Soc China 1:511–522

    Article  Google Scholar 

  • Zhang J, Chen B, Ye Y (2005) A multiexchange local search algorithm for the capacitated facility location problem. Math Oper Res 30:389–403

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

The first two authors are supported by Natural Science Foundation of China (No. 11531014). The third author is supported by the Natural Sciences and Engineering Research Council of Canada (NSERC) grant 06446, and Natural Science Foundation of China (Nos. 11771386, 11728104). The fourth author is supported by Natural Science Foundation of China (No. 11871081).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dongmei Zhang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Han, L., Xu, D., Du, D. et al. An approximation algorithm for the uniform capacitated k-means problem. J Comb Optim 44, 1812–1823 (2022). https://doi.org/10.1007/s10878-020-00550-y

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10878-020-00550-y

Keywords