Skip to main content
Log in

Local search approximation algorithms for the k-means problem with penalties

  • Published:
Journal of Combinatorial Optimization Aims and scope Submit manuscript

Abstract

In this paper, we study the k-means problem with (nonuniform) penalties (k-MPWP) which is a natural generalization of the classic k-means problem. In the k-MPWP, we are given an n-client set \( {\mathcal {D}} \subset {\mathbb {R}}^d\), a penalty cost \(p_j>0\) for each \(j \in {\mathcal {D}}\), and an integer \(k \le n\). The goal is to open a center subset \(F \subset {\mathbb {R}}^d\) with \( |F| \le k\) and to choose a client subset \(P \subseteq {\mathcal {D}} \) as the penalized client set such that the total cost (including the sum of squares of distance for each client in \( {\mathcal {D}} \backslash P \) to the nearest open center and the sum of penalty cost for each client in P) is minimized. We offer a local search \(( 81+ \varepsilon )\)-approximation algorithm for the k-MPWP by using single-swap operation. We further improve the above approximation ratio to \(( 25+ \varepsilon )\) by using multi-swap operation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Ahmadian S, Norouzi-Fard A, Svensson O, Ward J (2017) Better guarantees for \(k\)-means and Euclidean \(k\)-median by primal-dual algorithms. In: Proceedings of FOCS, pp 61–72

  • Aloise D, Deshpande A, Hansen P, Popat P (2009) NP-hardness of Euclidean sum-of-squares clustering. Mach Learn 75:245–249

    Article  MATH  Google Scholar 

  • Arya V, Garg N, Khandekar R, Meyerson A, Munagala K, Pandit V (2004) Local search heuristics for \(k\)-median and facility location problems. SIAM J Comput 33:544–562

    Article  MathSciNet  MATH  Google Scholar 

  • Bandyapadhyay S, Varadarajan K (2016) On variants of \(k\)-means clustering. In: Proceedings of SoCG, article no. 14:14:1–14:15

  • Byrka J, Pensyl T, Rybicki B, Srinivasan A, Trinh K (2017) An improved approximation for \(k\)-median, and positive correlation in budgeted optimization. ACM Transactions on Algorithms, 13(2): Article No. 23

  • Charikar M, Guha S (1999) Improved combinatorial algorithms for the facility location and \(k\)-median problems. In: Proceedings of FOCS, pp 378–388

  • Charikar M, Guha S, Tardos É, Shmoys DB (1999) A constant-factor approximation algorithm for the \(k\)-median problem. In: Proceedings of STOC, pp 1–10

  • Charikar M, Khuller S, Mount DM, Narasimhan G (2001) Algorithms for facility location problems with outliers. In: Proceedings of SODA, pp 642–651

  • Dasgupta S (2007) The hardness of \(k\)-means clustering. Technical report CS2007-0890, University of California, San Diego

  • Georgogiannis A (2016) Robust \(k\)-means: a theoretical revisit. In: Proceedings of NIPS, pp 2883–2891

  • Jain K, Vazirani VV (2001) Approximation algorithms for metric facility location and \(k\)-median problems using the primal-dual schema and Lagrangian relaxation. J ACM 48:274–296

    Article  MathSciNet  MATH  Google Scholar 

  • Kanungo T, Mount DM, Netanyahu NS, Piatko CD, Silverman R, Wu AY (2004) A local search approximation algorithm for \(k\)-means clustering. Comput Geom Theory Appl 28:89–112

    Article  MathSciNet  MATH  Google Scholar 

  • Li Y, Du D, Xiu N, Xu D (2015) Improved approximation algorithms for the facility location problems with linear/submodular penalties. Algorithmica 73:460–482

    Article  MathSciNet  MATH  Google Scholar 

  • Li S, Svensson O (2016) Approximating \(k\)-median via pseudo-approximation. SIAM J Comput 45:530–547

    Article  MathSciNet  MATH  Google Scholar 

  • Lloyd S (1957) Least squares quantization in PCM. Technical report, Bell Laboratories

  • Lloyd S (1982) Least squares quantization in PCM. IEEE Trans Inf Theory 28:129–137

    Article  MathSciNet  MATH  Google Scholar 

  • Mahajan M, Nimbhorkar P, Varadarajan K (2009) The planar \(k\)-means problem is NP-hard. In: Proceedings of WALCOM, pp 274–285

  • Makarychev K, Makarychev Y, Sviridenko M, Ward J (2016) A bi-criteria approximation algorithm for \(k\)-means. In: Proceedings of APPROX/RONDOM, article no. 14, pp 14:1–14:20

  • Matoušek J (2000) On approximate geometric \(k\)-clustering. Discrete Comput Geom 24:61–84

    Article  MathSciNet  MATH  Google Scholar 

  • Tseng GC (2007) Penalized and weighted \(k\)-means for clustering with scattered objects and prior information in high-throughput biological data. Bioinformatics 23:2247–2255

    Article  Google Scholar 

  • Wang Y, Xu D, Du D, Wu C An approximation algorithm for \(k\)-facility location problem with linear penalties using local search scheme. J Comb Optim. https://doi.org/10.1007/s10878-016-0080-2

  • Ward J (2017) Private communication

  • Zhang P (2007) A new approximation algorithm for the \(k\)-facility location problem. Theor Comput Sci 384:126–135

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

The research of the first author is supported by Higher Educational Science and Technology Program of Shandong Province (No. J15LN23) and the Science and Technology Development Plan Project of Jinan City (No. 201401211). The second author is supported by Ri-Xin Talents Project of Beijing University of Technology. The third author is supported by Natural Science Foundation of China (No. 11501412). The fourth author is supported by Natural Science Foundation of China (No. 11531014). The fifth author is supported by Beijing Excellent Talents Funding (No. 2014000020124G046).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dachuan Xu.

Additional information

A preliminary version of this paper appeared in Proceedings of the 23rd International Computing and Combinatorics Conference, pp. 568–574, 2017.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, D., Hao, C., Wu, C. et al. Local search approximation algorithms for the k-means problem with penalties. J Comb Optim 37, 439–453 (2019). https://doi.org/10.1007/s10878-018-0278-6

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10878-018-0278-6

Keywords

Navigation