Skip to main content

An Improved Approximation Algorithm for the k-Means Problem with Penalties

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11458))

Abstract

The clustering problem has been paid lots of attention in various fields of compute science. However, in many applications, the existence of noisy data poses a big challenge for the clustering problem. As one way to deal with clustering problem with noisy data, clustering with penalties has been studied extensively, such as the k-median problem with penalties and the facility location problem with penalties. As far as we know, there is only one approximation algorithm for the k-means problem with penalties with ratio \(25+\epsilon \). All the previous related results for the clustering with penalties problems were based on the techniques of local search, LP-rounding, or primal-dual, which cannot be applied directly to the k-means problem with penalties to get better approximation ratio than \(25+\epsilon \). In this paper, we apply primal-dual technique to solve the k-means problem with penalties by a different rounding method, i.e., employing a deterministic rounding algorithm, instead of using the randomized rounding algorithm used in the previous approximation schemes. Based on the above method, an approximation algorithm with ratio \(19.849+\epsilon \) is presented for the k-means problem with penalties.

This work is supported by the National Natural Science Foundation of China under Grants (61672536, 61420106009, 61872450, 61828205), Hunan Provincial Science and Technology Program (2018WK4001).

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Ahmadian, S., Norouzi-Fard, A., Svensson, O., Ward, J.: Better guarantees for \(k\)-means and Euclidean \(k\)-median by primal-dual algorithms. In: Proceedings of 58th IEEE Symposium on Foundations of Computer Science, pp. 61–72 (2017)

    Google Scholar 

  2. Aloise, D., Deshpande, A., Hansen, P., Popat, P.: NP-hardness of Euclidean sum-of-squares clustering. Mach. Learn. 75(2), 245–248 (2009)

    Article  Google Scholar 

  3. Arthur, D., Vassilvitskii, S.: \(k\)-means++: the advantages of careful seeding. In: Proceedings of 18th ACM-SIAM Symposium on Discrete Algorithms, pp. 1027–1035 (2007)

    Google Scholar 

  4. Byrka, J., Pensyl, T., Rybicki, B., Srinivasan, A., Trinh, K.: An improved approximation for \(k\)-median and positive correlation in budgeted optimization. ACM Trans. Algorithms 13(2), 23 (2017)

    Article  MathSciNet  Google Scholar 

  5. Charikar, M., Khuller, S., Mount, D.M., Narasimhan, G.: Algorithms for facility location problems with outliers. In: Proceedings of 12th ACM-SIAM Symposium on Discrete Algorithms, pp. 642–651 (2001)

    Google Scholar 

  6. Chen, K.: A constant factor approximation algorithm for \(k\)-median clustering with outliers. In: Proceedings of 19th ACM-SIAM Symposium on Discrete Algorithms, pp. 826–835 (2008)

    Google Scholar 

  7. Cohen-Addad, V., Klein, P.N., Mathieu, C.: Local search yields approximation schemes for \(k\)-means and \(k\)-median in Euclidean and minor-free metrics. In: Proceedings of 57th IEEE Symposium on Foundations of Computer Science, pp. 353–364 (2016)

    Google Scholar 

  8. Feldman, D., Schulman, L.J.: Data reduction for weighted and outlier-resistant clustering. In: Proceedings of 23st ACM-SIAM Symposium on Discrete Algorithms, pp. 1343–1354 (2012)

    Google Scholar 

  9. Friggstad, Z., Khodamoradi, K., Rezapour, M., Salavatipour, M.R.: Approximation schemes for clustering with outliers. In: Proceedings of 28th ACM-SIAM Symposium on Discrete Algorithms, pp. 398–414 (2018)

    Google Scholar 

  10. Friggstad, Z., Rezapour, M., Salavatipour, M.R.: Local search yields a PTAS for \(k\)-means in doubling metrics. In: Proceedings of 57th IEEE Symposium on Foundations of Computer Science, pp. 365–374 (2016)

    Google Scholar 

  11. Guha, S., Li, Y., Zhang, Q.: Distributed partial clustering. In: Proceedings of 29th ACM Symposium on Parallelism in Algorithms and Architectures, pp. 143–152 (2017)

    Google Scholar 

  12. Gupta, A., Guruganesh, G., Schmidt, M.: Approximation algorithms for aversion \(k\)-clustering via local \(k\)-median. In: Proceedings of 43rd International Colloquium on Automata, Languages and Programming, pp. 1–13 (2016)

    Google Scholar 

  13. Gupta, S., Kumar, R., Lu, K., Moseley, B., Vassilvitskii, S.: Local search methods for \(k\)-means with outliers. Proc. VLDB Endow. 10(7), 757–768 (2017)

    Article  Google Scholar 

  14. Hajiaghayi, M., Khandekar, R., Kortsarz, G.: Local search algorithms for the red-blue median problem. Algorithmica 63(4), 795–814 (2012)

    Article  MathSciNet  Google Scholar 

  15. Huang, L., Jiang, S., Li, J., Wu, X.: \(\epsilon \)-coresets for clustering (with outliers) in doubling metrics. In: Proceedings of 50th ACM Symposium on Theory of Computing, pp. 814–825 (2018)

    Google Scholar 

  16. Jain, K., Mahdian, M., Markakis, E., Saberi, A., Vazirani, V.V.: Greedy facility location algorithms analyzed using dual fitting with factor-revealing LP. J. ACM 50(6), 795–824 (2003)

    Article  MathSciNet  Google Scholar 

  17. Jain, K., Vazirani, V.V.: Approximation algorithms for metric facility location and \(k\)-median problems using the primal-dual schema and lagrangian relaxation. J. ACM 48(2), 274–296 (2001)

    Article  MathSciNet  Google Scholar 

  18. Kanungo, T., Mount, D.M., Netanyahu, N.S., Piatko, C.D., Silverman, R., Wu, A.Y.: A local search approximation algorithm for \(k\)-means clustering. Comput. Geom. 28(2–3), 89–112 (2004)

    Article  MathSciNet  Google Scholar 

  19. Kumar, A., Sabharwal, Y., Sen, S.: Linear-time approximation schemes for clustering problems in any dimensions. J. ACM 57(2), 1–32 (2010)

    Article  MathSciNet  Google Scholar 

  20. Li, S., Guo, X.: Distributed \(k\)-clustering for data with heavy noise. In: Proceedings of 32nd Annual Conference on Neural Information Processing Systems, pp. 7849–7857 (2018)

    Google Scholar 

  21. Li, S., Svensson, O.: Approximating \(k\)-median via pseudo-approximation. SIAM J. Comput. 45(2), 530–547 (2016)

    Article  MathSciNet  Google Scholar 

  22. Li, Y., Du, D., Xiu, N., Xu, D.: Improved approximation algorithms for the facility location problems with linear/submodular penalties. Algorithmica 73(2), 460–482 (2015)

    Article  MathSciNet  Google Scholar 

  23. Mahajan, M., Nimbhorkar, P., Varadarajan, K.: The planar \(k\)-means problem is NP-hard. Theoret. Comput. Sci. 442, 13–21 (2012)

    Article  MathSciNet  Google Scholar 

  24. Makarychev, K., Makarychev, Y., Sviridenko, M., Ward, J.: A bi-criteria approximation algorithm for \(k\)-means. In: Proceedings of 19th International Workshop on Approximation Algorithms for Combinatorial Optimization Problems and 20th International Workshop on Randomization and Computation, pp. 1–20 (2016)

    Google Scholar 

  25. Matousek, J.: On approximate geometric \(k\)-clustering. Discrete Comput. Geom. 24(1), 61–84 (2000)

    Article  MathSciNet  Google Scholar 

  26. Ravishankar, K., Li, S., Sai, S.: Constant approximation for \(k\)-median and \(k\)-means with outliers via iterative rounding. In: Proceedings of 50th ACM Symposium on Theory of Computing, pp. 646–659 (2018)

    Google Scholar 

  27. Wu, C., Du, D., Xu, D.: An approximation algorithm for the \(k\)-median problem with uniform penalties via pseudo-solution. Theoret. Comput. Sci. 749, 80–92 (2018)

    Article  MathSciNet  Google Scholar 

  28. Xu, G., Xu, J.: An LP rounding algorithm for approximating uncapacitated facility location problem with penalties. Inf. Process. Lett. 94(3), 119–123 (2005)

    Article  MathSciNet  Google Scholar 

  29. Xu, G., Xu, J.: An improved approximation algorithm for uncapacitated facility location problem with penalties. J. Comb. Optim. 17(4), 424–436 (2009)

    Article  MathSciNet  Google Scholar 

  30. Zhang, D., Hao, C., Wu, C., Xu, D., Zhang, Z.: A local search approximation algorithm for the k-means problem with penalties. In: Cao, Y., Chen, J. (eds.) COCOON 2017. LNCS, vol. 10392, pp. 568–574. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-62389-4_47

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jianxin Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Feng, Q., Zhang, Z., Shi, F., Wang, J. (2019). An Improved Approximation Algorithm for the k-Means Problem with Penalties. In: Chen, Y., Deng, X., Lu, M. (eds) Frontiers in Algorithmics. FAW 2019. Lecture Notes in Computer Science(), vol 11458. Springer, Cham. https://doi.org/10.1007/978-3-030-18126-0_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-18126-0_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-18125-3

  • Online ISBN: 978-3-030-18126-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics