The bi-criteria seeding algorithms for two variants of k-means problem

Li, Min

doi:10.1007/s10878-020-00537-9

The bi-criteria seeding algorithms for two variants of k-means problem

Published: 12 February 2020

Volume 44, pages 1693–1704, (2022)
Cite this article

Journal of Combinatorial Optimization Aims and scope Submit manuscript

Min Li ORCID: orcid.org/0000-0003-2784-5073¹

402 Accesses
Explore all metrics

Abstract

The k-means problem is very classic and important in computer science and machine learning, so there are many variants presented depending on different backgrounds, such as the k-means problem with penalties, the spherical k-means clustering, and so on. Since the k-means problem is NP-hard, the research of its approximation algorithm is very hot. In this paper, we apply a bi-criteria seeding algorithm to both k-means problem with penalties and spherical k-means problem, and improve (upon) the performance guarantees given by the k-means++ algorithm for these two problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The seeding algorithm for k-means problem with penalties

Article 26 September 2019

The Seeding Algorithm for Spherical k-Means Clustering with Penalties

The seeding algorithm for spherical k-means clustering with penalties

Article 15 April 2020

References

Aggarwal A, Deshpande A, Kannan R (2009) Adaptive sampling for $k$-means clustering. In: Proceedings of APPROX and RANDOM, pp 15–28
Ahmadian S, Norouzi-Fard A, Svensson O, Ward J (2017) Better guarantees for $k$-means and Euclidean $k$-median by primal-dual algorithms. In: Proceedings of FOCS, pp 61–72
Aloise D, Deshpande A, Hansen P, Popat P (2009) NP-hardness of Euclidean sum-of-squares clustering. Mach Learn 75:245–248
Article Google Scholar
Arthur D, Vassilvitskii S (2007) $k$-means++: the advantages of careful seeding. In: Proceedings of SODA, pp 1027–1035
Awasthi P, Charikar M, Krishnaswamy R, Sinop AK (2015) The hardness of approximation of Euclidean $k$-means. In: Proceedings of SoCG, pp 754–767
Bachem O, Lucic M, Hassani SH, Krause A (2016) Approximate $k$-means++ in sublinear time. In: Proceedings of AAAI, pp 1459–1467
Bachem O, Lucic M, Krause A (2017) Distributed and provably good seedings for $k$-means in constant rounds. In: Proceedings of ICML, pp 292–300
Blömer J, Lammersen C, Schmidt M, Sohler C (2016) Theoretical analysis of the $k$-means algorithm—a survey. In: Algorithm engineering. Springer, pp 81–116
Dhillon IS, Modha DS (2001) Concept decompositions for large sparse text data using clustering. Mach Learn 42:143–175
Article Google Scholar
Drineas P, Frieze A, Kannan R, Vempala S, Vinay V (2004) Clustering large graphs via the singular value decomposition. Mach Learn 56:9–33
Article Google Scholar
Endo Y, Miyamoto S (2015) Spherical $k$-means++ clustering. In: Proceedings of MDAI, pp 103–114
Feng Q, Zhang Z, Shi F, Wang J (2019) An improved approximation algorithm for the $k$-means problem with penalties. In: Proceedings of FAW, pp 170–181
Lee E, Schmidt M, Wright J (2017) Improved and simplified inapproximability for $k$-means. Inf Process Lett 120:40–43
Article MathSciNet Google Scholar
Li M, Xu D, Zhang D, Zou J (2019) The seeding algorithms for spherical $k$-means clustering. J Glob Optim. https://doi.org/10.1007/s10898-019-00779-w
Article MATH Google Scholar
Li M, Xu D, Yue J, Zhang D, Zhang P (2020) The seeding algorithm for $k$-means problem with penalties. J Comb Optim 39:15–32
Article MathSciNet Google Scholar
Lloyd S (1982) Least squares quantization in PCM. IEEE Trans Inf Theory 28:21–33
Article MathSciNet Google Scholar
Makarychev K, Makarychev Y, Sviridenko M, Ward J (2016) A bi-criteria approximation algorithm for $k$-means. In: Proceedings of APPROX/RONDOM, pp 14:1–14:20
Ostrovsky R, Rabani Y, Schulman L, Swamy C (2012) The effectiveness of Lloyd-type methods for the $k$-means problem. J ACM 59:28:1–28:22
Article MathSciNet Google Scholar
Tseng GC (2007) Penalized and weighted $k$-means for clustering with scattered objects and prior information in high-throughput biological data. Bioinformatics 23:2247–2255
Article Google Scholar
Wei D (2016) A constant-factor bi-criteria approximation guarantee for $k$-means++. In: Proceedings of NIPS, pp 604–612
Wu X, Kumar V, Quinlan J, Ross Ghosh J, Yang Q, Motoda H, McLachlan GJ, Ng A, Liu B, Yu PS, Zhou Z, Steinbach M, Hand DJ, Steinberg D (2008) Top 10 algorithms in data mining. Knowl Inf Syst 14:1–37
Article Google Scholar
Xu X, Ding S, Shi Z (2018) An improved density peaks clustering algorithm with fast finding cluster centers. Knowl Based Syst 158:65–74
Article Google Scholar
Xu D, Xu Y, Zhang D (2017) A survey on algorithm for $k$-means problem and its variants. Oper Res Trans 21:101–109
MATH Google Scholar
Xu D, Xu Y, Zhang D (2018) A survey on the initialization methods for the $k$-means algorithm. Oper Res Trans 22:31–40
MathSciNet MATH Google Scholar
Zhang D, Cheng Y, Li M, Wang Y, Xu D (2019) Local search approximation algorithms for the spherical $k$-means problem. In: Proceedings of AAIM, pp 341–351
Zhang D, Hao C, Wu C, Xu D, Zhang Z (2018) A local search approximation algorithm for the $k$-means problem with penalties. J Comb Optim 37:439–453
Article MathSciNet Google Scholar
Zhao Y, Karypis G (2001) Criterion functions for document clustering: experiments and analysis. Technical report $\sharp $01-40, Department of Computer Science, University of Minnesota

Download references

Acknowledgements

The author is supported by Higher Educational Science and Technology Program of Shandong Province (No. J17KA171) and Shandong Provincial Natural Science Foundation (No. ZR2019MA032) of China.

Author information

Authors and Affiliations

School of Mathematics and Statistics, Shandong Normal University, Jinan, 250014, People’s Republic of China
Min Li

Authors

Min Li
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Min Li.

Additional information

Dedicated to Professor Minyi Yue on the Occasion of His 100th Birthday.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

About this article

Cite this article

Li, M. The bi-criteria seeding algorithms for two variants of k-means problem. J Comb Optim 44, 1693–1704 (2022). https://doi.org/10.1007/s10878-020-00537-9

Download citation

Published: 12 February 2020
Issue Date: October 2022
DOI: https://doi.org/10.1007/s10878-020-00537-9

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The bi-criteria seeding algorithms for two variants of k-means problem

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

The seeding algorithm for k-means problem with penalties

The Seeding Algorithm for Spherical k-Means Clustering with Penalties

The seeding algorithm for spherical k-means clustering with penalties

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now