The seeding algorithm for spherical k-means clustering with penalties

Ji, Sai; Xu, Dachuan; Guo, Longkun; Li, Min; Zhang, Dongmei

doi:10.1007/s10878-020-00569-1

The seeding algorithm for spherical k-means clustering with penalties

Published: 15 April 2020

Volume 44, pages 1977–1994, (2022)
Cite this article

Journal of Combinatorial Optimization Aims and scope Submit manuscript

Sai Ji¹,
Dachuan Xu¹,
Longkun Guo ORCID: orcid.org/0000-0003-2891-4253²,
Min Li³ &
…
Dongmei Zhang⁴

464 Accesses
14 Citations
Explore all metrics

Abstract

Spherical k-means clustering as a known NP-hard variant of the k-means problem has broad applications in data mining. In contrast to k-means, it aims to partition a collection of given data distributed on a spherical surface into k sets so as to minimize the within-cluster sum of cosine dissimilarity. In the paper, we introduce spherical k-means clustering with penalties and give a $2\max \{2,M\}(1+M)(\ln k+2)$-approximation algorithm. Moreover, we prove that when against spherical k-means clustering with penalties but on separable instances, our algorithm is with an approximation ratio $2\max \{3,M+1\}$ with high probability, where M is the ratio of the maximal and the minimal penalty cost of the given data set.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Seeding Algorithm for Spherical k-Means Clustering with Penalties

Local Search Approximation Algorithms for the Spherical k-Means Problem

An approximation algorithm for the spherical k-means problem with outliers by local search

Article 25 April 2021

References

Ahmadian S, Norouzi-Fard A, Svensson O, Ward J (2017) Better guarantees for $k$-means and Euclidean $k$-median by primal–dual algorithms. In: Proceedings of the 58th annual IEEE symposium on foundations of computer science (FOCS), pp 61–72
Aloise D, Deshpande A, Hansen P (2009) NP-hardness of Euclidean sum-of-squares clustering. Mach Learn 75(2):245–248
Article Google Scholar
Arthur D, Vassilvitskii S (2006) How slow is the k-means method? In: Proceedings of the 22th symposium on computational geometry (SoCG), pp 144-153
Arthur D, Vassilvitskii S (2007) $k$-means++: the advantages of careful seeding, In: Proceedings of the 18th annual ACM-SIAM symposium on discrete algorithms (SODA), pp 1027–1035
Awasthi P, Charikar M, Krishnaswamy R, Sinop A (2015) The hardness of approximation of Euclidean $k$-means. In: Proceedings of the 31st symposium on computational geometry (SoCG), pp 754–767
Bahmani B, Moseley B, Vattani A, Kumar R, Vassilvitskii S (2012) Scalable k-means++. Proc VLDB Endow 5(7):622–633
Blömer J, Lammersen C, Schmidt M, Sohler C (2016) Theoretical analysis of the k-means algorithm – a survey. In: Kliemann L, Sanders P (eds) Algorithm engineering. Lecture notes in computer science, vol 9220. Springer, Cham, pp 81–116
Blömer J, Brauer S, Bujna K (2017) A theoretical analysis of the fuzzy $k$-means problem, In: Proceedings of the 16th IEEE international conference on data mining (ICDM), pp 805–810
Cohen-Addad V, Klein PN, Mathieu C (2019) Local search yields approximation schemes for $k$-means and $k$-median in Euclidean and minor-free metrics. SIAM J Comput 48(2):644–667
Article MathSciNet Google Scholar
Dhillon I, Modha D (2001) Concept decompositions for large sparse text data using clustering. Mach Learn 42(1–2):143–175
Article Google Scholar
Drineas P, Frieze A, Kannan R, Vempala V (2004) Clustering large graphs via the singular value decomposition. Mach Learn 56(1–3):9–33
Article Google Scholar
Endo Y, Miyamoto S (2015) Spherical $k$-means++ clustering. In: Proceedings of the 16th international conference on modeling decisions for artificial intelligence (MDAI), pp 103-114
Gupta S, Kumar R, Lu K, Moseley B, Vassilvitskii S (2017) Local search methods for $k$-means with outliers. Proc VLDB Endow 10(7):757–768
Article Google Scholar
Hornik K, Feinerer I, Kober M, Buchata M (2015) Spherical $k$-means clustering. J Stat Softw 50(10):1–22
Google Scholar
Kanungo T, Mount D, Netanyahu N, Piatko C, Silverma R (2004) A local search approximation algorithm for $k$-means clustering. Comput Geom 28(2–3):89–112
Article MathSciNet Google Scholar
Li M, Xu D, Zhang D, Zou J (2019) The seeding algorithms for spherical $k$-means clustering. J Glob Optim 76(4): 695–708
Li M, Xu D, Yue J, Zhang D, Zhang P (2020) The seeding slgorithm for $k$-means problem with penalties. J Comb Optim 39(1):15–32
Article MathSciNet Google Scholar
Lloyd S (1982) Least squares quantization in PCM. IEEE Trans Inf Theory 28(2):129–137
Article MathSciNet Google Scholar
Moriya T, Roth H, Nakamura S, Oda H, Kai N, Oda M (2018) Unsupervised pathology image segmentation using representation learning with spherical $k$-means. In: Proceeding SPIE 10581, Medical Imaging 2018: Digital Pathology, 1058111
Tunali V, Bilgin T, Camurcu A (2016) An improved clustering algorithm for text mining: multi-cluster spherical $K$-means. Int Arab J Inf Technol 13(1):12–19
Google Scholar
Vattani A (2011) K-means requires exponentially many iterations even in the plane. Discrete Comput Geom 45(4):596–616
Article MathSciNet Google Scholar
Xu J, Han J, Xiong K, Nie F (2016) Robust and sparse fuzzy $k$-means clustering. In: Proceedings 25th international joint conference on artificial intelligence (IJCAI), pp 2224–2230
Xu D, Xu Y, Zhang D (2017) A survey on algorithm for $k$-means and its variants. Oper Res Trans 21:101–109 (in Chinese)
MATH Google Scholar

Download references

Acknowledgements

The authors Sai Ji, Dachuan Xu and Dongmei Zhang are supported by National Natural Science Foundation of China (No. 11871081). The third author Longkun Guo is supported by National Natural Science Foundation of China (No. 61772005) and Natural Science Foundation of Fujian province (No. 2017J01753). The fourth author Min Li is supported by Higher Educational Science and Technology Program of Shandong Province (No. J17KA171) and Natural Science Foundation of Shandong Province (No. ZR2019MA032) of China.

Author information

Authors and Affiliations

Department of Operations Research and Scientific Computing, Beijing University of Technology, Beijing, 100124, People’s Republic of China
Sai Ji & Dachuan Xu
School of Computer Science and Technology, Qilu University of Technology (Shandong Academy of Sciences), Jinan, 250353, People’s Republic of China
Longkun Guo
School of Mathematics and Statistics, Shandong Normal University, Jinan, 250014, People’s Republic of China
Min Li
School of Computer Science and Technology, Shandong Jianzhu University, Jinan, 250101, People’s Republic of China
Dongmei Zhang

Authors

Sai Ji
View author publications
You can also search for this author inPubMed Google Scholar
Dachuan Xu
View author publications
You can also search for this author inPubMed Google Scholar
Longkun Guo
View author publications
You can also search for this author inPubMed Google Scholar
Min Li
View author publications
You can also search for this author inPubMed Google Scholar
Dongmei Zhang
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Longkun Guo.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

A preliminary version of this paper appeared in Proceedings of the 13th International Conference on Algorithmic Aspects in Information and Management, pp. 149–158, 2019.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ji, S., Xu, D., Guo, L. et al. The seeding algorithm for spherical k-means clustering with penalties. J Comb Optim 44, 1977–1994 (2022). https://doi.org/10.1007/s10878-020-00569-1

Download citation

Published: 15 April 2020
Issue Date: October 2022
DOI: https://doi.org/10.1007/s10878-020-00569-1

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The seeding algorithm for spherical k-means clustering with penalties

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

The Seeding Algorithm for Spherical k-Means Clustering with Penalties

Local Search Approximation Algorithms for the Spherical k-Means Problem

An approximation algorithm for the spherical k-means problem with outliers by local search

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now