Abstract
Spherical k-means clustering is a generalization of k-means problem which is NP-hard and has widely applications in data mining. It aims to partition a collection of given data with unit length into k sets so as to minimize the within-cluster sum of cosine dissimilarity. In this paper, we introduce the spherical k-means clustering with penalties and give a \(2\max \{2,M\}(1+M)(\ln k+2)\)-approximate algorithm, where M is the ratio of the maximal and the minimal penalty values of the given data set.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Awasthi, P., Charikar, M., Krishnaswamy, R., Sinop, A.: The hardness of approximation of Euclidean \(k\)-means, arXiv preprint arXiv:1502.03316 (2015)
Aloise, D., Deshpande, A., Hansen, P.: NP-hardness of Euclidean sum-of-squares clustering. Mach. Learn. 75(2), 245–248 (2009)
Ahmadian, S., Norouzi-Fard, A., Svensson, O., Ward, J.: Better guarantees for \(k\)-means and Euclidean \(k\)-median by primal-dual algorithms. In: Proceedings of the 58th Annual IEEE Symposium on Foundations of Computer Science (FOCS), pp. 61–72 (2017)
Arthur, D., Vassilvitskii, S.: \(k\)-means++: the advantages of careful seeding. In: Proceedings of the 18th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 1027–1035 (2007)
Blömer, J., Brauer, S., Bujna, K.: A theoretical analysis of the fuzzy \(k\)-means problem. In: Proceedings of the 16th IEEE International Conference on Data Mining (ICDM), pp. 805–810 (2017)
Blömer, J., Lammersen, C., Schmidt, M., Sohler, C.: Theoretical analysis of the k-means algorithm – a survey. In: Kliemann, L., Sanders, P. (eds.) Algorithm Engineering. LNCS, vol. 9220, pp. 81–116. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-49487-6_3
Cohen-Addad, V., Klein, P.N., Mathieu, C.: Local search yields approximation schemes for \(k\)-means and \(k\)-median in Euclidean and minor-free metrics. SIAM J. Comput. 48(2), 644–667 (2019)
Drineas, P., Frieze, A., Kannan, R., Vempala, V.: Clustering large graphs via the singular value decomposition. Mach. Learn. 56(1–3), 9–33 (2004)
Dhillon, I., Modha, D.: Concept decompositions for large sparse text data using clustering. Mach. Learn. 42(1–2), 143–175 (2001)
Endo, Y., Miyamoto, S.: Spherical k-means++ clustering. In: Torra, V., Narukawa, Y. (eds.) MDAI 2015. LNCS (LNAI), vol. 9321, pp. 103–114. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-23240-9_9
Gupta, S., Kumar, R., Lu, K., Moseley, B., Vassilvitskii, S.: Local search methods for \(k\)-means with outliers. Proc. VLDB Endow. 10(7), 757–768 (2017)
Hornik, K., Feinerer, I., Kober, M., Buchata, M.: Spherical \(k\)-means clustering. J. Stat. Softw. 50(10), 1–22 (2015)
Kanungo, T., Mount, D., Netanyahu, N., Piatko, C., Silverma, R.: A local search approximation algorithm for \(k\)-means clustering. Comput. Geom. 28(2–3), 89–112 (2004)
Lloyd, S.: Least squares quantization in PCM. IEEE Trans. 28(2), 129–137 (1982)
Li, M., Xu, D., Yue, J., Zhang, D., Zhang, P.: The seeding algorithm for \(k\)-means with penalties. J. Comb. Optim. (under review)
Li, M., Xu, D., Zhang, D., Zou, J.: The seeding algorithms for spherical \(k\)-means clustering. J. Global Optim. 1–14 (2019)
Moriya, T., Roth, H., Nakamura, S., Oda, H., Kai, N., Oda, M.: Unsupervised pathology image segmentation using representation learning with spherical \(k\)-means. In: Digital Pathology, p. 36 (2018)
Tunali, V., Bilgin, T., Camurcu, A.: An improved clustering algorithm for text mining: multi-cluster spherical \(k\)-means. Int. Arab J. Inf. Technol. 13(1), 12–19 (2016)
Xu, J., Han, J., Xiong, K., Nie F.: Robust and sparse fuzzy \(k\)-means clustering. In: Proceedings 25th International Joint Conference on Artificial Intelligence (IJCAI), pp. 2224–2230 (2016)
Xu, D., Xu, Y., Zhang, D.: A survey on algorithm for \(k\)-means and its variants. Oper. Res. Trans. 21, 101–109 (2017)
Acknowledgements
The first and second authors are supported by National Natural Science Foundation of China (No. 11531014). The third author is supported by National Natural Science Foundation of China (No. 61772005) and Natural Science Foundation of Fujian Province (No. 2017J01753). The forth author is supported by Higher Educational Science and Technology Program of Shandong Province (No. J17KA171). The fifth author is supported by National Natural Science Foundation of China (No. 11871081).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Ji, S., Xu, D., Guo, L., Li, M., Zhang, D. (2019). The Seeding Algorithm for Spherical k-Means Clustering with Penalties. In: Du, DZ., Li, L., Sun, X., Zhang, J. (eds) Algorithmic Aspects in Information and Management. AAIM 2019. Lecture Notes in Computer Science(), vol 11640. Springer, Cham. https://doi.org/10.1007/978-3-030-27195-4_14
Download citation
DOI: https://doi.org/10.1007/978-3-030-27195-4_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-27194-7
Online ISBN: 978-3-030-27195-4
eBook Packages: Computer ScienceComputer Science (R0)