Abstract
We consider the robust version of the classic k-center clustering problem, where we wish to remove up to z points (outliers), so as to be able to cluster the remaining points in k clusters with minimum maximum radius. We study such a problem under the fully dynamic adversarial model, where points can be inserted or deleted arbitrarily. In this setting, the main goal is to design algorithms that maintain a high quality solution at any point in time, while requiring a “small” amortized cost, i.e. a “small” number of operations per insertion or deletion, on average. In our work, we provide the first constant bi-criteria approximation algorithm for such a problem with its amortized cost being independent of both z and the size of the current input. We also complement our positive result with a lower bound showing that any constant (non bi-criteria) approximation algorithm has amortized cost at least linear in z. Finally, we conduct an in-depth experimental analysis of our algorithm on Twitter, Flickr, and Air-Quality datasets showing the effectiveness of our approach.
Similar content being viewed by others
Notes
We note here that the known coresets for k-center with outliers do not imply an algorithm with bounded amortized update time for such a problem, in general metric.
However, for practical implementation, we could possibly re-use the underlying intermediate data structures.
References
Hou, B.J., Zhang, L., Zhou, Z.H.: Learning with feature evolvable streams. In: Advances in Neural Information Processing Systems, pp. 1417–1427 (2017)
Wang, H., Fan, W., Yu, P.S., Han, J.: Mining concept-drifting data streams using ensemble classifiers. In: SIGKDD, pp. 226–235 (2003)
Zhang, P., Li, J., Wang, P., Gao, B.J., Zhu, X., Guo, L.: Enabling fast prediction for ensemble models on data streams. In: SIGKDD, pp. 177–185 (2011)
Chan, T.H.H., Guerqin, A., Sozio, M.: Fully Dynamic k-Center Clustering. In: WWW, pp. 579–587 (2018)
Goranci, G., Henzinger, M., Leniowski, D., Svozil, A.: Fully Dynamic k-Center Clustering in Doubling Metrics (2019). arXiv:1908.03948
Schmidt, M., Sohler, C.: Fully dynamic hierarchical diameter k-clustering and k-center (2019). arXiv:1908.02645
Cohen-Addad, V., Hjuler, N.O.D., Parotsidis, N., Saulpic, D., Schwiegelshohn, C.: Fully Dynamic Consistent Facility Location. In: NeurIPS, pp. 3250–3260 (2019)
Henzinger, M., Kale, S.: Fully-Dynamic Coresets. In: ESA 2020. vol. 173 of LIPIcs, pp. 57:1–57:21 (2020)
Bateni, M., Esfandiari, H., Jayaram, R., Mirrokni, V.S.: Optimal Fully Dynamic k-Centers Clustering (2021) arXiv:2112.07050
Ceccarello, M., Pietracaprina, A., Pucci, G.: Solving k-center Clustering (with Outliers) in MapReduce and Streaming, almost as Accurately as Sequentially. Proceedings of the VLDB Endowment. 12(7), 766–778 (2019)
de Berg, M., Monemizadeh, M., Zhong, Y.: k-Center Clustering with Outliers in the Sliding-Window Model. In: ESA. vol. 204. Schloss Dagstuhl, pp. 13:1–13:13 (2021)
Charikar, M., Khuller, S., Mount, D.M., Narasimhan, G.: Algorithms for facility location problems with outliers. In: SODA, pp. 642–651 (2001)
Bhaskara, A., Vadgama, S., Xu, H.: Greedy sampling for approximate clustering in the presence of outliers. In: NeurIPS, pp. 11146–11155 (2019)
Harris, D.G., Pensyl, T.W., Srinivasan, A., Trinh, K.: A lottery model for center-type problems with outliers. ACM Trans Algor. 15(3), 36:1-36:25 (2019)
Malkomes, G., Kusner, M.J., Chen, W., Weinberger, K.Q., Moseley, B.: Fast distributed k-center clustering with outliers on massive data. In: Advances in Neural Information Processing Systems, pp. 1063–1071 (2015)
McCutchen, R.M., Khuller, S.: Streaming algorithms for k-center clustering with outliers and with anonymity. In: APPROX-RANDOM. Springer, pp. 165–178 (2008)
Cohen-Addad, V., Schwiegelshohn, C., Sohler, C.: Diameter and k-Center in Sliding Windows. In: ICALP, pp. 19:1–19:12 (2016)
Ding, H., Yu, H., Wang, Z.: Greedy Strategy Works for k-Center Clustering with Outliers and Coreset Construction. In: ESA (2019)
Huang, L., Jiang, S., Li, J., Wu, X.: Epsilon-coresets for clustering (with outliers) in doubling metrics. In: FOCS, pp. 814–825 (2018)
Putina, A., Sozio, M., Rossi, D., Navarro, J.M.: Random histogram forest for unsupervised anomaly detection. In: ICDM. IEEE, pp. 1226–1231 (2020)
Alon, N., Krivelevich, M., Newman, I., Szegedy, M.: Regular Languages Are Testable with a Constant Number of Queries. In: FOCS, pp. 645–655 (1999)
Epasto, A., Lattanzi, S., Sozio, M.: Efficient densest subgraph computation in evolving graphs. In: Proceedings of the 24th International Conference on World Wide Web, WWW 2015, Florence, Italy, May 18–22, 2015, pp. 300–310 (2015)
Chan, T.H.H., Lattanzi, S., Sozio, M., Wang, B.: Fully dynamic k-center clustering with outliers. In: COCOON. Vol. 13595 of Lecture Notes in Computer Science, pp. 150–161. Springer (2022)
Acknowledgements
T-H. Hubert Chan was partially supported by the Hong Kong RGC under the grants 17201220, 17202121 and 17203122. The conference version of this paper has appeared in COCOON 2022 [23].
Author information
Authors and Affiliations
Contributions
Sozio had the initial ideas for the project. Sozio and Chan wrote the algorithms and their analysis. Lattanzi wrote the lower bound proof. Wang conducted the experiments. All authors have contributed to the editing and reviewing of the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no competing interests as defined by Springer, or other interests that might be perceived to influence the results and/or discussion reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Chan, TH.H., Lattanzi, S., Sozio, M. et al. Fully Dynamic k-Center Clustering with Outliers. Algorithmica 86, 171–193 (2024). https://doi.org/10.1007/s00453-023-01159-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00453-023-01159-3