Skip to main content
Log in

Fully Dynamic k-Center Clustering with Outliers

  • Published:
Algorithmica Aims and scope Submit manuscript

Abstract

We consider the robust version of the classic k-center clustering problem, where we wish to remove up to z points (outliers), so as to be able to cluster the remaining points in k clusters with minimum maximum radius. We study such a problem under the fully dynamic adversarial model, where points can be inserted or deleted arbitrarily. In this setting, the main goal is to design algorithms that maintain a high quality solution at any point in time, while requiring a “small” amortized cost, i.e. a “small” number of operations per insertion or deletion, on average. In our work, we provide the first constant bi-criteria approximation algorithm for such a problem with its amortized cost being independent of both z and the size of the current input. We also complement our positive result with a lower bound showing that any constant (non bi-criteria) approximation algorithm has amortized cost at least linear in z. Finally, we conduct an in-depth experimental analysis of our algorithm on Twitter, Flickr, and Air-Quality datasets showing the effectiveness of our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Algorithm 1
Algorithm 2
Algorithm 3
Algorithm 4
Algorithm 5
Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. We note here that the known coresets for k-center with outliers do not imply an algorithm with bounded amortized update time for such a problem, in general metric.

  2. However, for practical implementation, we could possibly re-use the underlying intermediate data structures.

  3. https://github.com/fe6Bc5R4JvLkFkSeExHM/k-center/tree/master/dataset.

  4. http://yfcc100m.appspot.com/.

  5. https://archive.ics.uci.edu/ml/datasets/Beijing+Multi-Site+Air-Quality+Data.

References

  1. Hou, B.J., Zhang, L., Zhou, Z.H.: Learning with feature evolvable streams. In: Advances in Neural Information Processing Systems, pp. 1417–1427 (2017)

  2. Wang, H., Fan, W., Yu, P.S., Han, J.: Mining concept-drifting data streams using ensemble classifiers. In: SIGKDD, pp. 226–235 (2003)

  3. Zhang, P., Li, J., Wang, P., Gao, B.J., Zhu, X., Guo, L.: Enabling fast prediction for ensemble models on data streams. In: SIGKDD, pp. 177–185 (2011)

  4. Chan, T.H.H., Guerqin, A., Sozio, M.: Fully Dynamic k-Center Clustering. In: WWW, pp. 579–587 (2018)

  5. Goranci, G., Henzinger, M., Leniowski, D., Svozil, A.: Fully Dynamic k-Center Clustering in Doubling Metrics (2019). arXiv:1908.03948

  6. Schmidt, M., Sohler, C.: Fully dynamic hierarchical diameter k-clustering and k-center (2019). arXiv:1908.02645

  7. Cohen-Addad, V., Hjuler, N.O.D., Parotsidis, N., Saulpic, D., Schwiegelshohn, C.: Fully Dynamic Consistent Facility Location. In: NeurIPS, pp. 3250–3260 (2019)

  8. Henzinger, M., Kale, S.: Fully-Dynamic Coresets. In: ESA 2020. vol. 173 of LIPIcs, pp. 57:1–57:21 (2020)

  9. Bateni, M., Esfandiari, H., Jayaram, R., Mirrokni, V.S.: Optimal Fully Dynamic k-Centers Clustering (2021) arXiv:2112.07050

  10. Ceccarello, M., Pietracaprina, A., Pucci, G.: Solving k-center Clustering (with Outliers) in MapReduce and Streaming, almost as Accurately as Sequentially. Proceedings of the VLDB Endowment. 12(7), 766–778 (2019)

    Article  Google Scholar 

  11. de Berg, M., Monemizadeh, M., Zhong, Y.: k-Center Clustering with Outliers in the Sliding-Window Model. In: ESA. vol. 204. Schloss Dagstuhl, pp. 13:1–13:13 (2021)

  12. Charikar, M., Khuller, S., Mount, D.M., Narasimhan, G.: Algorithms for facility location problems with outliers. In: SODA, pp. 642–651 (2001)

  13. Bhaskara, A., Vadgama, S., Xu, H.: Greedy sampling for approximate clustering in the presence of outliers. In: NeurIPS, pp. 11146–11155 (2019)

  14. Harris, D.G., Pensyl, T.W., Srinivasan, A., Trinh, K.: A lottery model for center-type problems with outliers. ACM Trans Algor. 15(3), 36:1-36:25 (2019)

    MathSciNet  Google Scholar 

  15. Malkomes, G., Kusner, M.J., Chen, W., Weinberger, K.Q., Moseley, B.: Fast distributed k-center clustering with outliers on massive data. In: Advances in Neural Information Processing Systems, pp. 1063–1071 (2015)

  16. McCutchen, R.M., Khuller, S.: Streaming algorithms for k-center clustering with outliers and with anonymity. In: APPROX-RANDOM. Springer, pp. 165–178 (2008)

  17. Cohen-Addad, V., Schwiegelshohn, C., Sohler, C.: Diameter and k-Center in Sliding Windows. In: ICALP, pp. 19:1–19:12 (2016)

  18. Ding, H., Yu, H., Wang, Z.: Greedy Strategy Works for k-Center Clustering with Outliers and Coreset Construction. In: ESA (2019)

  19. Huang, L., Jiang, S., Li, J., Wu, X.: Epsilon-coresets for clustering (with outliers) in doubling metrics. In: FOCS, pp. 814–825 (2018)

  20. Putina, A., Sozio, M., Rossi, D., Navarro, J.M.: Random histogram forest for unsupervised anomaly detection. In: ICDM. IEEE, pp. 1226–1231 (2020)

  21. Alon, N., Krivelevich, M., Newman, I., Szegedy, M.: Regular Languages Are Testable with a Constant Number of Queries. In: FOCS, pp. 645–655 (1999)

  22. Epasto, A., Lattanzi, S., Sozio, M.: Efficient densest subgraph computation in evolving graphs. In: Proceedings of the 24th International Conference on World Wide Web, WWW 2015, Florence, Italy, May 18–22, 2015, pp. 300–310 (2015)

  23. Chan, T.H.H., Lattanzi, S., Sozio, M., Wang, B.: Fully dynamic k-center clustering with outliers. In: COCOON. Vol. 13595 of Lecture Notes in Computer Science, pp. 150–161. Springer (2022)

Download references

Acknowledgements

T-H. Hubert Chan was partially supported by the Hong Kong RGC under the grants 17201220, 17202121 and 17203122. The conference version of this paper has appeared in COCOON 2022 [23].

Author information

Authors and Affiliations

Authors

Contributions

Sozio had the initial ideas for the project. Sozio and Chan wrote the algorithms and their analysis. Lattanzi wrote the lower bound proof. Wang conducted the experiments. All authors have contributed to the editing and reviewing of the manuscript.

Corresponding author

Correspondence to T.-H. Hubert Chan.

Ethics declarations

Conflict of interest

The authors declare that they have no competing interests as defined by Springer, or other interests that might be perceived to influence the results and/or discussion reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chan, TH.H., Lattanzi, S., Sozio, M. et al. Fully Dynamic k-Center Clustering with Outliers. Algorithmica 86, 171–193 (2024). https://doi.org/10.1007/s00453-023-01159-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00453-023-01159-3

Keywords

Navigation