Skip to main content

Improved Analysis of Complete-Linkage Clustering

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9294))

Abstract

Complete-linkage clustering is a very popular method for computing hierarchical clusterings in practice, which is not fully understood theoretically. Given a finite set P ⊆ ℝd of points, the complete-linkage method starts with each point from P in a cluster of its own and then iteratively merges two clusters from the current clustering that have the smallest diameter when merged into a single cluster.

We study the problem of partitioning P into k clusters such that the largest diameter of the clusters is minimized and we prove that the complete-linkage method computes an O(1)-approximation for this problem for any metric that is induced by a norm, assuming that the dimension d is a constant. This improves the best previously known bound of O(logk) due to Ackermann et al. (Algorithmica, 2014). Our improved bound also carries over to the k-center and the discrete k-center problem.

This research was supported by ERC Starting Grant 306465 (BeyondWorstCase).

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ackermann, M.R., Blömer, J., Kuntze, D., Sohler, C.: Analysis of agglomerative clustering. Algorithmica 69(1), 184–215 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  2. Cole, J.R., Wang, Q., Fish, J.A., Chai, B., McGarrell, D.M., Sun, Y., Brown, C.T., Porras-Alfaro, A., Kuske, C.R., Tiedje, J.M.: Ribosomal database project: data and tools for high throughput rrna analysis. Nucleic Acids Research (2013)

    Google Scholar 

  3. Dasgupta, S., Long, P.M.: Performance guarantees for hierarchical clustering. Journal of Computer and System Sciences 70(4), 555–569 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  4. Defays, D.: An efficient algorithm for a complete link method. The Computer Journal 20(4), 364–366 (1977)

    Article  MathSciNet  MATH  Google Scholar 

  5. Feder, T., Greene, D.H.: Optimal algorithms for approximate clustering. In: Proc. of the 20th Annual ACM Symposium on Theory of Computing (STOC), pp. 434–444 (1988)

    Google Scholar 

  6. Ghaemmaghami, H., Dean, D., Vogt, R., Sridharan, S.: Speaker attribution of multiple telephone conversations using a complete-linkage clustering approach. In: Proc. of the 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4185–4188 (2012)

    Google Scholar 

  7. Rieck, K., Trinius, P., Willems, C., Holz, T.: Automatic analysis of malware behavior using machine learning. Journal of Computer Security 19(4), 639–668 (2011)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anna Großwendt .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Großwendt, A., Röglin, H. (2015). Improved Analysis of Complete-Linkage Clustering. In: Bansal, N., Finocchi, I. (eds) Algorithms - ESA 2015. Lecture Notes in Computer Science(), vol 9294. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-48350-3_55

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-48350-3_55

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-48349-7

  • Online ISBN: 978-3-662-48350-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics