Skip to main content

Fully Dynamic Clustering of Metric Data Sets

  • Conference paper
  • First Online:
Advances in Databases (BNCOD 2002)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2405))

Included in the following conference series:

  • 239 Accesses

Abstract

The goal of cluster analysis [10] is to find homogeneous groups, or clusters, in data. Homogeneity is often made precise by means of a dissimilarity function on objects, having low values at pairs of objects in one cluster. Cluster analysis has also been investigated in data mining [5], emphasising efficiency on data sets larger than main memory [4,6,8,9,16]. More recently, the growing importance of multimedia and transactional databases has stimulated interest in metric clustering, i.e. when dissimilarity satisfies the triangular inequality.

This work has been partially supported by project D2I of the Italian MIUR

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Yi-Jen Chiang, Michael T. Goodrich, Edward F. Grove, Roberto Tamassia, Darren Erik Vengroff, and Jeffrey Scott Vitter. External-memory graph algorithms. In Proceedings of the Sixth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 139–149, San Francisco, California, 22–24 January 1995.

    Google Scholar 

  2. Paolo Ciaccia, Marco Patella, and Pavel Zezula. M-tree: An efficient access method for similarity search in metric spaces. In VLDB’97, Proceedings of 23rd International Conference on Very Large Data Bases, pages 426–435, 1997.

    Google Scholar 

  3. Martin Ester, Hans-Peter Kriegel, Jörg Sander, Michael Wimmer, and Xiaowei Xu. Incremental clustering for mining in a data warehousing environment. In Proc. 24th Int. Conf. Very Large Data Bases (VLDB), pages 323–333, 24–27 August 1998.

    Google Scholar 

  4. Martin Ester, Hans-Peter Kriegel, Jorg Sander, and Xiaowei Xu. A density-based algorithm for discovering clusters in large spatial databases with noise. In Evangelos Simoudis, Jia Wei Han, and Usama Fayyad, editors, Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD-96), page 226. AAAI Press, 1996.

    Google Scholar 

  5. Usama M. Fayyad, Gregory Piatetsky-Shapiro, Padhraic Smyth, and Ramasamy Uthurusamy, editors. Advances in Knowledge Discovery and Data Mining. AAAI Press/MIT Press, March 1996.

    Google Scholar 

  6. Venkatesh Ganti, Raghu Ramakrishnan, Johannes Gehrke, Allison Powell, and James French. Clustering large datasets in arbitrary metric spaces. In Proc. 15th IEEE Conf. Data Engineering (ICDE), 23–26 March 1999.

    Google Scholar 

  7. K. Chidananda Gowda and G. Krishna. Agglomerative clustering using the concept of mutual nearest neighbourhood. Pattern Recognition, 10(2):105–112, 1978.

    Article  MATH  Google Scholar 

  8. S. Guha, R. Rastogi, and K. Shim. CURE: An efficient clustering algorithm for large databases. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD-98), volume 27,2 of ACM SIGMOD Record, pages 73–84, New York, June 1–4 1998. ACM Press.

    Google Scholar 

  9. Sudipto Guha, Rajeev Rastogi, and Kyuseok Shim. ROCK: A robust clustering algorithm for categorical attributes. In Proceedings of the 15th International Conference on Data Engineering, 23–26 March 1999, Sydney, Austrialia, pages 512–521. IEEE Computer Society, 1999.

    Google Scholar 

  10. John A. Hartigan. Clustering algorithms. Wiley Series in Probability and Mathematical Statistics. John Wiley & Sons., New York, 1975.

    MATH  Google Scholar 

  11. Jacob Holm, Kristian de Lichtenberg, and Mikkel Thorup. Poly-logarithmic deterministic fully-dynamic algorithms for connectivity, minimum spanning tree, 2-edge, and biconnectivity. In Proceedings of the Thirtieth Annual ACM Symposium on the Theory of Computing, pages 79–89, Dallas, Texas, USA, May 23–26 1998. ACM.

    Google Scholar 

  12. R. Jarvis and E. Patrick. Clustering using a similarity measure based on shared near neighbors. IEEE Transactions on Computers, 22(11):1025–1034, November 1973.

    Google Scholar 

  13. Peter Bro Miltersen, Sairam Subramanian, Jeffrey Scott Vitter, and Roberto Tamassia. Complexity models for incremental computation. Theoretical Computer Science, 130(1):203–236, August 1994.

    Google Scholar 

  14. Jeffrey Scott Vitter. External memory algorithms and data structures. In James Abello and Jeffrey Scott Vitter, editors, External Memory Algorithms and Visualization, DIMACS Series in Discrete Mathematics and Theoretical Computer Science. American Mathematical Society Press, Providence, RI, 1999.

    Google Scholar 

  15. M. Anthony Wong and Tom Lane. A kth nearest neighbour clustering procedure. J. R. Statist. Soc. B, 45(3):362–368, 1983.

    MATH  MathSciNet  Google Scholar 

  16. Tian Zhang, Raghu Ramakrishnan, and Miron Livny. BIRCH: An efficient data clustering method for very large databases. In H. V Jagadish and Inderpal Singh Mumick, editors, Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data, pages 103–114, Montreal, Quebec, Canada, 4–6 June 1996. SIGMOD Record 25(2), June 1996.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lodi, S. (2002). Fully Dynamic Clustering of Metric Data Sets. In: Eaglestone, B., North, S., Poulovassilis, A. (eds) Advances in Databases. BNCOD 2002. Lecture Notes in Computer Science, vol 2405. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45495-0_12

Download citation

  • DOI: https://doi.org/10.1007/3-540-45495-0_12

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-43905-9

  • Online ISBN: 978-3-540-45495-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics