Fully Dynamic Clustering of Metric Data Sets

Lodi, Stefano

doi:10.1007/3-540-45495-0_12

Stefano Lodi⁷

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2405))

Included in the following conference series:

British National Conference on Databases

239 Accesses

Abstract

The goal of cluster analysis [10] is to find homogeneous groups, or clusters, in data. Homogeneity is often made precise by means of a dissimilarity function on objects, having low values at pairs of objects in one cluster. Cluster analysis has also been investigated in data mining [5], emphasising efficiency on data sets larger than main memory [4,6,8,9,16]. More recently, the growing importance of multimedia and transactional databases has stimulated interest in metric clustering, i.e. when dissimilarity satisfies the triangular inequality.

This work has been partially supported by project D2I of the Italian MIUR

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Yi-Jen Chiang, Michael T. Goodrich, Edward F. Grove, Roberto Tamassia, Darren Erik Vengroff, and Jeffrey Scott Vitter. External-memory graph algorithms. In Proceedings of the Sixth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 139–149, San Francisco, California, 22–24 January 1995.
Google Scholar
Paolo Ciaccia, Marco Patella, and Pavel Zezula. M-tree: An efficient access method for similarity search in metric spaces. In VLDB’97, Proceedings of 23rd International Conference on Very Large Data Bases, pages 426–435, 1997.
Google Scholar
Martin Ester, Hans-Peter Kriegel, Jörg Sander, Michael Wimmer, and Xiaowei Xu. Incremental clustering for mining in a data warehousing environment. In Proc. 24th Int. Conf. Very Large Data Bases (VLDB), pages 323–333, 24–27 August 1998.
Google Scholar
Martin Ester, Hans-Peter Kriegel, Jorg Sander, and Xiaowei Xu. A density-based algorithm for discovering clusters in large spatial databases with noise. In Evangelos Simoudis, Jia Wei Han, and Usama Fayyad, editors, Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD-96), page 226. AAAI Press, 1996.
Google Scholar
Usama M. Fayyad, Gregory Piatetsky-Shapiro, Padhraic Smyth, and Ramasamy Uthurusamy, editors. Advances in Knowledge Discovery and Data Mining. AAAI Press/MIT Press, March 1996.
Google Scholar
Venkatesh Ganti, Raghu Ramakrishnan, Johannes Gehrke, Allison Powell, and James French. Clustering large datasets in arbitrary metric spaces. In Proc. 15th IEEE Conf. Data Engineering (ICDE), 23–26 March 1999.
Google Scholar
K. Chidananda Gowda and G. Krishna. Agglomerative clustering using the concept of mutual nearest neighbourhood. Pattern Recognition, 10(2):105–112, 1978.
Article MATH Google Scholar
S. Guha, R. Rastogi, and K. Shim. CURE: An efficient clustering algorithm for large databases. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD-98), volume 27,2 of ACM SIGMOD Record, pages 73–84, New York, June 1–4 1998. ACM Press.
Google Scholar
Sudipto Guha, Rajeev Rastogi, and Kyuseok Shim. ROCK: A robust clustering algorithm for categorical attributes. In Proceedings of the 15th International Conference on Data Engineering, 23–26 March 1999, Sydney, Austrialia, pages 512–521. IEEE Computer Society, 1999.
Google Scholar
John A. Hartigan. Clustering algorithms. Wiley Series in Probability and Mathematical Statistics. John Wiley & Sons., New York, 1975.
MATH Google Scholar
Jacob Holm, Kristian de Lichtenberg, and Mikkel Thorup. Poly-logarithmic deterministic fully-dynamic algorithms for connectivity, minimum spanning tree, 2-edge, and biconnectivity. In Proceedings of the Thirtieth Annual ACM Symposium on the Theory of Computing, pages 79–89, Dallas, Texas, USA, May 23–26 1998. ACM.
Google Scholar
R. Jarvis and E. Patrick. Clustering using a similarity measure based on shared near neighbors. IEEE Transactions on Computers, 22(11):1025–1034, November 1973.
Google Scholar
Peter Bro Miltersen, Sairam Subramanian, Jeffrey Scott Vitter, and Roberto Tamassia. Complexity models for incremental computation. Theoretical Computer Science, 130(1):203–236, August 1994.
Google Scholar
Jeffrey Scott Vitter. External memory algorithms and data structures. In James Abello and Jeffrey Scott Vitter, editors, External Memory Algorithms and Visualization, DIMACS Series in Discrete Mathematics and Theoretical Computer Science. American Mathematical Society Press, Providence, RI, 1999.
Google Scholar
M. Anthony Wong and Tom Lane. A kth nearest neighbour clustering procedure. J. R. Statist. Soc. B, 45(3):362–368, 1983.
MATH MathSciNet Google Scholar
Tian Zhang, Raghu Ramakrishnan, and Miron Livny. BIRCH: An efficient data clustering method for very large databases. In H. V Jagadish and Inderpal Singh Mumick, editors, Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data, pages 103–114, Montreal, Quebec, Canada, 4–6 June 1996. SIGMOD Record 25(2), June 1996.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electronics, Computer Science, and Systems, University of Bologna, CSITE-CNR, viale Risorgimento 2, 40136, Bologna, Italy
Stefano Lodi

Authors

Stefano Lodi
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Information Studies, University of Sheffield, Sheffield, S10 2TN, UK
Barry Eaglestone
Department of Computer Science, University of Sheffield, Sheffield, S10 2TN, UK
Siobhán North
Computer Science and Information Systems, Birkbeck College, Malet Street, London, WC1E 7HX, UK
Alexandra Poulovassilis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lodi, S. (2002). Fully Dynamic Clustering of Metric Data Sets. In: Eaglestone, B., North, S., Poulovassilis, A. (eds) Advances in Databases. BNCOD 2002. Lecture Notes in Computer Science, vol 2405. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45495-0_12

Download citation

DOI: https://doi.org/10.1007/3-540-45495-0_12
Published: 04 July 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43905-9
Online ISBN: 978-3-540-45495-3
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics