Abstract
Databases are getting more and more important for storing complex objects from scientific, engineering, or multimedia applications. Examples for such data are chemical compounds, CAD drawings, or XML data. The efficient search for similar objects in such databases is a key feature. However, the general problem of many similarity measures for complex objects is their computational complexity, which makes them unusable for large databases. In this paper, we combine and extend the two techniques of metric index structures and multi-step query processing to improve the performance of range query processing. The efficiency of our methods is demonstrated in extensive experiments on real-world data including graphs, trees, and vector sets.
Similar content being viewed by others
References
Agrawal R, Faloutsos C, Swami AN (1993) Efficient similarity search in sequence databases. In: Proceedings of the 4th international conference of foundations of data organization and algorithms (FODO), pp 69–84
Ankerst M, Breunig MM, Kriegel H-P, Sander J (1999) OPTICS: ordering points to identify the clustering structure. In: Proceedings of the ACM SIGMOD international conference on management of data (SIGMOD'99), Philadelphia, PA, pp 49–60
Brecheisen S, Kriegel H-P, Kröger P, Pfeifle M (2004) Visually mining through cluster hierarchies. In: Proceedings of the SIAM international conference on data mining (SDM'04), Orlando, FL
Chavez E, Navarro G, Baeza-Yates R, Marroquin JL (2001) Searching in metric spaces. ACM Comput Surv 33(3):273–321
Ciaccia P, Patella M, Zezula P (1997) M-tree: An efficient access method for similarity search in metric spaces. In: VLDB'97, Proceedings of the 23rd international conference on very large databases, August 25–29, 1997, Athens, Greece, pp 426–435
Ester M, Kriegel H-P, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the 2nd international conference on knowledge discovery and data mining (KDD'96), Portland, OR, pp 291–316
Kailing K, Kriegel H-P, Pfeifle M, Schönauer S (2004) Efficient indexing of complex objects for density-based clustering. In: Proceedings of the 5th international workshop on multimedia data mining (MDM/KDD), Seattle, WA, pp 28–37
Kailing K, Kriegel H-P, Pryakhin A, Schubert M (2004) Clustering multi-represented objects with noise. In: Proceedings of the 8th Pacific-Asia conference on knowledge discovery and data mining (PAKDD'04), Sydney, Australia, pp 394–403
Kailing K, Kriegel, H-P, Schönauer S, Thomas S (2004) Efficient similarity search for hierachical data in large databases. In: Proceedings of the 9th international conference on extending database technology (EDBT 2004), pp 676–693
Kriegel H-P, Brecheisen S, Krger P, Pfeifle M, Schubert M (2003) Using sets of feature vectors for similarity search on voxelized cad objects. In: Proceedings of the ACM SIGMOD international conference on management of data (SIGMOD'03), San Diego, CA, pp 587–598
Kriegel H-P, Schönauer S (2003) Similarity search in structured data. In: Proceedings of the 5th international conference, DaWaK 2003, Prague, Czech Republic, pp 309–319
Kuhn H (1955) The Hungarian method for the assignment problem. Naval Res Logist Quart 2:83–97
Munkres J (1957) Algorithms for the assignment and transportation problems. J SIAM 6:32–38
Nierman A, Jagadish HV (2002) Evaluating structural similarity in XML documents. In: Proceedings of the 5th international workshop on the web and databases (WebDB 2002), Madison, Wisconsin, USA, pp 61–66
Sebastian TB, Klein PN, Kimia BB (2001) Recognition of shapes by editing shock graphs. In: Proceedings of the 8th international conference on computer vision (ICCV'01), Vancouver, BC, Canada, vol 1, pp 755–762
Traina C Jr., Traina A, Seeger B, Faloutsos C (2000) Slim-trees: high performance metric trees minimizing overlap between nodes. In: Proceedings of the 7th international conference on extending database technology, Konstanz, Germany, March 27–31, 2000, pp 51–65
Wang JTL, Zhang K, Chang G, Shasha D (2002) Finding approximate patterns in undirected acyclic graphs. Pattern Recog 35(2):473–483
Zhang K, Wang J, Shasha D (1996) On the editing distance between undirected acyclic graphs. Int J Found Comput Sci 7(1):43–57
Author information
Authors and Affiliations
Corresponding author
Additional information
Karin Kailing received her PhD from the University of Munich where she is working as a research and teaching assistant. She is currently on a leave of absence to the IBM Almaden Research Center. Her research interests are in query processing and knowledge discovery in databases. One of her focus areas is the development of new techniques for mining complex objects.
Hans-Peter Kriegel is a full professor for database systems in the Department of Computer Science at the University of Munich and the department head since 2003. His research interests are in spatial and multimedia database systems, particularly in query processing, performance issues, similarity search, high-dimensional indexing as well as in knowledge discovery and data mining. He received his MS and Ph.D. in 1973 and 1976, respectively, from the University of Karlsruhe, Germany. Hans-Peter Kriegel has been chairman and program committee member in many international database conferences. He has published over 200 refereed conference and journal papers. In 1997 Hans-Peter Kriegel received the “SIGMOD Best Paper Award” for the publication and prototype implementation “Fast Parallel Similarity Search in Multimedia Databases” together with four members of his research team.
Martin Pfeifle works as a research and teaching assistant in the group of Prof. Dr. Hans-Peter Kriegel. The research interests of Martin Pfeifle include database support for virtual engineering, with a strong emphasis on spatial index structures and similarity search in spatial databases. Furthermore, he is interested in the area of knowledge discovery in databases, especially in density-based clustering.
Stefan Schönauer currently is a Post-Doc at IBM Almaden Research Center in the reasearch group of Rakesh Agrawal. He received his MS and Ph.D. in 1999 and 2004, respectively, from the University of Munich, Germany. His research interests are in similarity search and data mining in complex objects, content-based image retrieval and bioinformatics. He is a member of ACM SIGMOD.
Rights and permissions
About this article
Cite this article
Kailing, K., Kriegel, HP., Pfeifle, M. et al. Extending metric index structures for efficient range query processing. Knowl Inf Syst 10, 211–227 (2006). https://doi.org/10.1007/s10115-006-0018-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-006-0018-6