A Relaxed Algorithm for Similarity Queries Performed with High-Dimensional Access Structures

Henrich, Andreas

doi:10.1007/3-540-36128-6_22

Andreas Henrich⁸

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2490))

Included in the following conference series:

International Conference on Extending Database Technology

266 Accesses

Abstract

Similarity queries searching for the most similar objects in a database compared to a given sample object are an important requirement for multimedia databases. However, strict mathematical correctness is not essential in many applications of similarity queries. For example, if we are concerned with image retrieval based on color and texture similarity, slight mathematical inaccuracies will hardly be recognized by the human observer. Therefore we present a relaxed algorithm to perform similarity queries for multidimensional index structures. This algorithm assures only that a user defined portion of the result list containing n elements actually belongs to the n most similar objects — the remaining elements are subject to a best effort semantics. As we will demonstrate, this allows to improve the performance of similarity queries by about 25 % with only marginal inaccuracies in the result.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

N. Beckmann, H.-P. Kriegel, R. Schneider, and B. Seeger. The R*-tree: an efficient and robust access method for points and rectangles. In Proc. ACM SIGMOD Conf., pages 322–331, Atlantic City, N.J., USA, 1990.
Google Scholar
S. Berchtold, C. Böhm, D. Keim, and H.-P. Kriegel. A cost model for nearest neighbor search in high-dimensional data space. In Proc. 16th ACM Symposium on Principles of Database Systems, pages 78–86, Tucson, Arizona, 1997.
Google Scholar
S. Berchtold, D. Keim, and H.-P. Kriegel. The X-tree: An index structure for high-dimensional data. In Proc. 22th Intl. Conf. on VLDB, pages 28–39, Mumbai (Bombay), India, 1996.
Google Scholar
H. E. Blok. Top N optimization issues in MM databases. Proceedings of the EDBT 2000 PhD Workshop, Mar. 2000. http://www.edbt2000.uni-konstanz.de/phd-workshop/.
C. Buckley and A. F. Lewit. Optimization of inverted vector searches. In Proc. 8th Intl. ACM SIGIR Conf. on Research and Development in Information Retrieval, pages 97–110, New York, 1985.
Google Scholar
M. J. Carey and D. Kossmann. On Saying ”Enough Already!” in SQL. In SIGMOD 1997, Proc. ACM SIGMOD Intl. Conf. on Management of Data, May 13–15, 1997, Tucson, Arizona, USA, pages 219–230. ACM Press, 1997.
Google Scholar
D. Donjerkovic and R. Ramakrishnan. Probabilistic optimization of top n queries. In VLDB’99, Proc. of 25th Intl. Conf. on Very Large Data Bases, September 7–10, 1999, Edinburgh, Scotland, UK, pages 411–422. Morgan Kaufmann, 1999.
Google Scholar
J. Friedman, J. Bentley, and R. Finkel. An algorithm for finding best matches in logarithmic expected time. ACM Trans. Math. Software, 3:209–226, September 1977.
Google Scholar
A. Henrich. A distance-scan algorithm for spatial access structures. In Proc. 2nd ACM Workshop on Advances in Geographic Information Systems, pages 136–143, Gaithersburg, Md., USA, 1994.
Google Scholar
A. Henrich. The LSD^h-tree: An access structure for feature vectors. In Proc. 14th Intl. Conf. on Data Engineering, Orlando, Florida, USA, pages 362–369, 1998.
Google Scholar
A. Henrich and H.-W. Six. How to split buckets in spatial data structures. In Proc. Intl. Conf. on Geographic Database Management Systems, Esprit Basic Research Series DG XIII, pages 212–244, Capri, 1991.
Google Scholar
A. Henrich, H.-W. Six, and P. Widmayer. The LSD-tree: spatial access to multidimensional point and non point objects. In Proc. 15th Intl. Conf. on VLDB, pages 45–53, Amsterdam, 1989.
Google Scholar
N. Katayama and S. Satoh. The SR-tree: An index structure for high-dimensional nearest neighbor queries. In Proc. ACM SIGMOD Conf., pages 369–380, Tucson, Arizona, USA, 1997.
Google Scholar
K.-I. Lin, H. Jagadish, and C. Faloutsos. The TV-tree: An index structure for high-dimensional data. VLDB Journal, 3(4):517–542, Oct. 1994.
Google Scholar
G. Salton. Automatic Text Processing: The Transformation, Analysis and Retrieval of Information by Computer. Addison-Wesley, Reading, Mass., USA, 1989.
Google Scholar
G. Salton and C. Buckley. Term-weighting approaches in automatic text retrieval. Information Processing & Management, 24(5):513–523, 1988.
Article Google Scholar
G. Sheikholeslami, W. Chang, and A. Zhang. Semantic clustering and querying on heterogeneous features for visual data. In Proc. 6th ACM Intl. Conf. on Multimedia (Multimedia-98), pages 3–12, N.Y., 1998. ACM Press.
Google Scholar
J. Sturges and T. Whitfield. Locating basic colours in the munsell space. Color Research and Application, 20:364–376, 1995.
Article Google Scholar
D. White and R. Jain. Similarity indexing: Algorithms and performance. In Proc. Storage and Retrieval for Image and Video Databases IV (SPIE), volume 2670, pages 62–73, San Diego, CA, USA, 1996.
Google Scholar
D. White and R. Jain. Similarity indexing with the SS-tree. In Proc. 12th Intl. Conf. on Data Engineering, pages 516–523, New Orleans, La., USA, 1996.
Google Scholar

Download references

Author information

Authors and Affiliations

University of Bayreuth, D-95440, Bayreuth, Germany
Andreas Henrich

Authors

Andreas Henrich
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

IBM developer Works, 6 New Square, Bedfont Lakes, Feltham, Middlesex, TW14 8HA, UK
Akmal B. Chaudhri
Institute for Computer Science, University of Essen, Schützenbahn 70, 45117, Essen, Germany
Rainer Unland
IRIN, Nantes University, 2, rue de la Houssiniére, 44322, Nantes, France
Chabane Djeraba
Department of Computer Science DBIS Research Group, University of Rostock, 18051, Rostock, Germany
Wolfgang Lindner

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Henrich, A. (2002). A Relaxed Algorithm for Similarity Queries Performed with High-Dimensional Access Structures. In: Chaudhri, A.B., Unland, R., Djeraba, C., Lindner, W. (eds) XML-Based Data Management and Multimedia Engineering — EDBT 2002 Workshops. EDBT 2002. Lecture Notes in Computer Science, vol 2490. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36128-6_22

Download citation

DOI: https://doi.org/10.1007/3-540-36128-6_22
Published: 08 November 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-00130-0
Online ISBN: 978-3-540-36128-2
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics