Skip to main content

A Relaxed Algorithm for Similarity Queries Performed with High-Dimensional Access Structures

  • Conference paper
  • First Online:
XML-Based Data Management and Multimedia Engineering — EDBT 2002 Workshops (EDBT 2002)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2490))

Included in the following conference series:

  • 266 Accesses

Abstract

Similarity queries searching for the most similar objects in a database compared to a given sample object are an important requirement for multimedia databases. However, strict mathematical correctness is not essential in many applications of similarity queries. For example, if we are concerned with image retrieval based on color and texture similarity, slight mathematical inaccuracies will hardly be recognized by the human observer. Therefore we present a relaxed algorithm to perform similarity queries for multidimensional index structures. This algorithm assures only that a user defined portion of the result list containing n elements actually belongs to the n most similar objects — the remaining elements are subject to a best effort semantics. As we will demonstrate, this allows to improve the performance of similarity queries by about 25 % with only marginal inaccuracies in the result.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. N. Beckmann, H.-P. Kriegel, R. Schneider, and B. Seeger. The R*-tree: an efficient and robust access method for points and rectangles. In Proc. ACM SIGMOD Conf., pages 322–331, Atlantic City, N.J., USA, 1990.

    Google Scholar 

  2. S. Berchtold, C. Böhm, D. Keim, and H.-P. Kriegel. A cost model for nearest neighbor search in high-dimensional data space. In Proc. 16th ACM Symposium on Principles of Database Systems, pages 78–86, Tucson, Arizona, 1997.

    Google Scholar 

  3. S. Berchtold, D. Keim, and H.-P. Kriegel. The X-tree: An index structure for high-dimensional data. In Proc. 22th Intl. Conf. on VLDB, pages 28–39, Mumbai (Bombay), India, 1996.

    Google Scholar 

  4. H. E. Blok. Top N optimization issues in MM databases. Proceedings of the EDBT 2000 PhD Workshop, Mar. 2000. http://www.edbt2000.uni-konstanz.de/phd-workshop/.

  5. C. Buckley and A. F. Lewit. Optimization of inverted vector searches. In Proc. 8th Intl. ACM SIGIR Conf. on Research and Development in Information Retrieval, pages 97–110, New York, 1985.

    Google Scholar 

  6. M. J. Carey and D. Kossmann. On Saying ”Enough Already!” in SQL. In SIGMOD 1997, Proc. ACM SIGMOD Intl. Conf. on Management of Data, May 13–15, 1997, Tucson, Arizona, USA, pages 219–230. ACM Press, 1997.

    Google Scholar 

  7. D. Donjerkovic and R. Ramakrishnan. Probabilistic optimization of top n queries. In VLDB’99, Proc. of 25th Intl. Conf. on Very Large Data Bases, September 7–10, 1999, Edinburgh, Scotland, UK, pages 411–422. Morgan Kaufmann, 1999.

    Google Scholar 

  8. J. Friedman, J. Bentley, and R. Finkel. An algorithm for finding best matches in logarithmic expected time. ACM Trans. Math. Software, 3:209–226, September 1977.

    Google Scholar 

  9. A. Henrich. A distance-scan algorithm for spatial access structures. In Proc. 2nd ACM Workshop on Advances in Geographic Information Systems, pages 136–143, Gaithersburg, Md., USA, 1994.

    Google Scholar 

  10. A. Henrich. The LSDh-tree: An access structure for feature vectors. In Proc. 14th Intl. Conf. on Data Engineering, Orlando, Florida, USA, pages 362–369, 1998.

    Google Scholar 

  11. A. Henrich and H.-W. Six. How to split buckets in spatial data structures. In Proc. Intl. Conf. on Geographic Database Management Systems, Esprit Basic Research Series DG XIII, pages 212–244, Capri, 1991.

    Google Scholar 

  12. A. Henrich, H.-W. Six, and P. Widmayer. The LSD-tree: spatial access to multidimensional point and non point objects. In Proc. 15th Intl. Conf. on VLDB, pages 45–53, Amsterdam, 1989.

    Google Scholar 

  13. N. Katayama and S. Satoh. The SR-tree: An index structure for high-dimensional nearest neighbor queries. In Proc. ACM SIGMOD Conf., pages 369–380, Tucson, Arizona, USA, 1997.

    Google Scholar 

  14. K.-I. Lin, H. Jagadish, and C. Faloutsos. The TV-tree: An index structure for high-dimensional data. VLDB Journal, 3(4):517–542, Oct. 1994.

    Google Scholar 

  15. G. Salton. Automatic Text Processing: The Transformation, Analysis and Retrieval of Information by Computer. Addison-Wesley, Reading, Mass., USA, 1989.

    Google Scholar 

  16. G. Salton and C. Buckley. Term-weighting approaches in automatic text retrieval. Information Processing & Management, 24(5):513–523, 1988.

    Article  Google Scholar 

  17. G. Sheikholeslami, W. Chang, and A. Zhang. Semantic clustering and querying on heterogeneous features for visual data. In Proc. 6th ACM Intl. Conf. on Multimedia (Multimedia-98), pages 3–12, N.Y., 1998. ACM Press.

    Google Scholar 

  18. J. Sturges and T. Whitfield. Locating basic colours in the munsell space. Color Research and Application, 20:364–376, 1995.

    Article  Google Scholar 

  19. D. White and R. Jain. Similarity indexing: Algorithms and performance. In Proc. Storage and Retrieval for Image and Video Databases IV (SPIE), volume 2670, pages 62–73, San Diego, CA, USA, 1996.

    Google Scholar 

  20. D. White and R. Jain. Similarity indexing with the SS-tree. In Proc. 12th Intl. Conf. on Data Engineering, pages 516–523, New Orleans, La., USA, 1996.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Henrich, A. (2002). A Relaxed Algorithm for Similarity Queries Performed with High-Dimensional Access Structures. In: Chaudhri, A.B., Unland, R., Djeraba, C., Lindner, W. (eds) XML-Based Data Management and Multimedia Engineering — EDBT 2002 Workshops. EDBT 2002. Lecture Notes in Computer Science, vol 2490. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36128-6_22

Download citation

  • DOI: https://doi.org/10.1007/3-540-36128-6_22

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-00130-0

  • Online ISBN: 978-3-540-36128-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics