Abstract
Similarity queries searching for the most similar objects in a database compared to a given sample object are an important requirement for multimedia databases. However, strict mathematical correctness is not essential in many applications of similarity queries. For example, if we are concerned with image retrieval based on color and texture similarity, slight mathematical inaccuracies will hardly be recognized by the human observer. Therefore we present a relaxed algorithm to perform similarity queries for multidimensional index structures. This algorithm assures only that a user defined portion of the result list containing n elements actually belongs to the n most similar objects — the remaining elements are subject to a best effort semantics. As we will demonstrate, this allows to improve the performance of similarity queries by about 25 % with only marginal inaccuracies in the result.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
N. Beckmann, H.-P. Kriegel, R. Schneider, and B. Seeger. The R*-tree: an efficient and robust access method for points and rectangles. In Proc. ACM SIGMOD Conf., pages 322–331, Atlantic City, N.J., USA, 1990.
S. Berchtold, C. Böhm, D. Keim, and H.-P. Kriegel. A cost model for nearest neighbor search in high-dimensional data space. In Proc. 16th ACM Symposium on Principles of Database Systems, pages 78–86, Tucson, Arizona, 1997.
S. Berchtold, D. Keim, and H.-P. Kriegel. The X-tree: An index structure for high-dimensional data. In Proc. 22th Intl. Conf. on VLDB, pages 28–39, Mumbai (Bombay), India, 1996.
H. E. Blok. Top N optimization issues in MM databases. Proceedings of the EDBT 2000 PhD Workshop, Mar. 2000. http://www.edbt2000.uni-konstanz.de/phd-workshop/.
C. Buckley and A. F. Lewit. Optimization of inverted vector searches. In Proc. 8th Intl. ACM SIGIR Conf. on Research and Development in Information Retrieval, pages 97–110, New York, 1985.
M. J. Carey and D. Kossmann. On Saying ”Enough Already!” in SQL. In SIGMOD 1997, Proc. ACM SIGMOD Intl. Conf. on Management of Data, May 13–15, 1997, Tucson, Arizona, USA, pages 219–230. ACM Press, 1997.
D. Donjerkovic and R. Ramakrishnan. Probabilistic optimization of top n queries. In VLDB’99, Proc. of 25th Intl. Conf. on Very Large Data Bases, September 7–10, 1999, Edinburgh, Scotland, UK, pages 411–422. Morgan Kaufmann, 1999.
J. Friedman, J. Bentley, and R. Finkel. An algorithm for finding best matches in logarithmic expected time. ACM Trans. Math. Software, 3:209–226, September 1977.
A. Henrich. A distance-scan algorithm for spatial access structures. In Proc. 2nd ACM Workshop on Advances in Geographic Information Systems, pages 136–143, Gaithersburg, Md., USA, 1994.
A. Henrich. The LSDh-tree: An access structure for feature vectors. In Proc. 14th Intl. Conf. on Data Engineering, Orlando, Florida, USA, pages 362–369, 1998.
A. Henrich and H.-W. Six. How to split buckets in spatial data structures. In Proc. Intl. Conf. on Geographic Database Management Systems, Esprit Basic Research Series DG XIII, pages 212–244, Capri, 1991.
A. Henrich, H.-W. Six, and P. Widmayer. The LSD-tree: spatial access to multidimensional point and non point objects. In Proc. 15th Intl. Conf. on VLDB, pages 45–53, Amsterdam, 1989.
N. Katayama and S. Satoh. The SR-tree: An index structure for high-dimensional nearest neighbor queries. In Proc. ACM SIGMOD Conf., pages 369–380, Tucson, Arizona, USA, 1997.
K.-I. Lin, H. Jagadish, and C. Faloutsos. The TV-tree: An index structure for high-dimensional data. VLDB Journal, 3(4):517–542, Oct. 1994.
G. Salton. Automatic Text Processing: The Transformation, Analysis and Retrieval of Information by Computer. Addison-Wesley, Reading, Mass., USA, 1989.
G. Salton and C. Buckley. Term-weighting approaches in automatic text retrieval. Information Processing & Management, 24(5):513–523, 1988.
G. Sheikholeslami, W. Chang, and A. Zhang. Semantic clustering and querying on heterogeneous features for visual data. In Proc. 6th ACM Intl. Conf. on Multimedia (Multimedia-98), pages 3–12, N.Y., 1998. ACM Press.
J. Sturges and T. Whitfield. Locating basic colours in the munsell space. Color Research and Application, 20:364–376, 1995.
D. White and R. Jain. Similarity indexing: Algorithms and performance. In Proc. Storage and Retrieval for Image and Video Databases IV (SPIE), volume 2670, pages 62–73, San Diego, CA, USA, 1996.
D. White and R. Jain. Similarity indexing with the SS-tree. In Proc. 12th Intl. Conf. on Data Engineering, pages 516–523, New Orleans, La., USA, 1996.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Henrich, A. (2002). A Relaxed Algorithm for Similarity Queries Performed with High-Dimensional Access Structures. In: Chaudhri, A.B., Unland, R., Djeraba, C., Lindner, W. (eds) XML-Based Data Management and Multimedia Engineering — EDBT 2002 Workshops. EDBT 2002. Lecture Notes in Computer Science, vol 2490. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36128-6_22
Download citation
DOI: https://doi.org/10.1007/3-540-36128-6_22
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-00130-0
Online ISBN: 978-3-540-36128-2
eBook Packages: Springer Book Archive