Abstract
Searching in metric spaces is a very active field since it offers methods for indexing and searching by similarity in collections of unstructured data. These methods select some objects of the collection as reference objects to build the indexes. It has been shown that the way the references are selected affects the search performance, and several algorithms for good reference selection have been proposed. Most of them assume the space to have a reasonably regular distribution. However, in some spaces the objects are grouped in small dense clusters that can make these methods perform worse than a random selection. In this paper, we propose a new method able to detect these situations and adapt the structure of the index to them. Our experimental evaluation shows that our proposal is more efficient than previous approaches when using the same amount of memory.
This work has been partially supported by “Ministerio de Educación y Ciencia” (PGE y FEDER) ref. TIN2006-16071-C03-03 and by “Xunta de Galicia” ref. PGIDIT05SIN10502PR., and by “Dirección Xeral de Ordenación e Calidade do Sistema Universitario de Galicia, da Consellería de Educación e Ordenación Universitaria-Xunta de Galicia” for Diego Seco.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Chávez, E., Navarro, G., Baeza-Yates, R., Marroquín, J.L.: Searching in metric spaces. ACM Computing Surveys 33(3), 273–321 (2001)
Zezula, P., Amato, G., Dohnal, V., Batko, M.: Similarity search. The metric space approach. Advances in Database Systems, vol. 32. Springer, Heidelberg (2006)
Searcóid, M.O.: Metric Spaces. Springer Undergraduate Mathematics Series. Springer, Heidelberg (2007)
Bustos, B., Navarro, G., Chávez, E.: Pivot selection techniques for proximity searching in metric spaces. Pattern Recognition Letters 24(14), 2357–2366 (2003)
Brin, S.: Near neighbor search in large metric spaces. In: Proc. of 21st conference on Very Large Databases (VLDB 1995). ACM Press, New York (1995)
Burkhard, W.A., Keller, R.M.: Some approaches to best-match file searching. Communications of the ACM 16(4), 230–236 (1973)
Baeza-Yates, R., Cunto, W., Manber, U., Wu, S.: Proximity matching using fixed-queries trees. In: Crochemore, M., Gusfield, D. (eds.) CPM 1994. LNCS, vol. 807, pp. 198–212. Springer, Heidelberg (1994)
Yianilos, P.: Data structures and algorithms for nearest-neighbor search in general metric spaces. In: Proc. of the fourth annual ACM-SIAM Symposium on Discrete Algorithms (SODA 1993), pp. 311–321. ACM Press, New York (1993)
Vidal, E.: An algorithm for finding nearest neighbors in (approximately) constant average time. Pattern Recognition Letters 4, 145–157 (1986)
Micó, L., Oncina, J., Vidal, R.E.: A new version of the nearest-neighbor approximating and eliminating search (aesa) with linear pre-processing time and memory requirements. Pattern Recognition Letters 15, 9–17 (1994)
Kalantari, I., McDonald, G.: A data structure and an algorithm for the nearest point problem. IEEE Transactions on Software Engineering 9, 631–634 (1983)
Uhlmann, J.K.: Satisfying general proximity/similarity queries with metric trees. Information Processing Letters 40, 175–179 (1991)
Ciaccia, P., Patella, M., Zezula, P.: M-tree: An efficient access method for similarity search in metric spaces. In: Proc. of the 23rd International Conference on Very Large Data Bases (VLDB 1997), Athens, Greece, pp. 426–435. ACM Press, New York (1997)
Vleugels, J., Veltkamp, R.C.: Efficient image retrieval through vantage objects. Pattern Recognition 35(1), 69–80 (2002)
van Leuken, R.H., Veltkamp, R.C., Typke, R.: Selecting vantage objects for similarity indexing. In: Proc. of the 18th International Conference on Pattern Recognition (ICPR 2006), pp. 453–456. IEEE Press, Los Alamitos (2006)
Venkateswaran, J., Kahveci, T., Jermaine, C.M., Lachwani, D.: Reference-based indexing for metric spaces with costly distance measures. The VLDB Journal 17(5), 1231–1251 (2008)
Brisaboa, N.R., Fariña, A., Pedreira, O., Reyes, N.: Similarity search using sparse pivots for efficient multimedia information retrieval. In: Proc. of the 8th IEEE International Symposium on Multimedia (ISM 2006), San Diego, California, USA, pp. 881–888. IEEE Press, Los Alamitos (2006)
SISAP: Metric spaces library, http://sisap.org/metric_space_library.html
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Brisaboa, N.R., Luaces, M.R., Pedreira, O., Places, Á.S., Seco, D. (2010). Indexing Dense Nested Metric Spaces for Efficient Similarity Search. In: Pnueli, A., Virbitskaite, I., Voronkov, A. (eds) Perspectives of Systems Informatics. PSI 2009. Lecture Notes in Computer Science, vol 5947. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-11486-1_9
Download citation
DOI: https://doi.org/10.1007/978-3-642-11486-1_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-11485-4
Online ISBN: 978-3-642-11486-1
eBook Packages: Computer ScienceComputer Science (R0)