On the Expected Exclusion Power of Binary Partitions for Metric Search

Vadicamo, Lucia; Dearle, Alan; Connor, Richard

doi:10.1007/978-3-031-17849-8_9

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13590))

Included in the following conference series:

International Conference on Similarity Search and Applications

Abstract

The entire history and, we dare say, future of similarity search is governed by the underlying notion of partition. A partition is an equivalence relation defined over the space, therefore each element of the space is contained within precisely one of the equivalence classes of the partition. All attempts to search a finite space efficiently, whether exactly or approximately, rely on some set of principles which imply that if the query is within one equivalence class, then one or more other classes either cannot, or probably do not, contain any of its solutions.

In most early research, partitions relied only on the metric postulates, and logarithmic search time could be obtained on low dimensional spaces. In these cases, it was straightforward to identify multiple partitions, each of which gave a relatively high probability of identifying subsets of the space which could not contain solutions. Over time the datasets being searched have become more complex, leading to higher dimensional spaces. It is now understood that even an approximate search in a very high-dimensional space is destined to require $\mathcal {O}(n)$ time and space.

Almost entirely missing from the research literature however is any analysis of exactly when this effect takes over. In this paper, we make a start on tackling this important issue. Using a quantitative approach, we aim to shed some light on the notion of the exclusion power of partitions, in an attempt to better understand their nature with respect to increasing dimensionality.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Supermetric Search with the Four-Point Property

Metric Indexing for Graph Similarity Search

High-Dimensional Simplexes for Supermetric Search

Notes

1.
A nearest neighbour query can be formulated as a range query where the query threshold is not known in advance but it is set iteratively as the distance to the current k-th nearest neighbour [16].
2.
All results in this article are derived using randomly generated uniformly distributed Euclidean data in different dimensions as stated. All code is available on request from the authors.

References

Chávez, E., Navarro, G.: A compact space decomposition for effective metric indexing. Pattern Recogn. Lett. 26(9), 1363–1376 (2005). https://doi.org/10.1016/j.patrec.2004.11.014
Article Google Scholar
Chávez, E., Navarro, G., Baeza-Yates, R., Marroquín, J.L.: Searching in metric spaces. ACM Comput. Surv. 33(3), 273–321 (2001). https://doi.org/10.1145/502807.502808
Article Google Scholar
Cháivez, E., Navarro, G.: A compact space decomposition for effective metric indexing. Pattern Recogn. Lett. 26(9), 1363–1376 (2005). https://doi.org/10.1016/j.patrec.2004.11.014. https://linkinghub.elsevier.com/retrieve/pii/S0167865504003733
Connor, R., Cardillo, F.A., Vadicamo, L., Rabitti, F.: Hilbert exclusion: improved metric search through finite isometric embeddings. ACM Trans. Inf. Syst.(TOIS) 35(3), 17:1–17:27 (2016). https://doi.org/10.1145/3001583
Connor, R., Vadicamo, L., Rabitti, F.: High-dimensional simplexes for supermetric search. In: Beecks, C., Borutta, F., Kröger, P., Seidl, T. (eds.) SISAP 2017. LNCS, vol. 10609, pp. 96–109. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68474-1_7
Chapter Google Scholar
Connor, R., Dearle, A., Vadicamo, L.: Investigating binary partition power in metric query. In: Proceedings of the 30th Italian Symposium on Advanced Database Systems, SEBD 2022, CEUR Workshop Proceedings, vol. 3194, pp. 415–426. Tirrenia (PI), Italy, 19–22 June 2022. http://ceur-ws.org/Vol-3194/paper49.pdf, http://CEUR-WS.org
Connor, R., Vadicamo, L., Cardillo, F.A., Rabitti, F.: Supermetric search. Inf. Syst. 80, 108–123 (2019). https://doi.org/10.1016/j.is.2018.01.002
Article Google Scholar
Hetland, M.L.: Comparison-based indexing from first principles. http://arxiv.org/abs/1908.06318
Hetland, M.L.: Metrics and ambits and sprawls, oh my. In: Satoh, S., et al. (eds.) SISAP 2020. LNCS, vol. 12440, pp. 126–139. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-60936-8_10. http://arxiv.org/abs/2008.09654
Naidan, B., Boytsov, L., Nyberg, E.: Permutation search methods are efficient, yet faster search is possible. Proc. Int. Conf. Very Large Data Bases 8(12), 1618–1629 (2015)
Google Scholar
Pestov, V., Stojmirović, A.: Indexing schemes for similarity search: an illustrated paradigm. Fund. Inform. 70(4), 367–385 (2006)
MathSciNet MATH Google Scholar
Sadit Tellez, E., Chávez, E.: The list of clusters revisited. In: Carrasco-Ochoa, J.A., Martínez-Trinidad, J.F., Olvera López, J.A., Boyer, K.L. (eds.) MCPR 2012. LNCS, vol. 7329, pp. 187–196. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-31149-9_19
Chapter Google Scholar
Uhlmann, J.K.: Satisfying general proximity/similarity queries with metric trees. Inf. Process. Lett. 40(4), 175–179 (1991)
Article Google Scholar
Weber, R., Schek, H.J., Blott, S.: A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. In: Proceedings International Conference on Very Large Data Bases, vol. 98, pp. 194–205 (1998)
Google Scholar
Yianilos, P.N.: Data structures and algorithms for nearest neighbor search in general metric spaces. In: Proceedings of the Fourth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 1993, pp. 311-321. Society for Industrial and Applied Mathematics (1993)
Google Scholar
Zezula, P., Amato, G., Dohnal, V., Batko, M.: Similarity Search: The Metric Space Approach, vol. 32. Springer, New York (2006). https://doi.org/10.1007/0-387-29151-2
Book MATH Google Scholar

Download references

Acknowledgements

This work was partially funded by AI4Media - A European Excellence Centre for Media, Society, and Democracy (EC, H2020 n. 951911) and by Economic & Social Research Council, ADR UK Programme ES/W010321/1.

Author information

Authors and Affiliations

Institute of Information Science and Technologies (ISTI), CNR, Pisa, Italy
Lucia Vadicamo
University of St Andrews, St Andrews, Scotland, UK
Alan Dearle & Richard Connor

Authors

Lucia Vadicamo
View author publications
You can also search for this author in PubMed Google Scholar
Alan Dearle
View author publications
You can also search for this author in PubMed Google Scholar
Richard Connor
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lucia Vadicamo .

Editor information

Editors and Affiliations

Charles University, Prague, Czech Republic
Tomáš Skopal
ISTI-CNR, Pisa, Italy
Fabrizio Falchi
Charles University, Prague, Czech Republic
Jakub Lokoč
University of Torino, Torino, Italy
Maria Luisa Sapino
University of Bologna, Bologna, Italy
Ilaria Bartolini
University of Bologna, Bologna, Italy
Marco Patella

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Vadicamo, L., Dearle, A., Connor, R. (2022). On the Expected Exclusion Power of Binary Partitions for Metric Search. In: Skopal, T., Falchi, F., Lokoč, J., Sapino, M.L., Bartolini, I., Patella, M. (eds) Similarity Search and Applications. SISAP 2022. Lecture Notes in Computer Science, vol 13590. Springer, Cham. https://doi.org/10.1007/978-3-031-17849-8_9

Download citation

DOI: https://doi.org/10.1007/978-3-031-17849-8_9
Published: 28 September 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-17848-1
Online ISBN: 978-3-031-17849-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

On the Expected Exclusion Power of Binary Partitions for Metric Search