Skip to main content

On the Expected Exclusion Power of Binary Partitions for Metric Search

  • Conference paper
  • First Online:
Similarity Search and Applications (SISAP 2022)

Abstract

The entire history and, we dare say, future of similarity search is governed by the underlying notion of partition. A partition is an equivalence relation defined over the space, therefore each element of the space is contained within precisely one of the equivalence classes of the partition. All attempts to search a finite space efficiently, whether exactly or approximately, rely on some set of principles which imply that if the query is within one equivalence class, then one or more other classes either cannot, or probably do not, contain any of its solutions.

In most early research, partitions relied only on the metric postulates, and logarithmic search time could be obtained on low dimensional spaces. In these cases, it was straightforward to identify multiple partitions, each of which gave a relatively high probability of identifying subsets of the space which could not contain solutions. Over time the datasets being searched have become more complex, leading to higher dimensional spaces. It is now understood that even an approximate search in a very high-dimensional space is destined to require \(\mathcal {O}(n)\) time and space.

Almost entirely missing from the research literature however is any analysis of exactly when this effect takes over. In this paper, we make a start on tackling this important issue. Using a quantitative approach, we aim to shed some light on the notion of the exclusion power of partitions, in an attempt to better understand their nature with respect to increasing dimensionality.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    A nearest neighbour query can be formulated as a range query where the query threshold is not known in advance but it is set iteratively as the distance to the current k-th nearest neighbour [16].

  2. 2.

    All results in this article are derived using randomly generated uniformly distributed Euclidean data in different dimensions as stated. All code is available on request from the authors.

References

  1. Chávez, E., Navarro, G.: A compact space decomposition for effective metric indexing. Pattern Recogn. Lett. 26(9), 1363–1376 (2005). https://doi.org/10.1016/j.patrec.2004.11.014

    Article  Google Scholar 

  2. Chávez, E., Navarro, G., Baeza-Yates, R., Marroquín, J.L.: Searching in metric spaces. ACM Comput. Surv. 33(3), 273–321 (2001). https://doi.org/10.1145/502807.502808

    Article  Google Scholar 

  3. Cháivez, E., Navarro, G.: A compact space decomposition for effective metric indexing. Pattern Recogn. Lett. 26(9), 1363–1376 (2005). https://doi.org/10.1016/j.patrec.2004.11.014. https://linkinghub.elsevier.com/retrieve/pii/S0167865504003733

  4. Connor, R., Cardillo, F.A., Vadicamo, L., Rabitti, F.: Hilbert exclusion: improved metric search through finite isometric embeddings. ACM Trans. Inf. Syst.(TOIS) 35(3), 17:1–17:27 (2016). https://doi.org/10.1145/3001583

  5. Connor, R., Vadicamo, L., Rabitti, F.: High-dimensional simplexes for supermetric search. In: Beecks, C., Borutta, F., Kröger, P., Seidl, T. (eds.) SISAP 2017. LNCS, vol. 10609, pp. 96–109. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68474-1_7

    Chapter  Google Scholar 

  6. Connor, R., Dearle, A., Vadicamo, L.: Investigating binary partition power in metric query. In: Proceedings of the 30th Italian Symposium on Advanced Database Systems, SEBD 2022, CEUR Workshop Proceedings, vol. 3194, pp. 415–426. Tirrenia (PI), Italy, 19–22 June 2022. http://ceur-ws.org/Vol-3194/paper49.pdf, http://CEUR-WS.org

  7. Connor, R., Vadicamo, L., Cardillo, F.A., Rabitti, F.: Supermetric search. Inf. Syst. 80, 108–123 (2019). https://doi.org/10.1016/j.is.2018.01.002

    Article  Google Scholar 

  8. Hetland, M.L.: Comparison-based indexing from first principles. http://arxiv.org/abs/1908.06318

  9. Hetland, M.L.: Metrics and ambits and sprawls, oh my. In: Satoh, S., et al. (eds.) SISAP 2020. LNCS, vol. 12440, pp. 126–139. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-60936-8_10. http://arxiv.org/abs/2008.09654

  10. Naidan, B., Boytsov, L., Nyberg, E.: Permutation search methods are efficient, yet faster search is possible. Proc. Int. Conf. Very Large Data Bases 8(12), 1618–1629 (2015)

    Google Scholar 

  11. Pestov, V., Stojmirović, A.: Indexing schemes for similarity search: an illustrated paradigm. Fund. Inform. 70(4), 367–385 (2006)

    MathSciNet  MATH  Google Scholar 

  12. Sadit Tellez, E., Chávez, E.: The list of clusters revisited. In: Carrasco-Ochoa, J.A., Martínez-Trinidad, J.F., Olvera López, J.A., Boyer, K.L. (eds.) MCPR 2012. LNCS, vol. 7329, pp. 187–196. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-31149-9_19

    Chapter  Google Scholar 

  13. Uhlmann, J.K.: Satisfying general proximity/similarity queries with metric trees. Inf. Process. Lett. 40(4), 175–179 (1991)

    Article  Google Scholar 

  14. Weber, R., Schek, H.J., Blott, S.: A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. In: Proceedings International Conference on Very Large Data Bases, vol. 98, pp. 194–205 (1998)

    Google Scholar 

  15. Yianilos, P.N.: Data structures and algorithms for nearest neighbor search in general metric spaces. In: Proceedings of the Fourth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 1993, pp. 311-321. Society for Industrial and Applied Mathematics (1993)

    Google Scholar 

  16. Zezula, P., Amato, G., Dohnal, V., Batko, M.: Similarity Search: The Metric Space Approach, vol. 32. Springer, New York (2006). https://doi.org/10.1007/0-387-29151-2

    Book  MATH  Google Scholar 

Download references

Acknowledgements

This work was partially funded by AI4Media - A European Excellence Centre for Media, Society, and Democracy (EC, H2020 n. 951911) and by Economic & Social Research Council, ADR UK Programme ES/W010321/1.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lucia Vadicamo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Vadicamo, L., Dearle, A., Connor, R. (2022). On the Expected Exclusion Power of Binary Partitions for Metric Search. In: Skopal, T., Falchi, F., Lokoč, J., Sapino, M.L., Bartolini, I., Patella, M. (eds) Similarity Search and Applications. SISAP 2022. Lecture Notes in Computer Science, vol 13590. Springer, Cham. https://doi.org/10.1007/978-3-031-17849-8_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-17849-8_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-17848-1

  • Online ISBN: 978-3-031-17849-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics