Skip to main content

Similarity Grid for Searching in Metric Spaces

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3664))

Abstract

Similarity search in metric spaces represents an important paradigm for content-based retrieval of many applications. Existing centralized search structures can speed-up retrieval, but they do not scale up to large volume of data because the response time is linearly increasing with the size of the searched file. The proposed GHT* index is a scalable and distributed structure. By exploiting parallelism in a dynamic network of computers, the GHT* achieves practically constant search time for similarity range queries in data-sets of arbitrary size. The structure also scales well with respect to the growing volume of retrieved data. Moreover, a small amount of replicated routing information on each server increases logarithmically. At the same time, the potential for interquery parallelism is increasing with the growing data-sets because the relative number of servers utilized by individual queries is decreasing. All these properties are verified by experiments on a prototype system using real-life data-sets.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Amato, G., Rabitti, F., Savino, P., Zezula, P.: Region proximity in metric spaces and its use for approximate similarity search. ACM TOIS 21(2), 192–227 (2003)

    Article  Google Scholar 

  2. Bustos, B., Navarro, G., Chávez, E.: Pivot selection techniques for proximity searching in metric spaces. In: Proc. of the XXI Conference of the Chilean Computer Science Society (SCCC 2001), pp. 33–40 (2001)

    Google Scholar 

  3. Chavez, E., Navarro, G., Baeza-Yates, R., Marroquin, J.: Proximity searching in metric spaces. ACM Computing Surveys 33(3), 273–321 (2001)

    Article  Google Scholar 

  4. Ciaccia, P., Patella, M., Zezula, P.: M-tree: An efficient access method for similarity search in metric spaces. In: Proc. of 23rd International Conference on Very Large Data Bases (VLDB), pp. 426–435 (1997)

    Google Scholar 

  5. Devine, R.: Design and implementation of DDH: A distributed dynamic hashing algorithm. In: Lomet, D.B. (ed.) FODO 1993. LNCS, vol. 730, pp. 101–114. Springer, Heidelberg (1993)

    Google Scholar 

  6. Dohnal, V., Gennaro, C., Savino, P., Zezula, P.: D-index: Distance searching index for metric data sets. Multimedia Tools and Applications 21(1), 9–13 (2003)

    Article  Google Scholar 

  7. Gennaro, C., Savino, P., Zezula, P.: Similarity search in metric databases through hashing. In: Proc. of the 3rd Work. on Multimedia Inf. Retrieval, pp. 1–5 (2001)

    Google Scholar 

  8. Hjaltason, G.R., Samet, H.: Index-driven similarity search in metric spaces. ACM Transactions on Database Systems 28(4), 517–580 (2003)

    Article  Google Scholar 

  9. Johnson, T., Krishna, P.: Lazy updates for distributed search structure. In: Proc. of the ACM SIGMOD, vol. 22(2), pp. 337–346 (1993)

    Google Scholar 

  10. Kröll, B., Widmayer, P.: Distributing a search tree among a growing number of processors. In: Proc. of the ACM SIGMOD, vol. 23(2), pp. 265–276 (1994)

    Google Scholar 

  11. Litwin, W., Neimat, M., Schneider, D.A.: LH* - a scalable, distributed data structure. ACM Transactions on Database Systems 21(4), 480–525 (1996)

    Article  Google Scholar 

  12. Ratnasamy, S., Francis, P., Handley, M., Karp, R., Shenker, S.: A scalable content addressable network. In: Proc. of ACM SIGCOMM 2001, pp. 161–172 (2001)

    Google Scholar 

  13. Tang, C., Xu, Z., Dwarkadas, S.: Peer-to-peer information retrieval using self-organizing semantic overlay networks. In: Proc. of Conference on Applications, tech., archit., and protocols for computer communications, pp. 175–186 (2003)

    Google Scholar 

  14. Uhlmann, J.K.: Satisfying general proximity / similarity queries with metric trees. IPL: Information Processing Letters 40, 175–179 (1991)

    Article  MATH  Google Scholar 

  15. Zezula, P., Savino, P., Rabitti, F., Amato, G., Ciaccia, P.: Processing m-trees with parallel resources. In: Proc. of the 8th International Workshop on Research Issues in Data Engineering (RIDE 1998), Orlando, FL, February 1998, pp. 147–154 (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Batko, M., Gennaro, C., Zezula, P. (2005). Similarity Grid for Searching in Metric Spaces. In: Türker, C., Agosti, M., Schek, HJ. (eds) Peer-to-Peer, Grid, and Service-Orientation in Digital Library Architectures. Lecture Notes in Computer Science, vol 3664. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11549819_3

Download citation

  • DOI: https://doi.org/10.1007/11549819_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-28711-7

  • Online ISBN: 978-3-540-28712-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics