Skip to main content

SELSH: A Hashing Scheme for Approximate Similarity Search with Early Stop Condition

  • Conference paper
  • First Online:
MultiMedia Modeling (MMM 2016)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9517))

Included in the following conference series:

Abstract

Similarity search is a fundamental problem in various multimedia database applications. Due to the phenomenon of “curse of dimensionality”, the performance of many access methods decreases significantly when the dimensionality increases. Approximate similarity search is an alternative solution, and Locality Sensitive Hashing (LSH) is the most popular method for it. Nevertheless, LSH needs to verify a large number of points to get good-enough results, which incurs plenty of I/O cost. In this paper, we propose a new scheme called SortedKey and Early stop LSH (SELSH), which extends the previous SortingKeys-LSH (SK-LSH). SELSH uses a linear order to sort all the compound hash keys. Moreover, during query processing an early stop condition and a limited page number are used to determine whether a page needs to be accessed. Our experiments demonstrate the superiority of the proposed method against two state-of-the-art methods, C2LSH and SK-LSH.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://kdd.ics.uci.edu/databases/CorelFeatures/.

  2. 2.

    http://www.cs.princeton.edu/cass/audio.tar.gz.

  3. 3.

    http://corpus-texmex.irisa.fr/.

References

  1. Bentley, J.L.: Multidimensional binary search trees used for associative searching. Commun. ACM 18(9), 509–517 (1975)

    Article  MATH  MathSciNet  Google Scholar 

  2. Berchtold, S., Böhm, C., Kriegel, H.: The pyramid-technique: towards breaking the curse of dimensionality. In: SIGMOD 1998, Proceedings ACM SIGMOD International Conference on Management of Data, June 2–4, 1998, Seattle, pp. 142–153 (1998)

    Google Scholar 

  3. Böhm, C., Berchtold, S., Keim, D.A.: Searching in high-dimensional spaces: index structures for improving the performance of multimedia databases. ACM Comput. Surv. 33(3), 322–373 (2001)

    Article  Google Scholar 

  4. Ciaccia, P., Patella, M., Zezula, P.: M-tree: An efficient access method for similarity search in metric spaces. In: VLDB 1997, Proceedings of 23rd International Conference on Very Large Data Bases, Athens, 25–29 August, 1997, pp. 426–435 (1997)

    Google Scholar 

  5. Datar, M., Immorlica, N., Indyk, P., Mirrokni, V.S.: Locality-sensitive hashing scheme based on p-stable distributions. In: Proceedings of the 20th ACM Symposium on Computational Geometry, Brooklyn, New York, 8–11 June, 2004, pp. 253–262 (2004)

    Google Scholar 

  6. Gan, J., Feng, J., Fang, Q., Ng, W.: Locality-sensitive hashing scheme based on dynamic collision counting. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2012, Scottsdale, 20–24 May, 2012, pp. 541–552 (2012)

    Google Scholar 

  7. Gionis, A., Indyk, P., Motwani, R.: Similarity search in high dimensions via hashing. In: VLDB 1999, Proceedings of 25th International Conference on Very Large Data Bases, Edinburgh, 7–10 September, 1999, pp. 518–529 (1999)

    Google Scholar 

  8. Günther, O.: The design of the cell tree: an object-oriented index structure for geometric databases. In: Proceedings of the Fifth International Conference on Data Engineering, Los Angeles, 6–10 February, 1989, pp. 598–605 (1989)

    Google Scholar 

  9. Indyk, P., Motwani, R.: Approximate nearest neighbors: towards removing the curse of dimensionality. In: Proceedings of the Thirtieth Annual ACM Symposium on the Theory of Computing, Dallas, 23–26 May, 1998, pp. 604–613 (1998)

    Google Scholar 

  10. Jagadish, H.V., Ooi, B.C., Tan, K., Yu, C., Zhang, R.: iDistance: an adaptive b\({}^{\text{+ }}\)-tree based indexing method for nearest neighbor search. ACM Trans. Database Syst. 30(2), 364–397 (2005)

    Article  Google Scholar 

  11. Liu, Y., Cui, J., Huang, Z., Li, H., Shen, H.T.: SK-LSH: an efficient index structure for approximate nearest neighbor search. PVLDB 7(9), 745–756 (2014)

    Google Scholar 

  12. Lv, Q., Josephson, W., Wang, Z., Charikar, M., Li, K.: Multi-probe LSH: efficient indexing for high-dimensional similarity search. In: Proceedings of the 33rd International Conference on Very Large Data Bases, University of Vienna, Austria, 23–27 September, 2007, pp. 950–961 (2007)

    Google Scholar 

  13. Shen, F., Shen, C., Shi, Q., van den Hengel, A., Tang, Z.: Inductive hashing on manifolds. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, Portland, 23–28 June, 2013, pp. 1562–1569 (2013)

    Google Scholar 

  14. Shen, F., Shen, C., Shi, Q., van den Hengel, A., Tang, Z., Shen, H.T.: Hashing on nonlinear manifolds. IEEE Trans. Image Process. 24(6), 1839–1851 (2015)

    Article  MathSciNet  Google Scholar 

  15. Sun, Y., Wang, W., Qin, J., Zhang, Y., Lin, X.: SRS: solving c-approximate nearest neighbor queries in high dimensional euclidean space with a tiny index. PVLDB 8(1), 1–12 (2014)

    Google Scholar 

  16. Tao, Y., Yi, K., Sheng, C., Kalnis, P.: Efficient and accurate nearest neighbor and closest pair search in high-dimensional space. ACM Trans. Database Syst., 35(3) (2010)

    Google Scholar 

  17. Weber, R., Böhm, K., Schek, H.: Interactive-time similarity search for large image collections using parallel va-files. In: ICDE. p. 197 (2000)

    Google Scholar 

Download references

Acknowledgments

This work is partially supported by the Fundamental Research Funds for the Central Universities of China under grant No.ZYGX2014Z007.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jie Shao .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Chen, J., He, C., Hu, G., Shao, J. (2016). SELSH: A Hashing Scheme for Approximate Similarity Search with Early Stop Condition. In: Tian, Q., Sebe, N., Qi, GJ., Huet, B., Hong, R., Liu, X. (eds) MultiMedia Modeling. MMM 2016. Lecture Notes in Computer Science(), vol 9517. Springer, Cham. https://doi.org/10.1007/978-3-319-27674-8_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-27674-8_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-27673-1

  • Online ISBN: 978-3-319-27674-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics