Distribution-Aware Locality Sensitive Hashing

Zhang, Lei; Zhang, Yongdong; Zhang, Dongming; Tian, Qi

doi:10.1007/978-3-642-35728-2_38

Distribution-Aware Locality Sensitive Hashing

Lei Zhang^7,8,
Yongdong Zhang⁷,
Dongming Zhang⁷ &
…
Qi Tian⁹

Conference paper

2100 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7733))

Abstract

Locality Sensitive Hashing (LSH) has been popularly used in content-based search systems. There exist two main categories of LSH methods: one is to index the original data in an effective way to accelerate search process; the other one is to embed the high-dimensional data into hamming space and perform bit-wise operations to search similar objects. In this paper, we propose a new LSH scheme, called Distribution-Aware LSH (DALSH), to address the problem of lacking adaptation to real data, which is the intrinsic limitation of most LSH methods belong to the former category. In DALSH, a given dataset is embedded into a low-dimensional space with projection vectors learned from data, followed by deriving hash functions from the distribution of the dimension-reduced data. We also present a multi-probe strategy to improve the query performance. Experimental comparisons with the state-of-the-art LSH methods on two high-dimensional datasets demonstrate the efficacy of DALSH.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bentley, J.L.: K-d trees for semidynamic point sets. In: Proc. SCG, pp. 187–197 (1990)
Google Scholar
Guttman, A.: R-trees: a dynamic index structure for spatial searching. In: International Conference on Management of Data, pp. 47–57 (1984)
Google Scholar
Beyer, K., Goldstein, J., Ramakrishnan, R., Shaft, U.: When is “nearest neighbor” meaningful? In: International Conference on Database Theory, pp. 217–235 (1999)
Google Scholar
Tao, Y., Yi, K., Sheng, C., Kalnis, P.: Efficient and accurate nearest neighbor and closest pair search in high-dimensional space. ACM TODS 35(3), 20 (2010)
Article Google Scholar
Gionis, A., Indyk, P., Motwani, R.: Similarity search in high dimensions via hashing. In: Proceedings of the International Conference on Very Large Data Bases, pp. 518–529 (1999)
Google Scholar
Indyk, P., Motwani, R.: Approximate nearest neighbors: towards removing the curse of dimensionality. In: Proc. STOC, pp. 604–613 (1998)
Google Scholar
Datar, M., Immorlica, N., Indyk, P., Mirrokni, V.S.: Locality-sensitive hashing scheme based on p-stable distributions. In: Proc. SCG, pp. 253–262 (2004)
Google Scholar
Lv, Q., Josephson, W., Wang, Z., Charikar, M., Li, K.: Multi-probe lsh: efficient indexing for high-dimensional similarity search. In: Proc. VLDB, pp. 950–961 (2007)
Google Scholar
Wang, J., Kumar, S., Chang, S.F.: Sequential projection learning for hashing with compact codes. In: Proc. ICML, pp. 1127–1134 (2010)
Google Scholar
Kulis, B., Grauman, K.: Kernelized locality-sensitive hashing for scale image search. In: International Conference on Computer Vision, pp. 2130–2137 (2009)
Google Scholar
Liu, W., Wang, J., Kumar, S., Chang, S.F.: Hashing with graphs. In: ICML, pp. 1–8 (2011)
Google Scholar
Joly, A., Buisson, O.: A posteriori multi-probe locality sensitive hashing. In: ACM MM (2008)
Google Scholar
Bawa, M., Condie, T., Ganesan, P.: LSH forest: self-tuning indexes for similarity search. In: Proc. WWW, pp. 651–660 (2005)
Google Scholar
Shakhnarovich, G., Darrell, T., Indyk, P.: Nearest-neighbor methods in learning and vision. IEEE Transactions on Neural Networks 19(2), 337 (2008)
Google Scholar
Jégou, H., Amsaleg, L., Schmid, C., Gros, P.: Query adaptative locality sensitive hashing. In: IEEE Int. Conf. on Acoustics, Speech and Signal Processing, pp. 825–828 (2008)
Google Scholar
Paulevé, L., Jégou, H., Amsaleg, L.: Locality sensitive hashing: A comparison of hash function types and querying mechanisms. Pattern Recognition Letters 31(11), 1348 (2010)
Article Google Scholar
Salakhutdinov, R., Hinton, G.: Semantic hashing. Int. J. Approx Reason. 50(7), 969 (2009)
Article Google Scholar
Weiss, Y., Torralba, A., Fergus, R.: Spectral hashing. In: NIPS, pp. 1753–1760 (2008)
Google Scholar
Raginsky, M., Lazebnik, S.: Locality-sensitive binary codes from shift-invariant kernels. In: Neural Information Processing Systems, pp. 1509–1517 (2009)
Google Scholar
Wang, M., Yang, K., Hua, X., Zhang, H.: Towards a relevant and diverse search of social images. IEEE Transactions on Multimedia, 829–842 (2010)
Article Google Scholar
Liu, W., Wang, J., Ji, R., Jiang, Y.G., Chang, S.F.: Supervised hashing with kernels. In: Proc. CVPR, pp. 2074–2081 (2012)
Google Scholar
Lejsek, H., Ásmundsson, F.H., Jónsson, B.T., Amsaleg, L.: NV-tree: An efficient disk based index for approximate search in very large high-dimensional collections. IEEE Transactions on Pattern Analysis and Machine Intelligence 31(5), 869 (2009)
Article Google Scholar
Jégou, H., Douze, M., Schmid, C.: Product quantization for nearest neighbor search. IEEE Transactions on Pattern Analysis and Machine Intelligence 33(1), 117 (2011)
Article Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 91–110 (2004)
Article Google Scholar
Oliva, A., Torralba, A.: Modeling the shape of the scene: A holistic representation of the spatial envelope. International Journal of Computer Vision 42(3), 145 (2001)
Article Google Scholar
Silpa-Anan, C., Hartley, R.: Optimised KD-trees for fast image descriptor matching. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2008)
Google Scholar
Muja, M., Lowe, D.G.: Fast approximate nearest neighbors with automatic algorithm configuration. In: Int. Conf. Computer Vision Theory and Applications, pp. 331–340 (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

Advanced Computing Research Laboratory, Beijing Key Laboratory of Mobile Computing and Pervasive Device, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
Lei Zhang, Yongdong Zhang & Dongming Zhang
University of Chinese Academy of Sciences, Beijing, China
Lei Zhang
University of Texas at San Antonio, San Antonio, USA
Qi Tian

Authors

Lei Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yongdong Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Dongming Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Qi Tian
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Microsoft Research Asia, 5 Danling Street, 100080, Beijing, China
Shipeng Li & Tao Mei &
School of Electrical Engineering and Computer Science, University of Ottawa, 800 King Edward, K1N 6N5, Ottawa, ON, Canada
Abdulmotaleb El Saddik
School of Computer and Information, Hefei University of Technology, Road Tunxi 193#, 230009, Hefei, Anhui, China
Meng Wang & Richang Hong &
Department of Information Engineering and Computer Science, University of Trento, ommarive 14, 38100, Trento, Italy
Nicu Sebe
Department of Electrical and Computer Engineering, National University of Singapore, 4 Engineering Drive 3, 117583, Singapore, Singapore
Shuicheng Yan
School of Computing, CLARITY: Centre for Sensor Web Technologies, Dublin City University, Glasnevin, 9, Dublin, Ireland
Cathal Gurrin

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, L., Zhang, Y., Zhang, D., Tian, Q. (2013). Distribution-Aware Locality Sensitive Hashing. In: Li, S., et al. Advances in Multimedia Modeling. Lecture Notes in Computer Science, vol 7733. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35728-2_38

Download citation

DOI: https://doi.org/10.1007/978-3-642-35728-2_38
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35727-5
Online ISBN: 978-3-642-35728-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics