Skip to main content
Log in

Accelerated Manhattan hashing via bit-remapping with location information

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Hashing is a binary-code encoding method which tries to preserve the neighborhood structures in the original feature space, in order to realize efficient approximate nearest neighbor search in large-scale databases. Existing hashing methods usually adopt a two-stage strategy (projection stage and quantization stage) to encode data points, and threshold-based single-bit quantization (SBQ) is used to binarize each projected dimension into 0 or 1. Data similarity between hash codes is measured by their Hamming distance. However, SBQ may destroy the original neighborhood structures by quantizing neighboring points near threshold into different binary values. Double-bit quantization (DBQ) and its derivative, Manhattan hashing, have been proposed to fix this problem. Experimental results showed that Manhattan hashing outperformed state-of-the-art methods in terms of effectiveness, but lost the advantage of efficiency because it used decimal arithmetic instead of fast bitwise operations for similarity measurement between hash codes. In this paper, we propose an accelerated strategy of Manhattan hashing by making full use of bitwise operations. Our main contributions are: 1) a new encoding method which assigns location information to each binary digit is proposed to avoid the time-consuming decimal arithmetic; 2) a novel hash code distance measurement that accelerates the calculation of Manhattan distance is proposed to improve query efficiency. Extensive experiments on three benchmark datasets show that our approach improves the speed of data querying on 2-bit, 3-bit and 4-bit quantized hash codes by at least one order of magnitude on average, without any precision loss.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. The method bit-count (n) counts the number of ’1’ bits in the binary representation of n, which is also known as the calculation of n’s Hamming weight.

  2. Codes are provided on http://ise.thss.tsinghua.edu.cn/MIG/resources.jsp

References

  1. Andoni A, Indyk P (2008) Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In: Communications of the ACM - 50th anniversary issue: 1958–2008, vol 51

  2. Baluja S, Covell M (2008) Learning to hash: forgiving hash functions and applications. Data Min Knowl Disc 17(3)

  3. Cheng W, Jin X, Sun J-T, Lin X, Zhang X, Wang W (2014) Searching dimension incomplete databases. Knowl Data Eng 26(3)

  4. Ding G, Guo Y, Zhou J (2014) Collective matrix factorization hashing for multimodal data. In: Computer vision and pattern recognition

  5. Friedman JH, Bentley JL, Finkel RA (1977) An algorithm for finding best matches in logarithmic expected time. ACM Trans Math Softw 3(3)

  6. Gionis A, Indyk P, Motwani R, et al. (1999) Similarity search in high dimensions via hashing. In: Very large data bases, vol 99

  7. Gong Y, Lazebnik S (2011) Iterative quantization: a procrustean approach to learning binary codes. In: Computer vision and pattern recognition

  8. Guttman A (1984) R-trees: a dynamic index structure for spatial searching 14(2)

  9. Indyk P, Motwani R (1998) Approximate nearest neighbors: towards removing the curse of dimensionality. In: Proceedings of the 13th annual ACM symposium on theory of computing

  10. Jegou H, Douze M, Schmid C (2008) Hamming embedding and weak geometric consistency for large scale image search. In: European conference on computer vision

  11. Jégou H, Douze M, Schmid C (2010) Improving bag-of-features for large scale image search. Int J Comput Vis 87(3)

  12. Jegou H, Douze M, Schmid C (2011) Product quantization for nearest neighbor search. Pattern Analysis and Machine Intelligence 33(1)

  13. Jolliffe I (2002) Principal component analysis

  14. Kong W, Li W-J (2012) Double-bit quantization for hashing. In: Association for the advancement of artificial intelligence

  15. Kong W, Li W-J, Guo M (2012) Manhattan hashing for large-scale image retrieval. In: ACM special interest group on information retrieval

  16. Lee Y, Heo J-P, Yoon S-E (2014) Quadra-embedding: binary code embedding with low quantization error. Comput Vis Image Underst 125

  17. Lin Z, Ding G, Hu M (2014) Image auto-annotation via tag-dependent random search over range-constrained visual neighbours. Multimedia tools and applications

  18. Lin Z, Ding G, Hu M, Wang J (2015) Semantics-preserving hashing for cross-view retrieval. In: Computer vision and pattern recognition

  19. Liu W, Wang J, Kumar S, Chang S-F (2011) Hashing with graphs. In: Proceedings of the 28th international conference on machine learning

  20. Moran S, Lavrenko V, Osborne M (2013) Neighbourhood preserving quantisation for lsh. In: Proceedings of the 36th international ACM SIGIR conference on research and development in information retrieval

  21. Moran S, Lavrenko V, Osborne M (2013) Variable bit quantisation for lsh. In: Association for computational linguistics

  22. Mu Y, Shen J, Yan S (2010) Weakly-supervised hashing in kernel space. In: Computer vision and pattern recognition

  23. Norouzi M, Blei DM (2011) Minimal loss hashing for compact binary codes. In: International conference on machine learning

  24. Raginsky M, Lazebnik S (2009) Locality-sensitive binary codes from shift-invariant kernels. In: Advances in neural information processing systems

  25. Song J, Yang Y, Huang Z, Shen HT, Hong R (2011) Multiple feature hashing for real-time large scale near-duplicate video retrieval. In: Proceedings of the 19th ACM international conference on multimedia

  26. Uhlmann JK (1991) Satisfying general proximity/similarity queries with metric trees. Inf Process Lett 40(4)

  27. Wang J, Kumar S, Chang SF (2010) Semi-supervised hashing for scalable image retrieval. In: Computer vision and pattern recognition

  28. Wang X, Jin X, Chen M-E, Zhang K, Shen D (2012) Topic mining over asynchronous text sequences. Knowl Data Eng 24(1)

  29. Weiss Y, Torralba A, Fergus R (2009) Spectral hashing. In: Advances in neural information processing systems

  30. Wold S, Esbensen K, Geladi P (1987) Principal component analysis. Chemom Intell Lab Syst 2(1)

  31. Yu Z, Wu F, Yang Y, Tian Q, Luo J, Zhuang Y (2014) Discriminative coupled dictionary hashing for fast cross-media retrieval. In: Proceedings of the 37th international ACM SIGIR conference on research and development in information retrieval

  32. Zhou J, Ding G, Guo Y (2014) Latent semantic sparse hashing for cross-modal similarity search. In: Proceedings of the 37th international ACM SIGIR conference on research and development in information retrieval

  33. Zhu X, Huang Z, Cheng H, Cui J, Shen HT (2013) Sparse hashing for fast multimedia search. ACM Trans Inf Syst 31(2)

  34. Zhu X, Huang Z, Shen HT, Zhao X (2013) Linear cross-modal hashing for efficient multimedia search. In: Proceedings of the 21st ACM international conference on multimedia

  35. Zhu X, Zhang L, Huang Z (2014) A sparse embedding and least variance encoding approach to hashing. Image Processing 23(9)

Download references

Acknowledgments

This research was supported by the National Natural Science Foundation of China (Grant No.61271394 and 61571269). The authors would like to thank the anonymous reviewers for their valuable comments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guiguang Ding.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, W., Ding, G., Lin, Z. et al. Accelerated Manhattan hashing via bit-remapping with location information. Multimed Tools Appl 76, 2441–2466 (2017). https://doi.org/10.1007/s11042-015-3217-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-015-3217-x

Keywords

Navigation