Skip to main content
Log in

Deep Supervised Hashing for Fast Image Retrieval

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

In this paper, we present a new hashing method to learn compact binary codes for highly efficient image retrieval on large-scale datasets. While the complex image appearance variations still pose a great challenge to reliable retrieval, in light of the recent progress of Convolutional Neural Networks (CNNs) in learning robust image representation on various vision tasks, this paper proposes a novel Deep Supervised Hashing method to learn compact similarity-preserving binary code for the huge body of image data. Specifically, we devise a CNN architecture that takes pairs/triplets of images as training inputs and encourages the output of each image to approximate discrete values (e.g. \(+\,1\)/\(-\,1\)). To this end, the loss functions are elaborately designed to maximize the discriminability of the output space by encoding the supervised information from the input image pairs/triplets, and simultaneously imposing regularization on the real-valued outputs to approximate the desired discrete values. For image retrieval, new-coming query images can be easily encoded by forward propagating through the network and then quantizing the network outputs to binary codes representation. Extensive experiments on three large scale datasets CIFAR-10, NUS-WIDE, and SVHN show the promising performance of our method compared with the state-of-the-arts.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Notes

  1. The source code of our DSH with running samples are available at http://vipl.ict.ac.cn/resources/codes or https://github.com/lhmRyan/deep-supervised-hashing-DSH.

References

  • Cao, Z., Long, M., Wang, J., & Yu, P. S. (2017). Hashnet: Deep learning to hash by continuation. In The IEEE international conference on computer vision, pp. 5608–5617.

  • Chua, T., Tang, J., Hong, R., Li, H., Luo, Z., & Zheng, Y. (2009). NUS-WIDE: A real-world web image database from National University of Singapore. In Proceedings of the ACM international conference on image and video retrieval, pp. 48:1–48:9.

  • Deng, J., Ding, N., Jia, Y., Frome, A., Murphy, K., Bengio, S., et al. (2014). Large-scale object classification using label relation graphs. In European Conference on Computer Vision, pp. 48–64.

  • Erin Liong, V., Lu, J., Wang, G., Moulin, P., & Zhou, J. (2015). Deep hashing for compact binary codes learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2475–2483.

  • Gionis, A., Indyk, P., & Motwani, R. (1999). Similarity search in high dimensions via hashing. In Proceedings of 25th international conference on very large data bases, pp. 518–529.

  • Glorot, X., & Bengio, Y. (2010). Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the thirteenth international conference on artificial intelligence and statistics, pp. 249–256.

  • Gong, Y., & Lazebnik, S. (2011). Iterative quantization: A procrustean approach to learning binary codes. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 817–824.

  • Hadsell, R., Chopra, S., & LeCun, Y. (2006). Dimensionality reduction by learning an invariant mapping. In Proceedings of the IEEE Computer Society conference on computer vision and pattern recognition, pp. 1735–1742.

  • He, K., Zhang, X., Ren, S., & Sun, J. (2015). Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE international conference on computer vision, pp. 1026–1034.

  • He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778.

  • Hermans, A., Beyer, L., & Leibe, B. (2017). In defense of the triplet loss for person re-identification. ArXiv preprint arXiv:1703.07737.

  • Jégou, H., Douze, M., & Schmid, C. (2011). Product quantization for nearest neighbor search. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(1), 117–128.

    Article  Google Scholar 

  • Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R. B., et al. (2014). Caffe: Convolutional architecture for fast feature embedding. In Proceedings of the 22nd ACM international conference on multimedia, pp. 675–678.

  • Jiang, Q. Y., & Li, W. J. (2017). Deep cross-modal hashing. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3270–3278.

  • Kang, W. C., Li, W. J., & Zhou, Z. H. (2016). Column sampling based discrete supervised hashing. In Proceedings of the thirtieth AAAI conference on artificial intelligence, pp. 1230–1236.

  • Krizhevsky, A. (2009). Learning multiple layers of features from tiny images. Technical report, University of Toronto.

  • Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pp. 1097–1105.

  • Kulis, B., & Darrell, T. (2009). Learning to hash with binary reconstructive embeddings. In Advances in neural information processing systems, pp. 1042–1050.

  • Lai, H., Pan, Y., Liu, Y., & Yan, S. (2015). Simultaneous feature learning and hash coding with deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3270–3278.

  • Li, W., Wang, S., & Kang, W. (2016). Feature learning based deep supervised hashing with pairwise labels. In Proceedings of the twenty-fifth international joint conference on artificial intelligence, pp. 1711–1717.

  • Lin, K., Yang, H., Hsiao, J., & Chen, C. (2015a). Deep learning of binary hash codes for fast image retrieval. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp. 27–35.

  • Lin, T., RoyChowdhury, A., & Maji, S. (2015b). Bilinear cnn models for fine-grained visual recognition. In Proceedings of the IEEE international conference on computer vision, pp. 1449–1457.

  • Liu, H., Wang, R., Shan, S., & Chen, X. (2016). Deep supervised hashing for fast image retrieval. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2064–2072.

  • Liu, L., Shen, F., Shen, Y., Liu, X., & Shao, L. (2017). Deep sketch hashing: Fast free-hand sketch-based image retrieval. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2298–2307.

  • Liu, W., Mu, C., Kumar, S., & Chang, S. (2014). Discrete graph hashing. In Advances in neural information processing systems, pp. 3419–3427.

  • Liu, W., Wang, J., Ji, R., Jiang, Y., & Chang, S. (2012). Supervised hashing with kernels. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2074–2081.

  • Long, J., Shelhamer, E., & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3431–3440.

  • Nair, V., & Hinton, G. E. (2010). Rectified linear units improve restricted Boltzmann machines. In Proceedings of the 27th international conference on machine learning, pp. 807–814.

  • Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., & Ng, A. Y. (2011). Reading digits in natural images with unsupervised feature learning. In NIPS workshop on deep learning and unsupervised feature learning.

  • Norouzi, M., & Fleet, D. J. (2011). Minimal loss hashing for compact binary codes. In Proceedings of the 28th international conference on machine learning, pp. 353–360.

  • Norouzi, M., Fleet, D. J., & Salakhutdinov, R. (2012). Hamming distance metric learning. In Advances in neural information processing systems, pp. 1061–1069.

  • Oliva, A., & Torralba, A. (2001). Modeling the shape of the scene: A holistic representation of the spatial envelope. International Journal of Computer Vision, 42(3), 145–175.

    Article  MATH  Google Scholar 

  • Rastegari, M., Farhadi, A., & Forsyth, D. (2012). Attribute discovery via predictable discriminative binary codes. In European Conference on Computer Vision, pp. 876–889.

  • Rastegari, M., Ordonez, V., Redmon, J., & Farhadi, A. (2016). Xnor-net: Imagenet classification using binary convolutional neural networks. In European Conference on Computer Vision, pp. 525–542.

  • Shen, F., Shen, C., Liu, W., & Shen, H. T. (2015). Supervised discrete hashing. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 37–45.

  • Shen, L., Lin, Z., & Huang, Q. (2016). Relay backpropagation for effective learning of deep convolutional neural networks. In European Conference on Computer Vision, pp. 467–482.

  • Soudry, D., Hubara, I., & Meir, R. (2014). Expectation backpropagation: Parameter-free training of multilayer neural networks with continuous or discrete weights. In Advances in neural information processing systems, pp. 963–971.

  • Sun, Y., Chen, Y., Wang, X., Tang, X. (2014). Deep learning face representation by joint identification-verification. In Advances in neural information processing systems, pp. 1988–1996.

  • Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., et al. (2015). Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1–9.

  • Szegedy, C., Toshev, A., & Erhan, D. (2013). Deep neural networks for object detection. In Advances in neural information processing systems, pp. 2553–2561.

  • Wang, J., Kumar, S., & Chang, S. (2012). Semi-supervised hashing for large-scale search. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(12), 2393–2406.

    Article  Google Scholar 

  • Wang, J., Zhang, T., Song, J., Sebe, N., & Shen, H. T. (2017). A survey on learning to hash. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40, 769–790.

    Article  Google Scholar 

  • Wang, X., Shi, Y., & Kitani, K. M. (2016). Deep supervised hashing with triplet labels. In Asian Conference on Computer Vision, pp. 70–84.

  • Weiss, Y., Torralba, A., & Fergus, R. (2008). Spectral hashing. In Advances in neural information processing systems, pp. 1753–1760.

  • Xia, R., Pan, Y., Lai, H., Liu, C., & Yan, S. (2014). Supervised hashing for image retrieval via image representation learning. In Proceedings of the twenty-eighth AAAI conference on artificial intelligence, pp. 2156–2162.

  • Zhang, R., Lin, L., Zhang, R., Zuo, W., & Zhang, L. (2015). Bit-scalable deep hashing with regularized similarity learning for image retrieval and person re-identification. IEEE Transactions on Image Processing, 24(12), 4766–4779.

    Article  MathSciNet  MATH  Google Scholar 

  • Zhang, Z., Chen, Y., & Saligrama, V. (2016). Efficient training of very deep neural networks for supervised hashing. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1487–1495.

  • Zhao, F., Huang, Y., Wang, L., & Tan, T. (2015). Deep semantic ranking based hashing for multi-label image retrieval. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1556–1564.

Download references

Acknowledgements

This work is partially supported by 973 Program under Contract No. 2015CB351802, Natural Science Foundation of China under Contracts Nos. 61390511, 61772500, Frontier Science Key Research Project CAS No. QYZDJ-SSW-JSC009, and Youth Innovation Promotion Association CAS No. 2015085.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ruiping Wang.

Additional information

Communicated by A.W.M Smeulders.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, H., Wang, R., Shan, S. et al. Deep Supervised Hashing for Fast Image Retrieval. Int J Comput Vis 127, 1217–1234 (2019). https://doi.org/10.1007/s11263-019-01174-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-019-01174-4

Keywords

Navigation