Unsupervised Binary Representation Learning with Deep Variational Networks

Shen, Yuming; Liu, Li; Shao, Ling

doi:10.1007/s11263-019-01166-4

Unsupervised Binary Representation Learning with Deep Variational Networks

Published: 21 February 2019

Volume 127, pages 1614–1628, (2019)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

1758 Accesses
40 Citations
Explore all metrics

Abstract

Learning to hash is regarded as an efficient approach for image retrieval and many other big-data applications. Recently, deep learning frameworks are adopted for image hashing, suggesting an alternative way to formulate the encoding function other than the conventional projections. Although deep learning has been proved to be successful in supervised hashing, existing unsupervised deep hashing techniques still cannot produce leading performance compared with the non-deep methods, as it is hard to unveil the intrinsic structure of the whole sample space by simply regularizing the output codes within each single training batch. To tackle this problem, in this paper, we propose a novel unsupervised deep hashing model, named deep variational binaries (DVB). The conditional auto-encoding variational Bayesian networks are introduced in this work to exploit the feature space structure of the training data using the latent variables. Integrating the probabilistic inference process with hashing objectives, the proposed DVB model estimates the statistics of data representations, and thus produces compact binary codes. Experimental results on three benchmark datasets, i.e., CIFAR-10, SUN-397 and NUS-WIDE, demonstrate that DVB outperforms state-of-the-art unsupervised hashing methods with significant margins.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Two-Stage Unsupervised Deep Hashing for Image Retrieval

Deep Supervised Hashing for Fast Image Retrieval

Article 16 March 2019

Deep Supervised Auto-encoder Hashing for Image Retrieval

References

Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G. S., Davis, A., Dean, J., & Devin, M., et al. (2016). Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467.
Belkin, M., & Niyogi, P. (2001). Laplacian eigenmaps and spectral techniques for embedding and clustering. In Advances in neural information processing systems (NIPS).
Cao, Y., Liu, B., Long, M., & Wang, J. (2018). Hashgan: Deep learning to hash with pair conditional wasserstein gan. In IEEE conference on computer vision and pattern recognition (CVPR).
Cao, Y., Long, M., Wang, J., Zhu, H., & Wen, Q. (2016). Deep quantization network for efficient image retrieval. In AAAI conference on artificial intelligence (AAAI).
Carreira-Perpinán, M. A., & Raziperchikolaei, R. (2015). Hashing with binary autoencoders. In IEEE conference on computer vision and pattern recognition (CVPR).
Chaidaroon, S., & Fang, Y. (2017). Variational deep semantic hashing for text documents. In ACM conference on research and development in information retrieval (SIGIR).
Charikar, M.S. (2002). Similarity estimation techniques from rounding algorithms. In ACM symposium on theory of computing (STOC).
Chua, T. S., Tang, J., Hong, R., Li, H., Luo, Z., & Zheng, Y. (2009). Nus-wide: A real-world web image database from national university of singapore. In ACM international conference on image and video retrieval (CIVR).
Dai, B., Guo, R., Kumar, S., He, N., & Song, L. (2017). Stochastic generative hashing. In International conference on machine learning (ICML).
Do, T. T., Doan, A. D., & Cheung, N. M. (2016). Learning to hash with binary deep neural network. In European conference on computer vision (ECCV).
Erin Liong, V., Lu, J., Tan, Y. P., & Zhou, J. (2017). Cross-modal deep variational hashing. In IEEE international conference on computer vision (ICCV).
Erin Liong, V., Lu, J., Wang, G., Moulin, P., & Zhou, J. (2015). Deep hashing for compact binary codes learning. In IEEE conference on computer vision and pattern recognition (CVPR).
Eslami, S.A., Heess, N., Weber, T., Tassa, Y., Szepesvari, D., & Hinton, G.E., et al. (2016). Attend, infer, repeat: Fast scene understanding with generative models. In Advances in neural information processing systems (NIPS).
Gong, Y., Lazebnik, S., Gordo, A., & Perronnin, F. (2013). Iterative quantization: A procrustean approach to learning binary codes for large-scale image retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(12), 2916–2929. As References are repeated twice in reference list, we have deleted the duplicate reference please check and confirm.
Article Google Scholar
Guo, Y., Ding, G., Liu, L., Han, J., & Shao, L. (2017). Learning to hash with optimized anchor embedding for scalable retrieval. IEEE Transactions on Image Processing, 26(3), 1344–1354.
Article MathSciNet Google Scholar
He, K., Wen, F., & Sun, J. (2013). K-means hashing: An affinity-preserving quantization method for learning binary compact codes. In IEEE conference on computer vision and pattern recognition (CVPR).
He, X., & Niyogi, P. (2003). Locality preserving projections. In Advances in neural information processing systems (NIPS).
Heo, J. P., Lee, Y., He, J., Chang, S. F., & Yoon, S. E. (2012). Spherical hashing. In IEEE conference on computer vision and pattern recognition (CVPR).
Hu, G., Hua, Y., Yuan, Y., Zhang, Z., Lu, Z., Mukherjee, S. S., Hospedales, T. M., Robertson, N. M., & Yang, Y. (2017). Attribute-enhanced face recognition with neural tensor fusion networks. In IEEE International conference on computer vision (ICCV).
Jiang, Q. Y., & Li, W. J. (2017). Deep cross-modal hashing. In IEEE conference on computer vision and pattern recognition (CVPR).
Kingma, D., & Ba, J. (2015). Adam: A method for acm symposium on theory of computing (stoc)hastic optimization. In International conference on learning representations (ICLR).
Kingma, D., & Welling, M. (2014). Auto-encoding variational bayes. In International conference on learning representations (ICLR).
Kingma, D. P., Mohamed, S., Rezende, D. J., & Welling, M. (2014). Semi-supervised learning with deep generative models. In Advances in neural information processing systems (NIPS).
Kong, W., & Li, W. J. (2012). Isotropic hashing. In Advances in neural information processing systems (NIPS).
Krizhevsky, A., & Hinton, G. (2009). Learning multiple layers of features from tiny images.
Kulis, B., & Darrell, T. (2009). Learning to hash with binary reconstructive embeddings. In Advances in neural information processing systems (NIPS).
Kulis, B., & Grauman, K. (2009). Kernelized locality-sensitive hashing for scalable image search. In IEEE international conference on computer vision (ICCV).
Kulkarni, T. D., Whitney, W. F., Kohli, P., & Tenenbaum, J. (2015). Deep convolutional inverse graphics network. In Advances in neural information processing systems (NIPS).
Lai, H., Pan, Y., Liu, Y., & Yan, S. (2015). Simultaneous feature learning and hash coding with deep neural networks. In IEEE conference on computer vision and pattern recognition (CVPR).
Lin, K., Lu, J., Chen, C. S., & Zhou, J. (2016). Learning compact binary descriptors with unsupervised deep neural networks. In IEEE conference on computer vision and pattern recognition (CVPR).
Liu, L., Lin, Z., Shao, L., Shen, F., Ding, G., & Han, J. (2017). Sequential discrete hashing for scalable cross-modality similarity retrieval. IEEE Transactions on Image Processing, 26(1), 107–118.
Article MathSciNet Google Scholar
Liu, L., & Shao, L. (2016). Sequential compact code learning for unsupervised image hashing. IEEE Transactions on Neural Networks and Learning Systems, 27(12), 2526–2536.
Article Google Scholar
Liu, L., Shao, L., Shen, F., & Yu, M. (2017). Discretely coding semantic rank orders for supervised image hashing. In IEEE conference on computer vision and pattern recognition (CVPR).
Liu, L., Shen, F., Shen, Y., Liu, X., & Shao, L. (2017). Deep sketch hashing: Fast free-hand sketch-based image retrieval. In IEEE conference on computer vision and pattern recognition (CVPR).
Liu, L., Yu, M., & Shao, L. (2016). Unsupervised local feature hashing for image similarity search. IEEE Transactions on Cybernetics, 46(11), 2548–2558.
Article Google Scholar
Liu, L., Yu, M., & Shao, L. (2017). Latent structure preserving hashing. International Journal of Computer Vision, 122(3), 439–457.
Article MathSciNet Google Scholar
Liu, L., Yu, M., & Shao, L. (2017). Learning short binary codes for large-scale image retrieval. IEEE Transactions on Image Processing, 26(3), 1289–1299.
Article MathSciNet Google Scholar
Liu, W., Mu, C., Kumar, S., & Chang, S. F. (2014). Discrete graph hashing. In Advances in neural information processing systems (NIPS).
Liu, W., Wang, J., Kumar, S., & Chang, S. F. (2011). Hashing with graphs. In International conference on machine learning (ICML).
Maaten, Lvd, & Hinton, G. (2008). Visualizing data using t-sne. Journal of Machine Learning Research, 9(Nov), 2579–2605.
MATH Google Scholar
Norouzi, M., & Blei, D. M. (2011). Minimal loss hashing for compact binary codes. In International conference on machine learning (ICML).
Oliva, A., & Torralba, A. (2001). Modeling the shape of the scene: A holistic representation of the spatial envelope. International Journal of Computer Vision, 42(3), 145–175.
Article Google Scholar
Purushotham, S., Carvalho, W., Nilanon, T., & Liu, Y. (2017). Variational recurrent adversarial deep domain adaptation. In International conference on learning representations (ICLR).
Raginsky, M., & Lazebnik, S. (2009). Locality-sensitive binary codes from shift-invariant kernels. In Advances in neural information processing systems (NIPS).
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., et al. (2015). Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 115(3), 211–252.
Article MathSciNet Google Scholar
Salakhutdinov, R., & Hinton, G. (2009). Semantic hashing. International Journal of Approximate Reasoning, 50(7), 969–978.
Article Google Scholar
Serban, I. V., Sordoni, A., Lowe, R., Charlin, L., Pineau, J., Courville, A. C., & Bengio, Y. (2017). A hierarchical latent variable encoder-decoder model for generating dialogues. In AAAI conference on artificial intelligence (AAAI).
Shen, F., Shen, C., Liu, W., & Tao Shen, H. (2015). Supervised discrete hashing. In IEEE conference on computer vision and pattern recognition (CVPR).
Shen, Y., Liu, l., & Shao, L. (2017). Unsupervised deep generative hashing. In British machine vision conference (BMVC).
Shen, Y., Liu, l., Shao, L., & Song, J. (2017). Deep binaries: Encoding semantic-rich cues for efficient textual-visual cross retrieval. In IEEE international conference on computer vision (ICCV).
Simonyan, K., & Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. In International conference in learning representations (ICLR).
Sohn, K., Lee, H., & Yan, X. (2015). Learning structured output representation using deep conditional generative models. In Advances in neural information processing systems (NIPS).
Song, J., Yang, Y., Yang, Y., Huang, Z., & Shen, H. T. (2013). Inter-media hashing for large-scale retrieval from heterogeneous data sources. In ACM international conference on management of data (SIGMOD).
Tucker, L. R. (1966). Some mathematical notes on three-mode factor analysis. Psychometrika, 31(3), 279–311.
Article MathSciNet Google Scholar
Wang, J., Kumar, S., & Chang, S. F. (2012). Semi-supervised hashing for large-scale search. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(12), 2393–2406.
Article Google Scholar
Weiss, Y., Torralba, A., & Fergus, R. (2009). Spectral hashing. In Advances in neural information processing systems (NIPS).
Xia, R., Pan, Y., Lai, H., Liu, C., & Yan, S. (2014). Supervised hashing for image retrieval via image representation learning. In AAAI conference on artificial intelligence (AAAI).
Xiao, J., Hays, J., Ehinger, K. A., Oliva, A., & Torralba, A. (2010). Sun database: Large-scale scene recognition from abbey to zoo. In IEEE conference on computer vision and pattern recognition (CVPR).
Xu, K., Ba, J., Kiros, R., Cho, K., Courville, A., Salakhudinov, R., Zemel, R., & Bengio, Y. (2015). Show, attend and tell: Neural image caption generation with visual attention. In International conference on machine learning (ICML).
Yan, X., Yang, J., Sohn, K., & Lee, H. (2016). Attribute2image: Conditional image generation from visual attributes. In European conference on computer vision (ECCV).
Yang, Z., Hu, Z., Salakhutdinov, R., & Berg-Kirkpatrick, T. (2017). Improved variational autoencoders for text modeling using dilated convolutions. In arXiv preprint arXiv:1702.08139.
Yu, M., Liu, L., & Shao, L. (2016). Structure-preserving binary representations for rgb-d action recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(8), 1651–1664.
Article Google Scholar
Zhu, H., Long, M., Wang, J., & Cao, Y. (2016). Deep hashing network for efficient similarity retrieval. In AAAI conference on artificial intelligence (AAAI).
Zhu, X., Zhang, L., & Huang, Z. (2014). A sparse embedding and least variance encoding approach to hashing. IEEE Transactions on Image Processing, 23(9), 3737–3750.
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Inception Institute of Artificial Intelligence, Abu Dhabi, UAE
Yuming Shen, Li Liu & Ling Shao

Authors

Yuming Shen
View author publications
You can also search for this author in PubMed Google Scholar
Li Liu
View author publications
You can also search for this author in PubMed Google Scholar
Ling Shao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ling Shao.

Additional information

Communicated by Dr. Tae-Kyun Kim, Dr. Stefanos Zafeiriou, Dr. Ben Glocker and Dr. Stefan Leutenegge.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Shen, Y., Liu, L. & Shao, L. Unsupervised Binary Representation Learning with Deep Variational Networks. Int J Comput Vis 127, 1614–1628 (2019). https://doi.org/10.1007/s11263-019-01166-4

Download citation

Received: 17 February 2018
Accepted: 13 February 2019
Published: 21 February 2019
Issue Date: December 2019
DOI: https://doi.org/10.1007/s11263-019-01166-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Unsupervised Binary Representation Learning with Deep Variational Networks

Abstract

Access this article

Similar content being viewed by others

Two-Stage Unsupervised Deep Hashing for Image Retrieval

Deep Supervised Hashing for Fast Image Retrieval

Deep Supervised Auto-encoder Hashing for Image Retrieval

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Unsupervised Binary Representation Learning with Deep Variational Networks

Abstract

Access this article

Similar content being viewed by others

Two-Stage Unsupervised Deep Hashing for Image Retrieval

Deep Supervised Hashing for Fast Image Retrieval

Deep Supervised Auto-encoder Hashing for Image Retrieval

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation