A multi-level descriptor using ultra-deep feature for image retrieval

Wu, Zebin; Yu, Junqing

doi:10.1007/s11042-019-07771-2

A multi-level descriptor using ultra-deep feature for image retrieval

Published: 30 May 2019

Volume 78, pages 25655–25672, (2019)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

380 Accesses
6 Citations
Explore all metrics

Abstract

CNN(Convolution Neural Network)-based descriptor generation is extensively studied recently for image retrieval. CNN deep feature trained for image classification is proved to have good transferability for image retrieval task. However, building a highly discriminative descriptor with CNN feature is still an important issue. The feature of the fully-connected layer is usually used and the shallow features of an image are ignored. In this paper, we proposed a simple and effective multi-level descriptor. Firstly, we proposed a multi-level feature fusion (MFF) method to capture low-level color/texture and high-level semantic information simultaneously. MFF replaces the commonly-used “object-level” with “part-level”, and the filters of convolution layer are seen as part detectors, instead of using an object detector method explicitly. The complementary nature of low-level and high-level feature benefits MFF greatly. Secondly, we trained a neural net with class information to further improve the discriminative power of MFF. Our MFF achieves good performance on public image retrieval datasets. Finally, a compressed version is proposed and achieves close performance to the uncompressed version.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

CBAM: Convolutional Block Attention Module

Deep learning models for digital image processing: a review

Article 07 January 2024

Deep Learning for Generic Object Detection: A Survey

Article Open access 31 October 2019

References

Agarwal S, Furukawa Y, Snavely N, Simon I, Curless B, Seitz SM, Szeliski R (2011) Building Rome in a day. Commun ACM 54(10):105–112
Article Google Scholar
Alex K, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: NIPS, pp 1097–1105
Arandjelovic R, Gronat P, Torii A, Pajdla T, Sivic J (2018) NetVLAD: CNN architecture for weakly supervised place recognition. IEEE Trans Pattern Anal Mach Intell 40(6):1437–1451
Article Google Scholar
Azizpour H, Razavian A, Sullivan J, Maki A, Carlsson S (2014) Factors of transferability for a generic convnet representation. IEEE Trans Pattern Anal Mach Intell 38(9):1790–1802
Article Google Scholar
Babenko A, Slesarev A, Chigorin A, Lempitsky V (2014) Neural codes for image retrieval. In: In european conference on computer vision, pp 584–599
Babenko A, Lempitsky V (2015) Aggregating local deep features for image retrieval. In: Proceedings of the IEEE international conference on computer vision, pp 1269–1277
Bay H, Tuytelaars T, Gool LV (2006) Surf: Speeded up robust features. In: European conference on computer vision, pp 404–417
Deng J, Dong W, Socher R, Li L, Li K, Li FF (2009) Imagenet: a large-scale hierarchical image database. In: IEEE Conference on computer vision and pattern recognition, pp 248–255
Gong Y, Wang L, Guo R, Lazebnik S (2014) Multi-scale orderless pooling of deep convolutional activation features. In: european conference on computer vision, pp 392–407
Gordo A, Almazn J, Revaud J, Larlus D (2016) Deep image retrieval: learning global representations for image search. In: European conference on computer vision, pp 241–257
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Hoang T, Do T-T, Tan D-KL, Cheung N-M (2017) Selective deep convolutional features for image retrieval. Proceedings of the 2017 ACM, on Multimedia Conference, MM 2017, pp 1600–1608
Jegou H, Douze M, Schmid C (2008) Hamming embedding and weak geometric consistency for large scale image search. In: European conference on computer vision, pp 304–317
Jégou H, Douze M, Schmid C (2009) On the burstiness of visual elements. In: 2009. CVPR 2009. IEEE Conference on Computer Vision and Pattern Recognition. IEEE, pp 1169–1176
Jegou H, Douze M, Schmid C, Perez P (2010) Aggregating local descriptors into a compact image representation. In: Computer Vision and Pattern Recognition, pp 3304–3311
Kalantidis Y, Mellina C, Osindero S (2016) Cross-dimensional weighting for aggregated deep convolutional features. In: European conference on computer vision, pp 685–701
Lecun Y, Bottou L, Bengio Y (1998) Gradient-based learning applied to document recognition. In: Proceedings of the IEEE, pp 2278–2324
Li Y, Kong X, Zheng L, Tian Q (2016) Exploiting hierarchical activations of neural network for image. In: Proceedings of the 2016 ACM on Multimedia Conference, pp 132–136
Lowe D (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110
Article Google Scholar
Lu Y, Cohen I, Zhou XS, Tian Q (2007) Feature selection using principal feature analysis. In: Proceedings of the 15th ACM international conference on Multimedia. ACM, pp 301–304
Lu J, Liong V, Zhou J (2017) Deep hashing for scalable image search. IEEE Trans Image Process 26(5):2352–2367
Article MathSciNet MATH Google Scholar
Lv Y, Zhou W, Tian Q, Li H (2018) Scalable bag of selected deep features for visual instance retrieval. In: International Conference on Multimedia Modeling, pp 239–251
Ng HJ, Yang F, Davis L (2015) Exploiting local features from deep networks for image retrieval. In: In CVPR workshops, pp 53–61
Nister D, Stewenius H (2006) Scalable recognition with a vocabulary tree. In: 2006 IEEE computer society conference on computer vision and pattern recognition (CVPR 2006), 17–22 June 2006, New York, NY, USA
Pang S, Ma J, Xue J, Zhu J, Ordonez V (2018) Image Retrieval using Heat Diffusion for Deep Feature Aggregation. arXiv:1805.08587
Perronnin F, Liu Y, Sanchez J (2010) H.poirier: Large-scale image retrieval with compressed fisher vectors. In: Computer vision and pattern recognition, pp 3384–3391
Philbin J, Chum O, Isard M, Sivic J, Zisserman A (2007) Object retrieval with large vocabularies and fast spatial matching. In: 2007. CVPR ’07. IEEE conference on Computer vision and pattern recognition, pp 1–8
Radenoviċ F, Tolias G, Chum O (2016) CNN Image retrieval learns from BoW: Unsupervised fine-tuning with hard examples. European conference on computer vision. Springer, Cham
Google Scholar
Razavian AS, Azizpour H, Sullivan J, Carlsson S (2014) Cnn features off-the-shelf: an astounding baseline for recognition. In: CVPR Workshops, pp 806–813
Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99
Salvador A, Girȯ i nieto X, Marquės F, Sato S (2016) Faster R-CNN Features for Instance Search. 2016 IEEE, Conference on Computer Vision and Pattern RecognitionWorkshops, CVPR Workshops 2016, pp 394–401
Seddati O, Dupont S, Mahmoudi S, Parian M (2017) Towards Good Practices for Image Retrieval Based on CNN Features. 2017 IEEE International Conference on Computer Vision Workshops, ICCV Workshops, pp 1246–1255
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
Sivic J, Zisserman A (2003) Video google: a text retrieval approach to object matching in videos. In: ICCV, pp 1470–1477
Szegedy C, Liu W, Jia Y, Sermanet P (2015) Going deeper with convolutions. In: IEEE Conference on computer vision and pattern recognition, pp 1–9
Tolias G, Sicre R, Jėgou H (2015) Particular object retrieval with integral max-pooling of CNN activations. arXiv:1511.05879
Wang XY, Zhang B, Yang HY (2014) Content-based image retrieval by integrating color and texture features. Multimed Tools Appl 68(3):545–569
Article Google Scholar
Wang J, Zhang T, Jingkuan Song NS, Shen HT (2017) A survey on learning to hash. IEEE Trans Pattern Anal Mach Intell (TPAMI) 99:1
Google Scholar
Xie L, Hong R, Zhang B, Tian Q (2015) Image classification and retrieval are one. In: ACM On international conference on multimedia retrieval, pp 3–10
Yan K, Wang Y, Liang D, Huang T, Tian Y (2016) Cnn vs sift for image retrieval: Alternative or complementary?. In: Proceedings of the 2016 ACM on Multimedia Conference, pp 407–411
Yangqing J, Evan S, Jeff D, Sergey K, Jonathan L (2014) Caffe: Convolutional architecture for fast feature embedding, pp 675–678
Yu W, Yang K, Yao H, Sun X, Xu P (2017) Exploiting the complementary strengths of multi-layer cnn features for image retrieval. Neurocomputing 237:235–241
Article Google Scholar
Zeiler M, Fergus R (2014) Visualizing and understanding convolutional networks. In: European conference on computer vision, pp 818–833
Zhang S, Yang M, Cour T, Yu K, Metaxas D (2012) Query specific fusion for image retrieval. In: European conference on computer vision, pp 660–673
Zheng L, Wang S, Liu Z, Tian Q (2014) Packing and padding: coupled multi-index for accurate image retrieval. In: IEEE Conference on computer vision and pattern recognition, pp 1947–1954
Zheng L, Wang S, Tian L, He F, Liu Z, Tian Q (2015) Query-adaptive late fusion for image search and person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1741–1750
Zheng L, Wang S, Wang J, Tian Q (2016) Accurate image search with multi-scale contextual evidences. IJCV 120(1):1–13
Article MathSciNet Google Scholar

Download references

Acknowledgements

The work was supported by the National Natural Science Foundation of China (No. 61572211).

Author information

Authors and Affiliations

Department of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, 430074, China
Zebin Wu & Junqing Yu
Center of Network and Computation, Huazhong University of Science and Technology, Wuhan, 430074, China
Junqing Yu

Authors

Zebin Wu
View author publications
You can also search for this author in PubMed Google Scholar
Junqing Yu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Junqing Yu.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wu, Z., Yu, J. A multi-level descriptor using ultra-deep feature for image retrieval. Multimed Tools Appl 78, 25655–25672 (2019). https://doi.org/10.1007/s11042-019-07771-2

Download citation

Received: 07 April 2018
Revised: 11 May 2019
Accepted: 15 May 2019
Published: 30 May 2019
Issue Date: 30 September 2019
DOI: https://doi.org/10.1007/s11042-019-07771-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A multi-level descriptor using ultra-deep feature for image retrieval

Abstract

Access this article

Similar content being viewed by others

CBAM: Convolutional Block Attention Module

Deep learning models for digital image processing: a review

Deep Learning for Generic Object Detection: A Survey

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A multi-level descriptor using ultra-deep feature for image retrieval

Abstract

Access this article

Similar content being viewed by others

CBAM: Convolutional Block Attention Module

Deep learning models for digital image processing: a review

Deep Learning for Generic Object Detection: A Survey

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation