Unsupervised semantic-based convolutional features aggregation for image retrieval

Wang, Xinsheng; Pang, Shanmin; Zhu, Jihua; Wang, Jiaxing; Wang, Lin

doi:10.1007/s11042-018-6915-3

Unsupervised semantic-based convolutional features aggregation for image retrieval

Published: 28 November 2018

Volume 79, pages 14465–14489, (2020)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Xinsheng Wang¹,
Shanmin Pang¹,
Jihua Zhu¹,
Jiaxing Wang¹ &
…
Lin Wang²

375 Accesses
Explore all metrics

Abstract

Deep features extracted from the convolutional layers of pre-trained CNNs have been widely used in the image retrieval task. These features, however, are in a large number and probably cannot be directly used for similarity evaluation due to lack of efficiency. Thus, it is of great importance to study how to aggregate deep features into a global yet distinctive image vector. This paper first introduces a simple but effective method to select informative features based on semantic content of feature maps. Then, we propose an effective channel weighting method (CW) for selected features by analyzing relations between the discriminative activation and distribution parameters of feature maps, including standard variance, non-zero responses and sum value. Furthermore, we provide a solution to pick semantic detectors that are independent on gallery images. Based on the aforementioned three strategies, we derive a global image vector generation method, and demonstrate its state-of-the-art performance on benchmark datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Object-Based Aggregation of Deep Features for Image Retrieval

Multi-center convolutional descriptor aggregation for image retrieval

Article 05 December 2018

Exploring geometric information in CNN for image retrieval

Article 23 July 2018

Notes

https://github.com/ShawnWXS/filckr_building

References

Azizpour H, Razavian AS, Sullivan J, Maki A, Carlsson S (2015) From generic to specific deep representations for visual recognition. In: Computer vision and pattern recognition workshops. pp 36–45
Babenko A, Lempitsky V (2015) Aggregating deep convolutional features for image retrieval. Computer Science
Babenko A, Slesarev A, Chigorin A, Lempitsky V (2014) Neural Codes for Image Retrieval 8689:584–599
Google Scholar
Cao X, Wang P, Meng C, Bai X, Gong G, Liu M, Qi J (2018) Region based CNN for foreign object debris detection on airfield pavement. Sensors 18(3):737
Article Google Scholar
Chen Z, Kuang Z, Wong KYK, Zhang W (2017) Aggregated deep feature from activation clusters for particular object retrieval. In: Thematic workshops of ACM multimedia. pp 44–51
Chu WT, Wu YL (2018) Image style classification based on learnt deep correlation features. IEEE Trans Multimed (99):1–1
Chum O, Philbin J, Sivic J, Isard M, Zisserman A (2007) Total Recall: Automatic Query Expansion with a Generative Feature Model for Object Retrieval. 1–8
Do TT, Hoang T, Tan DKL, Cheung NM (2018) From Selective Deep Convolutional Features to Compact Binary Representations for Image Retrieval
Gao L, Guo Z, Zhang H, Xu X, Shen HT (2017) Video captioning with attention-based LSTM and semantic consistency. IEEE Trans Multimed 19(9):2045–2055
Article Google Scholar
Gong Y, Wang L, Guo R, Lazebnik S (2014) Multi-scale Orderless Pooling of Deep Convolutional Activation Features. 8695:392–407
Gordo A, Almazán J, Revaud J, Larlus D (2016) Deep image retrieval: learning global representations for image search. In: European conference on computer vision. pp 241–257
Gordo A, Almazán J, Revaud J, Larlus D (2016) End-to-end learning of deep visual representations for image retrieval. Int J Comput Vis:1–18
He L, Xu X, Lu H, Yang Y, Shen F, Shen HT (2017) Unsupervised cross-modal retrieval through adversarial learning. IEEE Int Conf Multimed Expo: 1153–1158
Jégou H, Chum O (2012) Negative evidences and co-occurences in image retrieval: the benefit of PCA and whitening. Eur Conf Comput Vision: 774–787
Jégou H, Zisserman A (2014) Triangulation Embedding and Democratic aggregation for image search. In: Computer vision and pattern recognition. pp 3310–3317
Jegou H, Douze M, Schmid C (2009) On the burstiness of visual elements. Computer Vision Pattern Recogn 2009. CVPR 2009. IEEE Conf: 1169–1176
Jian X, Chunheng W, Chengzuo Q, Cunzhao S, Baihua X (2018) Unsupervised Semantic-based Aggregation of Deep Convolutional Features. arXiv:10
Kalantidis Y, Mellina C, Osindero S (2016) Cross-dimensional weighting for aggregated deep convolutional features. In: European conference on computer vision. 685–701
Kim DS, Arsalan M, Park KR (2018) Convolutional neural network-based shadow detection in images using visible light camera sensor. Sensors 18 (4)
Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. Int Conf Neural Inform Process Syst: 1097–1105
Lowe DG (2004) Distinctive image features from scale-invariant Keypoints. Int J Comput Vis 60(2):91–110
Article Google Scholar
Lu H, Li Y, Chen M, Kim H, Serikawa S (2017) Brain intelligence: go beyond artificial intelligence. Mobile Netw Appl 23(2):368–375
Article Google Scholar
Lu H, Li Y, Mu S, Wang D, Kim H, Serikawa S (2017) Motor anomaly detection for unmanned aerial vehicles using reinforcement learning. IEEE Int Things J PP (99):1–1
Lu H, Li Y, Uemura T, Ge Z, Xu X, He L, Serikawa S, Kim H (2017) FDCNet: filtering deep convolutional network for marine organism classification. Multimed Tools Appl (2):1–14
Lu H, Li B, Zhu J, Li Y, Li Y, Xu X, He L, Li X, Li J, Serikawa S (2017) Wound intensity correction and segmentation with convolutional neural networks. Concurr Comput Pract Exper 29 (6)
Lu H, Li Y, Uemura T, Kim H, Serikawa S (2018) Low illumination underwater light field images reconstruction using deep convolutional neural networks. Futur Gener Comput Syst 82
Mao XJ, Shen C, Yang YB (2016) Image restoration using very deep convolutional encoder-decoder networks with symmetric skip connections
Pang S, Ma J, Zhu J, Xue J, Tian Q Improving object retrieval quality by integration of similarity propagation and query expansion. IEEE Trans Multimed (99):1–1
Philbin J, Chum O, Isard M, Sivic J, Zisserman A (2007) Object retrieval with large vocabularies and fast spatial matching. In: Computer vision and pattern recognition, 2007. CVPR 2007. IEEE conference on. pp 1–8
Philbin J, Chum O, Isard M, Sivic J (2008) Lost in quantization: improving particular object retrieval in large scale image databases. In: Computer vision and pattern recognition, 2008. CVPR 2008. IEEE conference on. pp 1–8
Radenović F, Tolias G, Chum O CNN (2016) Image retrieval learns from BoW: unsupervised fine-tuning with hard examples. In: European conference on computer vision. 3–20
Ran L, Zhang Y, Wei W, Zhang Q (2017) A hyperspectral image classification framework with spatial pixel pair features. Sensors 17(10):2421
Article Google Scholar
Razavian AS, Azizpour H, Sullivan J, Carlsson S CNN (2014) Features off-the-shelf: an astounding baseline for recognition. In: IEEE conference on computer vision and pattern recognition workshops. 512–519
Razavian AS, Sullivan J, Maki A, Carlsson S (2014) A baseline for visual instance retrieval with deep convolutional networks. Computer Science
Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE transactions on Pattern Analysis & Machine. Intelligence 39(6):1137–1149
Google Scholar
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M (2015) ImageNet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252
Article MathSciNet Google Scholar
Scott BL, Hardesty LH (2018) Method and apparatus for speech recognition. J Acoust Soc Am 109(3):864
Google Scholar
Serikawa S, Lu H (2014) Underwater image dehazing using joint trilateral filter. Pergamon press, Inc
Shelhamer E, Long J, Darrell T (2017) Fully convolutional networks for semantic segmentation. IEEE Trans Pattern Anal Mach Intell 39(4):640–651
Article Google Scholar
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. Computer Science
Tolias G, Sicre R, Jégou H (2015) Particular object retrieval with integral max-pooling of CNN activations. Computer Science
Tollari S, Detyniecki M, Marsala C, Fakeri-Tabrizi A, Amini MR, Gallinari P (2009) Exploiting visual concepts to improve text-based image retrieval. Eur Conf Ir Res Adv Inform Retriev: 701–705
Tuan H, Thanh-Toan D, Dang-Khoa Le T, Ngai-Man C (2017) Selective deep convolutional features for image retrieval arXiv:9 pp.-9 pp
Wang L, Xu X, Dong H, Gui R, Pu F (2018) Multi-pixel simultaneous classification of PolSAR image using convolutional neural networks. Sensors 18(3):769
Article Google Scholar
Wang X, Pang S, Zhu J, Wang J, Wang L (2018) An efficient aggregation method of convolutional features for image retrieval. In: International symposium on artificial intelligence and robotics, Nanjing, China
Wang J, Zhu J, Pang S, Li Z, Li Y, Qian X (2018) Adaptive Co-weighting Deep Convolutional Features For Object Retrieval
Wei XS, Luo JH, Wu J, Zhou ZH (2016) Selective convolutional descriptor aggregation for fine-grained image retrieval. IEEE Trans Image Proc 99:1–1
MATH Google Scholar
Xiu-Shen W, Jian-Hao L, Jianxin W (2016) Selective convolutional descriptor aggregation for fine-grained image retrieval. arXiv:16 pp.-16 pp.
Xu X, He L, Shimada A, Taniguchi RI, Lu H (2016) Learning unified binary codes for cross-modal retrieval via latent semantic hashing. Neurocomputing 213:191–203
Article Google Scholar
Xu X, Shen F, Yang Y, Shen HT, Li X (2017) Learning discriminative binary codes for large-scale cross-modal retrieval. IEEE Trans Image Process (99):1–1
Xu J, Shi C, Qi C, Wang C, Xiao B (2017) Unsupervised Part-based Weighting Aggregation of Deep Convolutional Features for Image Retrieval
Xu X, He L, Lu H, Gao L, Ji Y (2018) Deep adversarial metric learning for cross-modal retrieval. World Wide web-internet & web Inf Syst:1–16
Yandex AB, Lempitsky V (2016) Aggregating local deep features for image retrieval. In: IEEE international conference on computer vision. 1269–1277
Yang J, She D, Sun M, Cheng MM, Rosin P, Wang L (2018) Visual sentiment prediction based on automatic discovery of affective regions. IEEE Trans Multimed (99):1–1
Zeiler MD, Fergus R (2013) Visualizing and Understanding Convolutional Networks 8689:818–833
Zhang X, Xiong H, Zhou W, Lin W, Tian Q (2016) Picking deep filter responses for fine-grained image recognition. In: Computer vision and pattern recognition, –1142
Zhang Y, Wei XS, Wu J, Cai J, Lu J, Nguyen VA, Do MN (2016) Weakly supervised fine-grained categorization with part-based image representation. IEEE Trans Image Process 25(4):1713–1725
Article MathSciNet Google Scholar

Download references

Acknowledgements

This research was funded by National Natural Science Foundation of China Grant 61603289, China Postdoctoral Science Foundation Grant 2016 M602823, and Fundamental Research Funds for the Central Universities xjj2017118.

Author information

Authors and Affiliations

School of Software Engineering, Xi’an Jiaotong University, Xi’an, People’s Republic of China
Xinsheng Wang, Shanmin Pang, Jihua Zhu & Jiaxing Wang
School of Information Science and Technology, Northwest University, Xi’an, People’s Republic of China
Lin Wang

Authors

Xinsheng Wang
View author publications
You can also search for this author in PubMed Google Scholar
Shanmin Pang
View author publications
You can also search for this author in PubMed Google Scholar
Jihua Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Jiaxing Wang
View author publications
You can also search for this author in PubMed Google Scholar
Lin Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jihua Zhu.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, X., Pang, S., Zhu, J. et al. Unsupervised semantic-based convolutional features aggregation for image retrieval. Multimed Tools Appl 79, 14465–14489 (2020). https://doi.org/10.1007/s11042-018-6915-3

Download citation

Received: 30 July 2018
Revised: 26 September 2018
Accepted: 19 November 2018
Published: 28 November 2018
Issue Date: June 2020
DOI: https://doi.org/10.1007/s11042-018-6915-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Unsupervised semantic-based convolutional features aggregation for image retrieval

Abstract

Access this article

Similar content being viewed by others

Object-Based Aggregation of Deep Features for Image Retrieval

Multi-center convolutional descriptor aggregation for image retrieval

Exploring geometric information in CNN for image retrieval

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Unsupervised semantic-based convolutional features aggregation for image retrieval

Abstract

Access this article

Similar content being viewed by others

Object-Based Aggregation of Deep Features for Image Retrieval

Multi-center convolutional descriptor aggregation for image retrieval

Exploring geometric information in CNN for image retrieval

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation