Abstract
Deep features extracted from the convolutional layers of pre-trained CNNs have been widely used in the image retrieval task. These features, however, are in a large number and probably cannot be directly used for similarity evaluation due to lack of efficiency. Thus, it is of great importance to study how to aggregate deep features into a global yet distinctive image vector. This paper first introduces a simple but effective method to select informative features based on semantic content of feature maps. Then, we propose an effective channel weighting method (CW) for selected features by analyzing relations between the discriminative activation and distribution parameters of feature maps, including standard variance, non-zero responses and sum value. Furthermore, we provide a solution to pick semantic detectors that are independent on gallery images. Based on the aforementioned three strategies, we derive a global image vector generation method, and demonstrate its state-of-the-art performance on benchmark datasets.
Similar content being viewed by others
References
Azizpour H, Razavian AS, Sullivan J, Maki A, Carlsson S (2015) From generic to specific deep representations for visual recognition. In: Computer vision and pattern recognition workshops. pp 36–45
Babenko A, Lempitsky V (2015) Aggregating deep convolutional features for image retrieval. Computer Science
Babenko A, Slesarev A, Chigorin A, Lempitsky V (2014) Neural Codes for Image Retrieval 8689:584–599
Cao X, Wang P, Meng C, Bai X, Gong G, Liu M, Qi J (2018) Region based CNN for foreign object debris detection on airfield pavement. Sensors 18(3):737
Chen Z, Kuang Z, Wong KYK, Zhang W (2017) Aggregated deep feature from activation clusters for particular object retrieval. In: Thematic workshops of ACM multimedia. pp 44–51
Chu WT, Wu YL (2018) Image style classification based on learnt deep correlation features. IEEE Trans Multimed (99):1–1
Chum O, Philbin J, Sivic J, Isard M, Zisserman A (2007) Total Recall: Automatic Query Expansion with a Generative Feature Model for Object Retrieval. 1–8
Do TT, Hoang T, Tan DKL, Cheung NM (2018) From Selective Deep Convolutional Features to Compact Binary Representations for Image Retrieval
Gao L, Guo Z, Zhang H, Xu X, Shen HT (2017) Video captioning with attention-based LSTM and semantic consistency. IEEE Trans Multimed 19(9):2045–2055
Gong Y, Wang L, Guo R, Lazebnik S (2014) Multi-scale Orderless Pooling of Deep Convolutional Activation Features. 8695:392–407
Gordo A, Almazán J, Revaud J, Larlus D (2016) Deep image retrieval: learning global representations for image search. In: European conference on computer vision. pp 241–257
Gordo A, Almazán J, Revaud J, Larlus D (2016) End-to-end learning of deep visual representations for image retrieval. Int J Comput Vis:1–18
He L, Xu X, Lu H, Yang Y, Shen F, Shen HT (2017) Unsupervised cross-modal retrieval through adversarial learning. IEEE Int Conf Multimed Expo: 1153–1158
Jégou H, Chum O (2012) Negative evidences and co-occurences in image retrieval: the benefit of PCA and whitening. Eur Conf Comput Vision: 774–787
Jégou H, Zisserman A (2014) Triangulation Embedding and Democratic aggregation for image search. In: Computer vision and pattern recognition. pp 3310–3317
Jegou H, Douze M, Schmid C (2009) On the burstiness of visual elements. Computer Vision Pattern Recogn 2009. CVPR 2009. IEEE Conf: 1169–1176
Jian X, Chunheng W, Chengzuo Q, Cunzhao S, Baihua X (2018) Unsupervised Semantic-based Aggregation of Deep Convolutional Features. arXiv:10
Kalantidis Y, Mellina C, Osindero S (2016) Cross-dimensional weighting for aggregated deep convolutional features. In: European conference on computer vision. 685–701
Kim DS, Arsalan M, Park KR (2018) Convolutional neural network-based shadow detection in images using visible light camera sensor. Sensors 18 (4)
Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. Int Conf Neural Inform Process Syst: 1097–1105
Lowe DG (2004) Distinctive image features from scale-invariant Keypoints. Int J Comput Vis 60(2):91–110
Lu H, Li Y, Chen M, Kim H, Serikawa S (2017) Brain intelligence: go beyond artificial intelligence. Mobile Netw Appl 23(2):368–375
Lu H, Li Y, Mu S, Wang D, Kim H, Serikawa S (2017) Motor anomaly detection for unmanned aerial vehicles using reinforcement learning. IEEE Int Things J PP (99):1–1
Lu H, Li Y, Uemura T, Ge Z, Xu X, He L, Serikawa S, Kim H (2017) FDCNet: filtering deep convolutional network for marine organism classification. Multimed Tools Appl (2):1–14
Lu H, Li B, Zhu J, Li Y, Li Y, Xu X, He L, Li X, Li J, Serikawa S (2017) Wound intensity correction and segmentation with convolutional neural networks. Concurr Comput Pract Exper 29 (6)
Lu H, Li Y, Uemura T, Kim H, Serikawa S (2018) Low illumination underwater light field images reconstruction using deep convolutional neural networks. Futur Gener Comput Syst 82
Mao XJ, Shen C, Yang YB (2016) Image restoration using very deep convolutional encoder-decoder networks with symmetric skip connections
Pang S, Ma J, Zhu J, Xue J, Tian Q Improving object retrieval quality by integration of similarity propagation and query expansion. IEEE Trans Multimed (99):1–1
Philbin J, Chum O, Isard M, Sivic J, Zisserman A (2007) Object retrieval with large vocabularies and fast spatial matching. In: Computer vision and pattern recognition, 2007. CVPR 2007. IEEE conference on. pp 1–8
Philbin J, Chum O, Isard M, Sivic J (2008) Lost in quantization: improving particular object retrieval in large scale image databases. In: Computer vision and pattern recognition, 2008. CVPR 2008. IEEE conference on. pp 1–8
Radenović F, Tolias G, Chum O CNN (2016) Image retrieval learns from BoW: unsupervised fine-tuning with hard examples. In: European conference on computer vision. 3–20
Ran L, Zhang Y, Wei W, Zhang Q (2017) A hyperspectral image classification framework with spatial pixel pair features. Sensors 17(10):2421
Razavian AS, Azizpour H, Sullivan J, Carlsson S CNN (2014) Features off-the-shelf: an astounding baseline for recognition. In: IEEE conference on computer vision and pattern recognition workshops. 512–519
Razavian AS, Sullivan J, Maki A, Carlsson S (2014) A baseline for visual instance retrieval with deep convolutional networks. Computer Science
Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE transactions on Pattern Analysis & Machine. Intelligence 39(6):1137–1149
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M (2015) ImageNet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252
Scott BL, Hardesty LH (2018) Method and apparatus for speech recognition. J Acoust Soc Am 109(3):864
Serikawa S, Lu H (2014) Underwater image dehazing using joint trilateral filter. Pergamon press, Inc
Shelhamer E, Long J, Darrell T (2017) Fully convolutional networks for semantic segmentation. IEEE Trans Pattern Anal Mach Intell 39(4):640–651
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. Computer Science
Tolias G, Sicre R, Jégou H (2015) Particular object retrieval with integral max-pooling of CNN activations. Computer Science
Tollari S, Detyniecki M, Marsala C, Fakeri-Tabrizi A, Amini MR, Gallinari P (2009) Exploiting visual concepts to improve text-based image retrieval. Eur Conf Ir Res Adv Inform Retriev: 701–705
Tuan H, Thanh-Toan D, Dang-Khoa Le T, Ngai-Man C (2017) Selective deep convolutional features for image retrieval arXiv:9 pp.-9 pp
Wang L, Xu X, Dong H, Gui R, Pu F (2018) Multi-pixel simultaneous classification of PolSAR image using convolutional neural networks. Sensors 18(3):769
Wang X, Pang S, Zhu J, Wang J, Wang L (2018) An efficient aggregation method of convolutional features for image retrieval. In: International symposium on artificial intelligence and robotics, Nanjing, China
Wang J, Zhu J, Pang S, Li Z, Li Y, Qian X (2018) Adaptive Co-weighting Deep Convolutional Features For Object Retrieval
Wei XS, Luo JH, Wu J, Zhou ZH (2016) Selective convolutional descriptor aggregation for fine-grained image retrieval. IEEE Trans Image Proc 99:1–1
Xiu-Shen W, Jian-Hao L, Jianxin W (2016) Selective convolutional descriptor aggregation for fine-grained image retrieval. arXiv:16 pp.-16 pp.
Xu X, He L, Shimada A, Taniguchi RI, Lu H (2016) Learning unified binary codes for cross-modal retrieval via latent semantic hashing. Neurocomputing 213:191–203
Xu X, Shen F, Yang Y, Shen HT, Li X (2017) Learning discriminative binary codes for large-scale cross-modal retrieval. IEEE Trans Image Process (99):1–1
Xu J, Shi C, Qi C, Wang C, Xiao B (2017) Unsupervised Part-based Weighting Aggregation of Deep Convolutional Features for Image Retrieval
Xu X, He L, Lu H, Gao L, Ji Y (2018) Deep adversarial metric learning for cross-modal retrieval. World Wide web-internet & web Inf Syst:1–16
Yandex AB, Lempitsky V (2016) Aggregating local deep features for image retrieval. In: IEEE international conference on computer vision. 1269–1277
Yang J, She D, Sun M, Cheng MM, Rosin P, Wang L (2018) Visual sentiment prediction based on automatic discovery of affective regions. IEEE Trans Multimed (99):1–1
Zeiler MD, Fergus R (2013) Visualizing and Understanding Convolutional Networks 8689:818–833
Zhang X, Xiong H, Zhou W, Lin W, Tian Q (2016) Picking deep filter responses for fine-grained image recognition. In: Computer vision and pattern recognition, –1142
Zhang Y, Wei XS, Wu J, Cai J, Lu J, Nguyen VA, Do MN (2016) Weakly supervised fine-grained categorization with part-based image representation. IEEE Trans Image Process 25(4):1713–1725
Acknowledgements
This research was funded by National Natural Science Foundation of China Grant 61603289, China Postdoctoral Science Foundation Grant 2016 M602823, and Fundamental Research Funds for the Central Universities xjj2017118.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Wang, X., Pang, S., Zhu, J. et al. Unsupervised semantic-based convolutional features aggregation for image retrieval. Multimed Tools Appl 79, 14465–14489 (2020). https://doi.org/10.1007/s11042-018-6915-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-018-6915-3