Abstract
In the recent years the rapid growth of multimedia content makes the image retrieval a challenging research task. Content Based Image Retrieval (CBIR) is a technique which uses features of image to search user required image from large image dataset according to the user’s request in the form of query image. Effective feature representation and similarity measures are very crucial to the retrieval performance of CBIR. The key challenge has been attributed to the well known semantic gap issue. The machine learning has been actively investigated as possible solution to bridge the semantic gap. The recent success of deep learning inspires as a hope for bridging the semantic gap in CBIR. In this paper, we investigate deep learning approach used for CBIR tasks under varied settings from our empirical studies; we find some encouraging conclusions and insights for future research.
Similar content being viewed by others
References
Ahmad J, Mehmood I, Baik SW (2017) Efficient Object-Based Surveillance Image Search using Spatial Pooling of Convolutional Features. J Vis Commun Image R 45:62–76
Ahmad J, Sajjad M, Rho S, Baik SW (2016) Multi-scale local structure patterns histogram for describing visual contents in social image retrieval systems. Multimed Tools Appl 75(20):12669–12692
Alzu’bi A, Amira A, Ramzan N (2017) Content based image retrieval with compact deep convolutional features. Neurocomputing 249:95–105
Arrospide J, Salgado L, Nieto M (2012) Video Analysis Based Vehicle Detection and Tracking using an MCMC sampling Framework, EURASIP J Adv Signal Process, pp. 1–20
Baltieri D, Vezzani R, Cucchiara R (2011) 3dpes: 3d People Dataset for Surveillance and Forensics, In Proceedings of the 2011 Joint ACM Workshop on Human Gesture and Behavior Understanding, ACM
Bhagyalaksluni A, Vijayachamundeeswari V (2014) A survey on content based image retrieval using various operators IEEE international conference on computer communication and systems (ICCCS '14)
Cheng DS et al. (2011) Custom pictorial structures for re-identification, In BMVC
Chollet F (2017) Xception: Deep Learning with Depth-wise Separable Convolutions, In IEEE Computer Vision and Pattern Recognition, Hawaii, USApp. 1251–1258
Chua TS, Tang J, Hong R, Li H, Luo Z, Zheng Y (2009) Nuswide: A Real-world Web Image Database from National University of Singapore In Proc ACM Int Conf Image Video Retrieval, pp. 48:1–48:9
Dharani T, Aroquiaraj IL (2013) A survey on content based image retrieval. In: Proceedings of the 2013 International Conference on Pattern Recognition, Informatics and Mobile Engineering, pp 485–490
Felci Rajam I, Valli S (2013) A survey on content based image retrieval. Life Sci J 10(2):2475–2487
Gao Y, Beijbom O, Zhang N, Darrell T (2016) Compact Bilinear Pooling In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition, pp. 317–326
Gao L, Song J, Zou F, Zhang D, Shao J (2015) Scalable multimedia retrieval by deep learning hashing with relative similarity learning. In: Proc ACM Int Conf Multimedia, pp 903–906
Glasner D, Galun M, Alpert S, Basri R, Shakhnarovich G (2012) Viewpoint aware object detection and continuous pose estimation. Image Vis Comput 30(12):923–933
Goel R, Sharma A, Kapoor R (2019) Object recognition using deep learning. in J Comput Theor Nanosci 16(9):4044–4052
Goodfellow I, Pouget-Abadie J, Mirza M, et al. (2014) Generative adversarial nets, Proc of International Conference on Neural Information Processing Systems, pp.2672–2680
Gordo A, Almazán J, Revaud J, Larlus D (2016) Deep image retrieval: learning global representations for image search, In Proceedings European conference on computer vision (ECCV)
Gordo A, Almazan J, Revaud J, Larlus D (2017) End-to-end Learning of Deep Visual Representations for Image Retrieval. Int J Comput Vis 124(2):237–254
Gray D, Brennan S, Tao H (2007) Evaluating Appearance Models for Recognition, Reacquisition, and Tracking, In Proc IEEE International Workshop on Performance Evaluation for Tracking and Surveillance (PETS), Citeseer
Guoqing J, Yongdong Z, Ke L (2019) Deep hashing based on VAE-GAN for efficient similarity retrieval. Chi J Electron 28(6):1191–1197
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. IEEE Comput Vis Pattern Recognit, Las Vegas, NV, USA, pp 770–778
Heesch D (2008) A survey of browsing models for content based image retrieval. J Multimed Tools Appl (Springer) 40:261–284
Huiskes MJ, Lew MS (2008) The Mir Flickr retrieval evaluation. In: Proc ACM Int. Multimedia Inf Retrieval, pp 39–43
Huiskes MJ, Thomee B, Lew MS (2010) New trends and ideas in visual concept detection: the MssIRFLICKR retrieval evaluation initiative. In: Proceedings of the International Conference on Multimedia Information Retrieval, pp 527–536
Jegou H, Douze M, Schmid C (2008) Hamming embedding and weak geometric consistency for large scale image search, In Computer Vision–ECCV, pp. 304–317
Kalantidis Y, Mellina C, Osindero S (2016) Cross-dimensional weighting for aggregated deep convolutional features. In: Workshop on Web Scale Vision and Social Media (VSM), European Conference on Computer Vision (ECCV), pp 685–701
Kingma D, Welling M (2013) Auto-encoding Variational Bayes, arXiv: 1312.6114
Kokare M, Chatterji BN, Biswas PK (2002) A survey on current content based image retrieval methods. IETE J Res 48(3–4):261–271
Li X, Xu M, Xu J, Weise T, Zou L, Sun F, Wu Z (2020) Image retrieval using a deep attention-based hash. IEEE Access 8:142229–142242
Lin TY, Roy Chowdhury A, Maji S (2015) Bilinear CNN Models for Fine Grained Visual Recognition, In Proceedings of the IEEE International Conference on Computer Vision, pp. 1449–1457
Liong VE, Lu J, Wang G, Moulin P, Zhou J (2015) Deep Hashing for Compact Binary Codes Learning, In IEEE Comput Vis Pattern Recognit, pp. 2475–2483
Liu P, Guo JM, Wu CY, Cai D (2017) Fusion of Deep Learning and Compressed Domain Features for Content Based Image Retrieval, IEEE transactions on image processing, vol. 26, no. 12
Liu L, Yu M, Shao L (2015) Multi-view alignment hashing for efficient image search. IEEE Trans Image Process 24(3):956–966
Liu Y, Zhang D, Lu G, Ma W-Y (2007) A Survey of Content Based Image Retrieval with High-Level Semantics. J Pattern Recognit (Elsevier) 40:262–282
Mohedano E, McGuinness K, O’Connor NE, Salvador A, Marques F, Giroi Nieto X (2016) Bags of Local Convolutional Features for Scalable Instance Search, In Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval, ACM, pp. 327–331
Mukherjee J, Mukhopadhyay J, Mitra P (2014) A survey on image retrieval performance of different bag of visual words indexing techniques. In: Proceedings of the IEEE Students' Technology Symposium, pp 99–104
Muller H, Michoux N, Bandon D, Geissbuhler A (2004) A Review of Content Based Image Retrieval Systems in Medical Applications–—Clinical Benefits and Future Directions. Int J Med Inform 73:1–23
Ng HW, Winkler S (2014) A Data-driven Approach to Cleaning Large Face Datasets, In IEEE International Conference on Image Processing pp. 343–347
Ng J, Yang F, Davis L (2015) Exploiting Local Features from Deep Networks for Image Retrieval, In Proceedings of the IEEE Conf Comput Vis Pattern Recognit Workshops, pp. 53–61
Nister D, Stewenius H (2006) Scalable Recognition with a Vocabulary Tree. Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition 2:2161–2168
Oussalah M (2008) Content based image retrieval: review of state of art and future directions. In: First Workshops on Image Processing Theory, Tools and Applications (IEEE), pp 1–10
Ozuysal M, Lepetit V, Fua P (2009) Pose Estimation for Category Specific Multi-view Object Localization, In IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2009, IEEE
Patel T, Gandhi S (2017) A survey on content based similarity techniques for image retrieval. In: International Conference on Innovative Mechanisms for Industry Applications (IEEE), pp 219–223
Patel F, Kasat D (2017) Hashing based indexing techniques for content based image retrieval: a survey. In: International Conference on Innovative Mechanisms for Industry Applications (IEEE), pp 279–283
Paulin M, Douze M, Harchaoui Z, Mairal J, Perronin F, Schmid C (2015) Local Convolutional Features with Unsupervised Training for Image Retrieval, In Proceedings of the IEEE International Conference on Computer Vision, pp. 91–99
Philbin J, Chum O, Isard M, Sivic J, Zisserman A (2007) Object retrieval with large vocabularies and fast spatial matching. In: Computer Vision and Pattern Recognition (CVPR), pp. 1–8
Philbin J, Chum O, Isard M, Sivic J, Zisserman A (2008) Lost in quantization: improving particular object retrieval in large scale image databases. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, IEEE, pp 1–8
Qiu Z, Pan Y, Yao T, et al. (2017) Deep Semantic Hashing with Generative Adversarial Networks, Proceedings of International ACM SIGIR Conference on Research and Development in Information Retrieval, Tokyo, Japan, pp. 225–234
Radenovic F, Tolias G, Chum O (2016) CNN image retrieval learns from BoW: unsupervised fine-tuning with hard examples, In Proceedings European conference on computer vision (ECCV)
Rafiee G, Dlay SS, Woo WL (2010) A review of content-based image retrieval. In: 7th International Symposium on Communication Systems, Networks & Digital Signal Processing (CSNDSP 2010), pp 775–779
Rasiwasia N, Costa Pereira J, Coviello E, Doyle G, Lanckriet GR, Levy R, Vasconcelos N (2010) A New Approach to Cross-modal Multimedia Retrieval, In Proc ACMInt Conf Multimedia, pp. 251–260
Roy S, Sangineto E, Demir B, Sebe N (2020) Metric learning based deep hashing network for content-based retrieval of remote sensing images. IEEE Geosci Remote Sens Lett 18(2):226–230
Schwartz WR, Davis LS (2009) Learning discriminative appearance based models using partial least squares. In: 2009 XXII Brazilian Symposium on Computer Graphics and Image Processing. IEEE, pp 322–329
Shen F, Shen C, Liu W, Shen HT (2015) Supervised discrete hashing, In IEEE Computer Vision and Pattern Recognition, pp. 37–45
Shen F, Shen C, Shi Q, van den Hengel A, Tang Z, Shen HT (2015) Hashing on nonlinear manifolds. IEEE Trans Image Process 24(6):1839–1851
Shen X, Shen F, Sun QS, Yuan YH (2015) Multi-view latent hashing for efficient multimedia search. In: Proc ACM Int Conf Multimedia, pp 831–834
Szegedy C, Liu W, Jia Y, Sermanet P (2015) Going deeper with convolutions. In: IEEE Computer Vision and Pattern Recognition (CVPR), pp 1–9
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the Inception Architecture for Computer Vision, In Proc IEEE Conference Computer Vision Pattern Recognition (CVPR), pp. 2818–2826
Tang J, Li Z, Zhu X (2018) Supervised deep hashing for scalable face image retrieval. Pattern Recogn 75:25–32
Tolias G, Sicre R, Jegou H (2016) Particular object retrieval with integral max pooling of CNN activations. In: International Conference on Learning Representations (ICLR), pp 1–12
Torralba A, Fergus R, Freeman W (2008) 80 million tiny images: a large dataset for nonparametric object and scene recognition. IEEE Trans Pattern Anal Mach Intel 30(11):1958–1970
Tzelepi M, Tefas A (2018) Deep convolutional learning for content based image retrieval. Neurocomputing 275:2467–2478
Veltkamp RC, Tanase M (2002) Content-based image retrieval systems: A survey. Multimed Sys Appl Ser (Springer) 21:47–101
Wan J, Wang D, Hoi SCH, Wu P, Zhu J, Zhang Y, Li J (2014) Deep learning for content based image retrieval: a comprehensive study. In: Proceedings of the 22nd ACM International Conference on Multimedia, pp 157–166
Wang B, Yang Y, Xing X, Alan H, Heng TS (2017) Adversarial cross-modal retrieval. In: Proc of ACM on Multimedia Conference, pp 154–162
Wolf L, Hassner T, Maoz I (2011) Face Recognition in Unconstrained Videos with Matched Background Similarity, In IEEE Computer Vision and Pattern Recognition, pp. 529–534
Wu S, Oerleman A, Bakker EM, Lew MS (2017) Deep binary codes for large scale image retrieval. Neurocomputing 257:5–15
Yang H, Lin K, Chen C (2017) Supervised learning of semantics-preserving hash via deep convolutional neural networks. IEEE Trans Pattern Anal Mach Intell 40(2):437–451
Yang L et al. (2015) A large-scale Car dataset for fine grained categorization and verification, In Proceedings of the IEEE conference on computer vision and pattern recognition
Yu W, Yang K, Yao H, Sun X, Xu P (2017) Exploiting the complementary strengths of multilayer CNN features for image retrieval. Neurocomputing 237:235–241
Yua W, Yangb K, Yaoa H, Suna X, Xub P (2017) Exploiting the complementary strengths of multilayer CNN features for image retrieval. Neurocomputing 237:235–241
Zhao M, Zhang H, Sun J (2016) A novel image retrieval method based on multi trend structure descriptor. J Vis Commun Image Represent 38:73–81
Zhu L, Shen J, Xie L, Cheng Z (2017) Unsupervised visual hashing with semantic assistant for content based image retrieval. IEEE Trans Knowl Data Eng 29(2):472–486
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Kapoor, R., Sharma, D. & Gulati, T. State of the art content based image retrieval techniques using deep learning: a survey. Multimed Tools Appl 80, 29561–29583 (2021). https://doi.org/10.1007/s11042-021-11045-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-021-11045-1