Abstract
With the advent of large-scale databases in the last two decades, content based image retrieval (CBIR) has been widely investigated. Studies show that the performance of the CBIR system is mainly affected by the image descriptors and the similarity measurement. Therefore, effectively describing the content of an image is a key point in the field of image retrieval. In the present study, a two-stage CBIR algorithm using sparse representation and feature fusion is proposed, in which the global and local features are combined to retrieve the images. The architecture of the CBIR system includes two parts: rough retrieval stage and main retrieval stage. The generalized search tree (GIST) features are initially used to roughly retrieve images with similar scene information by measuring the Canberra distance. Then, sparse coding and feature pooling are used to obtain the sparse representation of the local features extracted from the rough retrieval results. Finally, the Euclidean distance is applied to measure the similarity of the sparse feature vectors to acquire the retrieval results. Compared with the existing single feature-based image retrieval algorithms, experimental results on the Coil20 and Caltech256 image datasets show the best P, R, F1-measure and MAP values. It can be concluded that the proposed method obtains superior retrieval performance.
Similar content being viewed by others
References
Aharon M, Elad M, Bruckstein A (2006) K-SVD: an algorithm for designing Overcomplete dictionaries for sparse representation. IEEE Trans Signal Process 54(11):4311–4322. https://doi.org/10.1109/tsp.2006.881199
Alizadeh S, Cemal K (2017) Automatic retrieval of shoeprint images using blocked sparse representation. Forensic Sci Int 277:103–114. https://doi.org/10.1016/j.forsciint.2017.05.025
Celik C, Bilge HS (2017) Content based image retrieval with sparse representations and local feature descriptors: a comparative study. Pattern Recogn 68:1–13. https://doi.org/10.1016/j.patcog.2017.03.006
Chen SS, Donoho DL, Saunders MA (2001) Atomic decomposition by basis pursuit. SIAM Rev 43(1):129–159. https://doi.org/10.1137/s003614450037906x
Elad M, Aharon M (2006) Image Denoising via sparse and redundant representations over learned dictionaries. IEEE Trans Image Process 15(12):3736–3745. https://doi.org/10.1109/tip.2006.881969
Han X, Wu Z, Jiang YG et al (2017) Learning fashion compatibility with bidirectional lstms. In: 25th ACM international conference on multimedia ACM, pp 1078–1086. https://doi.org/10.1145/3123266.3123394
Huang W, Gao Y, Chan KL (2008) A review of region-based image retrieval. Journal of Signal Processing Systems 59(2):143–161. https://doi.org/10.1007/s11265-008-0294-3
Huang Z, Wang R, Shan S et al (2015) Projection metric learning on grassmann manifold with application to video based face recognition. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition, pp 140–149. https://doi.org/10.1109/CVPR.2015.7298609
Husain SS, Bober M (2019) Remap: multi-layer entropy-guided pooling of dense cnn features for image retrieval. IEEE Trans Image Process 28(10):5201–5213. https://doi.org/10.1109/TIP.2019.2917234
Jimenez A, Alvarez J M, Giro-I Nieto X (2017) Class-weighted convolutional features for visual instance search. arXiv preprint arXiv:1707.02581.
Johnson J, Krishna R, Stark M, Li L et al (2015) Image retrieval using scene graphs. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition. https://doi.org/10.1109/CVPR.2015.7298990
Kang LW, Hsu CY, Chen HW, Lu CS, Lin CY, Pei SC (2011) Feature-based sparse representation for image similarity assessment. IEEE Transactions on Multimedia 13(5):1019–1030. https://doi.org/10.1109/TMM.2011.2159197
Lai CC, Chen YC (2011) A User-Oriented Image Retrieval System Based on Interactive Genetic Algorithm. IEEE Transactions on Instrumentation and Measurement 60(10):3318–3325. https://doi.org/10.1109/tim.2011.2135010
Lai H, Pan Y, Ye L et al (2015) Simultaneous feature learning and hash coding with deep neural networks. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition. https://doi.org/10.1109/CVPR.2015.7298947
Lai ZH, Chen YD, Wu J et al (2018) Jointly sparse hashing for image retrieval. IEEE Trans Image Process 27(12):6147–6158. https://doi.org/10.1109/TIP.2018.2867956
Li H, Wang X, Tang J, Zhao C (2012) Combining global and local matching of multiple features for precise item image retrieval. Multimedia Systems 19(1):37–49. https://doi.org/10.1007/s00530-012-0265-1
Li X, Yang J, Ma J (2021) Recent developments of content-based image retrieval (CBIR). Neurocomputing 452:675–689. https://doi.org/10.1016/j.neucom.2020.07.139
Lin K, Yang HF, Hsiao JH, Chen CS (2015) Deep learning of binary hash codes for fast image retrieval. 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2015. doi: https://doi.org/10.1109/CVPRW.2015.7301269
Liu Y, Zhang D, Lu G, Ma WY (2007) A survey of content-based image retrieval with high-level semantics. Pattern Recogn 40(1):262–282. https://doi.org/10.1016/j.patcog.2006.04.045
Liu H, Wang R, Shan S et al (2016) Deep Supervised Hashing for Fast Image Retrieval. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition. https://doi.org/10.1109/CVPR.2016.227
Liu PZ, Guo JM, Wu CY et al (2017) Fusion of deep learning and compressed domain features for content-based image retrieval. IEEE Trans Image Process 26(12):5706–5717. https://doi.org/10.1109/TIP.2017.2736343
Liu H, Wang W, Jiao P (2019) Content Based Image Retrieval via Sparse Representation and Feature Fusion. In: 2019 IEEE 8th data driven control and learning systems conference. https://doi.org/10.1109/DDCLS.2019.8908926
Lowe DG (2004) Distinctive image features from scale-invariant Keypoints. Int J Comput Vis 60(2):91–110. https://doi.org/10.1023/b:visi.0000029664.99615.94
Mairal J, Bach F, Ponce J et al (2010) Online learning for matrix factorization and sparse coding. J Mach Learn Res 11:19–60
Mohamadzadeh S, Farsi H (2016) Content-based image retrieval system via sparse representation. IET Comput Vis 10(1):95–102. https://doi.org/10.1049/iet-cvi.2015.0165
Nakazawa T, Kulkarni DV (2018) Wafer map defect pattern classification and image retrieval using convolutional neural network. IEEE Trans Semicond Manuf 31(2):309–314. https://doi.org/10.1109/TSM.2018.2795466
Ning QN, Zhu JK, Zhong ZY et al (2017) Scalable image retrieval by sparse product quantization. IEEE Transactions on Multimedia 19(3):586–597. https://doi.org/10.1109/TMM.2016.2625260
Oliva A, Torralba A (2001) Modeling the shape of the scene: a holistic representation of the spatial envelope. IJCV 42(3):145–175. https://doi.org/10.1023/A:1011139631724
Olshausen BA, Field DJ (1996) Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381(6583):607–609. https://doi.org/10.1038/381607a0
Pati YC, Rezaiifar R, Krishnaprasad PS (1993) Orthogonal matching pursuit: recursive function approximation with applications to wavelet decomposition. In: Presented at the proceedings of 27th Asilomar conference on signals, systems and computers, 1993. https://doi.org/10.1109/ACSSC.1993.342465
Shen F, Xu Y, Liu L, Yang Y, Huang Z, Shen HT (2018) Unsupervised deep hashing with similarity-adaptive and discrete optimization. IEEE Trans Pattern Anal Mach Intell 40(12):3034–3044. https://doi.org/10.1109/tpami.2018.2789887
Srinivas M, Naidu RR, Sastry CS, Mohan CK (2015) Content based medical image retrieval using dictionary learning. Neurocomputing 236:880–895. https://doi.org/10.1016/j.neucom.2015.05.036
Tibshirani R, Johnstone I, Hastie T et al (2004) Least angle regression. Ann Stat 32(2):407–499. https://doi.org/10.1214/009053604000000067
Wang RX, Peng GH (2016) Hesse Sparse Representation under n-words Model for Image Retrieval. Journal of Electronics and Information Technology 38(5):1115–1122. (in chinese) https://doi.org/10.11999/JEIT150617
Wang D, Hoi SCH, He Y et al (2014) Retrieval-based face annotation by weak label regularized local coordinate coding. IEEE Trans Pattern Anal Mach Intell 36(3):550–563. https://doi.org/10.1109/TPAMI.2013.145
Wang YH, Cen YG, Zhao RZ et al (2017) Separable vocabulary and feature fusion for image retrieval based on sparse representation. Neurocomputing 236:14–22. https://doi.org/10.1016/j.neucom.2016.08.106
Wang WW, Zhang HF, Zhang Z et al (2021) Sparse graph based self-supervised hashing for scalable image retrieval. Inf Sci 547:622–640. https://doi.org/10.1016/j.ins.2020.08.092
Wei X, Luo J, Wu J et al (2017) Selective convolutional descriptor aggregation for fine-grained image retrieval. IEEE Trans Image Process 26(6):2868–2881. https://doi.org/10.1109/TIP.2017.2688133
Wei S, Liao L, Li J, Zheng Q, Yang F, Zhao Y (2019) Saliency inside: learning attentive CNNs for content-based image retrieval. IEEE Trans Image Process 28(9):4580–4593. https://doi.org/10.1109/TIP.2019.2913513
Wu P, Hoi SCH, Zhao P, Miao C, Liu ZY (2016) Online multi-modal distance metric learning with application to image retrieval. IEEE Trans Knowl Data Eng 28(2):454–467. https://doi.org/10.1109/TKDE.2015.2477296
Xu P, Zhang L, Yang K, Yao H (2013) Nested-SIFT for efficient image matching and retrieval. IEEE Multimedia 20(3):34–46. https://doi.org/10.1109/mmul.2013.18
Yang Y, Newsam S (2013) Geographic image retrieval using local invariant features. IEEE Trans Geosci Remote Sens 51(2):818–832. https://doi.org/10.1109/tgrs.2012.2205158
Yang J, Wright J, Huang TS et al (2010) Image super-resolution via sparse representation. IEEE Trans Image Process 19(11):2861–2873. https://doi.org/10.1109/tip.2010.2050625
Yang Z, Gao J, Xie Z et al (2013) . Scene categorization of local Gist feature match kernel. Journal of Image and Graphics 18(3):264–270. https://doi.org/10.11834/jig.20130303
Yang L, Xu Y, Wang J et al (2017) Ms-rmac: multi-scale regional maximum activation of convolutions for image retrieval. IEEE Signal Processing Letters 24(5):609–613. https://doi.org/10.1109/LSP.2017.2665522
Yasmin M, Sharif M, Mohsin S (2013) Neural networks in medical imaging applications: a survey. World Appl Sci J 22(1):85–96
Zhang Y, Pan P, Zheng Y et al (2018) Visual search at alibaba. In: 24th ACM SIGKDD international conference on Knowledge Discovery & Data Mining. ACM, pp 993–1001. https://doi.org/10.1145/3219819.3219820
Zheng L, Yang Y, Tian Q (2018) Sift meets cnn: a decade survey of instance retrieval. IEEE Trans Pattern Anal Mach Intell 40(5):1224–1244. https://doi.org/10.1109/TPAMI.2017.2709749
Acknowledgments
An earlier version of this paper was presented at the International Conference on 2019 IEEE 8th Data Driven Control and Learning Systems (DDCLS). Please refer in [22]. This work was funded by the National Natural Science Foundation of China under Grants 61703334 and 61973248, by the China Postdoctoral Science Foundation under Grant 2016M602942XB, and by the Key Projection of Shannxi Key Research and Development Program under Grant 2018ZDXM-GY-089.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Wang, W., Jiao, P., Liu, H. et al. Two-stage content based image retrieval using sparse representation and feature fusion. Multimed Tools Appl 81, 16621–16644 (2022). https://doi.org/10.1007/s11042-022-12348-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-022-12348-7