Skip to main content
Log in

Two-stage content based image retrieval using sparse representation and feature fusion

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

With the advent of large-scale databases in the last two decades, content based image retrieval (CBIR) has been widely investigated. Studies show that the performance of the CBIR system is mainly affected by the image descriptors and the similarity measurement. Therefore, effectively describing the content of an image is a key point in the field of image retrieval. In the present study, a two-stage CBIR algorithm using sparse representation and feature fusion is proposed, in which the global and local features are combined to retrieve the images. The architecture of the CBIR system includes two parts: rough retrieval stage and main retrieval stage. The generalized search tree (GIST) features are initially used to roughly retrieve images with similar scene information by measuring the Canberra distance. Then, sparse coding and feature pooling are used to obtain the sparse representation of the local features extracted from the rough retrieval results. Finally, the Euclidean distance is applied to measure the similarity of the sparse feature vectors to acquire the retrieval results. Compared with the existing single feature-based image retrieval algorithms, experimental results on the Coil20 and Caltech256 image datasets show the best P, R, F1-measure and MAP values. It can be concluded that the proposed method obtains superior retrieval performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  1. Aharon M, Elad M, Bruckstein A (2006) K-SVD: an algorithm for designing Overcomplete dictionaries for sparse representation. IEEE Trans Signal Process 54(11):4311–4322. https://doi.org/10.1109/tsp.2006.881199

    Article  MATH  Google Scholar 

  2. Alizadeh S, Cemal K (2017) Automatic retrieval of shoeprint images using blocked sparse representation. Forensic Sci Int 277:103–114. https://doi.org/10.1016/j.forsciint.2017.05.025

    Article  Google Scholar 

  3. Celik C, Bilge HS (2017) Content based image retrieval with sparse representations and local feature descriptors: a comparative study. Pattern Recogn 68:1–13. https://doi.org/10.1016/j.patcog.2017.03.006

    Article  Google Scholar 

  4. Chen SS, Donoho DL, Saunders MA (2001) Atomic decomposition by basis pursuit. SIAM Rev 43(1):129–159. https://doi.org/10.1137/s003614450037906x

    Article  MathSciNet  MATH  Google Scholar 

  5. Elad M, Aharon M (2006) Image Denoising via sparse and redundant representations over learned dictionaries. IEEE Trans Image Process 15(12):3736–3745. https://doi.org/10.1109/tip.2006.881969

    Article  MathSciNet  Google Scholar 

  6. Han X, Wu Z, Jiang YG et al (2017) Learning fashion compatibility with bidirectional lstms. In: 25th ACM international conference on multimedia ACM, pp 1078–1086. https://doi.org/10.1145/3123266.3123394

  7. Huang W, Gao Y, Chan KL (2008) A review of region-based image retrieval. Journal of Signal Processing Systems 59(2):143–161. https://doi.org/10.1007/s11265-008-0294-3

    Article  Google Scholar 

  8. Huang Z, Wang R, Shan S et al (2015) Projection metric learning on grassmann manifold with application to video based face recognition. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition, pp 140–149. https://doi.org/10.1109/CVPR.2015.7298609

  9. Husain SS, Bober M (2019) Remap: multi-layer entropy-guided pooling of dense cnn features for image retrieval. IEEE Trans Image Process 28(10):5201–5213. https://doi.org/10.1109/TIP.2019.2917234

  10. Jimenez A, Alvarez J M, Giro-I Nieto X (2017) Class-weighted convolutional features for visual instance search. arXiv preprint arXiv:1707.02581. 

    Book  Google Scholar 

  11. Johnson J, Krishna R, Stark M, Li L et al (2015) Image retrieval using scene graphs. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition. https://doi.org/10.1109/CVPR.2015.7298990

    Chapter  Google Scholar 

  12. Kang LW, Hsu CY, Chen HW, Lu CS, Lin CY, Pei SC (2011) Feature-based sparse representation for image similarity assessment. IEEE Transactions on Multimedia 13(5):1019–1030. https://doi.org/10.1109/TMM.2011.2159197

  13. Lai CC, Chen YC (2011) A User-Oriented Image Retrieval System Based on Interactive Genetic Algorithm. IEEE Transactions on Instrumentation and Measurement 60(10):3318–3325. https://doi.org/10.1109/tim.2011.2135010

    Article  Google Scholar 

  14. Lai H, Pan Y, Ye L et al (2015) Simultaneous feature learning and hash coding with deep neural networks. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition. https://doi.org/10.1109/CVPR.2015.7298947

    Chapter  Google Scholar 

  15. Lai ZH, Chen YD, Wu J et al (2018) Jointly sparse hashing for image retrieval. IEEE Trans Image Process 27(12):6147–6158. https://doi.org/10.1109/TIP.2018.2867956

  16. Li H, Wang X, Tang J, Zhao C (2012) Combining global and local matching of multiple features for precise item image retrieval. Multimedia Systems 19(1):37–49. https://doi.org/10.1007/s00530-012-0265-1

    Article  Google Scholar 

  17. Li X, Yang J, Ma J (2021) Recent developments of content-based image retrieval (CBIR). Neurocomputing 452:675–689. https://doi.org/10.1016/j.neucom.2020.07.139

  18. Lin K, Yang HF, Hsiao JH, Chen CS (2015) Deep learning of binary hash codes for fast image retrieval. 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2015. doi: https://doi.org/10.1109/CVPRW.2015.7301269

  19. Liu Y, Zhang D, Lu G, Ma WY (2007) A survey of content-based image retrieval with high-level semantics. Pattern Recogn 40(1):262–282. https://doi.org/10.1016/j.patcog.2006.04.045

    Article  MATH  Google Scholar 

  20. Liu H, Wang R, Shan S et al (2016) Deep Supervised Hashing for Fast Image Retrieval. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition. https://doi.org/10.1109/CVPR.2016.227

    Chapter  Google Scholar 

  21. Liu PZ, Guo JM, Wu CY et al (2017) Fusion of deep learning and compressed domain features for content-based image retrieval. IEEE Trans Image Process 26(12):5706–5717. https://doi.org/10.1109/TIP.2017.2736343

  22. Liu H, Wang W, Jiao P (2019) Content Based Image Retrieval via Sparse Representation and Feature Fusion. In: 2019 IEEE 8th data driven control and learning systems conference. https://doi.org/10.1109/DDCLS.2019.8908926

    Chapter  Google Scholar 

  23. Lowe DG (2004) Distinctive image features from scale-invariant Keypoints. Int J Comput Vis 60(2):91–110. https://doi.org/10.1023/b:visi.0000029664.99615.94

    Article  Google Scholar 

  24. Mairal J, Bach F, Ponce J et al (2010) Online learning for matrix factorization and sparse coding. J Mach Learn Res 11:19–60

    MathSciNet  MATH  Google Scholar 

  25. Mohamadzadeh S, Farsi H (2016) Content-based image retrieval system via sparse representation. IET Comput Vis 10(1):95–102. https://doi.org/10.1049/iet-cvi.2015.0165

    Article  MATH  Google Scholar 

  26. Nakazawa T, Kulkarni DV (2018) Wafer map defect pattern classification and image retrieval using convolutional neural network. IEEE Trans Semicond Manuf 31(2):309–314.  https://doi.org/10.1109/TSM.2018.2795466

  27. Ning QN, Zhu JK, Zhong ZY et al (2017) Scalable image retrieval by sparse product quantization. IEEE Transactions on Multimedia 19(3):586–597. https://doi.org/10.1109/TMM.2016.2625260

  28. Oliva A, Torralba A (2001) Modeling the shape of the scene: a holistic representation of the spatial envelope. IJCV 42(3):145–175. https://doi.org/10.1023/A:1011139631724

  29. Olshausen BA, Field DJ (1996) Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381(6583):607–609. https://doi.org/10.1038/381607a0

    Article  Google Scholar 

  30. Pati YC, Rezaiifar R, Krishnaprasad PS (1993) Orthogonal matching pursuit: recursive function approximation with applications to wavelet decomposition. In: Presented at the proceedings of 27th Asilomar conference on signals, systems and computers, 1993. https://doi.org/10.1109/ACSSC.1993.342465

    Chapter  Google Scholar 

  31. Shen F, Xu Y, Liu L, Yang Y, Huang Z, Shen HT (2018) Unsupervised deep hashing with similarity-adaptive and discrete optimization. IEEE Trans Pattern Anal Mach Intell 40(12):3034–3044. https://doi.org/10.1109/tpami.2018.2789887

    Article  Google Scholar 

  32. Srinivas M, Naidu RR, Sastry CS, Mohan CK (2015) Content based medical image retrieval using dictionary learning. Neurocomputing 236:880–895. https://doi.org/10.1016/j.neucom.2015.05.036

    Article  Google Scholar 

  33. Tibshirani R, Johnstone I, Hastie T et al (2004) Least angle regression. Ann Stat 32(2):407–499. https://doi.org/10.1214/009053604000000067

    Article  MathSciNet  MATH  Google Scholar 

  34. Wang RX, Peng GH (2016) Hesse Sparse Representation under n-words Model for Image Retrieval. Journal of Electronics and Information Technology 38(5):1115–1122. (in chinese) https://doi.org/10.11999/JEIT150617

  35. Wang D, Hoi SCH, He Y et al (2014) Retrieval-based face annotation by weak label regularized local coordinate coding. IEEE Trans Pattern Anal Mach Intell 36(3):550–563. https://doi.org/10.1109/TPAMI.2013.145

  36. Wang YH, Cen YG, Zhao RZ et al (2017) Separable vocabulary and feature fusion for image retrieval based on sparse representation. Neurocomputing 236:14–22. https://doi.org/10.1016/j.neucom.2016.08.106

  37. Wang WW, Zhang HF, Zhang Z et al (2021) Sparse graph based self-supervised hashing for scalable image retrieval. Inf Sci 547:622–640. https://doi.org/10.1016/j.ins.2020.08.092

  38. Wei X, Luo J, Wu J et al (2017) Selective convolutional descriptor aggregation for fine-grained image retrieval. IEEE Trans Image Process 26(6):2868–2881. https://doi.org/10.1109/TIP.2017.2688133

  39. Wei S, Liao L, Li J, Zheng Q, Yang F, Zhao Y (2019) Saliency inside: learning attentive CNNs for content-based image retrieval. IEEE Trans Image Process 28(9):4580–4593. https://doi.org/10.1109/TIP.2019.2913513

  40. Wu P, Hoi SCH, Zhao P, Miao C, Liu ZY (2016) Online multi-modal distance metric learning with application to image retrieval. IEEE Trans Knowl Data Eng 28(2):454–467. https://doi.org/10.1109/TKDE.2015.2477296

  41. Xu P, Zhang L, Yang K, Yao H (2013) Nested-SIFT for efficient image matching and retrieval. IEEE Multimedia 20(3):34–46. https://doi.org/10.1109/mmul.2013.18

    Article  Google Scholar 

  42. Yang Y, Newsam S (2013) Geographic image retrieval using local invariant features. IEEE Trans Geosci Remote Sens 51(2):818–832. https://doi.org/10.1109/tgrs.2012.2205158

    Article  Google Scholar 

  43. Yang J, Wright J, Huang TS et al (2010) Image super-resolution via sparse representation. IEEE Trans Image Process 19(11):2861–2873. https://doi.org/10.1109/tip.2010.2050625

    Article  MathSciNet  MATH  Google Scholar 

  44. Yang Z, Gao J, Xie Z et al (2013) . Scene categorization of local Gist feature match kernel. Journal of Image and Graphics 18(3):264–270. https://doi.org/10.11834/jig.20130303

  45. Yang L, Xu Y, Wang J et al (2017) Ms-rmac: multi-scale regional maximum activation of convolutions for image retrieval. IEEE Signal Processing Letters 24(5):609–613. https://doi.org/10.1109/LSP.2017.2665522

  46. Yasmin M, Sharif M, Mohsin S (2013) Neural networks in medical imaging applications: a survey. World Appl Sci J 22(1):85–96

    Google Scholar 

  47. Zhang Y, Pan P, Zheng Y et al (2018) Visual search at alibaba. In: 24th ACM SIGKDD international conference on Knowledge Discovery & Data Mining. ACM, pp 993–1001. https://doi.org/10.1145/3219819.3219820

  48. Zheng L, Yang Y, Tian Q (2018) Sift meets cnn: a decade survey of instance retrieval. IEEE Trans Pattern Anal Mach Intell 40(5):1224–1244. https://doi.org/10.1109/TPAMI.2017.2709749

Download references

Acknowledgments

An earlier version of this paper was presented at the International Conference on 2019 IEEE 8th Data Driven Control and Learning Systems (DDCLS). Please refer in [22]. This work was funded by the National Natural Science Foundation of China under Grants 61703334 and 61973248, by the China Postdoctoral Science Foundation under Grant 2016M602942XB, and by the Key Projection of Shannxi Key Research and Development Program under Grant 2018ZDXM-GY-089.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Han Liu.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, W., Jiao, P., Liu, H. et al. Two-stage content based image retrieval using sparse representation and feature fusion. Multimed Tools Appl 81, 16621–16644 (2022). https://doi.org/10.1007/s11042-022-12348-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-12348-7

Keywords

Navigation