Skip to main content
Log in

Exploring geometric information in CNN for image retrieval

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Convolutional Neural Network (CNN) has brought significant improvements for various multimedia tasks. In contrast, image retrieval has not yet benefited as much since no training database is available. In this paper, we propose an unsupervised weighting scheme for pre-trained CNN models to adaptively emphasize image center. Different from the general preference for fully connected layers which represent abstract semantics, we aggregate the activations of convolutional layers on image patches to depict local patterns in details. It is an empirical observation that the target of searching is naturally the focus of an image. Thus we pooling the features with respect to their positions, since they innately maintain the geometric layout of an image. Experimental results on two benchmarks prove the effectiveness of our methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Babenko A, Lempitsky V (2015) Aggregating local deep features for image retrieval. In: International conference on computer vision, pp 1269–1277

  2. Babenko A, Slesarev A, Chigorin A, Lempitsky V (2014) Neural codes for image retrieval. In: European conference on computer vision, pp 584–599

    Chapter  Google Scholar 

  3. Bai S, Bai X (2016) Sparse contextual activation for efficient visual re-ranking. IEEE Trans Image Process 25(3):1056–1069

    Article  MathSciNet  Google Scholar 

  4. Bai S, Sun S, Bai X, Zhang Z, Tian Q (2016) Smooth neighborhood structure mining on multiple affinity graphs with applications to context-sensitive similarity. In: European conference on computer vision, pp 592–608

  5. Bai S, Bai X, Tian Q, Latecki L J (2017) Regularized diffusion process for visual retrieval. In: AAAI conference on artificial intelligence, pp 3967–3973

  6. Bai S, Zhou Z, Wang J, Bai X, Latecki L J, Tian Q (2017) Ensemble diffusion for retrieval. In: IEEE international conference on computer vision, pp 774–783

  7. Bai S, Bai X, Tian Q, Latecki L J (2018) Regularized diffusion process on bidirectional context for object retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence

  8. Diaz I G, Birinci M, Diaz-De-Maria F, Delp E J (2017) Neighborhood matching for image retrieval. IEEE Transactions on Multimedia (99)

  9. Gong Y, Wang L, Guo R, Lazebnik S (2014) Multi-scale orderless pooling of deep convolutional activation features. In: European conference on computer vision, pp 392–407

    Chapter  Google Scholar 

  10. He K, Zhang X, Ren S, Sun J (2014) Spatial pyramid pooling in deep convolutional networks for visual recognition. In: European conference on computer vision, pp 346–361

    Chapter  Google Scholar 

  11. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: IEEE conference on computer vision and pattern recognition, pp 770–778

  12. Husain S S, Bober M (2017) Improving large-scale image retrieval through robust aggregation of local descriptors. IEEE Trans Pattern Anal Mach Intell 39(9):1783–1796

    Article  Google Scholar 

  13. Jégou H, Douze M, Schmid C (2008) Hamming embedding and weak geometric consistency for large scale image search. In: European conference on computer vision, pp 304–317

    Google Scholar 

  14. Jégou H, Douze M, Schmid C (2009) On the burstiness of visual elements. In: IEEE conference on computer vision and pattern recognition, pp 1169–1176

  15. Jégou H, Douze M, Schmid C (2010) Improving bag-of-features for large scale image search. Int J Comput Vis 87(3):316–336

    Article  Google Scholar 

  16. Jégou H, Douze M, Schmid C, Pérez P (2010) Aggregating local descriptors into a compact image representation. In: IEEE conference on computer vision and pattern recognition, pp 3304–3311

  17. Jégou H, Perronnin F, Douze M, Sanchez J, Perez P, Schmid C (2012) Aggregating local image descriptors into compact codes. IEEE Trans Pattern Anal Mach Intell 34(9):1704–1716

    Article  Google Scholar 

  18. Kumar M, Chhabra P, Garg N K (2018) An efficient content based image retrieval system using bayesnet and k-nn, Multimed Tools Appl, 1–14

  19. Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: IEEE conference on computer vision and pattern recognition, pp 2169–2178

  20. Li Y, Kong X, Zheng L, Tian Q (2016) Exploiting hierarchical activations of neural network for image retrieval. In: Proceedings of the 24nd ACM international conference on Multimedia, pp 132–136. ACM

  21. Lowe D G (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110

    Article  Google Scholar 

  22. Qin D, Wengert C, Gool L V (2013) Query adaptive similarity for large scale object retrieval. In: IEEE conference on computer vision and pattern recognition, pp 1610–1617

  23. Radenović F, Tolias G, Chum O (2016) Cnn image retrieval learns from bow: Unsupervised fine-tuning with hard examples. In: European conference on computer vision, pp 3–20. Springer

  24. Radenović F, Tolias G, Chum O (2018) Fine-tuning cnn image retrieval with no human annotation. IEEE Transactions on Pattern Analysis and Machine Intelligence

  25. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556

  26. Sun S, Li Y, Zhou W, Tian Q, Li H (2017) Local residual similarity for image re-ranking. Inform Sci 417:143–153

    Article  Google Scholar 

  27. Tolias G, Sicre R, Jégou H (2016) Particular object retrieval with integral max-pooling of cnn activations. In: International conference on learning representations, pp 1–12

  28. Wang Y, Lin X, Wu L, Zhang W (2015) Effective multi-query expansions: Robust landmark retrieva. In: Proceedings of the 23rd ACM international conference on Multimedia, pp 79–88. ACM

  29. Wang Y, Lin X, Wu L, Zhang W, Zhang Q (2015) Lbmch: Learning bridging mapping for cross-modal hashing. In: Proceedings of the 38th international ACM SIGIR conference on research and development in information retrieval, pp 999–1002. ACM

  30. Wang Y, Lin X, Wu L, Zhang W (2017) Effective multi-query expansions: Collaborative deep networks for robust landmark retrieval. IEEE Trans Image Process 26(3):1393–1404

    Article  MathSciNet  Google Scholar 

  31. Wang Y, Zhang W, Wu L, Lin X, Zhao X (2017) Unsupervised metric fusion over multiview data by graph random walk-based cross-view diffusion. IEEE Trans Neural Netw Learn Syst 28(1):57–70

    Article  Google Scholar 

  32. Wu L, Wang Y, Gao J, Li X (2018) Deep adaptive feature embedding with local sample distributions for person re-identification. Pattern Recogn 73:275–288

    Article  Google Scholar 

  33. Xie L, Tian Q, Flynn J, Wang J, Yuille A (2016) Geometric neural phrase pooling: Modeling the spatial co-occurrence of neurons. In: European conference on computer vision

  34. Zhang S, Yang M, Wang X, Lin Y, Tian Q (2013) Semantic-aware co-indexing for image retrieval. In: IEEE conference on computer vision and pattern recognition, pp 1673–1680

  35. Zhang S, Yang M, Cour T, Yu K, Metaxas D N (2015) Query specific rank fusion for image retrieval. IEEE Trans Pattern Anal Mach Intell 37(4):803–815

    Article  Google Scholar 

  36. Zheng L, Wang S, Liu Z, Tian Q (2014) Packing and padding: Coupled multi-index for accurate image retrieval. In: IEEE conference on computer vision and pattern recognition, pp 1939–1946

  37. Zheng L, Zhao Y, Wang S, Wang J, Tian Q (2016) Good practice in cnn feature transfer. arXiv:1604.00133

  38. Zheng L, Yang Y, Tian Q (2018) Sift meets cnn: A decade survey of instance retrieval. IEEE Trans Pattern Anal Mach Intell 40(5):1224–1244

    Article  Google Scholar 

  39. Zhou W, Yang M, Wang X, Li H, Lin Y, Tian Q (2016) Scalable feature matching by dual cascaded scalar quantization for image retrieval. IEEE Trans Pattern Anal Mach Intell 38(1):159–171

    Article  Google Scholar 

  40. Zhu Y, Jiang J, Han W, Ding Y, Tian Q (2017) Interpretation of users’ feedback via swarmed particles for content-based image retrieval. Inform. Sci 375:246–257

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China (NSFC) under Grant 61772111, in part by the Foundation for Innovative Research Groups of the National Natural Science Foundation of China (NSFC) under Grant 71421001, in part by the National Natural Science Foundation of China (NSFC) under Grant 61502073, and in part by the Fundamental Research Funds for the Central UniversitiesDUT18JC02.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiangwei Kong.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, Y., Kong, X. & Fu, H. Exploring geometric information in CNN for image retrieval. Multimed Tools Appl 78, 30585–30598 (2019). https://doi.org/10.1007/s11042-018-6414-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-018-6414-6

Keywords

Navigation