Skip to main content
Log in

Multi-center convolutional descriptor aggregation for image retrieval

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Recent works have demonstrated that the convolutional descriptor aggregation can provide state-of-the-art performance for image retrieval. In this paper, we propose a multi-center convolutional descriptor aggregation (MCDA) method to produce global image representation for image retrieval. We first present a feature map center selection method to eliminate the background information in the feature maps. We then propose the channel weighting and spatial weighting schemes based on the centers to boost the effect of the features on the object. Finally, the weighted convolutional descriptors are aggregated to represent images. Experiments demonstrate that MCDA can produce state-of-the-art retrieval performance, and the generated activation map is also effective for object localization.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Krizhevsky A, Sutskever I, Hinton GE(2012) Imagenet classification with deep convolutional neural networks. In: Annual conference on neural information processing systems, pp 1097–1105

  2. Cui L, Yang S, Chen F et al (2018) A survey on application of machine learning for internet of things. Int J Mach Learn Cybernet 9(8):1399–1417

    Article  Google Scholar 

  3. Banharnsakun A (2018) Towards improving the convolutional neural networks for deep learning using the distributed artificial bee colony method. Int J Mach Learn Cybern. https://doi.org/10.1007/s13042-018-0811-z

    Google Scholar 

  4. Cui Y, Jiang J, Lai Z et al (2018) Supervised discrete discriminant hashing for image retrieval. Pattern Recogn 78:79–90

    Article  Google Scholar 

  5. Kalantidis Y, Mellina C, Osindero S (2016) Cross-dimensional weighting for aggregated deep convolutional features. European conference on computer vision. Springer, Cham, pp 685–701

  6. Wei XS, Luo JH, Wu J (2017) Selective convolutional descriptor aggregation for fine-grained image retrieval. IEEE Trans Image Process 26(6):2868–2881

    Article  MathSciNet  MATH  Google Scholar 

  7. Ng JYH, Yang F, Davis LS (2015) Exploiting local features from deep networks for image retrieval. arXiv preprint: 1504.05133

  8. Jegou H, Douze M, Schmid C (2008) Hamming embedding and weak geometric consistency for large scale image search. European conference on computer vision, pp 304–317

  9. Philbin J, Chum O, Isard M (2008) Lost in quantization: Improving particular object retrieval in large scale image databases. International conference on computer vision and pattern recognition, pp 1–8

  10. Philbin J, Chum O, Isard M (2007) Object retrieval with large vocabularies and fast spatial matching. International conference on computer vision and pattern recognition, pp 1–8

  11. Wang XZ, Wang R, Xu C (2018) Discovering the relationship between generalization and uncertainty by incorporating complexity of classification. IEEE Trans Cybern 48(2):703–715

    Article  MathSciNet  Google Scholar 

  12. Wang R, Kwong S, Wang XZ et al (2015) Segment based decision tree induction with continuous valued attributes. IEEE Trans Cybern 45(7):1262–1275

    Article  Google Scholar 

  13. Wang R, Chow CY, Kwong S (2016) Ambiguity-based multiclass active learning. IEEE Trans Fuzzy Syst 24(1):242–248

    Article  Google Scholar 

  14. Wang R, Wang XZ, Kwong S et al (2017) Incorporating diversity and informativeness in multiple-instance active learning. IEEE Trans Fuzzy Syst 25(6):1460–1475

    Article  Google Scholar 

  15. Jégou H, Douze M, Schmid C (2010) Aggregating local descriptors into a compact image representation. International conference on computer vision and pattern recognition, pp 3304–3311

  16. Sánchez J, Perronnin F, Mensink T et al (2013) Image classification with the fisher vector: theory and practice. Int J Comput Vis 105(3):222–245

    Article  MathSciNet  MATH  Google Scholar 

  17. Babenko A, Slesarev A, Chigorin A (2014) Neural codes for image retrieval. European conference on computer vision, pp 584–599

  18. Razavian AS, Azizpour H, Sullivan J, Carlsson S (2014) CNN features off-the-shelf: an astounding baseline for recognition. International conference on computer vision and pattern recognition workshops, pp 512–519

  19. Gong Y, Wang L, Guo R (2014) Multi-scale orderless pooling of deep convolutional activation features. European conference on computer vision, pp 392–407

  20. Babenko A, Lempitsky V (2015) Aggregating local deep features for image retrieval. IEEE international conference on computer vision, pp 1269–1277

  21. Tolias G, Sicre R, Jégou H (2015) Particular object retrieval with integral max-pooling of CNN activations. arXiv preprint 1511.05879

  22. Radenović F, Tolias G, Chum O (2016) CNN image retrieval learns from BoW: unsupervised fine-tuning with hard examples. European conference on computer vision, pp 3–20

  23. Liu Z, Li J, Shen Z (2017) Learning efficient convolutional networks through network slimming. IEEE international conference on computer vision, pp 2755–2763

  24. Boscaini D, Masci J, Rodolà E (2016) Learning shape correspondence with anisotropic convolutional neural networks. Adv Neural Inf Process Syst 3189–3197

  25. Azizpour H, Sharif Razavian A, Sullivan J, Maki A, Carlsson S (2015) From generic to specific deep representations for visual recognition. International conference on computer vision and pattern recognition workshops, pp 36–45

  26. Fu Z, Robles-Kelly A, Zhou J (2011) MILIS: Multiple instance learning with instance selection. IEEE Trans Pattern Anal Mach Intell 33(5):958–977

    Article  Google Scholar 

  27. Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. International conference on learning representations, pp 1–14

  28. Razavian AS, Sullivan J, Carlsson S (2016) Visual instance retrieval with deep convolutional networks. ITE Trans Media Technol Appl 4(3):251–258

    Article  Google Scholar 

  29. Huang R, Zhang G, Chen J (2018) Semi-supervised discriminant Isomap with application to visualization, image retrieval and classification. Int J Mach Learn Cybern. https://doi.org/10.1007/s13042-018-0809-6

    Google Scholar 

  30. Zhu Q, Yuan N, Guan D et al (2018) An alternative to face image representation and classification. Int J Mach Learn Cybern. https://doi.org/10.1007/s13042-018-0802-0

    Google Scholar 

  31. Liu J, Liu W, Ma S et al (2018) Image-set based face recognition using K-SVD dictionary learning. Int J Mach Learn Cybern. https://doi.org/10.1007/s13042-017-0782-5

    Google Scholar 

  32. Ding S, Zhang N, Zhang J et al (2017) Unsupervised extreme learning machine with representational features. Int J Mach Learn Cybernet 8(2):587–595

    Article  Google Scholar 

  33. Fang J, Xu X, Liu H et al (2018) Local receptive field based extreme learning machine with three channels for histopathological image classification. Int J Mach Learn Cybern. https://doi.org/10.1007/s13042-018-0825-6

    Google Scholar 

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China (Grant 61772344 and Grant 61732011), in part by the Natural Science Foundation of SZU (Grant 827-000140, Grant 827-000230, and Grant 2017060), the National Social Science Foundation of China (17BTQ068), the Youth Foundation of Education Bureau of Hebei Province (Grant QN2015099), China Postdoctoral Science Foundation funded project (Grant 2017M621078). The funding project of midwest colleges and universities promoting comprehensive strength of Hebei University.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shufang Wu.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhu, J., Wu, S., Zhu, H. et al. Multi-center convolutional descriptor aggregation for image retrieval. Int. J. Mach. Learn. & Cyber. 10, 1863–1873 (2019). https://doi.org/10.1007/s13042-018-0898-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-018-0898-2

Keywords

Navigation