Skip to main content

Image Representation Optimization Based on Locally Aggregated Descriptors

  • Conference paper
  • First Online:
Advances in Knowledge Discovery and Data Mining (PAKDD 2016)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9652))

Included in the following conference series:

  • 3006 Accesses

Abstract

Aggregating local descriptors into super vectors achives excellent performance in image classification and retrieval tasks. Vector of locally aggregated descriptors(VLAD), which indexes images to compact representations by aggregating the residuals of descriptors and visual words, is a popular super vector encoding method among this kind. This paper will focus on the biggest difficulty of VLAD, the “visual burstiness”, reviste the basic assumptions and solutions along this line, then make modifications to two key steps of the initial VLAD process. The main contributions are twofold. Firstly, we start from local coordinate system(LCS) and propose the aggregated version(aggrLCS), which changes the objective and timing of coordinate rotation, for better captures of bursts. Secondly, an adaptive power-law normalization method is adopted to magnify the positive effect of power-law normalization by weighting each dimension respectively. Experiments on image retrieval tasks demonstrate that the proposed modifications show superior performance over the original and several variants of VLAD.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.robots.ox.ac.uk/~vgg/data/oxbuildings/.

  2. 2.

    http://www.robots.ox.ac.uk/~vgg/data/parisbuildings/.

  3. 3.

    http://lear.inrialpes.fr/~jegou/data.php.

References

  1. Delhumeau, J., Gosselin, P.-H., Jgou, H., Prez, P.: Revisiting the VLAD image representation. In: ACM Multimedia, pp. 653–656 (2013)

    Google Scholar 

  2. Arandjelovic, R., Zisserman, A.: All about VLAD. In: CVPR, pp. 1578–1585 (2013)

    Google Scholar 

  3. Spyromitros-Xioufis, E., Papadopoulos, S., Ginsca, A.L., Popescu, A., Kompatsiaris, Y.: Improving diversity in image search via supervised relevance scoring. In: ICMR (2015)

    Google Scholar 

  4. Ng, J.Y.H., Yang, F., Davis, L.S.: Exploiting local features from deep networks for image retrieval. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 53–61 (2015)

    Google Scholar 

  5. Jgou, H., Douze, M., Schmid, C., Prez, P.: Aggregating local descriptors into a compact image representation. In: CVPR, pp. 3304–3311 (2010)

    Google Scholar 

  6. Jgou, H., Perronnin, F., Douze, M., Sanchez, J., Perez, P., Schmid, C.: Aggregating local image descriptors into compact codes. In: PAMI, pp. 1704–1716 (2012)

    Google Scholar 

  7. Chatfield, K., Lempitsky, V.S., Vedaldi, A., Zisserman, A.: The devil is in the details: an evaluation of recent feature encoding methods. In: BMVC (2011)

    Google Scholar 

  8. Jgou, H., Douze, M., Schmid, C.: On the burstiness of visual elements. In: CVPR, pp. 1169–1176 (2009)

    Google Scholar 

  9. Sivic, J., Zisserman, A.: Video Google: a text retrieval approach to object matching in videos. In: ICCV, pp. 1470–1477 (2003)

    Google Scholar 

  10. Arandjelovi, R., Zisserman, A.: Three things everyone should know to improve object retrieval. In: CVPR, pp. 2911–2918 (2012)

    Google Scholar 

  11. Simonyan, K., Vedaldi, A., Zisserman, A.: Descriptor learning using convex optimisation. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part I. LNCS, vol. 7572, pp. 243–256. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  12. Yang, J., Yu, K., Gong, Y., Huang, T.: Linear spatial pyramid matching using sparse coding for image classification. In: CVPR, pp. 1794–1801 (2012)

    Google Scholar 

  13. Jégou, H., Chum, O.: Negative evidences and co-occurences in image retrieval: the benefit of PCA and whitening. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part II. LNCS, vol. 7573, pp. 774–787. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  14. Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., Gong, Y.: Locality-constrained linear coding for image classification. In: CVPR, pp. 3360–3367 (2010)

    Google Scholar 

  15. Perronnin, F., Sánchez, J., Mensink, T.: Improving the Fisher Kernel for large-scale image classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 143–156. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  16. Perronnin, F., Liu, Y., Snchez, J., Poirier, H.: Large-scale image retrieval with compressed fisher vectors. In: CVPR, pp. 3384–3391 (2010)

    Google Scholar 

  17. Lowe, D.G.: Object recognition from local scale-invariant features. In: ICCV, pp. 1150–1157 (1999)

    Google Scholar 

  18. Bay, H., Tuytelaars, T., Van Gool, L.: SURF: Speeded Up Robust Features. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006, Part I. LNCS, vol. 3951, pp. 404–417. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  19. Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: CVPR, pp. 1–8 (2007)

    Google Scholar 

  20. Jgou, H., Douze, M., Schmid, C.: Hamming embedding and weak geometry consistency for large scale image search. In: Proceedings of the 10th European Conference on Computer Vision: Part I, ECCV 2008, pp. 304-317 (2008)

    Google Scholar 

  21. Philbin, J., Chum, O., Isard, M., Sivic, J.: Lost in quantization: improving particular object retrieval in large scale image databases. In: CVPR (2008)

    Google Scholar 

Download references

Acknowledgments

This research was supported by the National Natural Science Foundation of China (Grant No. 61271394 and 61571269). In the end, the authors would like to sincerely thank the reviewers for their valuable comments and advice.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shijiang Chen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Chen, S., Ding, G., Li, C., Guo, Y. (2016). Image Representation Optimization Based on Locally Aggregated Descriptors. In: Bailey, J., Khan, L., Washio, T., Dobbie, G., Huang, J., Wang, R. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2016. Lecture Notes in Computer Science(), vol 9652. Springer, Cham. https://doi.org/10.1007/978-3-319-31750-2_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-31750-2_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-31749-6

  • Online ISBN: 978-3-319-31750-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics