Abstract
Aggregating local descriptors into super vectors achives excellent performance in image classification and retrieval tasks. Vector of locally aggregated descriptors(VLAD), which indexes images to compact representations by aggregating the residuals of descriptors and visual words, is a popular super vector encoding method among this kind. This paper will focus on the biggest difficulty of VLAD, the “visual burstiness”, reviste the basic assumptions and solutions along this line, then make modifications to two key steps of the initial VLAD process. The main contributions are twofold. Firstly, we start from local coordinate system(LCS) and propose the aggregated version(aggrLCS), which changes the objective and timing of coordinate rotation, for better captures of bursts. Secondly, an adaptive power-law normalization method is adopted to magnify the positive effect of power-law normalization by weighting each dimension respectively. Experiments on image retrieval tasks demonstrate that the proposed modifications show superior performance over the original and several variants of VLAD.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Delhumeau, J., Gosselin, P.-H., Jgou, H., Prez, P.: Revisiting the VLAD image representation. In: ACM Multimedia, pp. 653–656 (2013)
Arandjelovic, R., Zisserman, A.: All about VLAD. In: CVPR, pp. 1578–1585 (2013)
Spyromitros-Xioufis, E., Papadopoulos, S., Ginsca, A.L., Popescu, A., Kompatsiaris, Y.: Improving diversity in image search via supervised relevance scoring. In: ICMR (2015)
Ng, J.Y.H., Yang, F., Davis, L.S.: Exploiting local features from deep networks for image retrieval. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 53–61 (2015)
Jgou, H., Douze, M., Schmid, C., Prez, P.: Aggregating local descriptors into a compact image representation. In: CVPR, pp. 3304–3311 (2010)
Jgou, H., Perronnin, F., Douze, M., Sanchez, J., Perez, P., Schmid, C.: Aggregating local image descriptors into compact codes. In: PAMI, pp. 1704–1716 (2012)
Chatfield, K., Lempitsky, V.S., Vedaldi, A., Zisserman, A.: The devil is in the details: an evaluation of recent feature encoding methods. In: BMVC (2011)
Jgou, H., Douze, M., Schmid, C.: On the burstiness of visual elements. In: CVPR, pp. 1169–1176 (2009)
Sivic, J., Zisserman, A.: Video Google: a text retrieval approach to object matching in videos. In: ICCV, pp. 1470–1477 (2003)
Arandjelovi, R., Zisserman, A.: Three things everyone should know to improve object retrieval. In: CVPR, pp. 2911–2918 (2012)
Simonyan, K., Vedaldi, A., Zisserman, A.: Descriptor learning using convex optimisation. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part I. LNCS, vol. 7572, pp. 243–256. Springer, Heidelberg (2012)
Yang, J., Yu, K., Gong, Y., Huang, T.: Linear spatial pyramid matching using sparse coding for image classification. In: CVPR, pp. 1794–1801 (2012)
Jégou, H., Chum, O.: Negative evidences and co-occurences in image retrieval: the benefit of PCA and whitening. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part II. LNCS, vol. 7573, pp. 774–787. Springer, Heidelberg (2012)
Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., Gong, Y.: Locality-constrained linear coding for image classification. In: CVPR, pp. 3360–3367 (2010)
Perronnin, F., Sánchez, J., Mensink, T.: Improving the Fisher Kernel for large-scale image classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 143–156. Springer, Heidelberg (2010)
Perronnin, F., Liu, Y., Snchez, J., Poirier, H.: Large-scale image retrieval with compressed fisher vectors. In: CVPR, pp. 3384–3391 (2010)
Lowe, D.G.: Object recognition from local scale-invariant features. In: ICCV, pp. 1150–1157 (1999)
Bay, H., Tuytelaars, T., Van Gool, L.: SURF: Speeded Up Robust Features. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006, Part I. LNCS, vol. 3951, pp. 404–417. Springer, Heidelberg (2006)
Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: CVPR, pp. 1–8 (2007)
Jgou, H., Douze, M., Schmid, C.: Hamming embedding and weak geometry consistency for large scale image search. In: Proceedings of the 10th European Conference on Computer Vision: Part I, ECCV 2008, pp. 304-317 (2008)
Philbin, J., Chum, O., Isard, M., Sivic, J.: Lost in quantization: improving particular object retrieval in large scale image databases. In: CVPR (2008)
Acknowledgments
This research was supported by the National Natural Science Foundation of China (Grant No. 61271394 and 61571269). In the end, the authors would like to sincerely thank the reviewers for their valuable comments and advice.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Chen, S., Ding, G., Li, C., Guo, Y. (2016). Image Representation Optimization Based on Locally Aggregated Descriptors. In: Bailey, J., Khan, L., Washio, T., Dobbie, G., Huang, J., Wang, R. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2016. Lecture Notes in Computer Science(), vol 9652. Springer, Cham. https://doi.org/10.1007/978-3-319-31750-2_9
Download citation
DOI: https://doi.org/10.1007/978-3-319-31750-2_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-31749-6
Online ISBN: 978-3-319-31750-2
eBook Packages: Computer ScienceComputer Science (R0)