Region-of-Interest Retrieval in Large Image Datasets with Voronoi VLAD

Chadha, Aaron; Andreopoulos, Yiannis

doi:10.1007/978-3-319-20904-3_21

Aaron Chadha¹⁷ &
Yiannis Andreopoulos¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9163))

Included in the following conference series:

International Conference on Computer Vision Systems

1903 Accesses

Abstract

We investigate the problem of visual-query based retrieval from large image datasets when the visual queries comprise arbitrary regions of interest (ROI) rather than entire images. Our proposal is a compact image descriptor that combines the vector of locally aggregated descriptors (VLAD) of Jegou et. al. with a multi-level, Voronoi-based, spatial partitioning of each dataset image, and it is termed as the Voronoi VLAD (VVLAD). The proposed multi-level Voronoi partitioning uses a spatial hierarchical K-means over interest-point locations, and computes a VLAD over each cell. In order to reduce the matching complexity when handling very large datasets, we propose the following modifications. First, we utilize the tree structure of the spatial hierarchical K-means to perform a top-to-bottom pruning for local similarity maxima, rather than exhaustively matching against all cells (Fast-VVLAD). Second, we propose to aggregate VLADs of adjacent Voronoi cells in order to reduce the overall VVLAD storage requirement per image. Finally, we propose a new image similarity score for Fast-VVLAD that combines relevant information from all partition levels into a single measure for similarity. For a range of ROI queries in two standard datasets, Fast-VVLAD achieves comparable or higher mean Average Precision against the state-of-the-art Multi-VLAD framework while offering more than two-fold acceleration.

This work was funded in part by Innovate UK, project REVQUAL (101855), and EPSRC (Industrial PhD CASE award, co-sponsored by BAFTA).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Fast Visual Vocabulary Construction for Image Retrieval Using Skewed-Split k-d Trees

Random Binary Search Trees for Approximate Nearest Neighbour Search in Binary Space

Incremental Estimation of Visual Vocabulary Size for Image Retrieval

References

Arandjelovic, R., Zisserman, A.: Three things everyone should know to improve object retrieval. In: IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2911–2918 (2012)
Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. of Comput. Vis. 60(2), 91–110 (2004)
Article Google Scholar
Lazebnik, S. et al.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories, In: IEEE International Conference on Computer Vision and Pattern Recogonition, vol. 2, pp. 2169–2178 (2006)
Google Scholar
Philbin, J. et al.: Lost in quantization: improving particular object retrieval in large scale image databases. In: IEEE International Conference on Computer Vision and Pattern Recogonition, pp. 1–8 (2008)
Google Scholar
Chum, O. et al.: Total recall: automatic query expansion with a generative feature model for object retrieval, In: IEEE International Conference on Computer Vision, pp. 1–8 (2007)
Google Scholar
Jégou, H., Douze, M., Schmid, C., Pérez, P.: Aggregating local descriptors into a compact image representation, In: IEEE International Conference on Computer Vision and Pattern Recogonition, pp. 3304–3311 (2010)
Google Scholar
Perronnin, F., Dance, C.: Fisher kernels on visual vocabularies for image categorization. In: IEEE International Conference on Computer Vision and Pattern Recogonition, pp. 1–8 (2007)
Google Scholar
Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: IEEE International Conference on Computer Vision and Pattern Recogonition, pp. 1–8 (2007)
Google Scholar
Arandjelovic, R., Zisserman, A.: All about VLAD. In: IEEE International Conference on Computer Vision and Pattern Recogonition, pp. 1578–1585 (2013)
Google Scholar
Jegou, H., Douze, M., Schmid, C.: Hamming embedding and weak geometric consistency for large scale image search. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 304–317. Springer, Heidelberg (2008)
Chapter Google Scholar
Fergus, R., Perona, P., Zisserman, A.: Object class recognition by unsupervised scale-invariant learning. In: IEEE International Conference on Computer Vision and Pattern Recogonition, vol. 2, pp. II-264–II-271 (2003)
Google Scholar
Perronnin, F., Liu, Y., Sánchez, J., Poirier, H.: Large-scale image retrieval with compressed fisher vectors. In: IEEE International Conference on Computer Vision and Pattern Recogonition, pp. 3384–3391 (2010)
Google Scholar
Jégou, H., Perronnin, F., Douze, M., Sánchez, J., Pérez, P., Schmid, C.: Aggregating local image descriptors into compact codes. In: IEEE Transaction on Pattern Analysis and Machine Intelligence, vol. 34, no. 9, pp. 1704–1716 (2012)
Google Scholar
Jégou, H., Chum, O.: Negative evidences and co-occurences in image retrieval: the benefit of pca and whitening. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part II. LNCS, vol. 7573, pp. 774–787. Springer, Heidelberg (2012)
Chapter Google Scholar
Chum, O., Matas, J.: Unsupervised discovery of co-occurrence in sparse high dimensional data. In: IEEE International Conference on Computer Vision and Pattern Recogonition, pp. 3416–3423 (2010)
Google Scholar
Mikolajczyk, K., et al.: A comparison of affine region detectors. Int. J. of Comput. Vis. 65(1–2), 43–72 (2005)
Article Google Scholar
Mikolajczyk, K., Schmid, C.: An affine invariant interest point detector. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002, Part I. LNCS, vol. 2350, pp. 128–142. Springer, Heidelberg (2002)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical and Electronic Engineering, University College London (UCL), London, UK
Aaron Chadha & Yiannis Andreopoulos

Authors

Aaron Chadha
View author publications
You can also search for this author in PubMed Google Scholar
Yiannis Andreopoulos
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yiannis Andreopoulos .

Editor information

Editors and Affiliations

Aalborg University, Copenhagen, Denmark
Lazaros Nalpantidis
Aalborg University, Copenhagen, Denmark
Volker Krüger
Royal Institute of Technology - KTH, Stockholm, Sweden
Jan-Olof Eklundh
Democritus University of Thrace, Xanthi, Greece
Antonios Gasteratos

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chadha, A., Andreopoulos, Y. (2015). Region-of-Interest Retrieval in Large Image Datasets with Voronoi VLAD. In: Nalpantidis, L., Krüger, V., Eklundh, JO., Gasteratos, A. (eds) Computer Vision Systems. ICVS 2015. Lecture Notes in Computer Science(), vol 9163. Springer, Cham. https://doi.org/10.1007/978-3-319-20904-3_21

Download citation

DOI: https://doi.org/10.1007/978-3-319-20904-3_21
Published: 19 June 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-20903-6
Online ISBN: 978-3-319-20904-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics