A comparative study of irregular pyramid matching in bag-of-bags of words model for image retrieval

Ren, Yi

doi:10.1007/s11760-015-0763-7

A comparative study of irregular pyramid matching in bag-of-bags of words model for image retrieval

Original Paper
Published: 22 March 2015

Volume 10, pages 471–478, (2016)
Cite this article

Signal, Image and Video Processing Aims and scope Submit manuscript

Yi Ren¹

307 Accesses
Explore all metrics

Abstract

In this paper, we assess three standard approaches to build irregular pyramid partitions for image retrieval in the bag-of-bags of words model that we recently proposed. These three approaches are: kernel $k$-means to optimize multilevel weighted graph cuts, normalized cuts and graph cuts, respectively. The bag-of-bags of words (BBoW) model is an approach based on irregular pyramid partitions over the image. An image is first represented as a connected graph of local features on a regular grid of pixels. Irregular partitions (subgraphs) of the image are further built by using graph partitioning methods. Each subgraph in the partition is then represented by its own signature. The BBoW model with the aid of graph extends the classical bag-of-words model, by embedding color homogeneity and limited spatial information through irregular partitions of an image. Compared with existing methods for image retrieval, such as spatial pyramid matching, the BBoW model does not assume that similar parts of a scene always appear at the same location in images of the same category. The extension of the proposed model to pyramid gives rise to a method we name irregular pyramid matching. The experiments on Caltech-101 benchmark demonstrate that applying kernel $k$-means to graph clustering process produces better retrieval results, as compared with other graph partitioning methods such as graph cuts and normalized cuts for BBoW. Moreover, this proposed method achieves comparable results and outperforms SPM in 19 object categories on the whole Caltech-101 dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Scene search based on the adapted triangular regions and soft clustering to improve the effectiveness of the visual-bag-of-words model

Article Open access 13 June 2018

Mining the Discriminative Word Sets for Bag-of-Words Model Based on Distributional Similarity Graph

A Novel Visual Word Assignment Model for Content-Based Image Retrieval

Notes

http://trec.nist.gov/.
http://perso-etis.ensea.fr/yren/thesis/thesis.pdf.
For more detail, see supplementary material online.

References

Agarwal, A., Triggs, B.: Hyperfeatures—multilevel local coding for visual recognition. In: ECCV, pp. 30–43 (2006)
Birchfield, S., Rangarajan, S.: Spatiograms versus histograms for region-based tracking. In: CVPR, pp. 1158–1163 (2005)
Boykov, Y., Kolmogorov, V.: An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision. IEEE Trans. Pattern Anal. Mach. Intell. 26(9), 1124–1137 (2004)
Article Google Scholar
Boykov, Y., Veksler, O., Zabih, R.: Fast approximate energy minimization via graph cuts. IEEE Trans. Pattern Anal. Mach. Intell. 23(11), 1222–1239 (2001)
Article Google Scholar
Bunke, H., Riesen, K.: Towards the unification of structural and statistical pattern recognition. Pattern Recognit. Lett. 33(7), 811–825 (2012)
Article Google Scholar
Chatfield, K., Lempitsky, V., Vedaldi, A., Zisserman, A.: The devil is in the details: an evaluation of recent feature encoding methods. In: BMVC, pp. 1–12 (2011)
Dhillon, I.S., Guan, Y., Kulis, B.: Kernel k-means: spectral clustering and normalized cuts. In: International Conference on Knowledge Discovery and Data Mining, pp. 551–556 (2004)
Dhillon, I.S., Guan, Y., Kulis, B.: Weighted graph cuts without eigenvectors a multilevel approach. IEEE Trans. Pattern Anal. Mach. Intell. 29(11), 1944–1957 (2007)
Article Google Scholar
Duchenne, O., Joulin, A., Ponce, J.: A graph-matching kernel for object categorization. In: ICCV (2011)
Gibert, J., Valveny, E., Bunke, H.: Graph embedding in vector spaces by node attribute statistics. Pattern Recognit. 45(9), 3072–3083 (2012)
Article Google Scholar
Jegou, H., Douze, M., Schmid, C., Pérez, P.: Aggregating local descriptors into a compact image representation. In: CVPR (2010)
Jou, F.D., Fan, K.C., Chang, Y.L.: Efficient matching of large-size histograms. Pattern Recognit. Lett. 25(3), 277–286 (2004)
Article Google Scholar
Kolmogorov, V., Zabih, R.: What energy functions can be minimized via graph cuts? IEEE Trans. Pattern Anal. Mach. Intell. 26(2), 147–159 (2004)
Article Google Scholar
Krapac, J., Verbeek, J.J., Jurie, F.: Modeling spatial layout with fisher vectors for image categorization. In: ICCV, pp. 1487–1494 (2011)
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: CVPR (2006)
Leibe, B., Leonardis, A., Schiele, B.: Combined object categorization and segmentation with an implicit shape model. In: ECCV Workshop on Statistical Learning in Computer Vision (2004)
Li, F.F., Fergus, R., Perona, P.: Learning generative visual models from few training examples: an incremental bayesian approach tested on 101 object categories. In: Computer Vision and Pattern Recognition Workshop on Generative-Model Based Vision p. 178 (2004)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comp. Vis. 60(2), 91–110 (2004)
Article Google Scholar
McCann, S., Lowe, D.G.: Spatially local coding for object recognition. In: ACCV, pp. 204–217 (2012)
Perronnin, F., Dance, C.R.: Fisher kernels on visual vocabularies for image categorization. In: CVPR (2007)
Perronnin, F., Sánchez, J., Mensink, T.: Improving the fisher kernel for large-scale image classification. In: ECCV, pp. 143–156 (2010)
Ren, X., Malik, J.: Learning a classification model for segmentation. In: ICCV (2003)
Ren, Y., Bugeau, A., Benois-Pineau, J.: Bag-of-bags of words—irregular graph pyramids vs spatial pyramid matching for image retrieval. In: IPTA, pp. 247–252 (2014)
Rother, C., Kolmogorov, V., Blake, A.: Grabcut: interactive foreground extraction using iterated graph cuts. ACM Trans. Graph. 23(3), 309–314 (2004)
Article Google Scholar
Sánchez, J., Perronnin, F., de Campos, T.E.: Modeling the spatial layout of images beyond spatial pyramids. Pattern Recognit. Lett. 33(16), 2216–2223 (2012)
Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000)
Sivic, J., Zisserman, A.: Video google: a text retrieval approach to object matching in videos. In: ICCV, pp. 1470–1477 (2003)
Wang, J., Yang, J., Yu, K., Lv, F., Huang, T.S., Gong, Y.: Locality-constrained linear coding for image classification. In: CVPR, pp. 3360–3367 (2010)
Yang, J., Yu, K., Gong, Y., Huang, T.: Linear spatial pyramid matching using sparse coding for image classification. In: CVPR, pp. 1794–1801 (2009)

Download references

Acknowledgments

This work was conducted as Ph.D work of the author, supported by CNRS (Centre national de la recherche scientifique) and Region of Aquitaine Grant.

Author information

Authors and Affiliations

University of Bordeaux, LaBRI, CNRS UMR 5800, Domaine universitaire, 351, cours de la Libération, 33405, Talence, France
Yi Ren

Authors

Yi Ren
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yi Ren.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ren, Y. A comparative study of irregular pyramid matching in bag-of-bags of words model for image retrieval. SIViP 10, 471–478 (2016). https://doi.org/10.1007/s11760-015-0763-7

Download citation

Received: 31 October 2014
Revised: 16 February 2015
Accepted: 26 February 2015
Published: 22 March 2015
Issue Date: March 2016
DOI: https://doi.org/10.1007/s11760-015-0763-7

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A comparative study of irregular pyramid matching in bag-of-bags of words model for image retrieval

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Scene search based on the adapted triangular regions and soft clustering to improve the effectiveness of the visual-bag-of-words model

Mining the Discriminative Word Sets for Bag-of-Words Model Based on Distributional Similarity Graph

A Novel Visual Word Assignment Model for Content-Based Image Retrieval

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Supplementary material 1 (pdf 4097 KB)

Supplementary material 2 (pdf 22129 KB)

Supplementary material 3 (pdf 14397 KB)

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

A comparative study of irregular pyramid matching in bag-of-bags of words model for image retrieval

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Scene search based on the adapted triangular regions and soft clustering to improve the effectiveness of the visual-bag-of-words model

Mining the Discriminative Word Sets for Bag-of-Words Model Based on Distributional Similarity Graph

A Novel Visual Word Assignment Model for Content-Based Image Retrieval

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Supplementary material 1 (pdf 4097 KB)

Supplementary material 2 (pdf 22129 KB)

Supplementary material 3 (pdf 14397 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation