Skip to main content

Region Matching Techniques for Spatial Bag of Visual Words Based Image Category Recognition

  • Conference paper
Artificial Neural Networks – ICANN 2010 (ICANN 2010)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6352))

Included in the following conference series:

  • 1857 Accesses

Abstract

Histograms of local features—bags of visual words (BoV)—have proven to be powerful representations in image categorisation and object detection. The BoV representations have usefully been extended in spatial dimension by taking the features’ spatial distribution into account. In this paper we describe region matching strategies to be used in conjunction with such extensions. Of these, the rigid region matching is most commonly used. Here we present an alternative based on the Integrated Region Matching (IRM) technique, loosening the constraint of geometrical rigidity of the images. After having described the techniques, we evaluate them in image category detection experiments that utilise 5000 photographic images taken from the PASCAL VOC Challenge 2007 benchmark. Experiments show that for many image categories, the rigid region matching performs slightly better. However, for some categories IRM matching is significantly more accurate an alternative. As a consequence, on average we did not observe a significant difference. The best results were obtained by combining the two schemes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Sivic, J., Zisserman, A.: Video Google: A text retrieval approach to object matching in videos. In: Proc. of ICCV 2003, vol. 2, pp. 1470–1477 (October 2003)

    Google Scholar 

  2. Zhang, J., Marszałek, M., Lazebnik, S., Schmid, C.: Local features and kernels for classification of texture and object categories: a comprehensive study. International Journal of Computer Vision 73(2), 213–238 (2007)

    Article  Google Scholar 

  3. Wang, J.Z., Liu, J., Wiederhold, G.: SIMPLIcity: Semantics-sensitive integrated matching for picture libraries. IEEE Transactions on Pattern Analysis and Machine Intelligence 23(9), 947–963 (2001)

    Article  Google Scholar 

  4. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60(2), 91–110 (2004)

    Article  Google Scholar 

  5. Mikolajcyk, K., Schmid, C.: Scale and affine point invariant interest point detectors. International Journal of Computer Vision 60(1), 68–86 (2004)

    Google Scholar 

  6. Viitaniemi, V., Laaksonen, J.: Experiments on selection of codebooks for local image feature histograms. In: Sebillo, M., Vitiello, G., Schaefer, G. (eds.) VISUAL 2008. LNCS, vol. 5188, pp. 126–137. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  7. van Gemert, J.C., Veenman, C.J., Smeulders, A.W.M., Geusebroek, J.M.: Visual word ambiguity. IEEE Transactions on Pattern Analysis and Machine Intelligence (2010)

    Google Scholar 

  8. Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines (2001), Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm

  9. Viitaniemi, V., Laaksonen, J.: Improving the accuracy of global feature fusion based image categorisation. In: Falcidieno, B., Spagnuolo, M., Avrithis, Y., Kompatsiaris, I., Buitelaar, P. (eds.) SAMT 2007. LNCS, vol. 4816, pp. 1–14. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  10. Viitaniemi, V., Laaksonen, J.: Spatial extensions to bag of visual words. In: Proceedings of ACM International Conference on Image and Video Retrieval (CIVR 2009), Fira, Greece (July 2009)

    Google Scholar 

  11. Hillier, F.S., Lieberman, G.J.: Introduction to Mathematical Programming. McGraw-Hill, New York (1990)

    Google Scholar 

  12. Rubner, Y., Tomasi, C., Guibas, L.J.: A metric for distributions with applications to image databases. In: Proc. of IEEE International Conference on Computer Vision, India, pp. 59–66 (January 1998), Code available on-line at http://www.cs.duke.edu/~tomasi/emd.htm

  13. Genkin, A., Lewis, D.D., Madigan, D.: BBR: Bayesian logistic regression software (2005), Software available at http://www.stat.rutgers.edu/~madigan/BBR/

  14. Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge 2007 (VOC 2007) Results (2007), http://www.pascal-network.org/challenges/VOC/voc2007/workshop/index.html

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Viitaniemi, V., Laaksonen, J. (2010). Region Matching Techniques for Spatial Bag of Visual Words Based Image Category Recognition. In: Diamantaras, K., Duch, W., Iliadis, L.S. (eds) Artificial Neural Networks – ICANN 2010. ICANN 2010. Lecture Notes in Computer Science, vol 6352. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15819-3_69

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-15819-3_69

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-15818-6

  • Online ISBN: 978-3-642-15819-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics