Region Matching Techniques for Spatial Bag of Visual Words Based Image Category Recognition

Viitaniemi, Ville; Laaksonen, Jorma

doi:10.1007/978-3-642-15819-3_69

Ville Viitaniemi¹⁹ &
Jorma Laaksonen¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6352))

Included in the following conference series:

International Conference on Artificial Neural Networks

1857 Accesses

Abstract

Histograms of local features—bags of visual words (BoV)—have proven to be powerful representations in image categorisation and object detection. The BoV representations have usefully been extended in spatial dimension by taking the features’ spatial distribution into account. In this paper we describe region matching strategies to be used in conjunction with such extensions. Of these, the rigid region matching is most commonly used. Here we present an alternative based on the Integrated Region Matching (IRM) technique, loosening the constraint of geometrical rigidity of the images. After having described the techniques, we evaluate them in image category detection experiments that utilise 5000 photographic images taken from the PASCAL VOC Challenge 2007 benchmark. Experiments show that for many image categories, the rigid region matching performs slightly better. However, for some categories IRM matching is significantly more accurate an alternative. As a consequence, on average we did not observe a significant difference. The best results were obtained by combining the two schemes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Sivic, J., Zisserman, A.: Video Google: A text retrieval approach to object matching in videos. In: Proc. of ICCV 2003, vol. 2, pp. 1470–1477 (October 2003)
Google Scholar
Zhang, J., Marszałek, M., Lazebnik, S., Schmid, C.: Local features and kernels for classification of texture and object categories: a comprehensive study. International Journal of Computer Vision 73(2), 213–238 (2007)
Article Google Scholar
Wang, J.Z., Liu, J., Wiederhold, G.: SIMPLIcity: Semantics-sensitive integrated matching for picture libraries. IEEE Transactions on Pattern Analysis and Machine Intelligence 23(9), 947–963 (2001)
Article Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60(2), 91–110 (2004)
Article Google Scholar
Mikolajcyk, K., Schmid, C.: Scale and affine point invariant interest point detectors. International Journal of Computer Vision 60(1), 68–86 (2004)
Google Scholar
Viitaniemi, V., Laaksonen, J.: Experiments on selection of codebooks for local image feature histograms. In: Sebillo, M., Vitiello, G., Schaefer, G. (eds.) VISUAL 2008. LNCS, vol. 5188, pp. 126–137. Springer, Heidelberg (2008)
Chapter Google Scholar
van Gemert, J.C., Veenman, C.J., Smeulders, A.W.M., Geusebroek, J.M.: Visual word ambiguity. IEEE Transactions on Pattern Analysis and Machine Intelligence (2010)
Google Scholar
Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines (2001), Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm
Viitaniemi, V., Laaksonen, J.: Improving the accuracy of global feature fusion based image categorisation. In: Falcidieno, B., Spagnuolo, M., Avrithis, Y., Kompatsiaris, I., Buitelaar, P. (eds.) SAMT 2007. LNCS, vol. 4816, pp. 1–14. Springer, Heidelberg (2007)
Chapter Google Scholar
Viitaniemi, V., Laaksonen, J.: Spatial extensions to bag of visual words. In: Proceedings of ACM International Conference on Image and Video Retrieval (CIVR 2009), Fira, Greece (July 2009)
Google Scholar
Hillier, F.S., Lieberman, G.J.: Introduction to Mathematical Programming. McGraw-Hill, New York (1990)
Google Scholar
Rubner, Y., Tomasi, C., Guibas, L.J.: A metric for distributions with applications to image databases. In: Proc. of IEEE International Conference on Computer Vision, India, pp. 59–66 (January 1998), Code available on-line at http://www.cs.duke.edu/~tomasi/emd.htm
Genkin, A., Lewis, D.D., Madigan, D.: BBR: Bayesian logistic regression software (2005), Software available at http://www.stat.rutgers.edu/~madigan/BBR/
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge 2007 (VOC 2007) Results (2007), http://www.pascal-network.org/challenges/VOC/voc2007/workshop/index.html

Download references

Author information

Authors and Affiliations

Aalto University School of Science and Technology, P.O. Box 15400, FI-00076, Aalto, Finland
Ville Viitaniemi & Jorma Laaksonen

Authors

Ville Viitaniemi
View author publications
You can also search for this author in PubMed Google Scholar
Jorma Laaksonen
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Informatics, TEI of Thessaloniki, 57400, Sindos, Greece
Konstantinos Diamantaras
School of Physics, Astronomy, and Informatics, Department of Informatics, Nicolaus Copernicus University, ul. Grudziadzka 5, 87-100, Torun, Poland
Wlodek Duch
Department of Forestry and Management of the Environment and Natural Resources, Democritus University of Thrace, Pantazidou 193, 68200, Orestiada, Thrace, Greece
Lazaros S. Iliadis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Viitaniemi, V., Laaksonen, J. (2010). Region Matching Techniques for Spatial Bag of Visual Words Based Image Category Recognition. In: Diamantaras, K., Duch, W., Iliadis, L.S. (eds) Artificial Neural Networks – ICANN 2010. ICANN 2010. Lecture Notes in Computer Science, vol 6352. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15819-3_69

Download citation

DOI: https://doi.org/10.1007/978-3-642-15819-3_69
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15818-6
Online ISBN: 978-3-642-15819-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics