Abstract
Histograms of local features—bags of visual words (BoV)—have proven to be powerful representations in image categorisation and object detection. The BoV representations have usefully been extended in spatial dimension by taking the features’ spatial distribution into account. In this paper we describe region matching strategies to be used in conjunction with such extensions. Of these, the rigid region matching is most commonly used. Here we present an alternative based on the Integrated Region Matching (IRM) technique, loosening the constraint of geometrical rigidity of the images. After having described the techniques, we evaluate them in image category detection experiments that utilise 5000 photographic images taken from the PASCAL VOC Challenge 2007 benchmark. Experiments show that for many image categories, the rigid region matching performs slightly better. However, for some categories IRM matching is significantly more accurate an alternative. As a consequence, on average we did not observe a significant difference. The best results were obtained by combining the two schemes.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Sivic, J., Zisserman, A.: Video Google: A text retrieval approach to object matching in videos. In: Proc. of ICCV 2003, vol. 2, pp. 1470–1477 (October 2003)
Zhang, J., Marszałek, M., Lazebnik, S., Schmid, C.: Local features and kernels for classification of texture and object categories: a comprehensive study. International Journal of Computer Vision 73(2), 213–238 (2007)
Wang, J.Z., Liu, J., Wiederhold, G.: SIMPLIcity: Semantics-sensitive integrated matching for picture libraries. IEEE Transactions on Pattern Analysis and Machine Intelligence 23(9), 947–963 (2001)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60(2), 91–110 (2004)
Mikolajcyk, K., Schmid, C.: Scale and affine point invariant interest point detectors. International Journal of Computer Vision 60(1), 68–86 (2004)
Viitaniemi, V., Laaksonen, J.: Experiments on selection of codebooks for local image feature histograms. In: Sebillo, M., Vitiello, G., Schaefer, G. (eds.) VISUAL 2008. LNCS, vol. 5188, pp. 126–137. Springer, Heidelberg (2008)
van Gemert, J.C., Veenman, C.J., Smeulders, A.W.M., Geusebroek, J.M.: Visual word ambiguity. IEEE Transactions on Pattern Analysis and Machine Intelligence (2010)
Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines (2001), Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm
Viitaniemi, V., Laaksonen, J.: Improving the accuracy of global feature fusion based image categorisation. In: Falcidieno, B., Spagnuolo, M., Avrithis, Y., Kompatsiaris, I., Buitelaar, P. (eds.) SAMT 2007. LNCS, vol. 4816, pp. 1–14. Springer, Heidelberg (2007)
Viitaniemi, V., Laaksonen, J.: Spatial extensions to bag of visual words. In: Proceedings of ACM International Conference on Image and Video Retrieval (CIVR 2009), Fira, Greece (July 2009)
Hillier, F.S., Lieberman, G.J.: Introduction to Mathematical Programming. McGraw-Hill, New York (1990)
Rubner, Y., Tomasi, C., Guibas, L.J.: A metric for distributions with applications to image databases. In: Proc. of IEEE International Conference on Computer Vision, India, pp. 59–66 (January 1998), Code available on-line at http://www.cs.duke.edu/~tomasi/emd.htm
Genkin, A., Lewis, D.D., Madigan, D.: BBR: Bayesian logistic regression software (2005), Software available at http://www.stat.rutgers.edu/~madigan/BBR/
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge 2007 (VOC 2007) Results (2007), http://www.pascal-network.org/challenges/VOC/voc2007/workshop/index.html
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Viitaniemi, V., Laaksonen, J. (2010). Region Matching Techniques for Spatial Bag of Visual Words Based Image Category Recognition. In: Diamantaras, K., Duch, W., Iliadis, L.S. (eds) Artificial Neural Networks – ICANN 2010. ICANN 2010. Lecture Notes in Computer Science, vol 6352. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15819-3_69
Download citation
DOI: https://doi.org/10.1007/978-3-642-15819-3_69
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15818-6
Online ISBN: 978-3-642-15819-3
eBook Packages: Computer ScienceComputer Science (R0)