On the Spatial Extents of SIFT Descriptors for Visual Concept Detection

Mühling, Markus; Ewerth, Ralph; Freisleben, Bernd

doi:10.1007/978-3-642-23968-7_8

Markus Mühling¹⁹,
Ralph Ewerth¹⁹ &
Bernd Freisleben¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6962))

Included in the following conference series:

International Conference on Computer Vision Systems

1119 Accesses
3 Citations

Abstract

State-of-the-art systems for visual concept detection typically rely on the Bag-of-Visual-Words representation. While several aspects of this representation have , such as keypoint sampling strategy, vocabulary size, projection method, weighting scheme or the integration of color, the impact of the spatial extents of local SIFT descriptors has not been studied in previous work. In this paper, the effect of different spatial extents in a state-of-the-art system for visual concept detection is investigated. Based on the observation that SIFT descriptors with different spatial extents yield large performance differences, we propose a concept detection system that combines feature representations for different spatial extents using multiple kernel learning. It is shown experimentally on a large set of 101 concepts from the Mediamill Challenge and on the PASCAL Visual Object Classes Challenge that these feature representations are complementary: Superior performance can be achieved on both test sets using the proposed system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 69.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Efficient Bag of Words Based Concept Extraction for Visual Object Retrieval

Fast Re-ranking of Visual Search Results by Example Selection

Scene analysis and search using local features and support vector machine for effective content-based image retrieval

Article 13 June 2018

References

Bosch, A., Zisserman, A., Muñoz, X.: Scene Classification Using a Hybrid Generative/Discriminative Approach. IEEE Transactions on Pattern Analysis and Machine Intelligence 30(4), 712–727 (2008)
Article Google Scholar
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: In: The PASCAL Visual Object Classes Challenge 2007, VOC 2007 (2007), http://www.pascal-network.org/challenges/VOC/voc2007/workshop/index.html
Hauptmann, A., Yan, R., Lin, W.-H.: How Many High-Level Concepts Will Fill the Semantic Gap in News Video Retrieval?. In: International Conference on Image and Video Retrieval, pp. 627–634. ACM, New York (2007)
Google Scholar
Jiang, Y.-G., Ngo, C.-W., Yang, J.: Towards Optimal Bag-of-Features for Object Categorization and Semantic Video Retrieval. In: International Conference on Image and Video Retrieval, pp. 494–501. ACM, New York (2007)
Google Scholar
Jiang, Y.-G., Yang, J., Ngo, C.-W., Hauptmann, A.G.: Representations of Keypoint-Based Semantic Concept Detection: A Comprehensive Study. IEEE Transactions on Multimedia 12, 42–53 (2010)
Article Google Scholar
Joachims, T.: Text Categorization With Support Vector Machines: Learning With Many Relevant Features. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, pp. 137–142. Springer, Heidelberg (1998)
Chapter Google Scholar
Kovashka, A., Grauman, K.: Learning a Hierarchy of Discriminative Space-time Neighborhood Features for Human Action Recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2046–2053 (2010)
Google Scholar
Lazebnik, S., Schmid, C., Ponce, J.: Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2169–2178. IEEE Computer Society, USA (2006)
Google Scholar
Lowe, D.G.: Distinctive Image Features from Scale-Invariant Keypoints. International Journal of Computer Vision 60(2), 91–110 (2004)
Article Google Scholar
Naphade, M.R., Smith, J.R.: On the Detection of Semantic Concepts at TRECVID. In: International Conference on Multimedia, pp. 660–667. ACM, USA (2004)
Google Scholar
National Institute of Standards and Technology (NIST): TREC Video Retrieval Evaluation (TRECVID), http://www-nlpir.nist.gov/projects/trecvid/
Nowak, E., Jurie, F., Triggs, B.: Sampling Strategies for Bag-of-Features Image Classification. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3954, pp. 490–503. Springer, Heidelberg (2006)
Chapter Google Scholar
van de Sande, K.E., Gevers, T., Snoek, C.G.: A Comparison of Color Features for Visual Concept Classification. In: International Conference on Content-Based Image and Video Retrieval, pp. 141–150. ACM, USA (2008)
Google Scholar
Snoek, C.G.M., Worring, M., van Gemert, J.C., Geusebroek, J.M., Smeulders, A.W.M.: The Challenge Problem for Automated Detection of 101 Semantic Concepts in Multimedia. In: ACM International Conference on Multimedia, pp. 421–430. ACM, USA (2006)
Google Scholar
Sonnenburg, S., Rätsch, G., Henschel, S., Widmer, C., Behr, J., Zien, A., Bona, F., Binder, A., Gehl, C., Franc, V.: The SHOGUN Machine Learning Toolbox. Journal of Machine Learning Research 99, 1799–1802 (2010)
MATH Google Scholar
Vedaldi, A., Fulkerson, B.: VLFeat: An Open and Portable Library of Computer Vision Algorithms (2008), http://www.vlfeat.org/
Yang, J., Jiang, Y.G., Hauptmann, A.G., Ngo, C.: Evaluating Bag-of-Visual-Words Representations in Scene Classification. In: International Workshop on Multimedia Information Retrieval, pp. 197–206. ACM, USA (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics & Computer Science, University of Marburg, Hans-Meerwein-Str. 3, D-35032, Marburg, Germany
Markus Mühling, Ralph Ewerth & Bernd Freisleben

Authors

Markus Mühling
View author publications
You can also search for this author in PubMed Google Scholar
Ralph Ewerth
View author publications
You can also search for this author in PubMed Google Scholar
Bernd Freisleben
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

INRIA Grenoble Rhône-Alpes Research Centre, 655 Avenue de l’Europe, 38330, Montbonnot, France
James L. Crowley
Department of Computer Science, Colorado State University, 80523, Fort Collins, CO, USA
Bruce A. Draper
INRIA Sophia Antipolis,, 2004 route des Lucioles, BP 93, 06902, Sophia Antipolis, France
Monique Thonnat

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mühling, M., Ewerth, R., Freisleben, B. (2011). On the Spatial Extents of SIFT Descriptors for Visual Concept Detection. In: Crowley, J.L., Draper, B.A., Thonnat, M. (eds) Computer Vision Systems. ICVS 2011. Lecture Notes in Computer Science, vol 6962. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23968-7_8

Download citation

DOI: https://doi.org/10.1007/978-3-642-23968-7_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23967-0
Online ISBN: 978-3-642-23968-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics