A Hybrid Supervised-Unsupervised Vocabulary Generation Algorithm for Visual Concept Recognition

Binder, Alexander; Wojcikiewicz, Wojciech; Müller, Christina; Kawanabe, Motoaki

doi:10.1007/978-3-642-19318-7_8

Alexander Binder¹⁹,
Wojciech Wojcikiewicz^19,20,
Christina Müller^19,20 &
…
Motoaki Kawanabe^19,20

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 6494))

Included in the following conference series:

Asian Conference on Computer Vision

2942 Accesses

Abstract

Vocabulary generation is the essential step in the bag-of-words image representation for visual concept recognition, because its quality affects classification performance substantially. In this paper, we propose a hybrid method for visual word generation which combines unsupervised density-based clustering with the discriminative power of fast support vector machines. We aim at three goals: breaking the vocabulary generation algorithm up into two sections, with one highly parallelizable part, reducing computation times for bag of words features and keeping concept recognition performance at levels comparable to vanilla k-means clustering. On the two recent data sets Pascal VOC2009 and Image-CLEF2010 PhotoAnnotation, our proposed method either outperforms various baseline algorithms for visual word generation with almost same computation time or reduces training/test time with on par classification performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Collaborative Clustering Approach Based on Dempster-Shafer Theory for Bag-of-Visual-Words Codebook Generation

Incremental Estimation of Visual Vocabulary Size for Image Retrieval

Constructing a discriminative visual vocabulary with macro and micro sense of visual words

Article 22 October 2015

References

Csurka, G., Bray, C., Dance, C., Fan, L.: Visual categorization with bags of keypoints. In: Workshop on Statistical Learning in Computer Vision, ECCV, Prague, Czech Republic, pp. 1–22 (2004)
Google Scholar
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge (VOC 2007) (2007), http://www.pascal-network.org/challenges/VOC/voc2007/workshop/index.html
Tahir, M., van de Sande, K., Uijlings, J., Yan, F., Li, X., Mikolajczyk, K., Kittler, J., Gevers, T., Smeulders, A.: Surreyuva srkda method (2008), http://pascallin.ecs.soton.ac.uk/challenges/VOC/voc2008/workshop/tahir.pdf
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge 2009 (VOC 2009) (2009), http://www.pascal-network.org/challenges/VOC/voc2009/workshop/index.html
Nowak, S., Dunker, P.: Overview of the CLEF 2009 large-scale visual concept detection and annotation task. In: Peters, C., Caputo, B., Gonzalo, J., Jones, G.J.F., Kalpathy-Cramer, J., Müller, H., Tsikrika, T. (eds.) CLEF 2009. LNCS, vol. 6242, pp. 94–109. Springer, Heidelberg (2010)
Chapter Google Scholar
Jurie, F., Triggs, B.: Creating efficient codebooks for visual recognition. In: ICCV 2005, vol. I, pp. 604–610 (2005)
Google Scholar
Moosmann, F., Triggs, B., Jurie, F.: Fast discriminative visual codebooks using randomized clustering forests. In: Advances in Neural Information Processing Systems (2006)
Google Scholar
Moosmann, F., Nowak, E., Jurie, F.: Randomized clustering forests for image classification. IEEE Transactions on Pattern Analysis & Machine Intelligence 30, 1632–1646 (2008)
Article Google Scholar
Uijlings, J., Smeulders, A., Scha, R.: Real-time bag-of-words, approximately. In: CIVR (2009)
Google Scholar
Bosch, A., Zisserman, A., Muñoz, X.: Representing shape with a spatial pyramid kernel. In: Proceedings of the 6th ACM International Conference on Image and Video Retrieval (CIVR 2007), pp. 401–408 (2007)
Google Scholar
Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories. In: Workshop on Generative-Model Based Vision (2004)
Google Scholar
van de Sande, K.E.A., Gevers, T., Snoek, C.G.M.: Evaluating color descriptors for object and scene recognition. IEEE Trans. Pat. Anal. & Mach. Intel. (2010)
Google Scholar
Nister, D., Stewenius, H.: Scalable recognition with a vocabulary tree. In: CVPR 2006: Proceedings of Conference on Computer Vision and Pattern Recognition (2006)
Google Scholar
Fulkerson, B., Vedaldi, A., Soatto, S.: Localizing objects with smart dictionaries. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 179–192. Springer, Heidelberg (2008)
Chapter Google Scholar
Lowe, D.: Distinctive image features from scale invariant keypoints. International Journal of Computer Vision 60, 91–110 (2004)
Article Google Scholar
Sonnenburg, S., Rätsch, G., Henschel, S., Widmer, C., Behr, J., Zien, A., de Bona, F., Binder, A., Gehl, C., Franc, V.: The shogun machine learning toolbox. Journal of Machine Learning Research (2010)
Google Scholar
van Gemert, J.C., Geusebroek, J.-M., Veenman, C.J., Smeulders, A.W.M.: Kernel codebooks for scene categorization. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part III. LNCS, vol. 5304, pp. 696–709. Springer, Heidelberg (2008)
Chapter Google Scholar
Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: Liblinear: A library for large linear classification. J. Mach. Learn. Res. 9, 1871–1874 (2008)
MATH Google Scholar
Wojcikiewicz, W., Binder, A., Kawanabe, M.: Enhancing image classification with class-wise clustered vocabularies. In: Proceedings of the 20th International Conference on Pattern Recognition (ICPR) (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

Machine Learning Group, Berlin Institute of Technology, Franklinstr. 28/29, 10587, Berlin, Germany
Alexander Binder, Wojciech Wojcikiewicz, Christina Müller & Motoaki Kawanabe
Fraunhofer Institute FIRST, Kekuléstr. 7, 12489, Berlin, Germany
Wojciech Wojcikiewicz, Christina Müller & Motoaki Kawanabe

Authors

Alexander Binder
View author publications
You can also search for this author in PubMed Google Scholar
Wojciech Wojcikiewicz
View author publications
You can also search for this author in PubMed Google Scholar
Christina Müller
View author publications
You can also search for this author in PubMed Google Scholar
Motoaki Kawanabe
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Technion – Israel Institute of Technology, Department of Computer Science, 32000, Haifa, Israel
Ron Kimmel
The University of Auckland, 37 Kohimarama Road , Mission Bay, 1071, Auckland, New Zealand
Reinhard Klette
National Institute of Informatics, Chiyoda, 1018430, Tokyo, Japan
Akihiro Sugimoto

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Binder, A., Wojcikiewicz, W., Müller, C., Kawanabe, M. (2011). A Hybrid Supervised-Unsupervised Vocabulary Generation Algorithm for Visual Concept Recognition. In: Kimmel, R., Klette, R., Sugimoto, A. (eds) Computer Vision – ACCV 2010. ACCV 2010. Lecture Notes in Computer Science, vol 6494. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19318-7_8

Download citation

DOI: https://doi.org/10.1007/978-3-642-19318-7_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-19317-0
Online ISBN: 978-3-642-19318-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics