Skip to main content

A Hybrid Supervised-Unsupervised Vocabulary Generation Algorithm for Visual Concept Recognition

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 6494))

Abstract

Vocabulary generation is the essential step in the bag-of-words image representation for visual concept recognition, because its quality affects classification performance substantially. In this paper, we propose a hybrid method for visual word generation which combines unsupervised density-based clustering with the discriminative power of fast support vector machines. We aim at three goals: breaking the vocabulary generation algorithm up into two sections, with one highly parallelizable part, reducing computation times for bag of words features and keeping concept recognition performance at levels comparable to vanilla k-means clustering. On the two recent data sets Pascal VOC2009 and Image-CLEF2010 PhotoAnnotation, our proposed method either outperforms various baseline algorithms for visual word generation with almost same computation time or reduces training/test time with on par classification performance.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Csurka, G., Bray, C., Dance, C., Fan, L.: Visual categorization with bags of keypoints. In: Workshop on Statistical Learning in Computer Vision, ECCV, Prague, Czech Republic, pp. 1–22 (2004)

    Google Scholar 

  2. Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge (VOC 2007) (2007), http://www.pascal-network.org/challenges/VOC/voc2007/workshop/index.html

  3. Tahir, M., van de Sande, K., Uijlings, J., Yan, F., Li, X., Mikolajczyk, K., Kittler, J., Gevers, T., Smeulders, A.: Surreyuva srkda method (2008), http://pascallin.ecs.soton.ac.uk/challenges/VOC/voc2008/workshop/tahir.pdf

  4. Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge 2009 (VOC 2009) (2009), http://www.pascal-network.org/challenges/VOC/voc2009/workshop/index.html

  5. Nowak, S., Dunker, P.: Overview of the CLEF 2009 large-scale visual concept detection and annotation task. In: Peters, C., Caputo, B., Gonzalo, J., Jones, G.J.F., Kalpathy-Cramer, J., Müller, H., Tsikrika, T. (eds.) CLEF 2009. LNCS, vol. 6242, pp. 94–109. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  6. Jurie, F., Triggs, B.: Creating efficient codebooks for visual recognition. In: ICCV 2005, vol. I, pp. 604–610 (2005)

    Google Scholar 

  7. Moosmann, F., Triggs, B., Jurie, F.: Fast discriminative visual codebooks using randomized clustering forests. In: Advances in Neural Information Processing Systems (2006)

    Google Scholar 

  8. Moosmann, F., Nowak, E., Jurie, F.: Randomized clustering forests for image classification. IEEE Transactions on Pattern Analysis & Machine Intelligence 30, 1632–1646 (2008)

    Article  Google Scholar 

  9. Uijlings, J., Smeulders, A., Scha, R.: Real-time bag-of-words, approximately. In: CIVR (2009)

    Google Scholar 

  10. Bosch, A., Zisserman, A., Muñoz, X.: Representing shape with a spatial pyramid kernel. In: Proceedings of the 6th ACM International Conference on Image and Video Retrieval (CIVR 2007), pp. 401–408 (2007)

    Google Scholar 

  11. Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories. In: Workshop on Generative-Model Based Vision (2004)

    Google Scholar 

  12. van de Sande, K.E.A., Gevers, T., Snoek, C.G.M.: Evaluating color descriptors for object and scene recognition. IEEE Trans. Pat. Anal. & Mach. Intel. (2010)

    Google Scholar 

  13. Nister, D., Stewenius, H.: Scalable recognition with a vocabulary tree. In: CVPR 2006: Proceedings of Conference on Computer Vision and Pattern Recognition (2006)

    Google Scholar 

  14. Fulkerson, B., Vedaldi, A., Soatto, S.: Localizing objects with smart dictionaries. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 179–192. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  15. Lowe, D.: Distinctive image features from scale invariant keypoints. International Journal of Computer Vision 60, 91–110 (2004)

    Article  Google Scholar 

  16. Sonnenburg, S., Rätsch, G., Henschel, S., Widmer, C., Behr, J., Zien, A., de Bona, F., Binder, A., Gehl, C., Franc, V.: The shogun machine learning toolbox. Journal of Machine Learning Research (2010)

    Google Scholar 

  17. van Gemert, J.C., Geusebroek, J.-M., Veenman, C.J., Smeulders, A.W.M.: Kernel codebooks for scene categorization. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part III. LNCS, vol. 5304, pp. 696–709. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  18. Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: Liblinear: A library for large linear classification. J. Mach. Learn. Res. 9, 1871–1874 (2008)

    MATH  Google Scholar 

  19. Wojcikiewicz, W., Binder, A., Kawanabe, M.: Enhancing image classification with class-wise clustered vocabularies. In: Proceedings of the 20th International Conference on Pattern Recognition (ICPR) (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Binder, A., Wojcikiewicz, W., Müller, C., Kawanabe, M. (2011). A Hybrid Supervised-Unsupervised Vocabulary Generation Algorithm for Visual Concept Recognition. In: Kimmel, R., Klette, R., Sugimoto, A. (eds) Computer Vision – ACCV 2010. ACCV 2010. Lecture Notes in Computer Science, vol 6494. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19318-7_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-19318-7_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-19317-0

  • Online ISBN: 978-3-642-19318-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics