Skip to main content

Two-Stage Localization for Image Labeling

  • Conference paper
Book cover Advances in Multimedia Information Processing - PCM 2010 (PCM 2010)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6297))

Included in the following conference series:

  • 1438 Accesses

Abstract

The well-built dataset is a pre-requisite for object categorization. However, the processes of collecting and labeling the images are laborious and monotonous. In this paper, we focus on an automatic labeling of images by using a bounding box for each visual object. We propose a two-stage localization approach for image labeling which combines the Efficient Subwindow Search scheme with Multiple Instance Learning. We firstly detect the object coarsely by the the Efficient Subwindow Search scheme, and then we finely localize the object by Multiple Instance learning. Our approach has two advantages, one is to speed up the object search, and the other is to locate the object precisely in a tighter box than the Efficient Subwindow Search scheme. We evaluate the image labeling performance by the detection precision and the detection consistency with the ground truth label. Our approach is simple and fast in object localization. The experiment results demonstrate that our approach is more effective and accurate than the BOW model in the precision and consistency of detection.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories. Computer Vision and Image Understanding 106, 59–70 (2007)

    Article  Google Scholar 

  2. Griffin, G., Holub, A., Perona, P.: Caltech-256 object category dataset (2007)

    Google Scholar 

  3. Everingham, M., Zisserman, A., Williams, C., Van Gool, L.: The PASCAL Visual Object Classes Challenge, VOC 2006 Results (2006), http://www.pascalnetwork.org/challenges/VOC/voc2006/results.pdf

  4. Russell, B.C., Torralba, A., Murphy, K.P., Freeman, W.T.: Labelme: A database and web-based tool for image annotation. Int. J. Comput. Vision 77, 157–173 (2008)

    Article  Google Scholar 

  5. Yao, B., Yang, X., Zhu, S.C.: Introduction to a large-scale general purpose ground truth database: Methodology, annotation tool and benchmarks, pp. 169–183 (2007)

    Google Scholar 

  6. Feng, H., Chua, T.: A bootstrapping approach to annotating large image collection. In: Proceedings of the 5th ACM SIGMM international workshop on Multimedia information retrieval, pp. 55–62 (2003)

    Google Scholar 

  7. Fergus, R., Perona, P., Zisserman, A.: A visual category filter for google images. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3021, pp. 242–256. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  8. Fergus, R., Fei-Fei, L., Perona, P., Zisserman, A.: Learning object categories from google’s image search. In: Tenth IEEE International Conference on Computer Vision, vol. 2, pp. 1816–1823 (2005)

    Google Scholar 

  9. Li, J., Wang, G., Fei-Fei, L.: Optimol: automatic object picture collection via incremental model learning. In: Computer Vision and Pattern Recognition (2006)

    Google Scholar 

  10. Collins, B., Deng, J., Kai, L., Fei-Fei, L.: Towards scalable dataset construction: An active learning approach. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 86–98. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  11. Berg, T.L., Forsyth, D.A.: Animals on the web. In: Computer Vision and Pattern Recognition, pp. 1463–1470 (2006)

    Google Scholar 

  12. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Computer Vision and Pattern Recognition, pp. 886–893 (2005)

    Google Scholar 

  13. Lampert, C.H., Blaschko, M.B., Hofmann, T.: Efficient Subwindow Search: A Branch and Bound Framework for Object Localization. IEEE Pattern Analysis and Machine Learning 31(12), 2129–2142 (2009)

    Article  Google Scholar 

  14. Andrews, S., Tsochantaridis, I., Hofmann, T.: Support vector machines for multiple-instance learning. In: Neural Information Processing Systems (2003)

    Google Scholar 

  15. Maron, O., Ratan, A.: Multiple-instance learning for natural scene classification. In: International Conference on Machine Learning (1998)

    Google Scholar 

  16. Viola, P., Jones, M.: Fast multi-view face detection. In: CVPR (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Qu, Y., Wu, D., Cheng, Y., Chen, C. (2010). Two-Stage Localization for Image Labeling. In: Qiu, G., Lam, K.M., Kiya, H., Xue, XY., Kuo, CC.J., Lew, M.S. (eds) Advances in Multimedia Information Processing - PCM 2010. PCM 2010. Lecture Notes in Computer Science, vol 6297. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15702-8_52

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-15702-8_52

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-15701-1

  • Online ISBN: 978-3-642-15702-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics