Two-Stage Localization for Image Labeling

Qu, Yanyun; Wu, Diwei; Cheng, Yanyun; Chen, Cheng

doi:10.1007/978-3-642-15702-8_52

Yanyun Qu²²,
Diwei Wu²²,
Yanyun Cheng²² &
…
Cheng Chen²²

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6297))

Included in the following conference series:

Pacific-Rim Conference on Multimedia

1438 Accesses

Abstract

The well-built dataset is a pre-requisite for object categorization. However, the processes of collecting and labeling the images are laborious and monotonous. In this paper, we focus on an automatic labeling of images by using a bounding box for each visual object. We propose a two-stage localization approach for image labeling which combines the Efficient Subwindow Search scheme with Multiple Instance Learning. We firstly detect the object coarsely by the the Efficient Subwindow Search scheme, and then we finely localize the object by Multiple Instance learning. Our approach has two advantages, one is to speed up the object search, and the other is to locate the object precisely in a tighter box than the Efficient Subwindow Search scheme. We evaluate the image labeling performance by the detection precision and the detection consistency with the ground truth label. Our approach is simple and fast in object localization. The experiment results demonstrate that our approach is more effective and accurate than the BOW model in the precision and consistency of detection.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories. Computer Vision and Image Understanding 106, 59–70 (2007)
Article Google Scholar
Griffin, G., Holub, A., Perona, P.: Caltech-256 object category dataset (2007)
Google Scholar
Everingham, M., Zisserman, A., Williams, C., Van Gool, L.: The PASCAL Visual Object Classes Challenge, VOC 2006 Results (2006), http://www.pascalnetwork.org/challenges/VOC/voc2006/results.pdf
Russell, B.C., Torralba, A., Murphy, K.P., Freeman, W.T.: Labelme: A database and web-based tool for image annotation. Int. J. Comput. Vision 77, 157–173 (2008)
Article Google Scholar
Yao, B., Yang, X., Zhu, S.C.: Introduction to a large-scale general purpose ground truth database: Methodology, annotation tool and benchmarks, pp. 169–183 (2007)
Google Scholar
Feng, H., Chua, T.: A bootstrapping approach to annotating large image collection. In: Proceedings of the 5th ACM SIGMM international workshop on Multimedia information retrieval, pp. 55–62 (2003)
Google Scholar
Fergus, R., Perona, P., Zisserman, A.: A visual category filter for google images. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3021, pp. 242–256. Springer, Heidelberg (2004)
Chapter Google Scholar
Fergus, R., Fei-Fei, L., Perona, P., Zisserman, A.: Learning object categories from google’s image search. In: Tenth IEEE International Conference on Computer Vision, vol. 2, pp. 1816–1823 (2005)
Google Scholar
Li, J., Wang, G., Fei-Fei, L.: Optimol: automatic object picture collection via incremental model learning. In: Computer Vision and Pattern Recognition (2006)
Google Scholar
Collins, B., Deng, J., Kai, L., Fei-Fei, L.: Towards scalable dataset construction: An active learning approach. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 86–98. Springer, Heidelberg (2008)
Chapter Google Scholar
Berg, T.L., Forsyth, D.A.: Animals on the web. In: Computer Vision and Pattern Recognition, pp. 1463–1470 (2006)
Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Computer Vision and Pattern Recognition, pp. 886–893 (2005)
Google Scholar
Lampert, C.H., Blaschko, M.B., Hofmann, T.: Efficient Subwindow Search: A Branch and Bound Framework for Object Localization. IEEE Pattern Analysis and Machine Learning 31(12), 2129–2142 (2009)
Article Google Scholar
Andrews, S., Tsochantaridis, I., Hofmann, T.: Support vector machines for multiple-instance learning. In: Neural Information Processing Systems (2003)
Google Scholar
Maron, O., Ratan, A.: Multiple-instance learning for natural scene classification. In: International Conference on Machine Learning (1998)
Google Scholar
Viola, P., Jones, M.: Fast multi-view face detection. In: CVPR (2001)
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science Department, Xiamen University, 361005, P.R. China
Yanyun Qu, Diwei Wu, Yanyun Cheng & Cheng Chen

Authors

Yanyun Qu
View author publications
You can also search for this author in PubMed Google Scholar
Diwei Wu
View author publications
You can also search for this author in PubMed Google Scholar
Yanyun Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Cheng Chen
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computer Science, University of Nottingham, Jubilee Campus, NG8 1BB, Nottingham, UK
Guoping Qiu
The Centre for Multimedia Signal Processing, The Hong Kong Polytechnic University, Hong Kong, China
Kin Man Lam
Faculty of System Design, Tokyo Metropolitan University, 6-6, Asahigaoka, 191-0065, Hino-city, Tokyo
Hitoshi Kiya
Shanghai Key Laboratory of Intelligent Information Processing, Department of Computer Science & Engineering, Fudan University, Shanghai, China
Xiang-Yang Xue
Department of Electrical Engineering, University of Southern California, 90089-2564, Los Angeles, CA
C.-C. Jay Kuo
LIACS Media Lab, Leiden University,
Michael S. Lew

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Qu, Y., Wu, D., Cheng, Y., Chen, C. (2010). Two-Stage Localization for Image Labeling. In: Qiu, G., Lam, K.M., Kiya, H., Xue, XY., Kuo, CC.J., Lew, M.S. (eds) Advances in Multimedia Information Processing - PCM 2010. PCM 2010. Lecture Notes in Computer Science, vol 6297. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15702-8_52

Download citation

DOI: https://doi.org/10.1007/978-3-642-15702-8_52
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15701-1
Online ISBN: 978-3-642-15702-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics