Skip to main content
Log in

Weakly-supervised object localization in unlabeled image collection

  • Regular Paper
  • Published:
Multimedia Systems Aims and scope Submit manuscript

Abstract

Fully annotated image dataset is required for supervised learning. However, the image labeling process is laborious and monotonous. In this paper, we focus on automatic image labeling for a class-specified image dataset. We propose a weakly supervised approach to localize objects in a class of unlabelled images without using any manually labeled examples. Firstly, an image is segmented based on a multiple segmentation algorithm. Secondly, the segmented regions are mined based on the commonality and saliency to discovery the category pattern in the image. Thirdly, objects are localized based on the weakly supervised learning algorithm. To prove the effectiveness of the proposed approach, we experimentally evaluate the performance of our approach on 12 object classes of the Caltech101 dataset and 2 landmark classes collected from the Internet. The experimental results demonstrate that our approach is effective and accurate to automatically label images.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2001), vol. 1, pp. I-511–I-518 (2001)

  2. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005), vol. 1, pp. 886–893 (2005)

  3. Andrews, S., Tsochantaridis, I., Hofmann, T.: Support vector machines for multiple-instance learning. In: Proceedings of the NIPS, pp. 561–568 (2003)

  4. Liu, H., Qu, Y.: Exploiting context aware category discovery for image labeling. In: Proceedings of the Third International Conference on Internet Multimedia Computing and Service (2011)

  5. Russell, B.C., Freeman, W.T., Efros, A.A., Sivic, J., Zisserman, A.: Using multiple segmentations to discover objects and their extent in image collections. In: Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 1605–1614 (2006)

  6. Galleguillos, C., Babenko, B., Rabinovich, A., Belongie, S.: Weakly supervised object localization with stable segmentations. In: Proceedings of the 10th European Conference on Computer Vision: Part I (2008)

  7. Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 2169–2178 (2006)

  8. Fulkerson, B., Vedaldi, A., Soatto, S.: Class segmentation and object localization with superpixel neighborhoods. In: Proceedings of the ICCV, pp. 670–677 (2009)

  9. Lampert, C.H., Blaschko, M.B., Hofmann, T.: Efficient subwindow search: a branch and bound framework for object localization. IEEE Trans. Pattern Anal. Mach. Intell. 31, 2129–2142 (2009)

    Article  Google Scholar 

  10. Wang, M., Hua, X.-S., Tang, J., Hong, R.: Beyond distance measurement: constructing neighborhood similarity for video annotation. In: IEEE Transactions on Multimedia, vol. 11, pp. 465–476 (2009)

  11. Wang, M., Hua, X.-S., Hong, R., Tang, J., Qi, G.-J., Song, Y.: Unified video annotation via multigraph learning. In: IEEE Transactions on Circuits and Systems for Video Technology, vol. 19, pp. 733–746 (2009)

  12. Fergus, R., Perona, P., Zisserman, A.: Object class recognition by unsupervised scale-invariant learning. In: Proceedings of the 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. II-264–II-271 (2003)

  13. Sivic, J., Russell, B.C., Efros, A.A., Zisserman, A., Freeman, W.T.: Discovering objects and their location in images. In: Proceedings of the Tenth IEEE International Conference on Computer Vision, 2005 (ICCV 2005), vol. 1, pp. 370–377 (2005)

  14. Hofmann, T.: Unsupervised learning by probabilistic latent semantic analysis. Mach. Learn. 42, 177–196 (2001)

    Article  MATH  Google Scholar 

  15. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)

    MATH  Google Scholar 

  16. Griffiths, T., Steyvers, M.: Finding scientific topics. In: Proceedings of the National Academy of Sciences, pp. 5228–5235 (2004)

  17. Lee, Y.J., Grauman, K.: Foreground focus: unsupervised learning from partially matching images. Int. J. Comput. Vis. 85, 143–166 (2009)

    Article  Google Scholar 

  18. Tighe, J., Lazebnik, S.: Superparsing: scalable nonparametric image parsing with superpixels. In: Proceedings of the 11th European Conference on Computer Vision: Part V (2010)

  19. Dietterich, T.G., Lathrop, R.H.: Solving the multiple instance problem with axis-parallel rectangles. Artif. Intell. 89, 31–71 (1997)

    Article  MATH  Google Scholar 

  20. Wang, J., Zucker, J.-D.: Solving the multiple-instance problem: a lazy learning approach. In: Proceedings of the Seventeenth International Conference on Machine Learning (2000)

  21. Babenko, B., Ming-Hsuan, Y., Belongie, S.: Visual tracking with online multiple instance learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2009 (CVPR 2009), pp. 983–990 (2009)

  22. Shi, J., Malik, J.: Normalized cuts and image segmentation. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1997, pp. 731–737 (1997)

  23. Shi, J., Malik, J.: Normalized cuts and image segmentation. In: IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, pp. 888–905 (2000)

  24. Achanta, R., Hemami, S., Estrada, F., Susstrunk, S.: Frequency-tuned salient region detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2009 (CVPR 2009), pp. 1597–1604 (2009)

  25. Qu, Y., Chen, C., Wu, D., Xie, Y.: Image labeling via incremental model learning. In: Proceedings of the 17th IEEE International Conference on Image Processing (ICIP), 2010, pp. 1573–1576 (2010)

Download references

Acknowledgments

This research work was supported by the Fundamental Research Funds for the Central Universities under Grant No. 2010121067, the National Defence Basic Scientific Research program of China (B1420****55), the National Natural Science Foundation of China under Grant No. 61170179, the Special Research Fund for the Doctoral Program of Higher Education of China under Project (20110121110033), and the Xiamen Science and Technology Planning Project Fund (3502Z20116005) of China.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hanzi Wang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Qu, Y., Liu, H., Yang, X. et al. Weakly-supervised object localization in unlabeled image collection. Multimedia Systems 19, 51–63 (2013). https://doi.org/10.1007/s00530-012-0293-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00530-012-0293-x

Keywords

Navigation