Abstract
Fully annotated image dataset is required for supervised learning. However, the image labeling process is laborious and monotonous. In this paper, we focus on automatic image labeling for a class-specified image dataset. We propose a weakly supervised approach to localize objects in a class of unlabelled images without using any manually labeled examples. Firstly, an image is segmented based on a multiple segmentation algorithm. Secondly, the segmented regions are mined based on the commonality and saliency to discovery the category pattern in the image. Thirdly, objects are localized based on the weakly supervised learning algorithm. To prove the effectiveness of the proposed approach, we experimentally evaluate the performance of our approach on 12 object classes of the Caltech101 dataset and 2 landmark classes collected from the Internet. The experimental results demonstrate that our approach is effective and accurate to automatically label images.
Similar content being viewed by others
References
Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2001), vol. 1, pp. I-511–I-518 (2001)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005), vol. 1, pp. 886–893 (2005)
Andrews, S., Tsochantaridis, I., Hofmann, T.: Support vector machines for multiple-instance learning. In: Proceedings of the NIPS, pp. 561–568 (2003)
Liu, H., Qu, Y.: Exploiting context aware category discovery for image labeling. In: Proceedings of the Third International Conference on Internet Multimedia Computing and Service (2011)
Russell, B.C., Freeman, W.T., Efros, A.A., Sivic, J., Zisserman, A.: Using multiple segmentations to discover objects and their extent in image collections. In: Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 1605–1614 (2006)
Galleguillos, C., Babenko, B., Rabinovich, A., Belongie, S.: Weakly supervised object localization with stable segmentations. In: Proceedings of the 10th European Conference on Computer Vision: Part I (2008)
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 2169–2178 (2006)
Fulkerson, B., Vedaldi, A., Soatto, S.: Class segmentation and object localization with superpixel neighborhoods. In: Proceedings of the ICCV, pp. 670–677 (2009)
Lampert, C.H., Blaschko, M.B., Hofmann, T.: Efficient subwindow search: a branch and bound framework for object localization. IEEE Trans. Pattern Anal. Mach. Intell. 31, 2129–2142 (2009)
Wang, M., Hua, X.-S., Tang, J., Hong, R.: Beyond distance measurement: constructing neighborhood similarity for video annotation. In: IEEE Transactions on Multimedia, vol. 11, pp. 465–476 (2009)
Wang, M., Hua, X.-S., Hong, R., Tang, J., Qi, G.-J., Song, Y.: Unified video annotation via multigraph learning. In: IEEE Transactions on Circuits and Systems for Video Technology, vol. 19, pp. 733–746 (2009)
Fergus, R., Perona, P., Zisserman, A.: Object class recognition by unsupervised scale-invariant learning. In: Proceedings of the 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. II-264–II-271 (2003)
Sivic, J., Russell, B.C., Efros, A.A., Zisserman, A., Freeman, W.T.: Discovering objects and their location in images. In: Proceedings of the Tenth IEEE International Conference on Computer Vision, 2005 (ICCV 2005), vol. 1, pp. 370–377 (2005)
Hofmann, T.: Unsupervised learning by probabilistic latent semantic analysis. Mach. Learn. 42, 177–196 (2001)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
Griffiths, T., Steyvers, M.: Finding scientific topics. In: Proceedings of the National Academy of Sciences, pp. 5228–5235 (2004)
Lee, Y.J., Grauman, K.: Foreground focus: unsupervised learning from partially matching images. Int. J. Comput. Vis. 85, 143–166 (2009)
Tighe, J., Lazebnik, S.: Superparsing: scalable nonparametric image parsing with superpixels. In: Proceedings of the 11th European Conference on Computer Vision: Part V (2010)
Dietterich, T.G., Lathrop, R.H.: Solving the multiple instance problem with axis-parallel rectangles. Artif. Intell. 89, 31–71 (1997)
Wang, J., Zucker, J.-D.: Solving the multiple-instance problem: a lazy learning approach. In: Proceedings of the Seventeenth International Conference on Machine Learning (2000)
Babenko, B., Ming-Hsuan, Y., Belongie, S.: Visual tracking with online multiple instance learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2009 (CVPR 2009), pp. 983–990 (2009)
Shi, J., Malik, J.: Normalized cuts and image segmentation. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1997, pp. 731–737 (1997)
Shi, J., Malik, J.: Normalized cuts and image segmentation. In: IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, pp. 888–905 (2000)
Achanta, R., Hemami, S., Estrada, F., Susstrunk, S.: Frequency-tuned salient region detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2009 (CVPR 2009), pp. 1597–1604 (2009)
Qu, Y., Chen, C., Wu, D., Xie, Y.: Image labeling via incremental model learning. In: Proceedings of the 17th IEEE International Conference on Image Processing (ICIP), 2010, pp. 1573–1576 (2010)
Acknowledgments
This research work was supported by the Fundamental Research Funds for the Central Universities under Grant No. 2010121067, the National Defence Basic Scientific Research program of China (B1420****55), the National Natural Science Foundation of China under Grant No. 61170179, the Special Research Fund for the Doctoral Program of Higher Education of China under Project (20110121110033), and the Xiamen Science and Technology Planning Project Fund (3502Z20116005) of China.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Qu, Y., Liu, H., Yang, X. et al. Weakly-supervised object localization in unlabeled image collection. Multimedia Systems 19, 51–63 (2013). https://doi.org/10.1007/s00530-012-0293-x
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00530-012-0293-x