Abstract
In this paper, we address the problem of image annotation with incomplete labelling, where the multiple objects in each training image are not fully labeled. The conventional one-versus-all SVM (OVA-SVM) that performs fairly well on full labelling decays drastically under the incomplete setting. Recently, structured learning method termed OVA-SSVM is proposed to boost the performance of OVA-SVM by modeling the structured associations of labels and show efficiency under incomplete setting. The OVA-SSVM assumes that each training sample includes a single label and adopts an loss measure of classification style that as long as one of the predicted label is correct, the overall prediction should be considered correct. However, this may not be appropriate for the multi-label annotation task. In this paper, we extend the OVA-SSVM method to the multi-label situation and design a novel image specific structured loss measure to account for the dependencies between predicted labels relying on the image-label associations. Then we develop an efficient optimization algorithm to learn the model parameters. Finally, we present extensive empirical results on two benchmark datasets with various degree of incompletion, and show that proposed method outperforms OVA-SSVM and achieves competitive performance compared with other state-of-the-art methods which are also designed for the issue of incomplete labelling.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Grubinger, M.: Analysis and Evaluation of Visual Information Systems Performance. Ph.D. thesis, Victoria University (2007)
Chua, T.S., Tang, J., Hong, R., Li, H., Luo, Z., Zheng, Y.: Nus-wide: a real-world web image database from national university of singapore. In: Proceedings of the ACM International Conference on Image and Video Retrieval, p. 48 (2009)
Xiang, Y., Zhou, X., Chua, T.S., Ngo, C.W.: A revisit of generative model for automatic image annotation using markov random fields. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1153–1160 (2009)
Feng, S., Manmatha, R., Lavrenko, V.: Multiple bernoulli relevance models for image and video annotation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2, pp. 1002–1009 (2004)
Makadia, A., Pavlovic, V., Kumar, S.: A new baseline for image annotation. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part III. LNCS, vol. 5304, pp. 316–329. Springer, Heidelberg (2008)
Guillaumin, M., Mensink, T., Verbeek, J., Schmid, C.: Tagprop: discriminative metric learning in nearest neighbor models for image auto-annotation. In: IEEE 12th International Conference on Computer Vision (ICCV), pp. 309–316 (2009)
Verma, Y., Jawahar, C.V.: Image annotation using metric learning in semantic neighbourhoods. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part III. LNCS, vol. 7574, pp. 836–849. Springer, Heidelberg (2012)
Wu, L., Jin, R., Jain, A.: Tag completion for image retrieval. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 35, 716–727 (2013)
Lin, Z., Ding, G., Hu, M., Wang, J., Ye, X.: Image tag completion via image-specific and tag-specific linear sparse reconstructions. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1618–1625 (2013)
Xu, X., Shimada, A., Taniguchi, R.i.: Tag completion with defective tag assignments via image-tag re-weighting. In: IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6 (2014)
Sigurbjörnsson, B., van Zwol, R.: Flickr tag recommendation based on collective knowledge. In: Proceedings of the 17th International Conference on World Wide Web (WWW), pp. 327–336 (2008)
Agrawal, R., Gupta, A., Prabhu, Y., Varma, M.: Multi-label learning with millions of labels: recommending advertiser bid phrases for web pages. In: Proceedings of the 22nd International Conference on World Wide Web (WWW), pp. 13–24 (2013)
Bucak, S.S., Jin, R., Jain, A.K.: Multi-label learning with incomplete class assignments. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2801–2808 (2011)
Verma, Y., Jawahar, C.V.: Exploring svm for image annotation in presence of confusing labels. In: British Machine Vision Conference (BMVC) (2013)
Chen, M., Zheng, A., Weinberger, K.: Fast image tagging. In: Proceedings of the 30th International Conference on Machine Learning (ICML), pp. 1274–1282 (2013)
Yu, H.F., Jain, P., Kar, P., Dhillon, I.S.: Large-scale multi-label learning with missing labels. In: Proceedings of the 30th International Conference on Machine Learning (ICML) (2013)
Binder, A., Samek, W., Müller, K.R., Kawanabe, M.: Enhanced representation and multi-task learning for image annotation. Comput. Vis. Image Underst. (CVIU) 117, 466–478 (2013)
Dimitrovski, I., Kocev, D., Loskovska, S., Džeroski, S.: Detection of visual concepts and annotation of images using ensembles of trees for hierarchical multi-label classification. In: Ünay, D., Çataltepe, Z., Aksoy, S. (eds.) ICPR 2010. LNCS, vol. 6388, pp. 152–161. Springer, Heidelberg (2010)
Lou, X., Hamprecht, F.A.: Structured learning from partial annotations. In: Proceedings of the 29th International Conference on Machine Learning (ICML), pp. 1519–1526 (2012)
McAuley, J.J., Ramisa, A., Caetano, T.S.: Optimization of robust loss functions for weakly-labeled image taxonomies. Int. J. Comput. Vis. (IJCV) 104, 343–361 (2013)
Yu, C.N.J., Joachims, T.: Learning structural svms with latent variables. In: Proceedings of the 26th International Conference on Machine Learning (ICML), pp. 1169–1176 (2009)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 248–255 (2009)
Van De Sande, K.E., Gevers, T., Snoek, C.G.: Evaluating color descriptors for object and scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 32, 1582–1596 (2010)
Shalev-Shwartz, S., Singer, Y., Srebro, N.: Primal estimated sub-gradient solver for SVM. In: Proceedings of the 24th International Conference on Machine Learning (ICML) (2007)
Platt, J.C.: Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In: Advances in Large Margin Classifiers, pp. 61–74 (1999)
Hariharan, B., Zelnik-manor, L., Vishwanathan, S.V.N., Varma, M.: Large scale max-margin multi-label classification with priors. In: Proceedings of the 27th International Conference on Machine Learning (ICML) (2010)
Acknowledgement
This work was partly supported by Grant-in-Aid for Scientific Research (B), Grant Number 24300074. We thank reviewers for the precious comments.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Xu, X., Shimada, A., Taniguch, Ri. (2015). Exploring Image Specific Structured Loss for Image Annotation with Incomplete Labelling. In: Cremers, D., Reid, I., Saito, H., Yang, MH. (eds) Computer Vision – ACCV 2014. ACCV 2014. Lecture Notes in Computer Science(), vol 9003. Springer, Cham. https://doi.org/10.1007/978-3-319-16865-4_46
Download citation
DOI: https://doi.org/10.1007/978-3-319-16865-4_46
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16864-7
Online ISBN: 978-3-319-16865-4
eBook Packages: Computer ScienceComputer Science (R0)