Skip to main content

Multi-label Image Annotation by Structural Grouping Sparsity

  • Chapter
Social Media Modeling and Computing

Abstract

We can obtain high-dimensional heterogeneous features from real-world images on photo-sharing website, for an example Flickr. Those features are implemented to describe their various aspects of visual characteristics, such as color, texture and shape etc. The heterogeneous features are often over-complete to describe certain semantic. Therefore, the selection of limited discriminative features for certain semantics is hence crucial to make the image understanding more interpretable. This chapter introduces one approach for multi-label image annotation with a regularized penalty. We call it Multi-label Image Boosting by the selection of heterogeneous features with structural Grouping Sparsity (MtBGS). MtBGS induces a (structural) sparse selection model to identify subgroups of homogeneous features for predicting a certain label. Moreover, the correlations among multiple tags are utilized in MtBGS to boost the performance of multi-label annotation. Extensive experiments on public image datasets show that the proposed approach has better multi-label image annotation performance and leads to a quite interpretable model for image understanding.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Barnard, K., Duygulu, P., Forsyth, D., De Freitas, N., Blei, D.M., Jordan, M.I.: Matching words and pictures. J. Mach. Learn. Res. 3, 1107–1135 (2003)

    Article  MATH  Google Scholar 

  2. Breiman, L.: Heuristics of instability and stabilization in model selection. Ann. Stat. 24(6), 2350–2383 (1996)

    Article  MATH  MathSciNet  Google Scholar 

  3. Breiman, L., Friedman, J.: Predicting multivariate responses in multiple linear regression. J. R. Stat. Soc. B 59(1), 3–54 (1997)

    Article  MATH  MathSciNet  Google Scholar 

  4. Cao, L., Luo, J., Liang, F., Huang, T.: Heterogeneous feature machines for visual recognition. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2009)

    Google Scholar 

  5. Chen, Y., Wang, J.Z., Geman, D.: Image categorization by learning and reasoning with regions. J. Mach. Learn. Res. 5, 913–939 (2004)

    Google Scholar 

  6. Chua, T.S., Tang, J., Hong, R., Li, H., Luo, Z., Zheng, Y.: Nus-wide: A real-world web image database from National University of Singapore. In: Proceedings of the ACM International Conference on Image and Video Retrieval, pp. 1–9. ACM, New York (2009)

    Google Scholar 

  7. Clemmensen, L., Hastie, T., Ersbøll, B.: Sparse discriminant analysis. http://www-stat.stanford.edu/~hastie/Papers/ (2008)

  8. Duygulu, P., Barnard, K., De Freitas, J., Forsyth, D.: Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. In: Computer Vision, ECCV 2002, pp. 349–354 (2002)

    Google Scholar 

  9. Efron, B., Hastie, T., Johnstone, I., Tibshirani, R.: Least angle regression. Ann. Stat. 32(2), 407–451 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  10. Fan, J., Gao, Y., Luo, H.: Integrating concept ontology and multitask learning to achieve more effective classifier training for multilevel image annotation. IEEE Trans. Image Process. 17(3), 407 (2008)

    Article  MathSciNet  Google Scholar 

  11. Friedman, J., Hastie, T., Tibshirani, R.: A note on the group lasso and a sparse group lasso. http://www-stat.stanford.edu/~tibs/research.html (2010)

  12. Genkin, A., Lewis, D.D., Madigan, D.: Large-scale Bayesian logistic regression for text categorization. Technometrics 49(3), 291–304 (2007)

    Article  MathSciNet  Google Scholar 

  13. Grangier, D., Bengio, S.: A discriminative kernel-based approach to rank images from text queries. IEEE Trans. Pattern Anal. Mach. Intell. 30(8), 1371–1384 (2008)

    Article  Google Scholar 

  14. Han, Y., Wu, F., Jia, J., Zhuang, Y., Yu, B.: Multi-task sparse discriminant analysis (MtSDA) with overlapping categories. In: Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence (AAAI-10), pp. 469–474 (2010)

    Google Scholar 

  15. Hastie, T., Buja, A., Tibshirani, R.: Penalized discriminant analysis. Ann. Stat. 23(1), 73–102 (1995)

    Article  MATH  MathSciNet  Google Scholar 

  16. Hotelling, H.: Relations between two sets of variates. Biometrika 28(3), 321–377 (1936)

    MATH  MathSciNet  Google Scholar 

  17. Ji, S., Tang, L., Yu, S., Ye, J.: Extracting shared subspace for multi-label classification. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 381–389. ACM, New York (2008)

    Google Scholar 

  18. Kang, F., Jin, R., Sukthankar, R.: Correlated label propagation with application to multi-label learning. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1719–1726 (2006)

    Google Scholar 

  19. Lewis, D.D.: Evaluating text categorization. In: Proceedings of Speech and Natural Language Workshop, pp. 312–318 (1991)

    Chapter  Google Scholar 

  20. Maron, O., Ratan, A.L.: Multiple-instance learning for natural scene classification. In: Proceedings of the International Conference on Machine Learning (ICML), pp. 341–349 (1998)

    Google Scholar 

  21. Praks, P., Kucera, R., Izquierdo, E.: The sparse image representation for automated image retrieval. In: Image Processing, 2008. ICIP 2008. 15th IEEE International Conference on, pp. 25–28. IEEE, New York (2008)

    Chapter  Google Scholar 

  22. Quattoni, A., Collins, M., Darrell, T.: Transfer learning for image classification with sparse prototype representations. In: Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on, pp. 1–8. IEEE, New York (2008)

    Google Scholar 

  23. Shen, X., Huang, H.: Grouping pursuit through a regularization solution surface. J. Am. Stat. Assoc. 105(490), 727–739 (2010)

    Article  MathSciNet  Google Scholar 

  24. Shevade, S., Keerthi, S.: A simple and efficient algorithm for gene selection using sparse logistic regression. Bioinformatics 19(17), 2246–2253 (2003)

    Article  Google Scholar 

  25. Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc., Ser. B, Stat. Methodol. 58(1), 267–288 (1996)

    MATH  MathSciNet  Google Scholar 

  26. Wang, C., Yan, S., Zhang, L., Zhang, H.: Multi-label sparse coding for automatic image annotation. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1643–1650 (2009)

    Google Scholar 

  27. Wright, J., Yang, A.Y., Ganesh, A., Sastry, S.S., Ma, Y.: Robust face recognition via sparse representation. IEEE Trans. Pattern Anal. Mach. Intell. 31(2), 210–227 (2008)

    Article  Google Scholar 

  28. Wu, F., Han, Y.H., Tian, Q., Zhuang, Y.T.: Multi-label boosting for image annotation by structural grouping sparsity. In: Proceedings of the 2010 ACM International Conference on Multimedia (ACM Multimedia), pp. 15–24. ACM, New York (2010)

    Google Scholar 

  29. Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. B 68(1), 49–67 (2006)

    Article  MATH  MathSciNet  Google Scholar 

  30. Zhang, Y., Zhou, Z.: Multi-label dimensionality reduction via dependence maximization. In: Proceedings of AAAI Conference on Artificial Intelligence (AAAI), pp. 1503–1505 (2008)

    Google Scholar 

  31. Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B., Stat. Methodol. 67(2), 301–320 (2005)

    Article  MATH  MathSciNet  Google Scholar 

  32. Zhou, Z.H., Zhang, M.L.: Multi-instance multi-label learning with application to scene classification. In: Proceedings of Neural Information Processing Systems (NIPS) (2007)

    Google Scholar 

Download references

Acknowledgements

This work is supported by NSFC (90920303, 61070068), 863 Program (2006 AA010107) and Program for Changjiang Scholars and Innovative Research Team in University (IRT0652, PCSIRT).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yahong Han .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag London Limited

About this chapter

Cite this chapter

Han, Y., Wu, F., Zhuang, Y. (2011). Multi-label Image Annotation by Structural Grouping Sparsity. In: Hoi, S., Luo, J., Boll, S., Xu, D., Jin, R., King, I. (eds) Social Media Modeling and Computing. Springer, London. https://doi.org/10.1007/978-0-85729-436-4_5

Download citation

  • DOI: https://doi.org/10.1007/978-0-85729-436-4_5

  • Publisher Name: Springer, London

  • Print ISBN: 978-0-85729-435-7

  • Online ISBN: 978-0-85729-436-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics