Multi-label Image Annotation by Structural Grouping Sparsity

Han, Yahong; Wu, Fei; Zhuang, Yueting

doi:10.1007/978-0-85729-436-4_5

Yahong Han⁷,
Fei Wu⁷ &
Yueting Zhuang⁷

Abstract

We can obtain high-dimensional heterogeneous features from real-world images on photo-sharing website, for an example Flickr. Those features are implemented to describe their various aspects of visual characteristics, such as color, texture and shape etc. The heterogeneous features are often over-complete to describe certain semantic. Therefore, the selection of limited discriminative features for certain semantics is hence crucial to make the image understanding more interpretable. This chapter introduces one approach for multi-label image annotation with a regularized penalty. We call it Multi-label Image Boosting by the selection of heterogeneous features with structural Grouping Sparsity (MtBGS). MtBGS induces a (structural) sparse selection model to identify subgroups of homogeneous features for predicting a certain label. Moreover, the correlations among multiple tags are utilized in MtBGS to boost the performance of multi-label annotation. Extensive experiments on public image datasets show that the proposed approach has better multi-label image annotation performance and leads to a quite interpretable model for image understanding.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Barnard, K., Duygulu, P., Forsyth, D., De Freitas, N., Blei, D.M., Jordan, M.I.: Matching words and pictures. J. Mach. Learn. Res. 3, 1107–1135 (2003)
Article MATH Google Scholar
Breiman, L.: Heuristics of instability and stabilization in model selection. Ann. Stat. 24(6), 2350–2383 (1996)
Article MATH MathSciNet Google Scholar
Breiman, L., Friedman, J.: Predicting multivariate responses in multiple linear regression. J. R. Stat. Soc. B 59(1), 3–54 (1997)
Article MATH MathSciNet Google Scholar
Cao, L., Luo, J., Liang, F., Huang, T.: Heterogeneous feature machines for visual recognition. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2009)
Google Scholar
Chen, Y., Wang, J.Z., Geman, D.: Image categorization by learning and reasoning with regions. J. Mach. Learn. Res. 5, 913–939 (2004)
Google Scholar
Chua, T.S., Tang, J., Hong, R., Li, H., Luo, Z., Zheng, Y.: Nus-wide: A real-world web image database from National University of Singapore. In: Proceedings of the ACM International Conference on Image and Video Retrieval, pp. 1–9. ACM, New York (2009)
Google Scholar
Clemmensen, L., Hastie, T., Ersbøll, B.: Sparse discriminant analysis. http://www-stat.stanford.edu/~hastie/Papers/ (2008)
Duygulu, P., Barnard, K., De Freitas, J., Forsyth, D.: Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. In: Computer Vision, ECCV 2002, pp. 349–354 (2002)
Google Scholar
Efron, B., Hastie, T., Johnstone, I., Tibshirani, R.: Least angle regression. Ann. Stat. 32(2), 407–451 (2004)
Article MATH MathSciNet Google Scholar
Fan, J., Gao, Y., Luo, H.: Integrating concept ontology and multitask learning to achieve more effective classifier training for multilevel image annotation. IEEE Trans. Image Process. 17(3), 407 (2008)
Article MathSciNet Google Scholar
Friedman, J., Hastie, T., Tibshirani, R.: A note on the group lasso and a sparse group lasso. http://www-stat.stanford.edu/~tibs/research.html (2010)
Genkin, A., Lewis, D.D., Madigan, D.: Large-scale Bayesian logistic regression for text categorization. Technometrics 49(3), 291–304 (2007)
Article MathSciNet Google Scholar
Grangier, D., Bengio, S.: A discriminative kernel-based approach to rank images from text queries. IEEE Trans. Pattern Anal. Mach. Intell. 30(8), 1371–1384 (2008)
Article Google Scholar
Han, Y., Wu, F., Jia, J., Zhuang, Y., Yu, B.: Multi-task sparse discriminant analysis (MtSDA) with overlapping categories. In: Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence (AAAI-10), pp. 469–474 (2010)
Google Scholar
Hastie, T., Buja, A., Tibshirani, R.: Penalized discriminant analysis. Ann. Stat. 23(1), 73–102 (1995)
Article MATH MathSciNet Google Scholar
Hotelling, H.: Relations between two sets of variates. Biometrika 28(3), 321–377 (1936)
MATH MathSciNet Google Scholar
Ji, S., Tang, L., Yu, S., Ye, J.: Extracting shared subspace for multi-label classification. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 381–389. ACM, New York (2008)
Google Scholar
Kang, F., Jin, R., Sukthankar, R.: Correlated label propagation with application to multi-label learning. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1719–1726 (2006)
Google Scholar
Lewis, D.D.: Evaluating text categorization. In: Proceedings of Speech and Natural Language Workshop, pp. 312–318 (1991)
Chapter Google Scholar
Maron, O., Ratan, A.L.: Multiple-instance learning for natural scene classification. In: Proceedings of the International Conference on Machine Learning (ICML), pp. 341–349 (1998)
Google Scholar
Praks, P., Kucera, R., Izquierdo, E.: The sparse image representation for automated image retrieval. In: Image Processing, 2008. ICIP 2008. 15th IEEE International Conference on, pp. 25–28. IEEE, New York (2008)
Chapter Google Scholar
Quattoni, A., Collins, M., Darrell, T.: Transfer learning for image classification with sparse prototype representations. In: Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on, pp. 1–8. IEEE, New York (2008)
Google Scholar
Shen, X., Huang, H.: Grouping pursuit through a regularization solution surface. J. Am. Stat. Assoc. 105(490), 727–739 (2010)
Article MathSciNet Google Scholar
Shevade, S., Keerthi, S.: A simple and efficient algorithm for gene selection using sparse logistic regression. Bioinformatics 19(17), 2246–2253 (2003)
Article Google Scholar
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc., Ser. B, Stat. Methodol. 58(1), 267–288 (1996)
MATH MathSciNet Google Scholar
Wang, C., Yan, S., Zhang, L., Zhang, H.: Multi-label sparse coding for automatic image annotation. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1643–1650 (2009)
Google Scholar
Wright, J., Yang, A.Y., Ganesh, A., Sastry, S.S., Ma, Y.: Robust face recognition via sparse representation. IEEE Trans. Pattern Anal. Mach. Intell. 31(2), 210–227 (2008)
Article Google Scholar
Wu, F., Han, Y.H., Tian, Q., Zhuang, Y.T.: Multi-label boosting for image annotation by structural grouping sparsity. In: Proceedings of the 2010 ACM International Conference on Multimedia (ACM Multimedia), pp. 15–24. ACM, New York (2010)
Google Scholar
Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. B 68(1), 49–67 (2006)
Article MATH MathSciNet Google Scholar
Zhang, Y., Zhou, Z.: Multi-label dimensionality reduction via dependence maximization. In: Proceedings of AAAI Conference on Artificial Intelligence (AAAI), pp. 1503–1505 (2008)
Google Scholar
Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B., Stat. Methodol. 67(2), 301–320 (2005)
Article MATH MathSciNet Google Scholar
Zhou, Z.H., Zhang, M.L.: Multi-instance multi-label learning with application to scene classification. In: Proceedings of Neural Information Processing Systems (NIPS) (2007)
Google Scholar

Download references

Acknowledgements

This work is supported by NSFC (90920303, 61070068), 863 Program (2006 AA010107) and Program for Changjiang Scholars and Innovative Research Team in University (IRT0652, PCSIRT).

Author information

Authors and Affiliations

College of Computer Science, Zhejiang University, Hangzhou, China
Yahong Han, Fei Wu & Yueting Zhuang

Authors

Yahong Han
View author publications
You can also search for this author in PubMed Google Scholar
Fei Wu
View author publications
You can also search for this author in PubMed Google Scholar
Yueting Zhuang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yahong Han .

Editor information

Editors and Affiliations

School of Computer Engineering, Nanyang Technological University, Singapore, 639798, Singapore
Steven C.H. Hoi
Kodak Research Laboratories, Lake Avenue 1999, Rochester, 14650, New York, USA
Jiebo Luo
Media Informatics and Multimedia Systems, University of Oldenburg, Escherweg 2, Oldenburg, 26121, Germany
Susanne Boll
School of Computer Engineering, Nanyang Technological University, Singapore, 639798, Singapore
Dong Xu
Dept. Computer Science and Engineering, Michigan State University, Engineering Building 3115, East Lansing, 48824, Michigan, USA
Rong Jin
Dept. Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong/PR China
Irwin King

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Han, Y., Wu, F., Zhuang, Y. (2011). Multi-label Image Annotation by Structural Grouping Sparsity. In: Hoi, S., Luo, J., Boll, S., Xu, D., Jin, R., King, I. (eds) Social Media Modeling and Computing. Springer, London. https://doi.org/10.1007/978-0-85729-436-4_5

Download citation

DOI: https://doi.org/10.1007/978-0-85729-436-4_5
Publisher Name: Springer, London
Print ISBN: 978-0-85729-435-7
Online ISBN: 978-0-85729-436-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics