Stable multi-label boosting for image annotation with structural feature selection

Zhuang, YueTing; Han, YaHong; Wu, Fei; Yang, JiaCheng

doi:10.1007/s11432-011-4483-5

Stable multi-label boosting for image annotation with structural feature selection

Research Papers
Special Focus
Published: 03 December 2011

Volume 54, pages 2508–2521, (2011)
Cite this article

Science China Information Sciences Aims and scope Submit manuscript

YueTing Zhuang¹,
YaHong Han¹,
Fei Wu¹ &
…
JiaCheng Yang¹

128 Accesses
7 Citations
Explore all metrics

Abstract

Automatic annotating images with appropriate multiple tags are very important to image retrieval and image understanding. We can obtain high-dimensional heterogenous visual features from real-world images to describe their various aspects of visual characteristics, such as color, texture, and shape. Different kinds of heterogenous features have different intrinsic discriminative power for image understanding. The selection of groups of discriminative features for certain semantics is hence crucial to make the image understanding more interpretable. This paper proposes an approach, called stable multi-label boosting with structural feature selection (S-MtBFS), for image annotation. S-MtBFS comprises two steps, namely structural feature selection for each label and stable multi-label boosting by curds and whey. In the first step, a (structural) sparse selection model is learned to identify subgroups of homogenous features for the purpose of predicting a certain label. Moreover, a stable method of multi-label boosting with a re-sampling policy is employed in the second step to utilize the correlations among multiple tags. Extensive experiments on public image datasets show that the proposed approach has better and stable performance of multi-label image annotation and leads to a quite interpretable model for image understanding.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning with Noisy Correspondence

Article 13 April 2024

Zhenyu Huang, Peng Hu, … Xi Peng

FSODv2: A Deep Calibrated Few-Shot Object Detection Network

Article 04 April 2024

Qi Fan, Wei Zhuo, … Yu-Wing Tai

Food-101 – Mining Discriminative Components with Random Forests

References

Grangier D, Bengio S. A discriminative kernel-based approach to rank images from text queries. IEEE Trans Patt Anal Mach Intel, 2008, 30: 1371–1384
Article Google Scholar
Chen Y, Wang J Z, Geman D. Image categorization by learning and reasoning with regions. J Mach Learn Res, 2004, 5: 913–939
Google Scholar
Maron O, Ratan A L. Multiple-instance learning for natural scene classification. In: Proceedings of the 15th International Conference on Machine Learning, Madison, Wisconsin, USA, 1998. 341–349
Wang C, Yan S, Zhang L, et al. Multi-label sparse coding for automatic image annotation. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 2009. 1643–1650
Han Y, Wu F, Jia J, et al. Multi-task sparse discriminant analysis (MTSDA) with overlapping categories. In: Proceedings of the 24th AAAI Conference on Artificial Intelligence, Atlanta, Georgia, USA, 2010. 469–474
Cao L, Luo J, Liang F, et al. Heterogeneous feature machines for visual recognition. In: Proceedings of the 12th IEEE International Conference on Computer Vision, Kyoto, Japan, 2009
Tibshirani R. Regression shrinkage and selection via the lasso. J Royal Stat Soc Ser B, 1996, 58: 267–288
MATH MathSciNet Google Scholar
Yuan M, Lin Y. Model selection and estimation in regression with grouped variables. J Royal Stat Soc Ser B, 2006, 68: 49–67
Article MATH MathSciNet Google Scholar
Zou H, Hastie T. Regularization and variable selection via the elastic net. J Royal Stat Soc Ser B, 2005, 67: 301–320
Article MATH MathSciNet Google Scholar
Friedman J, Hastie T, Tibshirani R. A note on the group lasso and a sparse group lasso. Arxiv preprint, arXiv: 1001.0736, 2010
Shen X, Huang H. Grouping pursuit in regression. J Am Stat Assoc, 2010, 105: 727–739
Article MathSciNet Google Scholar
Breiman L, Friedman J. Predicting multivariate responses in multiple linear regression. J Royal Stat Soc Ser B, 1997, 59: 3–54
Article MATH MathSciNet Google Scholar
Hotelling H. Relations between two sets of variates. Biometrika, 1936, 28: 321–377
MATH Google Scholar
Kohavi R. A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proceedings of the International Joint Conference on Artificial Intelligence, Montréal, Québec, Canada, 1995. 1137–1145
Miller R G. The jackknife—a review. Biometrika, 1974, 61: 1–15
MATH MathSciNet Google Scholar
Efron B. Bootstrap methods: another look at the jackknife. Ann Stat, 1979, 7: 1–26
Article MATH MathSciNet Google Scholar
Beck A, Teboulle M. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J Imag Sci, 2009, 2: 183–202
Article MATH MathSciNet Google Scholar
Wu F, Han Y, Tian Q, et al. Multi-label boosting for image annotation by structural grouping sparsity. In: Proceedings of the ACM International Conference on Multimedia, Firenze, Italy, 2010. 15–24
Bishop C M. Pattern Recognition and Machine Learning, Volume 4. New York: Springer, 2006
MATH Google Scholar
Bousquet O, Elisseeff A. Stability and generalization. J Mach Learn Res, 2002, 2: 499–526
MATH MathSciNet Google Scholar
Zhou Z H, Zhang M L. Multi-instance multi-label learning with application to scene classification. In: Proceedings of Conference on Neural Information Processing Systems, Vancouver, British Columbia, Canada, 2007
Chua T S, Tang J, Hong R, et al. Nus-wide: A real-world web image database from National University of Singapore. In: Proceedings of the ACM International Conference on Image and Video Retrieval, Island of Santorini, Greece, 2009. 1–9
Lewis D D. Evaluating text categorization. In: Proceedings of Speech and Natural Language Workshop, Pacific Grove, California, USA, 1991. 312–318
Ji S, Tang L, Yu S, et al. Extracting shared subspace for multi-label classification. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Las Vegas, Nevada, USA, 2008. 381–389

Download references

Author information

Authors and Affiliations

College of Computer Science, Zhejiang University, Hangzhou, 310027, China
YueTing Zhuang, YaHong Han, Fei Wu & JiaCheng Yang

Authors

YueTing Zhuang
View author publications
You can also search for this author in PubMed Google Scholar
YaHong Han
View author publications
You can also search for this author in PubMed Google Scholar
Fei Wu
View author publications
You can also search for this author in PubMed Google Scholar
JiaCheng Yang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to YaHong Han.

Additional information

ZHUANG YueTing was born in 1965. He received the B.S., M.S., and Ph.D. degrees from Zhejiang University, Hangzhou, China in 1986, 1989, and 1998, respectively. Currently, he is a Professor and Ph.D. Supervisor with the College of Computer Science, Zhejiang University. His current research interests include multimedia databases, artificial intelligence, and video-based animation.

WU Fei was born in 1973. He received the B.S. degree from Lanzhou University, Lanzhou, Gansu, China, the M.S. degree from Macao University, Taipa, Macau, and the Ph.D. degree from Zhejiang University, Hangzhou, China. He is currently a Professor with the College of Computer Science, Zhejiang University. His current research interests include multimedia analysis, retrieval, statistic learning, and pattern recognition.

HAN YaHong was born in 1977. He received the B.S. degree from Zhengzhou University, Zhengzhou, Henan, China in 2000, and the M.S. degree from Hohai University, Nanjing, Jiangsu, China in 2003. He is currently pursuing the Ph.D. degree from the College of Computer Science, Zhejiang University, Hangzhou, China. His current research interests include multimedia analysis, retrieval, and machine learning.

Electronic supplementary material

Supplementary material, approximately 9.71 MB.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhuang, Y., Han, Y., Wu, F. et al. Stable multi-label boosting for image annotation with structural feature selection. Sci. China Inf. Sci. 54, 2508–2521 (2011). https://doi.org/10.1007/s11432-011-4483-5

Download citation

Received: 12 June 2011
Accepted: 25 July 2011
Published: 03 December 2011
Issue Date: December 2011
DOI: https://doi.org/10.1007/s11432-011-4483-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Stable multi-label boosting for image annotation with structural feature selection

Abstract

Access this article

Similar content being viewed by others

Learning with Noisy Correspondence

FSODv2: A Deep Calibrated Few-Shot Object Detection Network

Food-101 – Mining Discriminative Components with Random Forests

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Stable multi-label boosting for image annotation with structural feature selection

Abstract

Access this article

Similar content being viewed by others

Learning with Noisy Correspondence

FSODv2: A Deep Calibrated Few-Shot Object Detection Network

Food-101 – Mining Discriminative Components with Random Forests

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation