Abstract
We can obtain high-dimensional heterogeneous features from real-world images on photo-sharing website, for an example Flickr. Those features are implemented to describe their various aspects of visual characteristics, such as color, texture and shape etc. The heterogeneous features are often over-complete to describe certain semantic. Therefore, the selection of limited discriminative features for certain semantics is hence crucial to make the image understanding more interpretable. This chapter introduces one approach for multi-label image annotation with a regularized penalty. We call it Multi-label Image Boosting by the selection of heterogeneous features with structural Grouping Sparsity (MtBGS). MtBGS induces a (structural) sparse selection model to identify subgroups of homogeneous features for predicting a certain label. Moreover, the correlations among multiple tags are utilized in MtBGS to boost the performance of multi-label annotation. Extensive experiments on public image datasets show that the proposed approach has better multi-label image annotation performance and leads to a quite interpretable model for image understanding.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Barnard, K., Duygulu, P., Forsyth, D., De Freitas, N., Blei, D.M., Jordan, M.I.: Matching words and pictures. J. Mach. Learn. Res. 3, 1107–1135 (2003)
Breiman, L.: Heuristics of instability and stabilization in model selection. Ann. Stat. 24(6), 2350–2383 (1996)
Breiman, L., Friedman, J.: Predicting multivariate responses in multiple linear regression. J. R. Stat. Soc. B 59(1), 3–54 (1997)
Cao, L., Luo, J., Liang, F., Huang, T.: Heterogeneous feature machines for visual recognition. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2009)
Chen, Y., Wang, J.Z., Geman, D.: Image categorization by learning and reasoning with regions. J. Mach. Learn. Res. 5, 913–939 (2004)
Chua, T.S., Tang, J., Hong, R., Li, H., Luo, Z., Zheng, Y.: Nus-wide: A real-world web image database from National University of Singapore. In: Proceedings of the ACM International Conference on Image and Video Retrieval, pp. 1–9. ACM, New York (2009)
Clemmensen, L., Hastie, T., Ersbøll, B.: Sparse discriminant analysis. http://www-stat.stanford.edu/~hastie/Papers/ (2008)
Duygulu, P., Barnard, K., De Freitas, J., Forsyth, D.: Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. In: Computer Vision, ECCV 2002, pp. 349–354 (2002)
Efron, B., Hastie, T., Johnstone, I., Tibshirani, R.: Least angle regression. Ann. Stat. 32(2), 407–451 (2004)
Fan, J., Gao, Y., Luo, H.: Integrating concept ontology and multitask learning to achieve more effective classifier training for multilevel image annotation. IEEE Trans. Image Process. 17(3), 407 (2008)
Friedman, J., Hastie, T., Tibshirani, R.: A note on the group lasso and a sparse group lasso. http://www-stat.stanford.edu/~tibs/research.html (2010)
Genkin, A., Lewis, D.D., Madigan, D.: Large-scale Bayesian logistic regression for text categorization. Technometrics 49(3), 291–304 (2007)
Grangier, D., Bengio, S.: A discriminative kernel-based approach to rank images from text queries. IEEE Trans. Pattern Anal. Mach. Intell. 30(8), 1371–1384 (2008)
Han, Y., Wu, F., Jia, J., Zhuang, Y., Yu, B.: Multi-task sparse discriminant analysis (MtSDA) with overlapping categories. In: Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence (AAAI-10), pp. 469–474 (2010)
Hastie, T., Buja, A., Tibshirani, R.: Penalized discriminant analysis. Ann. Stat. 23(1), 73–102 (1995)
Hotelling, H.: Relations between two sets of variates. Biometrika 28(3), 321–377 (1936)
Ji, S., Tang, L., Yu, S., Ye, J.: Extracting shared subspace for multi-label classification. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 381–389. ACM, New York (2008)
Kang, F., Jin, R., Sukthankar, R.: Correlated label propagation with application to multi-label learning. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1719–1726 (2006)
Lewis, D.D.: Evaluating text categorization. In: Proceedings of Speech and Natural Language Workshop, pp. 312–318 (1991)
Maron, O., Ratan, A.L.: Multiple-instance learning for natural scene classification. In: Proceedings of the International Conference on Machine Learning (ICML), pp. 341–349 (1998)
Praks, P., Kucera, R., Izquierdo, E.: The sparse image representation for automated image retrieval. In: Image Processing, 2008. ICIP 2008. 15th IEEE International Conference on, pp. 25–28. IEEE, New York (2008)
Quattoni, A., Collins, M., Darrell, T.: Transfer learning for image classification with sparse prototype representations. In: Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on, pp. 1–8. IEEE, New York (2008)
Shen, X., Huang, H.: Grouping pursuit through a regularization solution surface. J. Am. Stat. Assoc. 105(490), 727–739 (2010)
Shevade, S., Keerthi, S.: A simple and efficient algorithm for gene selection using sparse logistic regression. Bioinformatics 19(17), 2246–2253 (2003)
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc., Ser. B, Stat. Methodol. 58(1), 267–288 (1996)
Wang, C., Yan, S., Zhang, L., Zhang, H.: Multi-label sparse coding for automatic image annotation. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1643–1650 (2009)
Wright, J., Yang, A.Y., Ganesh, A., Sastry, S.S., Ma, Y.: Robust face recognition via sparse representation. IEEE Trans. Pattern Anal. Mach. Intell. 31(2), 210–227 (2008)
Wu, F., Han, Y.H., Tian, Q., Zhuang, Y.T.: Multi-label boosting for image annotation by structural grouping sparsity. In: Proceedings of the 2010 ACM International Conference on Multimedia (ACM Multimedia), pp. 15–24. ACM, New York (2010)
Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. B 68(1), 49–67 (2006)
Zhang, Y., Zhou, Z.: Multi-label dimensionality reduction via dependence maximization. In: Proceedings of AAAI Conference on Artificial Intelligence (AAAI), pp. 1503–1505 (2008)
Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B., Stat. Methodol. 67(2), 301–320 (2005)
Zhou, Z.H., Zhang, M.L.: Multi-instance multi-label learning with application to scene classification. In: Proceedings of Neural Information Processing Systems (NIPS) (2007)
Acknowledgements
This work is supported by NSFC (90920303, 61070068), 863 Program (2006 AA010107) and Program for Changjiang Scholars and Innovative Research Team in University (IRT0652, PCSIRT).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag London Limited
About this chapter
Cite this chapter
Han, Y., Wu, F., Zhuang, Y. (2011). Multi-label Image Annotation by Structural Grouping Sparsity. In: Hoi, S., Luo, J., Boll, S., Xu, D., Jin, R., King, I. (eds) Social Media Modeling and Computing. Springer, London. https://doi.org/10.1007/978-0-85729-436-4_5
Download citation
DOI: https://doi.org/10.1007/978-0-85729-436-4_5
Publisher Name: Springer, London
Print ISBN: 978-0-85729-435-7
Online ISBN: 978-0-85729-436-4
eBook Packages: Computer ScienceComputer Science (R0)