Skip to main content
Log in

Graph regularized low-rank feature mapping for multi-label learning with application to image annotation

  • Published:
Multidimensional Systems and Signal Processing Aims and scope Submit manuscript

Abstract

Automatic image annotation has emerged as a hot research topic in the last two decades due to its application in social images organization. Most studies treat image annotation as a typical multi-label classification problem, where the shortcoming of this approach lies in that in order to a learn reliable model for label prediction, it requires sufficient number of training images with accurate annotations. Being aware of this, we develop a novel graph regularized low-rank feature mapping for image annotation under semi-supervised multi-label learning framework. Specifically, the proposed method concatenate the prediction models for different tags into a matrix, and introduces the matrix trace norm to capture the correlations among different labels and control the model complexity. In addition, by using graph Laplacian regularization as a smooth operator, the proposed approach can explicitly take into account the local geometric structure on both labeled and unlabeled images. Moreover, considering the tags of labeled images tend to be missing or noisy, we introduce a supplementary ideal label matrix to automatically fill in the missing tags as well as correct noisy tags for given training images. Extensive experiments conducted on five different multi-label image datasets demonstrate the effectiveness of the proposed approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. We only consider the first 5 returned tags in Corel5K image dataset since the maximum tags for each image is 5.

References

  • Barnard, K., Duygulu, P., Forsyth, D., Freitas, N., Blei, D., & Jordan, M. (2003). Matching words and pictures. Journal of Machine Learning Reserach, 3, 1107–1135.

    MATH  Google Scholar 

  • Beck, A., & Teboulle, M. (2009). A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM Journal on Imaging Sciences, 2(11), 183–202.

    Article  MathSciNet  MATH  Google Scholar 

  • Boutell, M., Luo, J., Shen, X., & Brown, C. M. (2004). Learning multi-label scene classification. Pattern Recognition, 37(9), 1757–1771.

    Article  Google Scholar 

  • Bucak, S., Jin, R., & Jain, A. (2011). Multi-label learning with incomplete class assignments. In: IEEE International Conference on Computer Vision and Pattern Recognition, pp. 2801–2808.

  • Cabral, R., de la Torre, F., Costeira, J., & Bernardino, A. (2015). Matrix completion for weakly-supervised multi-label image classification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(1), 121–135.

    Article  Google Scholar 

  • Cai, J., Candes, E., & Shen, Z. (2010). A singular value thresholding algorithm for matrix completion. SIAM Journal on Optimization, 20(4), 1956–1982.

    Article  MathSciNet  MATH  Google Scholar 

  • Cai, X., Nie, F., Cai, W., & Huang, H. (2013). New graph structured sparsity model for multi-label image annotations. In: IEEE International Conference on Computer Vision, pp. 801–808.

  • Candes, E., & Recht, B. (2009). Exact matrix completion via convex optimization. Foundations of Computational Mathematics, 9(6), 717–772.

    Article  MathSciNet  MATH  Google Scholar 

  • Chen, M., Zheng, A., & Weinberge, K. (2013). Fast image tagging. In: Interbational Conference on Machine Learning.

  • Chua, T., Tang, J., Hong, R., Li, H., Luo, Z., & Zheng, Y. (2009). Nus-wide: A real-world web image database from National University of Singapore. In: ACM International Conference on CIVR, pp. 1–9.

  • Feng, S., Feng, Z., & Jin, R. (2015). Learning to rank image tags with limited training examples. IEEE Transactions on Image Processing, 24(4), 1223–1234.

    Article  MathSciNet  Google Scholar 

  • Feng, S., Manmatha, R., & Lavrenko, V. (2004). Multiple bernoulli relevance models for image and video annotation. In: IEEE International Conference on Computer Vision and Pattern Recognition, pp. 1002–1009.

  • Feng, Z., Jin, R., & Jain, A. (2013). Large-scale image annotation by efficient and robust kernel metric learning. In: IEEE International Conference on Computer Vision, pp. 4321–4328.

  • Goldberg, A., Zhu, X., Recht, B., Xu, J., & Nowak, R. (2010). Transduction with matrix completion: Three birds with one stone. In: Advances in Neural Information Processing Systems, pp. 1–9.

  • Guillaumin, M., Mensink, T., Verbeek, J., & Schmid, C. (2009). Tagprop: Discriminative metric learning in nearest neighbor models for image auto-annotation. In: IEEE International Conference on Computer Vision, pp. 309–316.

  • Hsu, C. W., & Lin, C. J. (2002). A comparison of methods for multiclass support vector machines. IEEE Transactions on Neural Networks, 13(2), 415–425.

    Article  Google Scholar 

  • Hwang, S., & Grauman, K. (2012). Learning the relative importance of objects from tagged images for retrieval and cross-modal search. International Journal of Computer Vision, 100(2), 134–153.

    Article  MathSciNet  Google Scholar 

  • Jeon, J., Lavrenko, V., & Manmatha, R. (2003). Annotation and retrieval using cross-media relevance models. In: ACM International Conference on SIGIR, pp. 119–126.

  • Jing, L., Yang, L., Yu, J., & Ng, M. (2015). Semi-supervised low-rank mapping learning for multi-label classification. In: IEEE International Conference on Computer Vision and Pattern Recognition, pp. 1483–1491.

  • Lavrenko, V., Manmatha, R., & Jeon, J. (2003). A model for learning the semantics of pictures. In: Advances in Neural Information Processing Systems (NIPS).

  • Li, Z., Liu, J., Yang, Y., Zhou, X., & Lu, H. (2014). Clustering-guided sparse structural learning for unsupervised feature selection. IEEE Transactions on Knowledge and Data Engineering, 26(9), 2138–2150.

    Article  Google Scholar 

  • Lin, Z., Ding, G., Hu, M., Wang, J., & Ye, X. (2013). Image tag completion via image-specific and tag-specific linear sparse reconstructions. In: IEEE International Conference on Computer Vision, pp. 1618–1625.

  • Liu, M., Luo, Y., Tao, D., Xu, C., & Wen, Y. (2015). Low-rank multi-view learning in matrix completion for multi-label image classification. In: AAAI Conference on Artificial Intelligence, pp. 2778–2784.

  • Liu, T., Gong, M., & Tao, D. (2017). Large-cone nonnegative matrix factorization. IEEE Transactions on Neural Networks and Learning Systems.

  • Liu, T., & Tao, D. (2016). Classification with noisy labels by importance reweighting. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(3), 447–461.

    Article  Google Scholar 

  • Liu, W., & Tao, D. (2013). Multiview hessian regularization for image annotation. IEEE Transactions on Image Processing, 22(7), 2676–2687.

    Article  MathSciNet  MATH  Google Scholar 

  • Luo, Y., Tao, D., Geng, B., Xu, C., & Maybank, S. (2013). Manifold regularized multitask learning for semi-supervised multilabel image classification. IEEE Transactions on Image Processing, 22(2), 523–536.

    Article  MathSciNet  MATH  Google Scholar 

  • Makadia, A., Pavlovic, V., & Kumar, S. (2010). Baselines for image annotation. International Journal of Computer Vision, 90(1), 88–105.

    Article  Google Scholar 

  • Monay, F., & Gatica-Perez, D. (2004). Plsa-based image autoannotation: constraining the latent space. In: ACM International Conference on Multimedia, pp. 348–351.

  • Nesterov, Y. (1983). A method for solving a convex programming problem with convergence rate \(o\left(\frac{1}{k^{2}}\right)\). Soviet Mathematics Doklady, 27, 372–376.

    Google Scholar 

  • Nesterov, Y. (2007). Gradient methods for minimizing composite objective function (technical report 2007/76). CORE, University catholique de Louvain.

  • Putthividhya, D., Attias, H., & Nagarajan, S. (2010). Topic-regression multimodal latent dirichlet allocation for image annotation. In: IEEE International Conference on Computer Vision and Pattern Recognition, pp. 3408–3415.

  • Tang, J., Hong, R., Yan, S., & Chua, T. (2011). Image annotation by knn-sparse graph-based label propagation over noisily-tagged web images. ACM Transactions on Intelligent Systems and Technology, 2(2), 1–16.

    Article  Google Scholar 

  • Tang, J., Li, Z., Wang, M., & Zhao, R. (2015). Neighborhood discriminant hashing for large-scale image retrieval. IEEE Transactions on Image Processing, 24(9), 2827–2840.

    Article  MathSciNet  Google Scholar 

  • Tang, J., Yan, S., Hong, R., Qi, G., & Chua, T. (2009). Inferring semantic concepts from community-contributed images and noisy tags. In: ACM International Conference on Multimedia, pp. 223–232.

  • Toh, K., & Yun, S. (2010). An accelerated proximal gradient algorithm for nuclear norm regularized least squares problems. Pacific Journal of Optimization, 6, 615–640.

    MathSciNet  MATH  Google Scholar 

  • Vedaldi, A., & Zisserman, A. (2012). Efficient additive kernels via explicit feature maps. IEEE Transactions on Pattern Analasis and Machine Intelligence, 34(4), 480–492.

    Article  Google Scholar 

  • Verma, Y., & Jawahar, C. (2012). Image annotation using metric learning in semantic neighbourhoods. In: European Confernce on Computer Vision, pp. 836–849.

  • Wang, C., Yan, S., Zhang, L., & Zhang, H. (2009). Multi-label sparse coding for automatic image annotation. In: IEEE International Conference on Computer Vision and Pattern Recognition, pp. 1643–1650.

  • Wang, H., Huang, H., & Ding, C. (2009). Image annotation using multi-label correlated green’s function. In: IEEE International Conference on Computer Vision.

  • Wang, J., & Ye, J. (2009). An accelerated gradient method for trace norm minimization. In: IEEE International Conference on Machine Learning, pp. 457–464.

  • Wu, B., Liu, Z., Wang, S., Hu, B., & Ji, Q. (2014). Multi-label learning with missing labels. In: IEEE Conference on Pattern Recognition, pp. 1964–1968.

  • Wu, B., Lyu, S., & Ghanem, B. (2015). Ml-mg: Multi-label learning with missing labels using a mixed graph. In: IEEE Conference on Computer Vision, pp. 4157–4165.

  • Wu, B., Lyu, S., Hu, B., & Ji, Q. (2015). Multi-label learning with missing labels for image annotation and facial action unit recognition. Pattern Recognition, 48(7), 2279–2289.

    Article  Google Scholar 

  • Wu, L., Jin, R., & Jain, A. (2013). Tag completion for image retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(3), 716–727.

    Article  Google Scholar 

  • Xu, M., Jin, R., & Zhou, Z.H. (2013). Speedup matrix completion with side information: application to multi-label learning. In: Advances in Neural Information Processing Systems, pp. 2301–2309.

  • Yang, Y., Wu, F., Nie, F., Shen, H., Zhuang, Y., & Hauptmann, A. (2012). Web and personal image annotation by mining label correlation with relaxed visual graph embedding. IEEE Transactions on Image Processing, 21(3), 1339–1351.

    Article  MathSciNet  MATH  Google Scholar 

  • Yin, M., Gao, J., & Lin, Z. (2016). Laplacian regularized low-rank representation and its applications. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(3), 504–517.

    Article  Google Scholar 

  • Yu, H., Jain, P., Kar, P., & Dhillon, I.S. (2014). Large-scale multi-label learning with missing labels. In: International Conference on Machine Learning, pp. 593–601.

  • Zhang, M., & Wu, L. (2015). Lift: multi-label learning with label-specific features. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(1), 107–120.

    Article  Google Scholar 

  • Zhang, M., & Zhou, Z. (2014). A review on multi-label learning algorithms. IEEE Transactions on Knowledge and Data Engineering, 26(9), 2138–2150.

    Article  MathSciNet  Google Scholar 

  • Zhang, M., & Zhou, Z. H. (2007). Ml-knn: A lazy learning approach to multi-label learning. Pattern Recognition, 40(7), 2038–2048.

    Article  MATH  Google Scholar 

  • Zhang, S., Huang, J., Huang, Y., Yu, Y., Li, H., & Metaxas, D. (2010). Automatic image annotation using group sparsity. In: IEEE International Conference on Computer Vision and Pattern Recognition, pp. 3312–3319.

  • Zhao, F., & Guo, Y. (2015). Semi-supervised multi-label learning with incomplete labels. In: International Joint Conference on Artificial Intelligence, pp. 4062–4068.

Download references

Acknowledgements

This work is supported in part by National Natural Science Foundation of China (61472028, 61403423, 61502026, 61673048, 61372148), Beijing Natural Science Foundation (4162048, 4163075) and Jiangsu Key Laboratory of Big Data Analysis Technology, Nanjing University of Information Science & Technology, Nanjing, China.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Songhe Feng.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Feng, S., Lang, C. Graph regularized low-rank feature mapping for multi-label learning with application to image annotation. Multidim Syst Sign Process 29, 1351–1372 (2018). https://doi.org/10.1007/s11045-017-0505-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11045-017-0505-9

Keywords

Navigation