Skip to main content
Log in

Semi-supervised feature selection via hierarchical regression for web image classification

  • Special Issue Paper
  • Published:
Multimedia Systems Aims and scope Submit manuscript

Abstract

Feature selection is an important step for large-scale image data analysis, which has been proved to be difficult due to large size in both dimensions and samples. Feature selection firstly eliminates redundant and irrelevant features and then chooses a subset of features that performs as efficient as the complete set. Generally, supervised feature selection yields better performance than unsupervised feature selection because of the utilization of labeled information. However, labeled data samples are always expensive to obtain, which constraints the performance of supervised feature selection, especially for the large web image datasets. In this paper, we propose a semi-supervised feature selection algorithm that is based on a hierarchical regression model. Our contribution can be highlighted as: (1) Our algorithm utilizes a statistical approach to exploit both labeled and unlabeled data, which preserves the manifold structure of each feature type. (2) The predicted label matrix of the training data and the feature selection matrix are learned simultaneously, making the two aspects mutually benefited. Extensive experiments are performed on three large-scale image datasets. Experimental results demonstrate the better performance of our algorithm, compared with the state-of-the-art algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Zha, Z.-J., Hua, X.-S., Mei, T., Wang, J., Qi, G.-J., Wang, Z.: Joint multi-label multi-instance learning for image classification. In: IEEE Conference on Computer Vision and Pattern Recognition, 2008. IEEE, (2008)

  2. Zha, Z.-J., Wang, M., Zheng, Y.-T., Yang, Y., Hong, R., Chua, T.-S.: Interactive video indexing with statistical active learning. IEEE Trans. Multimed. 14, 17–27 (2012)

    Article  Google Scholar 

  3. Zha, Z.-J., Yang, L., Mei, T., Wang, M., Wang, Z.: Visual query suggestion. In: Proceedings of the 17th ACM international conference on Multimedia, pp. 15–24. ACM, (2009)

  4. Zha, Z.-J., Yang, L., Wang, Z., Chua, T.-S., Hua, X.-S.: Visual query suggestion: towards capturing user intent in internet image search. ACM Trans. Multimed. Comput. Commun. Appl. (TOMCCAP) 6(13), 1–19 (2010)

    Article  Google Scholar 

  5. Koller, D., Sahami, M.: Toward optimal feature selection. Technical Report 1996–77, Stanford InfoLab, February (1996)

  6. Han, Y., Yang, Y., Zhou, X.: Co-regularized ensemble for feature selection. In: Proceedings of the Twenty-Third international joint conference on Artificial Intelligence (2013)

  7. Yang, Y., Song, J., Huang, Z., Ma, Z., Sebe, N.: Multi-feature fusion via hierarchical regression for multimedia analysis. IEEE Trans. Multimed. 15, 572–581 (2012)

    Article  Google Scholar 

  8. Zhu, X.: Semi-supervised learning literature survey. Technical Report 1530, Computer Sciences, University of Wisconsin-Madison (2005)

  9. Zhang, T., Changsheng, X., Hanqing, L.: A generic framework for video annotation via semi-supervised learning. IEEE Trans. Multimed. 14, 1206–1219 (2012)

    Article  Google Scholar 

  10. Zha, Z.-J., Mei, T., Wang, J., Wang, Z., Hua, X.-S.: Graph-based semi-supervised learning with multiple labels. J. Vis. Commun. Image Represent. 20(2), 97–103 (2009). Special issue on Emerging Techniques for Multimedia Content Sharing, Search and Understanding

    Article  Google Scholar 

  11. Zhu, J., Hoi, S.C.H., Lyu, M.R., Yan, S.: Near-duplicate keyframe retrieval by semi-supervised learning and nonrigid image matching. ACM Trans. Multimed. Comput. Commun. Appl. (TOMCCAP) 7(1), 4:1–4:24 (2011)

    Google Scholar 

  12. Belkin, M., Niyogi, P., Sindhwani, V.: Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J. Mach. Learn. Res. 7, 2399–2434 (2006)

    MathSciNet  MATH  Google Scholar 

  13. Nie, F., Dong, X.: Flexible manifold embedding: a framework for semi-supervised and unsupervised dimension reduction. IEEE Trans. Image Process. 19(7), 1921–1932 (2010)

    Article  MathSciNet  Google Scholar 

  14. Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. Wiley-Interscience, New York (2001)

    MATH  Google Scholar 

  15. Nie, F., Huang, H., Cai, X., Ding, C.H.: Efficient and robust feature selection via joint \(l_{2, 1}\)-norms minimization. In: Advances in Neural Information Processing Systems, pp. 1813–1821 (2010)

  16. Yang, Y., Shen, H.T., Ma, Z., Huang, Z., Zhou, X.: l 2, 1-norm regularized discriminative feature selection for unsupervised learning. In: Proceedings of the Twenty-Second international joint conference on Artificial Intelligence, pp. 1589–1594. AAAI Press, (2011)

  17. Zhao, Z., Wang, L., Liu, H.: Efficient spectral feature selection with minimum redundancy. In: Proceedings of the AAAI Conference on Artificial Intelligence, (2010)

  18. Bao, B.-K., Liu, G., Yan, S.: Inductive robust principal component analysis. IEEE Trans. Image Process. 21(8), 3794–3800 (2012)

    Article  MathSciNet  Google Scholar 

  19. Bao, B.-K., Zhu, G., Shen, J., Yan, S.: Robust image analysis with sparse representation on quantized visual features. IEEE Trans. Image Process. 22(3), 860–871 (2013)

    Article  MathSciNet  Google Scholar 

  20. Bradley, P. S., Mangasarian, O.L.: Feature selection via concave minimization and support vector machines. In: International Conference on Machine Learning (ICML), vol. 98, pp. 82–90, (1998)

  21. Sun, L., Liu, J., Chen, J., Ye, J.: Efficient recovery of jointly sparse vectors. In: Advances in Neural Information Processing Systems, pp. 1812–1820 (2009)

  22. Ma, Z., Yang, Y., Nie, F., Uijlings, J., Sebe, N.: Exploiting the entire feature space with sparsity for automatic image annotation. In: Proceedings of the 19th ACM international conference on Multimedia, pp. 283–292. ACM, (2011)

  23. Zhao, Z., Liu, H.: Semi-supervised feature selection via spectral analysis. In: SIAM International Conference on Data Mining, (2007)

  24. Zhu, X., Ghahramani, Z., Lafferty, J. et al.: Semi-supervised learning using gaussian fields and harmonic functions. In: ICML, vol. 3, pp. 912–919, (2003)

  25. Zenglin, X., Irwin King, M.R.-T., Lyu, R.J.: Discriminative semi-supervised feature selection via manifold regularization. IEEE Trans. Neural Netw. 21(7), 1033–1047 (2010)

    Article  Google Scholar 

  26. Huiskes, M.J., Lew, M.S.: The mir flickr retrieval evaluation. In: Proceedings of the 1st ACM international conference on Multimedia information retrieval, pp. 39–43. ACM, (2008)

  27. Li, H., Wang, M., Hua, X.-S.: MSRA-MM 2.0: a large-scale web multimedia dataset. In: IEEE International Conference on Data Mining Workshops, pp. 164–169. IEEE, (2009)

  28. Chua, T.-S., Tang, J., Hong, R., Li, H., Luo, Z., Zheng, Y.: NUS-WIDE: a real-world web image database from National University of Singapore. In: Proceedings of the ACM International Conference on Image and Video Retrieval, number 48 in CIVR ’09, pp. 1–9. ACM, (2009)

  29. Wu, F., Yuan, Y., Zhuang, Y.: Heterogeneous feature selection by group lasso with logistic regression. In: Proceedings of the international conference on Multimedia, pp. 983–986, (2010)

  30. Yang, Y., Xu, D., Nie, F., Luo, J., Zhuang, Y.: Ranking with local regression and global alignment for cross media retrieval. In: Proceedings of the 17th ACM international conference on Multimedia, pp. 175–184. ACM, (2009)

Download references

Acknowledgments

This paper was partially supported by the National Program on the Key Basic Research Project (under Grant 2013CB329301), NSFC (under Grant 61202166), and Doctoral Fund of Ministry of Education of China (under Grant 20120032120042).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yahong Han.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Song, X., Zhang, J., Han, Y. et al. Semi-supervised feature selection via hierarchical regression for web image classification. Multimedia Systems 22, 41–49 (2016). https://doi.org/10.1007/s00530-014-0390-0

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00530-014-0390-0

Keywords

Navigation