Skip to main content
Log in

Discrete semi-supervised learning for multi-label image classification and large-scale image retrieval

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Multi-label image classification is a critical problem in image semantic learning. Traditional semi-supervised multi-label learning methods are mainly based on continuous learning of both labelled and unlabelled data. They usually learn classification functions from continuous label space. And the neglect of discrete constraint of labels impedes the classification performance. In this paper, we specifically consider the discrete constraint and propose Discrete Semi-supervised Multi-label Learning (DSML) for image classification. In DSML, we propose a semi-supervised framework with discrete constraint. Then we introduce anchor graph learning to improve the scalability, and derive an ADMM based alternating optimization process to solve our framework. The main experimental results on two real-world image datasets MIR Flickr and NUS-WIDE demonstrate the superiority of DSML compared with several advanced multi-label methods. Furthermore, additional experiments of image retrieval show the potential advantages of DSML in other image applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Andoni A, Indyk P (2006) Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. Commun ACM 51(1):459–468

    Google Scholar 

  2. Belkin M, Niyogi P, Sindhwani V (2006) Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. JMLR.org

  3. Boyd S, Parikh N, Chu E, Peleato B, Eckstein J (2010) Distributed optimization and statistical learning via the alternating direction method of multipliers. Found Trends Mach Learn 3(1):1--122

    Article  MATH  Google Scholar 

  4. Bruzzone L, Chi M, Marconcini M (2006) A novel transductive svm for semisupervised classification of remote-sensing images. IEEE Trans Geosci Remote Sens 44(11):3363–3373

    Article  Google Scholar 

  5. Chang X, Yi Y (2017) Semisupervised feature analysis by mining correlations among multiple tasks. IEEE Trans Neural Netw Learn Syst 28(10):2294–2305

    Article  MathSciNet  Google Scholar 

  6. Chua T-S, Tang J, Hong R, Li H, Luo Z, Zheng Y (2009) Nus-wide: a real-world web image database from national university of singapore. In: Proceedings of the ACM international conference on image and video retrieval. ACM, pp 48

  7. Ciaccia P, Patella M, Zezula P (1997) M-tree: An efficient access method for similarity search in metric spaces. In: International conference on very large data bases

  8. Cui H, Zhu L, Cui C, Nie X, Zhang H (2018) Efficient weakly-supervised discrete hashing for large-scale social image retrieval. Pattern Recognition Letters

  9. Everingham M, Van Gool L, Williams CKI, Winn J, Zisserman A (2008) The pascal visual object classes challenge 2007 (voc 2007) results (2007). http://www.pascal-network.org/challenges/VOC/voc2007/ workshop/index.html

  10. Gong Y, Lazebnik S, Gordo A, Perronnin F (2013) Iterative quantization: a procrustean approach to learning binary codes for large-scale image retrieval. IEEE Trans Pattern Anal Mach Intell 35(12):2916–2929

    Article  Google Scholar 

  11. Guillaumin M, Mensink T, Verbeek J, Schmid C (2009) Tagprop discriminative metric learning in nearest neighbor models for image auto-annotation. In: IEEE international conference on computer vision, pp 309–316

  12. Guillaumin M, Verbeek J, Schmid C (2010) Multimodal semi-supervised learning for image classification. In: 2010 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 902–909

  13. Hajinezhad D, Chang TH, Wang X, Shi Q, Hong M (2016) Nonnegative matrix factorization using admm: Algorithm and convergence analysis. In: IEEE international conference on acoustics, speech and signal processing, pp 4742–4746

  14. Huiskes MJ, Lew MS (2008) The mir flickr retrieval evaluation. In: Proceedings of the 1st ACM international conference on multimedia information retrieval. ACM, pp 39–43

  15. Jing L, Shen C, Yang L, Yu J, Ng MK (2017) Multi-label classification by semi-supervised singular value decomposition. IEEE Trans Image Process 26(10):4612–4625

    Article  MathSciNet  MATH  Google Scholar 

  16. Li Z, Nie F, Chang X, Yi Y (2017) Beyond trace ratio Weighted harmonic mean of trace ratios for multiclass discriminant analysis. IEEE Trans Knowl Data Eng PP(99):1–1

    Google Scholar 

  17. Liu W (2011) Hashing with graphs. In: Proceedings of the international conference machine learning June, pp 1–8

  18. Liu W, Mu C, Kumar S, Chang SF (2014) Discrete graph hashing. Advances in Neural Information Processing Systems:3419–3427

  19. Liu W, Liu W, Lv Z, Lv Z, Gao X (2016) Coordinate discrete optimization for efficient cross-view image retrieval. In: International joint conference on artificial intelligence, pp 1860--1866

  20. Liu X, Xu Q, Chau T, Mu Y, Zhu L, Yan S (2018) Revisiting jump-diffusion process for visual tracking: A reinforcement learning approach. IEEE Transactions on Circuits and Systems for Video Technology

  21. Liu X, Xu Q, Xu Y, Zhu L, Mu Y (2018) A stochastic grammar for simultaneous human re-identification and tracking. IEEE Transactions on Circuits and Systems for Video Technology

  22. Lu X, Zhu L, Cheng Z, Song X, Zhang H (2019) Efficient discrete latent semantic hashing for scalable cross-modal retrieval. Signal Process 154:217–231

    Article  Google Scholar 

  23. Luo Y, Tao D, Geng B, Xu C, Maybank SJ (2013) Manifold regularized multitask learning for semi-supervised multilabel image classification. IEEE Trans Image Process 22(2):523--536

    Article  MathSciNet  MATH  Google Scholar 

  24. Luo M, Chang X, Li Z, Nie L, Hauptmann AG, Zheng Q (2017) Simple to complex cross-modal learning to rank. Computer Vision & Image Understanding:163

  25. Ma Z, Chang X, Xu Z, Sebe N, Hauptmann AG (2018) Joint attributes and event analysis for multimedia event detection. IEEE Trans Neural Netw Learn Syst 29(7):2921–2930

    MathSciNet  Google Scholar 

  26. Nie L, Wang M, Zha ZJ, Chua TxS (2012) Oracle in image search A content-based approach to performance prediction. ACM Trans Inf Syst 30(2):13

    Article  Google Scholar 

  27. Nie L, Yan S, Wang M, Hong R, Chua TS (2012) Harvesting visual concepts for image search with complex queries. In: ACM international conference on multimedia, pp 59–68

  28. Nie L, Yi LZ, Wang X, Shen J, Chua TS (2014) Learning to recommend descriptive tags for questions in social forums. ACM Trans Inf Syst 32(1):1–23

    Article  Google Scholar 

  29. Schölkopf B, Smola A, Müller K-R (1998) Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput 10(5):1299–1319

    Article  Google Scholar 

  30. Seung D, L Lee (2001) Algorithms for non-negative matrix factorization. In: Advances in neural information processing systems, pp 556–562

  31. Shen F, Shen C, Liu W, Shen HT (2015) Supervised discrete hashing. In: Computer vision and pattern recognition, pp 37–45

  32. Shen F, Zhou X, Song J, Shen HT, Tao D (2016) A fast optimization method for general binary code learning. IEEE Trans Image Process 25(12):5610–5621

    Article  MathSciNet  MATH  Google Scholar 

  33. Sivic J (2003) Video google: a text retrieval apporach to object matching in viedeos. In: Proceedings of the IEEE international conference on computer vision

  34. Song J, Yang Y, Yi Y, Zi H, Shen HT (2013) Inter-media hashing for large-scale retrieval from heterogeneous data sources. In: ACM SIGMOD international conference on management of data, pp 785–796

  35. Wang J, Kumar S, Chang SF (2012) Semi-supervised hashing for large-scale search. IEEE Trans Pattern Anal Mach Intell 34(12):2393–2406

    Article  Google Scholar 

  36. Wang J, Shen HT, Song J, Ji J (2014) Hashing for similarity search: a survey. Computer Science

  37. Wang L, Zhu L, Yu E, Sun J, Zhang H (2018) Task-dependent and query-dependent subspace learning for cross-modal retrieval. IEEE Access 6:27091–27102

    Article  Google Scholar 

  38. Weiss Y, Torralba A, Fergus R (2009) Spectral hashing. In: Proceedings of NIPS, pp 1753--1760

  39. Xie L, Pan P, Lu Y, Wang S (2014) A cross-modal multi-task learning framework for image annotation categories and subject descriptors, pp 431–440

  40. Yan X, Zhang L, Wu JL (2017) Semi-supervised deep hashing with a bipartite graph. In: Twenty-sixth international joint conference on artificial intelligence, pp 3238–3244

  41. Yang Y, Shen F, Zi H, Shen HT, Li X (2017) Discrete nonnegative spectral clustering. IEEE Trans Knowl Data Eng PP(99):1–1

    Google Scholar 

  42. Yi L, Rong J, Liu Y (2006) Semi-supervised multi-label learning by constrained non-negative matrix factorization. In: National conference on artificial intelligence and the eighteenth innovative applications of artificial intelligence conference. Boston, pp 421–426

  43. Yi Y, Wu F, Nie F, Shen HT, Zhuang Y, Hauptmann AG (2012) Web and personal image annotation by mining label correlation with relaxed visual graph embedding. IEEE Trans Image Process Publ IEEE Signal Process Soc 21(3):1339–51

    MathSciNet  MATH  Google Scholar 

  44. Yu HZ, Zhou ZH (2016) Large margin distribution learning with cost interval and unlabeled data. IEEE Trans Knowl Data Eng 28(7):1749–1763

    Article  Google Scholar 

  45. Zhang D, Wang J, Cai D, Lu J (2010) Self-taught hashing for fast similarity search. In: Proceedings of the 33rd international ACM SIGIR conference on research and development in information retrieval, pp 18–25

  46. Zhang H, Shen F, Liu W, He X, Luan H, Chua TS (2016) Discrete collaborative filtering. In: International ACM SIGIR conference on research and development in information retrieval, pp 325–334

  47. Zhu L, Zi H, Liu X, He X, Sun J, Zhou X (2017) Discrete multimodal hashing with canonical views for robust mobile landmark search. IEEE Trans Multimed 19(9):2066–2079

    Article  Google Scholar 

Download references

Acknowledgments

This work was supported by the National Natural Science Foundation of China (No.61702388), the Fundamental Research Funds for the Central Universities(WUT: 2018IVB021) and the Fundamental Research Funds for the Central Universities (No.2018IB016).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Liang Xie.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

He, L., Xie, L., Shu, H. et al. Discrete semi-supervised learning for multi-label image classification and large-scale image retrieval. Multimed Tools Appl 78, 24519–24537 (2019). https://doi.org/10.1007/s11042-019-7157-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-019-7157-8

Keywords

Navigation