Discrete semi-supervised learning for multi-label image classification and large-scale image retrieval

He, Lang; Xie, Liang; Shu, Haohao; Hu, Shengyuan

doi:10.1007/s11042-019-7157-8

Discrete semi-supervised learning for multi-label image classification and large-scale image retrieval

Published: 19 February 2019

Volume 78, pages 24519–24537, (2019)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Lang He¹,
Liang Xie ORCID: orcid.org/0000-0003-1718-7556¹,
Haohao Shu¹ &
…
Shengyuan Hu¹

580 Accesses
6 Citations
Explore all metrics

Abstract

Multi-label image classification is a critical problem in image semantic learning. Traditional semi-supervised multi-label learning methods are mainly based on continuous learning of both labelled and unlabelled data. They usually learn classification functions from continuous label space. And the neglect of discrete constraint of labels impedes the classification performance. In this paper, we specifically consider the discrete constraint and propose Discrete Semi-supervised Multi-label Learning (DSML) for image classification. In DSML, we propose a semi-supervised framework with discrete constraint. Then we introduce anchor graph learning to improve the scalability, and derive an ADMM based alternating optimization process to solve our framework. The main experimental results on two real-world image datasets MIR Flickr and NUS-WIDE demonstrate the superiority of DSML compared with several advanced multi-label methods. Furthermore, additional experiments of image retrieval show the potential advantages of DSML in other image applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

ImageNet Large Scale Visual Recognition Challenge

Article 11 April 2015

A survey on semi-supervised learning

Article Open access 15 November 2019

Learning to Prompt for Vision-Language Models

Article 31 July 2022

References

Andoni A, Indyk P (2006) Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. Commun ACM 51(1):459–468
Google Scholar
Belkin M, Niyogi P, Sindhwani V (2006) Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. JMLR.org
Boyd S, Parikh N, Chu E, Peleato B, Eckstein J (2010) Distributed optimization and statistical learning via the alternating direction method of multipliers. Found Trends Mach Learn 3(1):1--122
Article MATH Google Scholar
Bruzzone L, Chi M, Marconcini M (2006) A novel transductive svm for semisupervised classification of remote-sensing images. IEEE Trans Geosci Remote Sens 44(11):3363–3373
Article Google Scholar
Chang X, Yi Y (2017) Semisupervised feature analysis by mining correlations among multiple tasks. IEEE Trans Neural Netw Learn Syst 28(10):2294–2305
Article MathSciNet Google Scholar
Chua T-S, Tang J, Hong R, Li H, Luo Z, Zheng Y (2009) Nus-wide: a real-world web image database from national university of singapore. In: Proceedings of the ACM international conference on image and video retrieval. ACM, pp 48
Ciaccia P, Patella M, Zezula P (1997) M-tree: An efficient access method for similarity search in metric spaces. In: International conference on very large data bases
Cui H, Zhu L, Cui C, Nie X, Zhang H (2018) Efficient weakly-supervised discrete hashing for large-scale social image retrieval. Pattern Recognition Letters
Everingham M, Van Gool L, Williams CKI, Winn J, Zisserman A (2008) The pascal visual object classes challenge 2007 (voc 2007) results (2007). http://www.pascal-network.org/challenges/VOC/voc2007/ workshop/index.html
Gong Y, Lazebnik S, Gordo A, Perronnin F (2013) Iterative quantization: a procrustean approach to learning binary codes for large-scale image retrieval. IEEE Trans Pattern Anal Mach Intell 35(12):2916–2929
Article Google Scholar
Guillaumin M, Mensink T, Verbeek J, Schmid C (2009) Tagprop discriminative metric learning in nearest neighbor models for image auto-annotation. In: IEEE international conference on computer vision, pp 309–316
Guillaumin M, Verbeek J, Schmid C (2010) Multimodal semi-supervised learning for image classification. In: 2010 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 902–909
Hajinezhad D, Chang TH, Wang X, Shi Q, Hong M (2016) Nonnegative matrix factorization using admm: Algorithm and convergence analysis. In: IEEE international conference on acoustics, speech and signal processing, pp 4742–4746
Huiskes MJ, Lew MS (2008) The mir flickr retrieval evaluation. In: Proceedings of the 1st ACM international conference on multimedia information retrieval. ACM, pp 39–43
Jing L, Shen C, Yang L, Yu J, Ng MK (2017) Multi-label classification by semi-supervised singular value decomposition. IEEE Trans Image Process 26(10):4612–4625
Article MathSciNet MATH Google Scholar
Li Z, Nie F, Chang X, Yi Y (2017) Beyond trace ratio Weighted harmonic mean of trace ratios for multiclass discriminant analysis. IEEE Trans Knowl Data Eng PP(99):1–1
Google Scholar
Liu W (2011) Hashing with graphs. In: Proceedings of the international conference machine learning June, pp 1–8
Liu W, Mu C, Kumar S, Chang SF (2014) Discrete graph hashing. Advances in Neural Information Processing Systems:3419–3427
Liu W, Liu W, Lv Z, Lv Z, Gao X (2016) Coordinate discrete optimization for efficient cross-view image retrieval. In: International joint conference on artificial intelligence, pp 1860--1866
Liu X, Xu Q, Chau T, Mu Y, Zhu L, Yan S (2018) Revisiting jump-diffusion process for visual tracking: A reinforcement learning approach. IEEE Transactions on Circuits and Systems for Video Technology
Liu X, Xu Q, Xu Y, Zhu L, Mu Y (2018) A stochastic grammar for simultaneous human re-identification and tracking. IEEE Transactions on Circuits and Systems for Video Technology
Lu X, Zhu L, Cheng Z, Song X, Zhang H (2019) Efficient discrete latent semantic hashing for scalable cross-modal retrieval. Signal Process 154:217–231
Article Google Scholar
Luo Y, Tao D, Geng B, Xu C, Maybank SJ (2013) Manifold regularized multitask learning for semi-supervised multilabel image classification. IEEE Trans Image Process 22(2):523--536
Article MathSciNet MATH Google Scholar
Luo M, Chang X, Li Z, Nie L, Hauptmann AG, Zheng Q (2017) Simple to complex cross-modal learning to rank. Computer Vision & Image Understanding:163
Ma Z, Chang X, Xu Z, Sebe N, Hauptmann AG (2018) Joint attributes and event analysis for multimedia event detection. IEEE Trans Neural Netw Learn Syst 29(7):2921–2930
MathSciNet Google Scholar
Nie L, Wang M, Zha ZJ, Chua TxS (2012) Oracle in image search A content-based approach to performance prediction. ACM Trans Inf Syst 30(2):13
Article Google Scholar
Nie L, Yan S, Wang M, Hong R, Chua TS (2012) Harvesting visual concepts for image search with complex queries. In: ACM international conference on multimedia, pp 59–68
Nie L, Yi LZ, Wang X, Shen J, Chua TS (2014) Learning to recommend descriptive tags for questions in social forums. ACM Trans Inf Syst 32(1):1–23
Article Google Scholar
Schölkopf B, Smola A, Müller K-R (1998) Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput 10(5):1299–1319
Article Google Scholar
Seung D, L Lee (2001) Algorithms for non-negative matrix factorization. In: Advances in neural information processing systems, pp 556–562
Shen F, Shen C, Liu W, Shen HT (2015) Supervised discrete hashing. In: Computer vision and pattern recognition, pp 37–45
Shen F, Zhou X, Song J, Shen HT, Tao D (2016) A fast optimization method for general binary code learning. IEEE Trans Image Process 25(12):5610–5621
Article MathSciNet MATH Google Scholar
Sivic J (2003) Video google: a text retrieval apporach to object matching in viedeos. In: Proceedings of the IEEE international conference on computer vision
Song J, Yang Y, Yi Y, Zi H, Shen HT (2013) Inter-media hashing for large-scale retrieval from heterogeneous data sources. In: ACM SIGMOD international conference on management of data, pp 785–796
Wang J, Kumar S, Chang SF (2012) Semi-supervised hashing for large-scale search. IEEE Trans Pattern Anal Mach Intell 34(12):2393–2406
Article Google Scholar
Wang J, Shen HT, Song J, Ji J (2014) Hashing for similarity search: a survey. Computer Science
Wang L, Zhu L, Yu E, Sun J, Zhang H (2018) Task-dependent and query-dependent subspace learning for cross-modal retrieval. IEEE Access 6:27091–27102
Article Google Scholar
Weiss Y, Torralba A, Fergus R (2009) Spectral hashing. In: Proceedings of NIPS, pp 1753--1760
Xie L, Pan P, Lu Y, Wang S (2014) A cross-modal multi-task learning framework for image annotation categories and subject descriptors, pp 431–440
Yan X, Zhang L, Wu JL (2017) Semi-supervised deep hashing with a bipartite graph. In: Twenty-sixth international joint conference on artificial intelligence, pp 3238–3244
Yang Y, Shen F, Zi H, Shen HT, Li X (2017) Discrete nonnegative spectral clustering. IEEE Trans Knowl Data Eng PP(99):1–1
Google Scholar
Yi L, Rong J, Liu Y (2006) Semi-supervised multi-label learning by constrained non-negative matrix factorization. In: National conference on artificial intelligence and the eighteenth innovative applications of artificial intelligence conference. Boston, pp 421–426
Yi Y, Wu F, Nie F, Shen HT, Zhuang Y, Hauptmann AG (2012) Web and personal image annotation by mining label correlation with relaxed visual graph embedding. IEEE Trans Image Process Publ IEEE Signal Process Soc 21(3):1339–51
MathSciNet MATH Google Scholar
Yu HZ, Zhou ZH (2016) Large margin distribution learning with cost interval and unlabeled data. IEEE Trans Knowl Data Eng 28(7):1749–1763
Article Google Scholar
Zhang D, Wang J, Cai D, Lu J (2010) Self-taught hashing for fast similarity search. In: Proceedings of the 33rd international ACM SIGIR conference on research and development in information retrieval, pp 18–25
Zhang H, Shen F, Liu W, He X, Luan H, Chua TS (2016) Discrete collaborative filtering. In: International ACM SIGIR conference on research and development in information retrieval, pp 325–334
Zhu L, Zi H, Liu X, He X, Sun J, Zhou X (2017) Discrete multimodal hashing with canonical views for robust mobile landmark search. IEEE Trans Multimed 19(9):2066–2079
Article Google Scholar

Download references

Acknowledgments

This work was supported by the National Natural Science Foundation of China (No.61702388), the Fundamental Research Funds for the Central Universities(WUT: 2018IVB021) and the Fundamental Research Funds for the Central Universities (No.2018IB016).

Author information

Authors and Affiliations

Department of Mathematics, Wuhan University of Technology, Wuhan, China
Lang He, Liang Xie, Haohao Shu & Shengyuan Hu

Authors

Lang He
View author publications
You can also search for this author in PubMed Google Scholar
Liang Xie
View author publications
You can also search for this author in PubMed Google Scholar
Haohao Shu
View author publications
You can also search for this author in PubMed Google Scholar
Shengyuan Hu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Liang Xie.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

He, L., Xie, L., Shu, H. et al. Discrete semi-supervised learning for multi-label image classification and large-scale image retrieval. Multimed Tools Appl 78, 24519–24537 (2019). https://doi.org/10.1007/s11042-019-7157-8

Download citation

Received: 25 June 2018
Revised: 25 December 2018
Accepted: 03 January 2019
Published: 19 February 2019
Issue Date: 15 September 2019
DOI: https://doi.org/10.1007/s11042-019-7157-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Discrete semi-supervised learning for multi-label image classification and large-scale image retrieval

Abstract

Access this article

Similar content being viewed by others

ImageNet Large Scale Visual Recognition Challenge

A survey on semi-supervised learning

Learning to Prompt for Vision-Language Models

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Discrete semi-supervised learning for multi-label image classification and large-scale image retrieval

Abstract

Access this article

Similar content being viewed by others

ImageNet Large Scale Visual Recognition Challenge

A survey on semi-supervised learning

Learning to Prompt for Vision-Language Models

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation