Regularized label relaxation-based stacked autoencoder for zero-shot learning

Song, Jianqiang; Zhao, Heng; Wei, Xing; Zhang, Xiutai; Yao, Haiyan

doi:10.1007/s10489-023-04686-2

Regularized label relaxation-based stacked autoencoder for zero-shot learning

Published: 27 June 2023

Volume 53, pages 22348–22362, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Jianqiang Song¹,
Heng Zhao²,
Xing Wei^1,3,
Xiutai Zhang¹ &
…
Haiyan Yao¹

157 Accesses
1 Citation
Explore all metrics

Abstract

Recently, Zero-Shot Learning (ZSL) has gained great attention due to its significant classification performance for novel unobserved classes. As seen and unseen classes are completely disjoint, the current ZSL methods inevitably suffer from the domain shift problem when transferring the knowledge between the observed and unseen classes. Additionally, most ZSL methods especially those targeting the semantic space may cause the hubness problem due to their use of nearest-neighbor classifiers in high-dimensional space. To tackle these issues, we propose a novel pathway termed Regularized Label Relaxation-based Stacked Autoencoder (RLRSA) to diminish the domain difference between seen and unseen classes by exploiting an effective label space, which has some notable advantages. First, the proposed method establishes the tight relations among the visual representation, semantic information and label space using via the stacked autoencoder, which is beneficial for avoiding the projection domain shift. Second, by incorporating a slack variable matrix into the label space, our RLRSA method has more freedom to fit the test samples whether they come from the observed or unseen classes, resulting in a very robust and discriminative projection. Third, we construct a manifold regularization based on a class compactness graph to further reduce the domain gap between the seen and unseen classes. Finally, the learned projection is utilized to predict the class label of the target sample, thus the hubness issue can be prevented. Extensive experiments conducted on benchmark datasets clearly show that our RLRSA method produces new state-of-the-art results under two standard ZSL settings. For example, the RLRSA obtains the highest average accuracy of 67.82% on five benchmark datasets under the pure ZSL setting. For the generalized ZSL task, the proposed RLRSA is still highly effective, e.g., it achieves the best H result of 58.9% on the AwA2 dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Enhancing Generative Generalized Zero Shot Learning via Multi-Space Constraints and Adaptive Integration

Label correlation preserving visual-semantic joint embedding for multi-label zero-shot learning

Article 02 August 2024

Generative-based hybrid model with semantic representations for generalized zero-shot learning

Article 02 December 2024

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data Availability

The data used during this study are public datasets, which can be obtained directly from the references or provided as required.

References

H. Touvron, M. Cord, A. Sablayrolles, G. Synnaeve, H. Jégou, Going deeper with image transformers, in: Proceedings of the IEEE International Conference on Computer Vision, 2021, pp. 32–42
Wang P, Fan E, Wang P (2021) Comparative analysis of image classification algorithms based on traditional machine learning and deep learning. Pattern Recogn. Lett. 141:61–67
Article Google Scholar
Wei W, Zheng VW, Han Y, Miao C (2019) A survey of zero-shot learning: Settings, methods, and applications. ACM Trans. Intell. Syst. Technol. 10(2):1–37
Article Google Scholar
R. Socher, M. Ganjoo, C. D. Manning, A. Ng, Zero-shot learning through cross-modal transfer, in: Advances in Neural Information Processing Systems, 2013, pp. 935–943
A. Frome, G. S. Corrado, J. Shlens, S. Bengio, J. Dean, M. Ranzato, T. Mikolov, Devise: a deep visual-semantic embedding model, in: Advances in Neural Information Processing Systems, 2013, pp. 2121–2129
J. Li, M. Jing, K. Lu, Z. Ding, L. Zhu, Z. Huang, Leveraging the invariant side of generative zero-shot learning, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019, pp. 7402–7411
Zhang R, Zhu Q, Xu X, Zhang D, Huang S-J (2021) Visual-guided attentive attributes embedding for zero-shot learning. Neural Networks 143:709–718
Article MATH Google Scholar
E. Kodirov, T. Xiang, S. Gong, Semantic autoencoder for zero-shot learning, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 4447–4456
Liu Y, Gao X, Gao Q, Han J, Shao L (2020) Label-activating framework for zero-shot learning. Neural Networks 121:1–9
Article Google Scholar
Ji Z, Wang J, Yu Y, Pang Y, Han J (2019) Class-specific synthesized dictionary model for zero-shot learning. Neurocomputing 329:339–347
M. Bucher, S. Herbin, F. Jurie, Improving semantic embedding consistency by metric learning for zero-shot classiffication, in: Proceedings of the European Conference on Computer Vision, 2016, pp. 730–746
Pan C, Huang J, Hao J, Gong J (2020) Towards zero-shot learning generalization via a cosine distance loss. Neurocomputing 381:167–176
Article Google Scholar
A. Mishra, S. Krishna Reddy, A. Mittal, H. A. Murthy, A generative model for zero shot learning using conditional variational autoencoders, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 2188–2196
Y. Xian, T. Lorenz, B. Schiele, Z. Akata, Feature generating networks for zero-shot learning, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 5542–5551
Ma Y, Xu X, Shen F, Shen HT (2020) Similarity preserving feature generating networks for zero-shot learning. Neurocomputing 406:333–342
Article Google Scholar
W. Wang, Y. Pu, V. K. Verma, K. Fan, Y. Zhang, C. Chen, P. Rai, L. Carin, Zero-shot learning via class-conditioned deep generative models, in: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018, pp. 4211–4218
M. Ye, Y. Guo, Progressive ensemble networks for zero-shot recognition, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019, pp. 11728–11736
J. Song, C. Shen, Y. Yang, Y. Liu, M. Song, Transductive unbiased embedding for zero-shot learning, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 1024–1033
S. Changpinyo, W.-L. Chao, B. Gong, F. Sha, Synthesized classifiers for zero-shot learning, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 5327–5336
Guan J, Lu Z, Xiang T, Li A, Zhao A, Wen J-R (2021) Zero and few shot learning with semantic feature synthesis and competitive learning. IEEE Trans. Pattern Anal. Mach. Intell. 43(7):2510–2523
Article Google Scholar
Y. Liu, Q. Gao, J. Han, S. Wang, X. Gao, Graph and autoencoder based feature extraction for zero-shot learning, in: Proceedings of the International Joint Conference on Artificial Intelligence, 2019, pp. 15–36
Wu H, Yan Y, Chen S, Huang X, Wu Q, Ng MK (2021) Joint visual and semantic optimization for zero-shot learning. Knowl. Based Syst. 215:106773
Article Google Scholar
B. Romera-Paredes, P. Torr, An embarrassingly simple approach to zero-shot learning, in: Proceedings of the International Conference on Machine Learning, 2015, pp. 2152–2161
Z. Akata, S. Reed, D. Walter, H. Lee, B. Schiele, Evaluation of output embeddings for fine-grained image classification, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 2927–2936
Xian Y, Schiele B, Akata Z (2019) Zero-shot learning-the good, the bad and the ugly. IEEE Trans. Pattern Anal. Mach. Intell. 41(9):2251–2265
Article Google Scholar
Guo J, Guo S (2021) A novel perspective to zero-shot learning: Towards an alignment of manifold structures via semantic feature expansion. IEEE Trans. Multim. 23:524–537
Article MathSciNet Google Scholar
L. Zhang, T. Xiang, S. Gong, et al., Learning a deep embedding model for zero-shot learning, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 3010–3019
Pan C, Huang J, Hao J, Gong J (2020) Towards zero-shot learning generalization via a cosine distance loss. Neurocomputing 381:167–176
Article Google Scholar
Shen F, Zhou X, Yu J, Yang Y, Liu L, Shen HT (2019) Scalable zero-shot learning via binary visual-semantic embeddings. IEEE Trans. Image Process. 28(7):3662–3674
Article MathSciNet MATH Google Scholar
Gao R, Hou X, Qin J, Chen J, Liu L, Zhu F, Zhang Z, Shao L (2020) Zero-vae-gan: Generating unseen features for generalized and transductive zero-shot learning. IEEE Trans. Image Process. 29:3665–3680
Article MATH Google Scholar
Xiang S, Nie F, Meng G, Pan C, Zhang C (2012) Discriminative least squares regression for multiclass classification and feature selection. IEEE Trans. Neural Netw. Learn. Syst. 23(11):1738–1754
Article Google Scholar
Han N, Wu J, Fang X, Wong WK, Xu Y, Yang J, Li X (2020) Double relaxed regression for image classification. IEEE Trans. Circuits Syst. Video Technol. 30(2):307–319
Article Google Scholar
Ma J, Zhou S (2022) Discriminative least squares regression for multiclass classification based on within-class scatter minimization. Appl. Intell. 52(1):622–635
Article Google Scholar
Han H, Li W, Wang J, Qin G, Qin X (2022) Enhance explainability of manifold learning. Neurocomputing 500:877–895
Article Google Scholar
Bartels RH, Stewart GW (1972) Solution of the matrix equation ax+xb=c [f4]. Commun. ACM 15(9):820–826
Article MATH Google Scholar
J. Song, G. Shi, X. Xie, D. Gao, Zero-shot learning using stacked autoencoder with manifold regularizations, in: Proceedings of the IEEE International Conference on Image Processing, 2019, pp. 3651–3655
Luo X, Wu H, Wang Z, Wang J, Meng D (2022) A novel approach to large-scale dynamically weighted directed network representation. IEEE Trans. Pattern Anal. Mach. Intell. 44(12):9756–9773
Article Google Scholar
Han D, Sun D, Zhang L (2018) Linear rate convergence of the alternating direction method of multipliers for convex composite programming. Math. Oper. Res. 43(2):622–637
Article MathSciNet MATH Google Scholar
A. Farhadi, I. Endres, D. Hoiem, D. Forsyth, Describing objects by their attributes, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2009, pp. 1778–1785
C. Wah, S. Branson, P. Welinder, P. Perona, S. Belongie, The caltech-ucsd birds-200-2011 dataset, Tech. rep. (2011)
Patterson G, Xu C, Su H, Hays J (2014) The sun attribute database: Beyond categories for deeper scene understanding. International Journal of Computer Vision 108:59–81
Article Google Scholar
Yang H, Sun B, Li B, Yang C, Wang Z, Chen J, Wang L, Li H (2023) Iterative class prototype calibration for transductive zero-shot learning. IEEE Trans. Circuits Syst. Video Technol. 33(3):1236–1246
Article Google Scholar
Long T, Xu X, Shen F, Liu L, Xie N, Yang Y (2018) Zero-shot learning via discriminative representation extraction. Pattern Recogn. Lett. 109:27-34
Article Google Scholar
V. K. Verma, G. Arora, A. Mishra, P. Rai, Generalized zero-shot learning via synthesized examples, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 4281–4289
Rahman S, Khan S, Porikli F (2018) A unified approach for conventional zero-shot, generalized zero-shot, and few-shot learning. IEEE Trans. Image Process. 27(11):5652–5667
Article MathSciNet Google Scholar
Yu Y, Zhong J, Li X, Guo J, Zhang Z, Ling H, Wu F (2018) Transductive zero-shot learning with a self-training dictionary approach, IEEE Trans. Syst. Man. Cybern. B Cybern. 48(10):2908–2919
Google Scholar
Y. Guo, G. Ding, X. Jin, J. Wang, Transductive zero-shot recognition via shared model space learning, in: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016, pp. 3494–3500
Yu Y, Ji Z, Guo J, Pang Y (2018) Transductive zero-shot learning with adaptive structural embedding. IEEE Trans. Neural Netw. Learn. Syst. 29(9):4116–4127
Article Google Scholar
V. K. Verma, P. Rai, A simple exponential family framework for zero-shot learning, in: European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2017, pp. 792–808
Ji Z, Sun Y, Yu Y, Guo J, Pang Y (2018) Semantic softmax loss for zero-shot learning. Neurocomputing 316:369–375
Article Google Scholar
Yu Y, Ji Z, Guo J, Zhang Z (2019) Zero-shot learning via latent space encoding. IEEE Trans. Cybern. 49(10):3755–3766
Article Google Scholar
E. Kodirov, T. Xiang, Z. Fu, S. Gong, Unsupervised domain adaptation for zero-shot learning, in: Proceedings of the IEEE Conference on Computer Vision, 2015, pp. 2452–2460
Lampert CH, Nickisch H, Harmeling S (2014) Attribute-based classification for zero-shot visual object categorization. IEEE Trans. Pattern Anal. Mach. Intell. 36(3):453–465
Article Google Scholar
Y. Zhu, M. Elhoseiny, B. Liu, X. Peng, A. Elgammal, A generative adversarial approach for zero-shot learning from noisy texts, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 1004–1013
L. v. d. Maaten, G. Hinton, Visualizing data using t-sne, Journal of Machine Learning Research 9 (11) (2008) 2579-2605

Download references

Acknowledgements

This paper is supported by the National Natural Science Foundation of China (No. 61876139), the Project of Science and Technology of Henan (No. 212102310383), the Key Technologies R & D Program of Anyang (No. 2022C01SF112), the Research Foundation of Anyang Institute of Technology (No. YPY2021007), and the Research Start-up Foundation of Dr. Song Jianqiang (No. BSJ2022026).

Author information

Authors and Affiliations

College of Electronic Information and Electric Engineering, Anyang Institute of Technology, Anyang, 455000, China
Jianqiang Song, Xing Wei, Xiutai Zhang & Haiyan Yao
School of Life Science and Technology, Xidian University, Xi’an, 710071, China
Heng Zhao
Electronics & Information School of Yangtze University, Jingzhou, 434023, China
Xing Wei

Authors

Jianqiang Song
View author publications
You can also search for this author in PubMed Google Scholar
Heng Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Xing Wei
View author publications
You can also search for this author in PubMed Google Scholar
Xiutai Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Haiyan Yao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jianqiang Song.

Ethics declarations

Conflicts of interest

The authors declare that they do have not any pertinent or potential conflicts which could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Song, J., Zhao, H., Wei, X. et al. Regularized label relaxation-based stacked autoencoder for zero-shot learning. Appl Intell 53, 22348–22362 (2023). https://doi.org/10.1007/s10489-023-04686-2

Download citation

Accepted: 04 May 2023
Published: 27 June 2023
Issue Date: October 2023
DOI: https://doi.org/10.1007/s10489-023-04686-2

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Regularized label relaxation-based stacked autoencoder for zero-shot learning

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Enhancing Generative Generalized Zero Shot Learning via Multi-Space Constraints and Adaptive Integration

Label correlation preserving visual-semantic joint embedding for multi-label zero-shot learning

Generative-based hybrid model with semantic representations for generalized zero-shot learning

Data Availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Regularized label relaxation-based stacked autoencoder for zero-shot learning

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Enhancing Generative Generalized Zero Shot Learning via Multi-Space Constraints and Adaptive Integration

Label correlation preserving visual-semantic joint embedding for multi-label zero-shot learning

Generative-based hybrid model with semantic representations for generalized zero-shot learning

Explore related subjects

Data Availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation