Enhanced VAEGAN: a zero-shot image classification method

Ding, Bo; Fan, Yufei; He, Yongjun; Zhao, Jing

doi:10.1007/s10489-022-03869-7

Enhanced VAEGAN: a zero-shot image classification method

Published: 06 August 2022

Volume 53, pages 9235–9246, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

503 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

Zero-shot learning (ZSL) aims to classify samples of unseen categories for which no training data is available. At present, the VAEGAN framework which combines Generative Adversarial Networks (GAN) with Variational Auto-Encoder (VAE) has achieved good performance in zero-shot image classification. Based on the VAEGAN, we propose a new zero-shot image classification method named Enhanced VAEGAN (E-VAEGAN). Firstly, we design a feature alignment module to align visual features and attribute features. Then, the aligned features are fused with the hidden layer features of the encoder to improve output features of the encoder. Secondly, the triplet loss is applied during the encoder training, which further increases the discriminability of features. Finally, the hidden layer features of the discriminator are input into a transform module and then fed back to the generator, which improves the quality of the generated fake samples. The originality of this paper is that we design a new E-VAEGAN which employs the feature alignment module, triplet loss and transform module to reduce the ambiguity between categories and make the generated fake features similar to the real features. Experiments show that our method outperforms the compared methods on five zero-shot learning benchmarks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Joint Generative Model for Zero-Shot Learning

Generative Generalized Zero-Shot Learning Based on Auxiliary-Features

Multi-modal Cycle-Consistent Generalized Zero-Shot Learning

References

Hayashi Toshitaka K, Ambai, Fujita H (2020) Applying cluster-based zero-shot classifier to data imbalance problems[C]. International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems. Springer, Cham, 759–769
Zhao P, Wang CY, Liu ZYA (2021)A zero-shot image classification method based on subspace learning with the fusion of reconstruction[J]. Chin J Comput 44(2):409–421
Google Scholar
Hayashi, Toshitaka, Fujita H (2021)Cluster-based zero-shot learning for multivariate data[J]. J Ambient Intell Humaniz Comput 12(2):1897–1911
Article Google Scholar
Pan SJ, Yang Q (2009) A survey on transfer learning[J]. IEEE Trans Knowl Data Eng 22(10):1345–1359
Shen Y, Liu L, Shen F et al (2018)Zero-shot sketch-image hashing[C]. In: Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, 3598–3607
Feng YG, Yu J, Sang JT et al (2021) Survey on knowledge-based zero-shot visual recognition[J]. J Softw 32(2):370–405
MATH Google Scholar
Xian Y, Schiele B, Akata Z (2017)Zero-shot learning-the good, the bad and the ugly[C]. In: Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2017). IEEE, 3077–3086
Goodfellow I, Pouget-Abadie J, Mirza M (2014) Generative adversarial nets[C]. In: Proceedings of the 28th Neural Information Processing Systems (NIPS 2014). MIT, 2672–2680
Yongqin Xian T, Lorenz B, Schiele et al (2018) Feature generating networks for zero-shot learning [C]. In CVPR, 5542–5551
Zhang GM, Long BY, Lu FF (2020)Zero-shot text recognition combining transfer guide and bidirectional cycle structure GAN[J]. Pattern Recog Artif Intell 33(12):1083–1096
Google Scholar
Sanath Narayan A, Gupta FS, Khan et al (2020) Latent embedding feedback and discriminative features for zero-shot classification[C]. European Conference on Computer Vision. Springer, Cham, 479–495
Dinu G, Lazaridou A, Baroni M (2014) Improving zero-shot learning by mitigating the hubness problem[J]. Comput Sci 9284:135–151
Google Scholar
Lazaridou A, Dinu G, Baroni M (2015) Hubness and pollution: Delving into cross-space mapping for zero-shot learning[C]. In: Proc. of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th Int’l Joint Conf. on Natural Language Processing (vol 1: Long Papers), 270–280
Fu Y, Hospedales TM, Xiang T et al (2015) Transductive multi-view zero-shot learning[J]. IEEE Trans Pattern Anal Mach Intell 37(11):2332–2345
Article Google Scholar
Fu Z, Xiang T, Kodirov E et al (2017)Zero-shot learning on semantic class prototype graph[J]. IEEE Trans Pattern Anal Mach Intell 2009–2022
Chen L, Zhang H, Xiao J, Liu W, Chang SF (2018)Zero-shot visual recognition using semantics-preserving adversarial embedding networks[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1043–1052
Pandey A, Mishra A, Verma VK et al (2020) Stacked adversarial network for zero-shot sketch based image retrieval[C]. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2529–2538
Guan J, Lu Z, Xiang T et al (2020) Zero and few shot learning with semantic feature synthesis and competitive learning[J]. IEEE Trans Pattern Anal Mach Intell 43(7):2510–2523
Article Google Scholar
Xie GS, Zhang XY, Yao Y et al (2021) Vman: A virtual mainstay alignment network for transductive zero-shot learning[J]. IEEE Trans Image Process 30:4316–4329
Article MathSciNet Google Scholar
Das D, George Lee CS (2019)Zero-shot image recognition using relational matching, adaptation and calibration[C]. 2019 International Joint Conference on Neural Networks (IJCNN). IEEE, 1–8
Yang Hu, Wen G, Chapman A et al (2021) Graph-based visual-semantic entanglement network for zero-shot image recognition[J]. IEEE Trans Multimed 24:2473–2487
Liu Y, Tuytelaars T (2020) A deep multi-modal explanation model for zero-shot learning[J]. IEEE Trans Image Process 29:4788–4803
Article MATH Google Scholar
Luo Y, Wang X, Pourpanah F (2021) Dual VAEGAN: A generative model for generalized zero-shot learning[J]. Appl Soft Comput 107:107352
Caixia Y, Chang X, Li Z et al (2021) Zeronas: Differentiable generative adversarial networks search for zero-shot learning[J]. IEEE Trans Pattern Anal Mach Intell 2021:1–9
Shermin T, Teng SW, Sohel F et al (2021) Bidirectional mapping coupled GAN for generalized zero-shot learning[J]. IEEE Trans Image Process 31:721–733
Article Google Scholar
Arjovsky M, Chintala S, Bottou L (2017) Wasserstein Gan[C]. ICML
Han Z, Fu Z, Li G et al (2020) Inference guided feature generation for generalized zero-shot learning[J]. Neurocomputing 430:150–158
Li J, Jin M, Lu K et al (2019) Leveraging the invariant side of generative zero-shot learning[C]. In: Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, 7402–7411
Felix R, Kumar VBG, Reid I et al (2018)Multi-modal cycle-consistent generalized zero-shot learning[C]. In: Proceedings of European Conference on Computer Vision, Munich, 21–37
Vyas MR, Venkateswara H, Panchanathan S (2020) Leveraging seen and unseen semantic relationships for generative zero-shot learning[C]. European Conference on Computer Vision. Springer, Cham, 70–86
Chen Z, Huang Z, Li J et al (2021)Entropy-based uncertainty calibration for generalized zero-shot learning[C]. Australasian Database Conference. Springer, Cham, 139–151
Schonfeld E, Ebrahimi S, Sinha S et al (2019) Generalized zero-and few-shot learning via aligned variational autoencoders[C]. Proceedings of the IEEE Conf. on Computer Vision and Pattern Recognition, 8247 – 8255
Verma VK, Arora G, Mishra A et al (2018) Generalized Zero-Shot Learning via Synthesized Examples[C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, 4281–4289
Ma P, Hu X (2020) A variational autoencoder with deep embedding model for generalized zero-shot learning[C]. Proceedings of the AAAI Conference on Artificial Intelligence, 11733–11740
Chen SM, Xie GS, Liu Y et al (2021) HSVA: Hierarchical semantic-visual adaptation for zero-shot learning[C]. 35th Conference on Neural Information Processing Systems
Xian Y, Sharma S, Schiele B et al (2019) F-VAEGAN-D2: A feature generating framework for any-shot learning[C]. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 10275–10278
Gao R, Hou X, Qin J, Shao et al (2020) Zero-VAE-GAN: Generating unseen features for generalized and transductive zero-shot learning[J]. IEEE Trans Image Process 29:3665–3680
Article MATH Google Scholar
Huang H, Wang C, Yu PS et al (2019) Generative dual adversarial network for generalized zero-shot learning[C]. In: Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, 801–810
Wen Y, Zhang K, Li Z et al (2016) A discriminative feature learning approach for deep face recognition[C]. European Conference on Computer Vision. Springer, Cham, 499–515
Firas Shama R, Mechrez A, Shoshan et al (2019) Adversarial feedback loop[C]. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 3205–3214
Lampert CH, Nickisch H, Harmeling S (2013)Attribute-based classification for zero-shot visual object categorization[J]. IEEE Trans Pattern Anal Mach Intell 36(3):453–465
Article Google Scholar
Welinder P, Branson S, Mita T et al (2010)Caltech-ucsd birds 200[R]. Technical Report CNS-TR-2010-001, Caltech
Patterson G, Hays J (2012) Sun attribute database: Discovering, annotating, and recognizing scene attributes[C]. In: CVPR, 2751–2758
Nilsback ME, Zisserman A (2008) Automated flower classification over a large number of classes[C]. In: ICVGIP, 722–729
Van Der Maaten L, Hinton G (2008) Visualizing data using t-SNE[J]. J Mach Learn Res 9:2579–2605

Download references

Acknowledgements

This paper is supported by the National Natural Science Foundation of China (No.61673142) and the Natural Science Foundation of HeiLongJiang Province of China (No.JJ2019JQ0013).

Author information

Authors and Affiliations

Department of Computer Science and Technology, Harbin University of Science and Technology, 150080, Harbin, China
Bo Ding, Yufei Fan, Yongjun He & Jing Zhao

Authors

Bo Ding
View author publications
You can also search for this author in PubMed Google Scholar
Yufei Fan
View author publications
You can also search for this author in PubMed Google Scholar
Yongjun He
View author publications
You can also search for this author in PubMed Google Scholar
Jing Zhao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yongjun He.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ding, B., Fan, Y., He, Y. et al. Enhanced VAEGAN: a zero-shot image classification method. Appl Intell 53, 9235–9246 (2023). https://doi.org/10.1007/s10489-022-03869-7

Download citation

Accepted: 08 June 2022
Published: 06 August 2022
Issue Date: April 2023
DOI: https://doi.org/10.1007/s10489-022-03869-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Enhanced VAEGAN: a zero-shot image classification method

Abstract

Access this article

Similar content being viewed by others

A Joint Generative Model for Zero-Shot Learning

Generative Generalized Zero-Shot Learning Based on Auxiliary-Features

Multi-modal Cycle-Consistent Generalized Zero-Shot Learning

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Enhanced VAEGAN: a zero-shot image classification method

Abstract

Access this article

Similar content being viewed by others

A Joint Generative Model for Zero-Shot Learning

Generative Generalized Zero-Shot Learning Based on Auxiliary-Features

Multi-modal Cycle-Consistent Generalized Zero-Shot Learning

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation