Skip to main content
Log in

Enhanced VAEGAN: a zero-shot image classification method

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Zero-shot learning (ZSL) aims to classify samples of unseen categories for which no training data is available. At present, the VAEGAN framework which combines Generative Adversarial Networks (GAN) with Variational Auto-Encoder (VAE) has achieved good performance in zero-shot image classification. Based on the VAEGAN, we propose a new zero-shot image classification method named Enhanced VAEGAN (E-VAEGAN). Firstly, we design a feature alignment module to align visual features and attribute features. Then, the aligned features are fused with the hidden layer features of the encoder to improve output features of the encoder. Secondly, the triplet loss is applied during the encoder training, which further increases the discriminability of features. Finally, the hidden layer features of the discriminator are input into a transform module and then fed back to the generator, which improves the quality of the generated fake samples. The originality of this paper is that we design a new E-VAEGAN which employs the feature alignment module, triplet loss and transform module to reduce the ambiguity between categories and make the generated fake features similar to the real features. Experiments show that our method outperforms the compared methods on five zero-shot learning benchmarks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Hayashi Toshitaka K, Ambai, Fujita H (2020) Applying cluster-based zero-shot classifier to data imbalance problems[C]. International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems. Springer, Cham, 759–769

  2. Zhao P, Wang CY, Liu ZYA (2021)A zero-shot image classification method based on subspace learning with the fusion of reconstruction[J]. Chin J Comput 44(2):409–421

    Google Scholar 

  3. Hayashi, Toshitaka, Fujita H (2021)Cluster-based zero-shot learning for multivariate data[J]. J Ambient Intell Humaniz Comput 12(2):1897–1911

    Article  Google Scholar 

  4. Pan SJ, Yang Q (2009) A survey on transfer learning[J]. IEEE Trans Knowl Data Eng 22(10):1345–1359

  5. Shen Y, Liu L, Shen F et al (2018)Zero-shot sketch-image hashing[C]. In: Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, 3598–3607

  6. Feng YG, Yu J, Sang JT et al (2021) Survey on knowledge-based zero-shot visual recognition[J]. J Softw 32(2):370–405

    MATH  Google Scholar 

  7. Xian Y, Schiele B, Akata Z (2017)Zero-shot learning-the good, the bad and the ugly[C]. In: Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2017). IEEE, 3077–3086

  8. Goodfellow I, Pouget-Abadie J, Mirza M (2014) Generative adversarial nets[C]. In: Proceedings of the 28th Neural Information Processing Systems (NIPS 2014). MIT, 2672–2680

  9. Yongqin Xian T, Lorenz B, Schiele et al (2018) Feature generating networks for zero-shot learning [C]. In CVPR, 5542–5551

  10. Zhang GM, Long BY, Lu FF (2020)Zero-shot text recognition combining transfer guide and bidirectional cycle structure GAN[J]. Pattern Recog Artif Intell 33(12):1083–1096

    Google Scholar 

  11. Sanath Narayan A, Gupta FS, Khan et al (2020) Latent embedding feedback and discriminative features for zero-shot classification[C]. European Conference on Computer Vision. Springer, Cham, 479–495

  12. Dinu G, Lazaridou A, Baroni M (2014) Improving zero-shot learning by mitigating the hubness problem[J]. Comput Sci 9284:135–151

    Google Scholar 

  13. Lazaridou A, Dinu G, Baroni M (2015) Hubness and pollution: Delving into cross-space mapping for zero-shot learning[C]. In: Proc. of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th Int’l Joint Conf. on Natural Language Processing (vol 1: Long Papers), 270–280

  14. Fu Y, Hospedales TM, Xiang T et al (2015) Transductive multi-view zero-shot learning[J]. IEEE Trans Pattern Anal Mach Intell 37(11):2332–2345

    Article  Google Scholar 

  15. Fu Z, Xiang T, Kodirov E et al (2017)Zero-shot learning on semantic class prototype graph[J]. IEEE Trans Pattern Anal Mach Intell 2009–2022

  16. Chen L, Zhang H, Xiao J, Liu W, Chang SF (2018)Zero-shot visual recognition using semantics-preserving adversarial embedding networks[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1043–1052

  17. Pandey A, Mishra A, Verma VK et al (2020) Stacked adversarial network for zero-shot sketch based image retrieval[C]. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2529–2538

  18. Guan J, Lu Z, Xiang T et al (2020) Zero and few shot learning with semantic feature synthesis and competitive learning[J]. IEEE Trans Pattern Anal Mach Intell 43(7):2510–2523

    Article  Google Scholar 

  19. Xie GS, Zhang XY, Yao Y et al (2021) Vman: A virtual mainstay alignment network for transductive zero-shot learning[J]. IEEE Trans Image Process 30:4316–4329

    Article  MathSciNet  Google Scholar 

  20. Das D, George Lee CS (2019)Zero-shot image recognition using relational matching, adaptation and calibration[C]. 2019 International Joint Conference on Neural Networks (IJCNN). IEEE, 1–8

  21. Yang Hu, Wen G, Chapman A et al (2021) Graph-based visual-semantic entanglement network for zero-shot image recognition[J]. IEEE Trans Multimed 24:2473–2487

  22. Liu Y, Tuytelaars T (2020) A deep multi-modal explanation model for zero-shot learning[J]. IEEE Trans Image Process 29:4788–4803

    Article  MATH  Google Scholar 

  23. Luo Y, Wang X, Pourpanah F (2021) Dual VAEGAN: A generative model for generalized zero-shot learning[J]. Appl Soft Comput 107:107352

  24. Caixia Y, Chang X, Li Z et al (2021) Zeronas: Differentiable generative adversarial networks search for zero-shot learning[J]. IEEE Trans Pattern Anal Mach Intell 2021:1–9

  25. Shermin T, Teng SW, Sohel F et al (2021) Bidirectional mapping coupled GAN for generalized zero-shot learning[J]. IEEE Trans Image Process 31:721–733

    Article  Google Scholar 

  26. Arjovsky M, Chintala S, Bottou L (2017) Wasserstein Gan[C]. ICML

  27. Han Z, Fu Z, Li G et al (2020) Inference guided feature generation for generalized zero-shot learning[J]. Neurocomputing 430:150–158

  28. Li J, Jin M, Lu K et al (2019) Leveraging the invariant side of generative zero-shot learning[C]. In: Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, 7402–7411

  29. Felix R, Kumar VBG, Reid I et al (2018)Multi-modal cycle-consistent generalized zero-shot learning[C]. In: Proceedings of European Conference on Computer Vision, Munich, 21–37

  30. Vyas MR, Venkateswara H, Panchanathan S (2020) Leveraging seen and unseen semantic relationships for generative zero-shot learning[C]. European Conference on Computer Vision. Springer, Cham, 70–86

  31. Chen Z, Huang Z, Li J et al (2021)Entropy-based uncertainty calibration for generalized zero-shot learning[C]. Australasian Database Conference. Springer, Cham, 139–151

  32. Schonfeld E, Ebrahimi S, Sinha S et al (2019) Generalized zero-and few-shot learning via aligned variational autoencoders[C]. Proceedings of the IEEE Conf. on Computer Vision and Pattern Recognition, 8247 – 8255

  33. Verma VK, Arora G, Mishra A et al (2018) Generalized Zero-Shot Learning via Synthesized Examples[C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, 4281–4289

  34. Ma P, Hu X (2020) A variational autoencoder with deep embedding model for generalized zero-shot learning[C]. Proceedings of the AAAI Conference on Artificial Intelligence, 11733–11740

  35. Chen SM, Xie GS, Liu Y et al (2021) HSVA: Hierarchical semantic-visual adaptation for zero-shot learning[C]. 35th Conference on Neural Information Processing Systems

  36. Xian Y, Sharma S, Schiele B et al (2019) F-VAEGAN-D2: A feature generating framework for any-shot learning[C]. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 10275–10278

  37. Gao R, Hou X, Qin J, Shao et al (2020) Zero-VAE-GAN: Generating unseen features for generalized and transductive zero-shot learning[J]. IEEE Trans Image Process 29:3665–3680

    Article  MATH  Google Scholar 

  38. Huang H, Wang C, Yu PS et al (2019) Generative dual adversarial network for generalized zero-shot learning[C]. In: Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, 801–810

  39. Wen Y, Zhang K, Li Z et al (2016) A discriminative feature learning approach for deep face recognition[C]. European Conference on Computer Vision. Springer, Cham, 499–515

  40. Firas Shama R, Mechrez A, Shoshan et al (2019) Adversarial feedback loop[C]. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 3205–3214

  41. Lampert CH, Nickisch H, Harmeling S (2013)Attribute-based classification for zero-shot visual object categorization[J]. IEEE Trans Pattern Anal Mach Intell 36(3):453–465

    Article  Google Scholar 

  42. Welinder P, Branson S, Mita T et al (2010)Caltech-ucsd birds 200[R]. Technical Report CNS-TR-2010-001, Caltech

  43. Patterson G, Hays J (2012) Sun attribute database: Discovering, annotating, and recognizing scene attributes[C]. In: CVPR, 2751–2758

  44. Nilsback ME, Zisserman A (2008) Automated flower classification over a large number of classes[C]. In: ICVGIP, 722–729

  45. Van Der Maaten L, Hinton G (2008) Visualizing data using t-SNE[J]. J Mach Learn Res 9:2579–2605

Download references

Acknowledgements

This paper is supported by the National Natural Science Foundation of China (No.61673142) and the Natural Science Foundation of HeiLongJiang Province of China (No.JJ2019JQ0013).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yongjun He.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ding, B., Fan, Y., He, Y. et al. Enhanced VAEGAN: a zero-shot image classification method. Appl Intell 53, 9235–9246 (2023). https://doi.org/10.1007/s10489-022-03869-7

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-022-03869-7

Keywords

Navigation