Abstract
Zero-shot learning (ZSL) aims to classify samples of unseen categories for which no training data is available. At present, the VAEGAN framework which combines Generative Adversarial Networks (GAN) with Variational Auto-Encoder (VAE) has achieved good performance in zero-shot image classification. Based on the VAEGAN, we propose a new zero-shot image classification method named Enhanced VAEGAN (E-VAEGAN). Firstly, we design a feature alignment module to align visual features and attribute features. Then, the aligned features are fused with the hidden layer features of the encoder to improve output features of the encoder. Secondly, the triplet loss is applied during the encoder training, which further increases the discriminability of features. Finally, the hidden layer features of the discriminator are input into a transform module and then fed back to the generator, which improves the quality of the generated fake samples. The originality of this paper is that we design a new E-VAEGAN which employs the feature alignment module, triplet loss and transform module to reduce the ambiguity between categories and make the generated fake features similar to the real features. Experiments show that our method outperforms the compared methods on five zero-shot learning benchmarks.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Hayashi Toshitaka K, Ambai, Fujita H (2020) Applying cluster-based zero-shot classifier to data imbalance problems[C]. International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems. Springer, Cham, 759–769
Zhao P, Wang CY, Liu ZYA (2021)A zero-shot image classification method based on subspace learning with the fusion of reconstruction[J]. Chin J Comput 44(2):409–421
Hayashi, Toshitaka, Fujita H (2021)Cluster-based zero-shot learning for multivariate data[J]. J Ambient Intell Humaniz Comput 12(2):1897–1911
Pan SJ, Yang Q (2009) A survey on transfer learning[J]. IEEE Trans Knowl Data Eng 22(10):1345–1359
Shen Y, Liu L, Shen F et al (2018)Zero-shot sketch-image hashing[C]. In: Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, 3598–3607
Feng YG, Yu J, Sang JT et al (2021) Survey on knowledge-based zero-shot visual recognition[J]. J Softw 32(2):370–405
Xian Y, Schiele B, Akata Z (2017)Zero-shot learning-the good, the bad and the ugly[C]. In: Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2017). IEEE, 3077–3086
Goodfellow I, Pouget-Abadie J, Mirza M (2014) Generative adversarial nets[C]. In: Proceedings of the 28th Neural Information Processing Systems (NIPS 2014). MIT, 2672–2680
Yongqin Xian T, Lorenz B, Schiele et al (2018) Feature generating networks for zero-shot learning [C]. In CVPR, 5542–5551
Zhang GM, Long BY, Lu FF (2020)Zero-shot text recognition combining transfer guide and bidirectional cycle structure GAN[J]. Pattern Recog Artif Intell 33(12):1083–1096
Sanath Narayan A, Gupta FS, Khan et al (2020) Latent embedding feedback and discriminative features for zero-shot classification[C]. European Conference on Computer Vision. Springer, Cham, 479–495
Dinu G, Lazaridou A, Baroni M (2014) Improving zero-shot learning by mitigating the hubness problem[J]. Comput Sci 9284:135–151
Lazaridou A, Dinu G, Baroni M (2015) Hubness and pollution: Delving into cross-space mapping for zero-shot learning[C]. In: Proc. of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th Int’l Joint Conf. on Natural Language Processing (vol 1: Long Papers), 270–280
Fu Y, Hospedales TM, Xiang T et al (2015) Transductive multi-view zero-shot learning[J]. IEEE Trans Pattern Anal Mach Intell 37(11):2332–2345
Fu Z, Xiang T, Kodirov E et al (2017)Zero-shot learning on semantic class prototype graph[J]. IEEE Trans Pattern Anal Mach Intell 2009–2022
Chen L, Zhang H, Xiao J, Liu W, Chang SF (2018)Zero-shot visual recognition using semantics-preserving adversarial embedding networks[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1043–1052
Pandey A, Mishra A, Verma VK et al (2020) Stacked adversarial network for zero-shot sketch based image retrieval[C]. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2529–2538
Guan J, Lu Z, Xiang T et al (2020) Zero and few shot learning with semantic feature synthesis and competitive learning[J]. IEEE Trans Pattern Anal Mach Intell 43(7):2510–2523
Xie GS, Zhang XY, Yao Y et al (2021) Vman: A virtual mainstay alignment network for transductive zero-shot learning[J]. IEEE Trans Image Process 30:4316–4329
Das D, George Lee CS (2019)Zero-shot image recognition using relational matching, adaptation and calibration[C]. 2019 International Joint Conference on Neural Networks (IJCNN). IEEE, 1–8
Yang Hu, Wen G, Chapman A et al (2021) Graph-based visual-semantic entanglement network for zero-shot image recognition[J]. IEEE Trans Multimed 24:2473–2487
Liu Y, Tuytelaars T (2020) A deep multi-modal explanation model for zero-shot learning[J]. IEEE Trans Image Process 29:4788–4803
Luo Y, Wang X, Pourpanah F (2021) Dual VAEGAN: A generative model for generalized zero-shot learning[J]. Appl Soft Comput 107:107352
Caixia Y, Chang X, Li Z et al (2021) Zeronas: Differentiable generative adversarial networks search for zero-shot learning[J]. IEEE Trans Pattern Anal Mach Intell 2021:1–9
Shermin T, Teng SW, Sohel F et al (2021) Bidirectional mapping coupled GAN for generalized zero-shot learning[J]. IEEE Trans Image Process 31:721–733
Arjovsky M, Chintala S, Bottou L (2017) Wasserstein Gan[C]. ICML
Han Z, Fu Z, Li G et al (2020) Inference guided feature generation for generalized zero-shot learning[J]. Neurocomputing 430:150–158
Li J, Jin M, Lu K et al (2019) Leveraging the invariant side of generative zero-shot learning[C]. In: Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, 7402–7411
Felix R, Kumar VBG, Reid I et al (2018)Multi-modal cycle-consistent generalized zero-shot learning[C]. In: Proceedings of European Conference on Computer Vision, Munich, 21–37
Vyas MR, Venkateswara H, Panchanathan S (2020) Leveraging seen and unseen semantic relationships for generative zero-shot learning[C]. European Conference on Computer Vision. Springer, Cham, 70–86
Chen Z, Huang Z, Li J et al (2021)Entropy-based uncertainty calibration for generalized zero-shot learning[C]. Australasian Database Conference. Springer, Cham, 139–151
Schonfeld E, Ebrahimi S, Sinha S et al (2019) Generalized zero-and few-shot learning via aligned variational autoencoders[C]. Proceedings of the IEEE Conf. on Computer Vision and Pattern Recognition, 8247 – 8255
Verma VK, Arora G, Mishra A et al (2018) Generalized Zero-Shot Learning via Synthesized Examples[C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, 4281–4289
Ma P, Hu X (2020) A variational autoencoder with deep embedding model for generalized zero-shot learning[C]. Proceedings of the AAAI Conference on Artificial Intelligence, 11733–11740
Chen SM, Xie GS, Liu Y et al (2021) HSVA: Hierarchical semantic-visual adaptation for zero-shot learning[C]. 35th Conference on Neural Information Processing Systems
Xian Y, Sharma S, Schiele B et al (2019) F-VAEGAN-D2: A feature generating framework for any-shot learning[C]. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 10275–10278
Gao R, Hou X, Qin J, Shao et al (2020) Zero-VAE-GAN: Generating unseen features for generalized and transductive zero-shot learning[J]. IEEE Trans Image Process 29:3665–3680
Huang H, Wang C, Yu PS et al (2019) Generative dual adversarial network for generalized zero-shot learning[C]. In: Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, 801–810
Wen Y, Zhang K, Li Z et al (2016) A discriminative feature learning approach for deep face recognition[C]. European Conference on Computer Vision. Springer, Cham, 499–515
Firas Shama R, Mechrez A, Shoshan et al (2019) Adversarial feedback loop[C]. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 3205–3214
Lampert CH, Nickisch H, Harmeling S (2013)Attribute-based classification for zero-shot visual object categorization[J]. IEEE Trans Pattern Anal Mach Intell 36(3):453–465
Welinder P, Branson S, Mita T et al (2010)Caltech-ucsd birds 200[R]. Technical Report CNS-TR-2010-001, Caltech
Patterson G, Hays J (2012) Sun attribute database: Discovering, annotating, and recognizing scene attributes[C]. In: CVPR, 2751–2758
Nilsback ME, Zisserman A (2008) Automated flower classification over a large number of classes[C]. In: ICVGIP, 722–729
Van Der Maaten L, Hinton G (2008) Visualizing data using t-SNE[J]. J Mach Learn Res 9:2579–2605
Acknowledgements
This paper is supported by the National Natural Science Foundation of China (No.61673142) and the Natural Science Foundation of HeiLongJiang Province of China (No.JJ2019JQ0013).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Ding, B., Fan, Y., He, Y. et al. Enhanced VAEGAN: a zero-shot image classification method. Appl Intell 53, 9235–9246 (2023). https://doi.org/10.1007/s10489-022-03869-7
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-022-03869-7