Skip to main content
Log in

Bidirectional generative transductive zero-shot learning

  • Review Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Most zero-shot learning (ZSL) methods aim to learn a mapping from visual feature space to semantic feature space or from both visual and semantic feature spaces to a common joint space and align them. However, in these methods the visual and semantic information are not utilized sufficiently and the useless information is not excluded. Moreover, there exists a strong bias problem that the instances from unseen classes always tend to be predicted as some seen classes in most ZSL methods. In this paper, combining the advantages of generative adversarial networks (GANs), a method based on bidirectional projections between the visual and semantic feature spaces is proposed. GANs are used to perform bidirectional generations and alignments between the visual and semantic features. In addition, cycle mapping structure ensures that the important information are kept in the alignments. Furthermore, in order to better solve the bias problem, pseudo-labels are generated for unseen instances and the model is adjusted along with them iteratively. We conduct extensive experiments at traditional ZSL and generalized ZSL settings, respectively. Experiment results confirm that our method achieves the state-of-the-art performances on the popular datasets AWA2, aPY and SUN.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Akata Z, Perronnin F, Harchaoui Z, Schmid C (2013) Label-embedding for attribute-based classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 819–826

  2. Akata Z, Perronnin F, Harchaoui Z, Schmid C (2015) Label-embedding for image classification. IEEE Trans Pattern Anal Mach Intell 38(7):1425–1438

    Article  Google Scholar 

  3. Blum A, Mitchell T (1998) Combining labeled and unlabeled data with co-training. In: Proceedings of the eleventh annual conference on Computational learning theory. ACM, pp 92–100

  4. Bucher M, Herbin S, Jurie F (2016) Improving semantic embedding consistency by metric learning for zero-shot classification. In: European conference on computer vision. Springer, pp 730–746

  5. Changpinyo S, Chao WL, Gong B, Sha F (2016) Synthesized classifiers for zero-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5327–5336 (2016)

  6. Changpinyo S, Chao WL, Sha F (2017) Predicting visual exemplars of unseen classes for zero-shot learning. In: Proceedings of the IEEE international conference on computer vision, pp 3476–3485

  7. Demirel B, Gokberk Cinbis R, Ikizler-Cinbis N (2017) Attributes2classname: a discrimiative model for attribute-based unsupervised zero-shot learning. In: Proceedings of the IEEE international conference on computer vision, pp 1232–1241

  8. Farhadi A, Endres I, Hoiem D, Forsyth D (2009) Describing objects by their attributes. In: 2009 IEEE conference on computer vision and pattern recognition, pp 1778–1785

  9. Frome A, Corrado GS, Shlens J, Bengio S, Dean J, Ranzato M, Mikolov T (2013) Devise: a deep visual-semantic embedding model. In: Advances in neural information processing systems, pp 2121–2129

  10. Fu Y, Hospedales TM, Xiang T, Gong S (2015) Transductive multi-view zero-shot learning. IEEE Trans Pattern Anal Mach Intell 37(11):2332–2345

    Article  Google Scholar 

  11. Gan Y, Liu K, Ye M, Zhang Y, Qian Y (2019) Generative adversarial networks with denoising penalty and sample augmentation. In: Neural computing and applications, pp 1–11

  12. Gan Y, Liu K, Ye M, Qian Y (2019) Generative adversarial networks with augmentation and penalty. Neurocomputing 360:52–60

    Article  Google Scholar 

  13. Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems, pp 2672–2680

  14. Gulrajani I, Ahmed F, Arjovsky M, Dumoulin V, Courville AC (2017) Improved training of Wasserstein gans. In: Advances in neural information processing systems, pp 5767–5777

  15. Guo Y, Ding G, Han J, Gao Y (2017) Zero-shot recognition via direct classifier learning with transferred samples and pseudo labels. In: Thirty-First AAAI conference on artificial intelligence (2017)

  16. Guo Y, Ding G, Jin X, Wang J (2016) Transductive zero-shot recognition via shared model space learning. In: Thirtieth AAAI conference on artificial intelligence

  17. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  18. Karessli N, Akata Z, Schiele B, Bulling A (2017) Gaze embeddings for zero-shot image classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4525–4534

  19. Kingma DP, Ba J, Adam (2014) A method for stochastic optimization. arXiv preprint arXiv:1412.6980

  20. Kodirov E, Xiang T, Fu Z, Gong S (2015) Unsupervised domain adaptation for zero-shot learning, In Proceedings of the IEEE international conference on computer vision, pp: 2452–2460

  21. Kodirov E, Xiang T, Gong S (2017) Semantic autoencoder for zero-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3174–3183

  22. Lampert CH, Nickisch H, Harmeling S (2009) Learning to detect unseen object classes by between-class attribute transfer. In: 2009 IEEE conference on computer vision and pattern recognition, pp 951–958. IEEE (2009)

  23. Lampert CH, Nickisch H, Harmeling S (2013) Attribute-based classification for zero-shot visual object categorization. IEEE Trans Pattern Anal Mach Intell 36(3):453–465

    Article  Google Scholar 

  24. Li J, Jing M, Lu K, Ding Z, Zhu L, Huang Z (2019) Leveraging the invariant side of generative zero-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7402–7411

  25. Liu L, Zhang H, Xu X, Zhang Z, Yan S (2019) Collocating clothes with generative adversarial networks cosupervised by categories and attributes: a multidiscriminator framework. IEEE Trans Neural Netw Learn Syst (2019)

  26. Lu Y (2015) Unsupervised learning on neural network outputs: with application in zero-shot learning. arXiv preprint arXiv:1506.00990

  27. Mao X, Li Q, Xie H, Lau RY, Wang Z, Paul Smolley S (2017) Least squares generative adversarial networks. In: Proceedings of the IEEE international conference on computer vision, pp 2794–2802

  28. Mao X, Li Q, Xie, Lau RY, Wang Z Smolley SP (2018) On the effectiveness of least squares generative adversarial networks. IEEE Trans Pattern Anal Mach Intell 41(12):2947–2960

  29. Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781

  30. Mishra A, Krishna Reddy S, Mittal A, Murthy HA (2018) A generative model for zero shot learning using conditional variational autoencoders. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 2188–2196

  31. Norouzi M, Mikolov T, Bengio S, Singer Y, Shlens J, Frome A, Corrado GS, Dean J (2013) Zero-shot learning by convex combination of semantic embeddings. arXiv preprint arXiv:1312.5650

  32. Qiao R, Liu L, Shen C, Hengel Avd (2017) Visually aligned word embeddings for improving zero-shot learning. arXiv preprint arXiv:1707.05427

  33. Romera-Paredes B, Torr P (2015) An embarrassingly simple approach to zero-shot learning. In: International conference on machine learning, pp 2152–2161

  34. Saito K, Ushiku Y, Harada T (2017) Asymmetric tri-training for unsupervised domain adaptation. In: Proceedings of the 34th international conference on machine learning, vol 70, pp 2988–2997

  35. Shigeto Y, Suzuki I, Hara K, Shimbo M, Matsumoto Y (2015) Ridge regression, hubness, and zero-shot learning. In: Joint European conference on machine learning and knowledge discovery in databases, pp 135–151

  36. Socher R, Ganjoo M, Manning CD, Ng A (2013) Zero-shot learning through cross-modal transfer. In: Advances in neural information processing systems, pp 935–943

  37. Song J, Shen C, Yang Y, Liu Y, Song M (2018) Transductive unbiased embedding for zero-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1024–1033

  38. Tong B, Klinkigt M, Chen J, Cui X, Kong Q, Murakami T, Kobayashi Y (2018) Adversarial zero-shot learning with semantic augmentation. In: Thirty-second AAAI conference on artificial intelligence

  39. Verma VK, Rai P (2017) A simple exponential family framework for zero-shot learning. In: Joint European conference on machine learning and knowledge discovery in databases, pp 792–808

  40. Xian Y, Akata Z, Sharma G, Nguyen Q, Hein M, Schiele B (2016) Latent embeddings for zero-shot classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 69–77

  41. Xian Y, Lampert CH, Schiele B, Akata Z (2018) Zero-shot learning-a comprehensive evaluation of the good, the bad and the ugly. IEEE Trans Pattern Anal Mach Intell

  42. Xian Y, Lorenz T, Schiele B, Akata Z (2018) Feature generating networks for zero-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5542–5551

  43. Zhang H, Sun Y, Liu L, Wang X, Li L, Liu W (2018) Clothingout: a category-supervised gan model for clothing segmentation and retrieval. In: Neural computing and applications, pp 1–12

  44. Zhang L, Xiang T, Gong S (2017) Learning a deep embedding model for zero-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2021–2030

  45. Zhang Z, Saligrama V (2015) Zero-shot learning via semantic similarity embedding. In: Proceedings of the IEEE international conference on computer vision, pp 4166–4174

  46. Zhou ZH, Li M (2005) Tri-training: exploiting unlabeled data using three classifiers. IEEE Trans Knowl Data Eng 11:1529–1541

    Article  Google Scholar 

  47. Zhu JY, Park T, Isola P, Efros AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE international conference on computer vision, pp 2223–2232

  48. Zhu Y, Elhoseiny M, Liu B, Peng X, Elgammal A (2018) A generative adversarial approach for zero-shot learning from noisy texts. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1004–1013

Download references

Acknowledgements

This work was supported in part by the National Key R&D Program of China (2018YFE0203900), National Natural Science Foundation of China (61773093), Important Science and Technology Innovation Projects in Chengdu (2018-YF08-00039-GX) and Research Programs of Sichuan Science and Technology Department (17ZDYF3184).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mao Ye.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, X., Zhang, D., Ye, M. et al. Bidirectional generative transductive zero-shot learning. Neural Comput & Applic 33, 5313–5326 (2021). https://doi.org/10.1007/s00521-020-05322-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-020-05322-7

Keywords

Navigation