Abstract
Zero-shot learning (ZSL) aims at recognizing instances from unseen classes via training a classification model with only seen data. Most existing approaches easily suffer from the classification bias from unseen to seen categories since the models are only trained with seen data. In this paper, we tackle the task of ZSL with a novel Unseen Prototype Learning (UPL) model, which is a simple yet effective framework to learn visual prototypes for unseen categories from the corresponding class-level semantic information, and the learned features are treated as latent classifiers directly. Two types of constraints are proposed to improve the performance of the learned prototypes. Firstly, we utilize an autoencoder framework to learn visual prototypes from the semantic prototypes and reconstruct the original semantic information by a decoder to ensure the prototypes have a strong correlation with the corresponding categories. Secondly, we employ a triplet loss to make the average of visual features per class supervise the learned visual prototypes. In this way, the visual prototypes are more discriminative and as a result, the classification bias problem can be alleviated well. Besides, based on the episodic training paradigm in meta-learning, the model can accumulate wealthy experiences in predicting unseen classes. Extensive experiments on four datasets under both traditional ZSL and generalized ZSL show the effectiveness of our proposed UPL method.
Similar content being viewed by others
References
Akata Z, Perronnin F, Harchaoui Z, Schmid C (2013) Label-embedding for attribute-based classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 819–826
Akata Z, Reed S, Walter DJ, Lee H, Schiele B (2015) Evaluation of output embeddings for fine-grained image classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2927–2936
Changpinyo S, Chao WL, Gong B, Sha F (2016) Synthesized classifiers for zero-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5327–5336
Chao W, Changpinyo S, Gong B, Sha F (2016) An empirical study and analysis of generalized zero-shot learning for object recognition in the wild. In: European conference on computer vision, pp 1–27
Finn C, Abbeel P, Levine S (2017) Model-agnostic meta-learning for fast adaptation of deep networks. In: International conference on machine learning, pp 1126–1135
Fontanini T, Iotti E, Prati A (2019) Metalgan: a cluster-based adaptive training for few-shot adversarial colorization. In: International conference on image analysis and processing, pp 280–291
Frome A, Corrado GS, Shlens J, Bengio S, Dean J, Ranzato M, Mikolov T (2013) Devise: a deep visual-semantic embedding model. In: Advances in neural information processing systems, pp 2121–2129
Gao R, Hou X, Qin J, Chen J, Liu L, Zhu F, Zhang Z, Shao L (2020) Zero-vae-gan: generating unseen features for generalized and transductive zero-shot learning. IEEE Trans Image Process 29:3665–3680
Gao X, Zhang Z, Mu T, Zhang X, Cui C, Wang M (2020) Self-attention driven adversarial similarity learning network. Pattern Recogn 5:107331
Goodfellow I, Pougetabadie J, Mirza M, Xu B, Wardefarley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems, pp 2672–2680
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Huang H, Wang C, Yu PS, Wang C (2019) Generative dual adversarial network for generalized zero-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 801–810
Jayaraman D, Grauman K (2014) Zero shot recognition with unreliable attributes. In: Advances in neural information processing systems, pp 3464–3472
Ji Z, Cui B, Li H, Jiang YG, Xiang T, Hospedales T, Fu Y (2020) Deep ranking for image zero-shot multi-label classification. IEEE Trans Image Process 29:6549–6560
Ji Z, Sun Y, Yu Y, Pang Y, Han J (2020) Attribute-guided network for cross-modal zero-shot hashing. IEEE Trans Neural Netw 31(1):321–330
Ji Z, Yan J, Wang Q, Pang Y, Li X (2020) Triple discriminator generative adversarial network for zero-shot image classification. Sci China Inf Sci 5:10
Jiang H, Wang R, Shan S, Chen X (2019) Transferable contrastive network for generalized zero-shot learning. In: Proceedings of the IEEE international conference on computer vision, pp 9765–9774
Kingma DP, Welling M (2014) Auto-encoding variational bayes. In: International conference on learning representations, pp 1–14
Kipf TN, Welling M (2017) Semi-supervised classification with graph convolutional networks. In: International conference on learning representations
Kodirov E, Xiang T, Gong S (2017) Semantic autoencoder for zero-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4447–4456
Lampert CH, Nickisch H, Harmeling S (2009) Learning to detect unseen object classes by between-class attribute transfer. In: IEEE conference on computer vision and pattern recognition, pp 951–958
Lampert CH, Nickisch H, Harmeling S (2014) Attribute-based classification for zero-shot visual object categorization. IEEE Trans Pattern Anal Mach Intell 36(3):453–465
Li J, Jing M, Lu K, Ding Z, Zhu L, Huang Z (2019) Leveraging the invariant side of generative zero-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7402–7411
Liu L, Zhang H, Xu X, Zhang Z, Yan S (2019) Collocating clothes with generative adversarial networks cosupervised by categories and attributes: a multidiscriminator framework. IEEE Trans Neural Netw Learn Syst 31(9):3540–3554
Liu Y, Gao Q, Li J, Han J, Shao L (2018) Zero shot learning via low-rank embedded semantic autoencoder. In: IJCAI, pp 2490–2496
Liu Y, Xie DY, Gao Q, Han J, Wang S, Gao X (2019) Graph and autoencoder based feature extraction for zero-shot learning. In: Twenty-Eighth international joint conference on artificial intelligence, pp 3038–3044
Maaten L, Hinton G (2008) Visualizing data using t-sne. J Mach Learn Res 9:2579–2605
Mishra AK, Reddy MSK, Mittal A, Murthy HA (2017) A generative model for zero shot learning using conditional variational autoencoders. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 1–9
Nilsback ME, Zisserman A (2008) Automated flower classification over a large number of classes. In: Conference on computer vision, graphics and image processing, pp 722–729
Ravi S, Larochelle H (2017) Optimization as a model for few-shot learning. In: International conference on learning representations, pp 1–11
Reed S, Akata Z, Lee H, Schiele B (2016) Learning deep representations of fine-grained visual descriptions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 49–58
Romera-Paredes B, Torr P (2015) An embarrassingly simple approach to zero-shot learning. In: International conference on machine learning, pp 2152–2161
Romeraparedes B, Torr PHS (2015) An embarrassingly simple approach to zero-shot learning. In: International conference on machine learning, pp 2152–2161
Schonfeld E, Ebrahimi S, Sinha S, Darrell T, Akata Z (2019) Generalized zero- and few-shot learning via aligned variational autoencoders. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8247–8255
Shigeto Y, Suzuki I, Hara K, Shimbo M, Matsumoto Y (2015) Ridge regression, hubness, and zero-shot learning. In: Joint European conference on machine learning and knowledge discovery in databases, pp 135–151
Sung F, Yang Y, Zhang L, Xiang T, Torr PHS, Hospedales TM (2018) Learning to compare: Relation network for few-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1199–1208
Tang Z, Jiang W, Zhang Z, Zhao M, Zhang L, Wang M (2019) Densenet with up-sampling block for recognizing texts in images. Neural Comput Appl 20:1–9
Verma VK, Arora G, Mishra AK, Rai P (2018) Generalized zero-shot learning via synthesized examples. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4281–4289
Verma VK, Brahma D, Rai P (2020) A meta-learning framework for generalized zero-shot learning. In: AAAI conference on artificial intelligence, pp 1–8
Vinyals O, Blundell C, Lillicrap T, Kavukcuoglu K, Wierstra D (2016) Matching networks for one shot learning. In: Advances in neural information processing systems, pp 3630–3638
Wah C, Branson S, Welinder P, Perona P, Belongie S (2011) The caltech-ucsd birds-200-2011 dataset
Wang X, Ye Y, Gupta A (2018) Zero-shot recognition via semantic embeddings and knowledge graphs. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6857–6866
Xian Y, Akata Z, Sharma G, Nguyen Q, Hein M, Schiele B (2016) Latent embeddings for zero-shot classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 69–77
Xian Y, Lampert CH, Schiele B, Akata Z (2019) Zero-shot learning-a comprehensive evaluation of the good, the bad and the ugly. IEEE Trans Pattern Anal Mach Intell 41(9):2251–2265
Xian Y, Lorenz T, Schiele B, Akata Z (2018) Feature generating networks for zero-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5542–5551
Xian Y, Sharma S, Schiele B, Akata Z (2019) F-vaegan-d2: a feature generating framework for any-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 10275–10284
Ye Z, Lyu F, Li L, Fu Q, Ren J, Hu F (2019) Sr-gan: semantic rectifying generative adversarial network for zero-shot learning. In: IEEE international conference on multimedia and expo, pp 85–90
Yin C, Tang J, Xu Z, Wang Y (2018) Adversarial meta-learning. arXiv:1806.03316
Yu Y, Ji Z, Guo J, Zhang Z (2018) Zero-shot learning via latent space encoding. IEEE Trans Cybern 49(10):3755–3766
Yu Y, Ji Z, Han J, Zhang Z (2020) Episode-based prototype generating network for zero-shot learning. In: Proceedings of the IEEE international conference on computer vision and pattern recognition, pp 14035–14044
Zeng W, Zhao M, Gao Y, Zhang Z (2020) Tilegan: category-oriented attention-based high-quality tiled clothes generation from dressed person. Neural Comput Appl 5:1–14
Zhang L, Xiang T, Gong S (2017) Learning a deep embedding model for zero-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3010–3019
Zhang R, Che T, Ghahramani Z, Bengio Y, Song Y (2018) Metagan: an adversarial approach to few-shot learning. In: Advances in neural information processing systems, pp 2371–2380
Zhang Y, Li X, Lin M, Chiu B, Zhao M (2020) Deep-recursive residual network for image semantic segmentation. Neural Comput Appl 5:1–13
Zhao M, Liu Y, Li X, Zhang Z, Zhang Y (2020) An end-to-end framework for clothing collocation based on semantic feature fusion. IEEE Multimedia 27(4):122–132
Zhu Y, Elhoseiny M, Liu B, Peng X, Elgammal A (2018) A generative adversarial approach for zero-shot learning from noisy texts. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1004–1013
Acknowledgements
This research is partially supported by the Fundamental Research Funds for the Central Universities 2020QNA5010 and the National Natural Science Foundation of China under Grant 61771329 and Grant 62002320, the Central Funds Guiding the Local Science and Technology Development (Grant No. 206Z5001G).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
We wish to draw the attention of the Editor to the following facts which may be considered as potential conflict of interest and to significant financial contributions to this work. We wish to confirm that there are no known conflicts of interest associated with this publication and there has been no significant financial support for this work that could have influenced its outcome.
Ethical aprroval
We confirm that the manuscript has been read and approved by all named authors. We further confirm that the order of authors listed in the manuscript has been approved by all of us. The roles of all authors are listed as follows: Zhong Ji contributed to conceptualization and writing—review. Biying Cui contributed to software and writing—original draft. Yunlong Yu (Corresponding author) contributed to methodology and supervision. Yanwei Pang contributed to writing—review and editing. Zhongfei Zhang contributed to writing—review and editing.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Ji, Z., Cui, B., Yu, Y. et al. Zero-shot classification with unseen prototype learning. Neural Comput & Applic 35, 12307–12317 (2023). https://doi.org/10.1007/s00521-021-05746-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-021-05746-9