Zero-shot classification with unseen prototype learning

Ji, Zhong; Cui, Biying; Yu, Yunlong; Pang, Yanwei; Zhang, Zhongfei

doi:10.1007/s00521-021-05746-9

Zero-shot classification with unseen prototype learning

S.I. : New Trends of Neural Computing for Advanced Applications
Published: 15 February 2021

Volume 35, pages 12307–12317, (2023)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Zhong Ji¹,
Biying Cui¹,
Yunlong Yu ORCID: orcid.org/0000-0002-0294-2099²,
Yanwei Pang¹ &
…
Zhongfei Zhang³

861 Accesses
9 Citations
Explore all metrics

Abstract

Zero-shot learning (ZSL) aims at recognizing instances from unseen classes via training a classification model with only seen data. Most existing approaches easily suffer from the classification bias from unseen to seen categories since the models are only trained with seen data. In this paper, we tackle the task of ZSL with a novel Unseen Prototype Learning (UPL) model, which is a simple yet effective framework to learn visual prototypes for unseen categories from the corresponding class-level semantic information, and the learned features are treated as latent classifiers directly. Two types of constraints are proposed to improve the performance of the learned prototypes. Firstly, we utilize an autoencoder framework to learn visual prototypes from the semantic prototypes and reconstruct the original semantic information by a decoder to ensure the prototypes have a strong correlation with the corresponding categories. Secondly, we employ a triplet loss to make the average of visual features per class supervise the learned visual prototypes. In this way, the visual prototypes are more discriminative and as a result, the classification bias problem can be alleviated well. Besides, based on the episodic training paradigm in meta-learning, the model can accumulate wealthy experiences in predicting unseen classes. Extensive experiments on four datasets under both traditional ZSL and generalized ZSL show the effectiveness of our proposed UPL method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

FSODv2: A Deep Calibrated Few-Shot Object Detection Network

Article 04 April 2024

Qi Fan, Wei Zhuo, … Yu-Wing Tai

Learning to Prompt for Vision-Language Models

Article 31 July 2022

Kaiyang Zhou, Jingkang Yang, … Ziwei Liu

CLIP-Adapter: Better Vision-Language Models with Feature Adapters

Article 15 September 2023

Peng Gao, Shijie Geng, … Yu Qiao

References

Akata Z, Perronnin F, Harchaoui Z, Schmid C (2013) Label-embedding for attribute-based classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 819–826
Akata Z, Reed S, Walter DJ, Lee H, Schiele B (2015) Evaluation of output embeddings for fine-grained image classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2927–2936
Changpinyo S, Chao WL, Gong B, Sha F (2016) Synthesized classifiers for zero-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5327–5336
Chao W, Changpinyo S, Gong B, Sha F (2016) An empirical study and analysis of generalized zero-shot learning for object recognition in the wild. In: European conference on computer vision, pp 1–27
Finn C, Abbeel P, Levine S (2017) Model-agnostic meta-learning for fast adaptation of deep networks. In: International conference on machine learning, pp 1126–1135
Fontanini T, Iotti E, Prati A (2019) Metalgan: a cluster-based adaptive training for few-shot adversarial colorization. In: International conference on image analysis and processing, pp 280–291
Frome A, Corrado GS, Shlens J, Bengio S, Dean J, Ranzato M, Mikolov T (2013) Devise: a deep visual-semantic embedding model. In: Advances in neural information processing systems, pp 2121–2129
Gao R, Hou X, Qin J, Chen J, Liu L, Zhu F, Zhang Z, Shao L (2020) Zero-vae-gan: generating unseen features for generalized and transductive zero-shot learning. IEEE Trans Image Process 29:3665–3680
Article MATH Google Scholar
Gao X, Zhang Z, Mu T, Zhang X, Cui C, Wang M (2020) Self-attention driven adversarial similarity learning network. Pattern Recogn 5:107331
Article Google Scholar
Goodfellow I, Pougetabadie J, Mirza M, Xu B, Wardefarley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems, pp 2672–2680
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Huang H, Wang C, Yu PS, Wang C (2019) Generative dual adversarial network for generalized zero-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 801–810
Jayaraman D, Grauman K (2014) Zero shot recognition with unreliable attributes. In: Advances in neural information processing systems, pp 3464–3472
Ji Z, Cui B, Li H, Jiang YG, Xiang T, Hospedales T, Fu Y (2020) Deep ranking for image zero-shot multi-label classification. IEEE Trans Image Process 29:6549–6560
Article MathSciNet MATH Google Scholar
Ji Z, Sun Y, Yu Y, Pang Y, Han J (2020) Attribute-guided network for cross-modal zero-shot hashing. IEEE Trans Neural Netw 31(1):321–330
Article Google Scholar
Ji Z, Yan J, Wang Q, Pang Y, Li X (2020) Triple discriminator generative adversarial network for zero-shot image classification. Sci China Inf Sci 5:10
Google Scholar
Jiang H, Wang R, Shan S, Chen X (2019) Transferable contrastive network for generalized zero-shot learning. In: Proceedings of the IEEE international conference on computer vision, pp 9765–9774
Kingma DP, Welling M (2014) Auto-encoding variational bayes. In: International conference on learning representations, pp 1–14
Kipf TN, Welling M (2017) Semi-supervised classification with graph convolutional networks. In: International conference on learning representations
Kodirov E, Xiang T, Gong S (2017) Semantic autoencoder for zero-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4447–4456
Lampert CH, Nickisch H, Harmeling S (2009) Learning to detect unseen object classes by between-class attribute transfer. In: IEEE conference on computer vision and pattern recognition, pp 951–958
Lampert CH, Nickisch H, Harmeling S (2014) Attribute-based classification for zero-shot visual object categorization. IEEE Trans Pattern Anal Mach Intell 36(3):453–465
Article Google Scholar
Li J, Jing M, Lu K, Ding Z, Zhu L, Huang Z (2019) Leveraging the invariant side of generative zero-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7402–7411
Liu L, Zhang H, Xu X, Zhang Z, Yan S (2019) Collocating clothes with generative adversarial networks cosupervised by categories and attributes: a multidiscriminator framework. IEEE Trans Neural Netw Learn Syst 31(9):3540–3554
Article MathSciNet Google Scholar
Liu Y, Gao Q, Li J, Han J, Shao L (2018) Zero shot learning via low-rank embedded semantic autoencoder. In: IJCAI, pp 2490–2496
Liu Y, Xie DY, Gao Q, Han J, Wang S, Gao X (2019) Graph and autoencoder based feature extraction for zero-shot learning. In: Twenty-Eighth international joint conference on artificial intelligence, pp 3038–3044
Maaten L, Hinton G (2008) Visualizing data using t-sne. J Mach Learn Res 9:2579–2605
MATH Google Scholar
Mishra AK, Reddy MSK, Mittal A, Murthy HA (2017) A generative model for zero shot learning using conditional variational autoencoders. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 1–9
Nilsback ME, Zisserman A (2008) Automated flower classification over a large number of classes. In: Conference on computer vision, graphics and image processing, pp 722–729
Ravi S, Larochelle H (2017) Optimization as a model for few-shot learning. In: International conference on learning representations, pp 1–11
Reed S, Akata Z, Lee H, Schiele B (2016) Learning deep representations of fine-grained visual descriptions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 49–58
Romera-Paredes B, Torr P (2015) An embarrassingly simple approach to zero-shot learning. In: International conference on machine learning, pp 2152–2161
Romeraparedes B, Torr PHS (2015) An embarrassingly simple approach to zero-shot learning. In: International conference on machine learning, pp 2152–2161
Schonfeld E, Ebrahimi S, Sinha S, Darrell T, Akata Z (2019) Generalized zero- and few-shot learning via aligned variational autoencoders. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8247–8255
Shigeto Y, Suzuki I, Hara K, Shimbo M, Matsumoto Y (2015) Ridge regression, hubness, and zero-shot learning. In: Joint European conference on machine learning and knowledge discovery in databases, pp 135–151
Sung F, Yang Y, Zhang L, Xiang T, Torr PHS, Hospedales TM (2018) Learning to compare: Relation network for few-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1199–1208
Tang Z, Jiang W, Zhang Z, Zhao M, Zhang L, Wang M (2019) Densenet with up-sampling block for recognizing texts in images. Neural Comput Appl 20:1–9
Google Scholar
Verma VK, Arora G, Mishra AK, Rai P (2018) Generalized zero-shot learning via synthesized examples. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4281–4289
Verma VK, Brahma D, Rai P (2020) A meta-learning framework for generalized zero-shot learning. In: AAAI conference on artificial intelligence, pp 1–8
Vinyals O, Blundell C, Lillicrap T, Kavukcuoglu K, Wierstra D (2016) Matching networks for one shot learning. In: Advances in neural information processing systems, pp 3630–3638
Wah C, Branson S, Welinder P, Perona P, Belongie S (2011) The caltech-ucsd birds-200-2011 dataset
Wang X, Ye Y, Gupta A (2018) Zero-shot recognition via semantic embeddings and knowledge graphs. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6857–6866
Xian Y, Akata Z, Sharma G, Nguyen Q, Hein M, Schiele B (2016) Latent embeddings for zero-shot classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 69–77
Xian Y, Lampert CH, Schiele B, Akata Z (2019) Zero-shot learning-a comprehensive evaluation of the good, the bad and the ugly. IEEE Trans Pattern Anal Mach Intell 41(9):2251–2265
Article Google Scholar
Xian Y, Lorenz T, Schiele B, Akata Z (2018) Feature generating networks for zero-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5542–5551
Xian Y, Sharma S, Schiele B, Akata Z (2019) F-vaegan-d2: a feature generating framework for any-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 10275–10284
Ye Z, Lyu F, Li L, Fu Q, Ren J, Hu F (2019) Sr-gan: semantic rectifying generative adversarial network for zero-shot learning. In: IEEE international conference on multimedia and expo, pp 85–90
Yin C, Tang J, Xu Z, Wang Y (2018) Adversarial meta-learning. arXiv:1806.03316
Yu Y, Ji Z, Guo J, Zhang Z (2018) Zero-shot learning via latent space encoding. IEEE Trans Cybern 49(10):3755–3766
Article Google Scholar
Yu Y, Ji Z, Han J, Zhang Z (2020) Episode-based prototype generating network for zero-shot learning. In: Proceedings of the IEEE international conference on computer vision and pattern recognition, pp 14035–14044
Zeng W, Zhao M, Gao Y, Zhang Z (2020) Tilegan: category-oriented attention-based high-quality tiled clothes generation from dressed person. Neural Comput Appl 5:1–14
Google Scholar
Zhang L, Xiang T, Gong S (2017) Learning a deep embedding model for zero-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3010–3019
Zhang R, Che T, Ghahramani Z, Bengio Y, Song Y (2018) Metagan: an adversarial approach to few-shot learning. In: Advances in neural information processing systems, pp 2371–2380
Zhang Y, Li X, Lin M, Chiu B, Zhao M (2020) Deep-recursive residual network for image semantic segmentation. Neural Comput Appl 5:1–13
Google Scholar
Zhao M, Liu Y, Li X, Zhang Z, Zhang Y (2020) An end-to-end framework for clothing collocation based on semantic feature fusion. IEEE Multimedia 27(4):122–132
Article Google Scholar
Zhu Y, Elhoseiny M, Liu B, Peng X, Elgammal A (2018) A generative adversarial approach for zero-shot learning from noisy texts. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1004–1013

Download references

Acknowledgements

This research is partially supported by the Fundamental Research Funds for the Central Universities 2020QNA5010 and the National Natural Science Foundation of China under Grant 61771329 and Grant 62002320, the Central Funds Guiding the Local Science and Technology Development (Grant No. 206Z5001G).

Author information

Authors and Affiliations

School of Electrical and Information Engineering, Tianjin University, Tianjin, 300072, China
Zhong Ji, Biying Cui & Yanwei Pang
College of Information Science and Electronic Engineering, Zhejiang University, Hangzhou, 310027, China
Yunlong Yu
The Computer Science Department, Watson School, State University of New York Binghamton University, Binghamton, NY, 13902, USA
Zhongfei Zhang

Authors

Zhong Ji
View author publications
You can also search for this author in PubMed Google Scholar
Biying Cui
View author publications
You can also search for this author in PubMed Google Scholar
Yunlong Yu
View author publications
You can also search for this author in PubMed Google Scholar
Yanwei Pang
View author publications
You can also search for this author in PubMed Google Scholar
Zhongfei Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yunlong Yu.

Ethics declarations

Conflict of interest

We wish to draw the attention of the Editor to the following facts which may be considered as potential conflict of interest and to significant financial contributions to this work. We wish to confirm that there are no known conflicts of interest associated with this publication and there has been no significant financial support for this work that could have influenced its outcome.

Ethical aprroval

We confirm that the manuscript has been read and approved by all named authors. We further confirm that the order of authors listed in the manuscript has been approved by all of us. The roles of all authors are listed as follows: Zhong Ji contributed to conceptualization and writing—review. Biying Cui contributed to software and writing—original draft. Yunlong Yu (Corresponding author) contributed to methodology and supervision. Yanwei Pang contributed to writing—review and editing. Zhongfei Zhang contributed to writing—review and editing.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ji, Z., Cui, B., Yu, Y. et al. Zero-shot classification with unseen prototype learning. Neural Comput & Applic 35, 12307–12317 (2023). https://doi.org/10.1007/s00521-021-05746-9

Download citation

Received: 29 September 2020
Accepted: 16 January 2021
Published: 15 February 2021
Issue Date: June 2023
DOI: https://doi.org/10.1007/s00521-021-05746-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Zero-shot classification with unseen prototype learning

Abstract

Access this article

Similar content being viewed by others

FSODv2: A Deep Calibrated Few-Shot Object Detection Network

Learning to Prompt for Vision-Language Models

CLIP-Adapter: Better Vision-Language Models with Feature Adapters

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical aprroval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Zero-shot classification with unseen prototype learning

Abstract

Access this article

Similar content being viewed by others

FSODv2: A Deep Calibrated Few-Shot Object Detection Network

Learning to Prompt for Vision-Language Models

CLIP-Adapter: Better Vision-Language Models with Feature Adapters

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical aprroval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation