Hierarchical contrastive representation for zero shot learning

Lu, Ziqian; Lu, Zheming; He, Zewei; Sun, Xuecheng; Luo, Hao; Zheng, Yangming

doi:10.1007/s10489-024-05531-w

Hierarchical contrastive representation for zero shot learning

Published: 12 July 2024

Volume 54, pages 9213–9229, (2024)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Ziqian Lu¹,
Zheming Lu ORCID: orcid.org/0000-0003-1785-7847¹,
Zewei He¹,
Xuecheng Sun¹,
Hao Luo¹ &
…
Yangming Zheng¹

171 Accesses
Explore all metrics

Abstract

Zero-shot learning aims to identify unseen (novel) objects, using only labeled samples from seen (base) classes. Existing methods usually learn visual-semantic interactions or generate absent visual features of unseen classes to compensate for the data imbalance problem. However, existing methods ignore the representation quality of visual-semantic pairs, resulting in unsatisfactory alignment and prediction bias. To tackle these issues, we propose a Hierarchical Contrastive Representation learning paradigm, termed HCR, which fully exploits model representation capability and discriminative information. Specifically, we first propose a contrastive embedding, which preserves not only high quality representations but also discriminative enough information from class-level and instance-level supervision. Then, we introduce a regressor by valuable prior knowledge for conducting more desirable visual-semantic alignment for unseen classes. A pluggable calibrator is also aggregated to further alleviate prediction bias in contrastive embedding. Extensive experiments show that the proposed HCR can significantly outperform the state-of-the-arts on popular benchmarks under ZSL and challenging GZSL settings.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Indirect visual–semantic alignment for generalized zero-shot recognition

Article 03 April 2024

Application of CLIP for efficient zero-shot learning

Article 26 July 2024

Asymmetric graph based zero shot learning

Article 14 May 2019

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data availability and access

The datasets generated or analyzed during this study are available in the [7], https://www.mpi-inf.mpg.de/departments/computer-vision-and-machine-learning/research/zero-shot-learning/zero-shot-learning-the-good-the-bad-and-the-ugly.

References

Lu Z, Yu Y, Lu Z-M, Shen F-L, Zhang Z (2020) Attentive semantic preservation network for zero-shot learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pp 682–683
Lu Z, Lu Z, Yu Y, Wang Z (2022) Learn more from less: generalized zero-shot learning with severely limited labeled data. Neurocomputing 477:25–35
Article Google Scholar
Ou G, Yu G, Domeniconi C, Lu X, Zhang X (2020) Multi-label zero-shot learning with graph convolutional networks. Neural Netw 132:333–341
Article Google Scholar
Xian Y, Lorenz T, Schiele B, Akata Z (2018) Feature generating networks for zero-shot learning. In: IEEE Conference on computer vision and pattern recognition (CVPR), pp 5542–5551
Li J, Jing M, Lu K, Ding Z, Zhu L, Huang Z (2019) Leveraging the invariant side of generative zero-shot learning. In: IEEE Conference on computer vision and pattern recognition (CVPR), pp 7402–7411
Xu B, Zeng Z, Lian C, Ding Z (2022) Generative mixup networks for zero-shot learning. IEEE Trans Neural Netw Learn Syst
Xian Y, Lampert CH, Schiele B, Akata Z (2018) Zero-shot learning-a comprehensive evaluation of the good, the bad and the ugly. IEEE Trans Pattern Anal Mach Intell 41(9):2251–2265
Article Google Scholar
Min S, Yao H, Xie H, Wang C, Zha Z-J, Zhang Y (2020) Domain-aware visual bias eliminating for generalized zero-shot learning. In: IEEE Conference on computer vision and pattern recognition (CVPR), pp 12664–12673
Zhang L, Xiang T, Gong S (2017) Learning a deep embedding model for zero-shot learning. In: IEEE Conference on computer vision and pattern recognition (CVPR), pp 2021–2030
Chen T, Kornblith S, Norouzi M, Hinton G (2020) A simple framework for contrastive learning of visual representations. In: International conference on machine learning, pp 1597–1607. PMLR
Khosla P, Teterwak P, Wang C, Sarna A, Tian Y, Isola P, Maschinot A, Liu C, Krishnan D (2020) Supervised contrastive learning. Adv Neural Inf Process Syst 33:18661–18673
Google Scholar
Chen X, Fan H, Girshick R, He K (2020) Improved baselines with momentum contrastive learning. Preprint at arXiv:2003.04297
Ye H-J, Ming L, Zhan D-C, Chao W-L (2022) Few-shot learning with a strong teacher. IEEE Trans Pattern Anal Mach Intell
Zhang J, Gao L, Luo X, Shen H, Song J (2023) Deta: Denoised task adaptation for few-shot learning. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 11541–11551
Wu J, Zhang Y, Sun S, Li Q, Zhao X (2022) Generalized zero-shot emotion recognition from body gestures. Appl Intell 1–19
Kumar Verma V, Arora G, Mishra A, Rai P (2018) Generalized zero-shot learning via synthesized examples. In: IEEE Conference on computer vision and pattern recognition (CVPR), pp 4281–4289
Gao R, Hou X, Qin J, Chen J, Liu L, Zhu F, Zhang Z, Shao L (2020) Zero-vae-gan: generating unseen features for generalized and transductive zero-shot learning. IEEE Trans Image Process 29:3665–3680
Article Google Scholar
Han Z, Fu Z, Yang J (2020) Learning the redundancy-free features for generalized zero-shot object recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 12865–12874
Huang H, Wang C, Yu PS, Wang C-D (2019) Generative dual adversarial network for generalized zero-shot learning. In: IEEE Conference on computer vision and pattern recognition (CVPR), pp 801–810
Li Y, Liu Z, Yao L, Wang X, McAuley J, Chang X (2022) An entropy-guided reinforced partial convolutional network for zero-shot learning. IEEE Trans Circuits Syst Video Technol 32(8):5175–5186
Article Google Scholar
Ji Z, Wang Q, Cui B, Pang Y, Cao X, Li X (2021) A semi-supervised zero-shot image classification method based on soft-target. Neural Netw 143:88–96
Article Google Scholar
Akata Z, Perronnin F, Harchaoui Z, Schmid C (2013) Label-embedding for attribute-based classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 819–826
Akata Z, Reed S, Walter D, Lee H, Schiele B (2015) Evaluation of output embeddings for fine-grained image classification. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 2927–2936
Sung F, Yang Y, Zhang L, Xiang T, Torr PH, Hospedales TM (2018) Learning to compare: relation network for few-shot learning. In: IEEE Conference on computer vision and pattern recognition (CVPR), pp 1199–1208
Zhang L, Wang P, Liu L, Shen C, Wei W, Zhang Y, Van Den Hengel A (2020) Towards effective deep embedding for zero-shot learning. IEEE Trans Circuits Syst Video Technol 30(9):2843–2852
Article Google Scholar
Zhu Y, Elhoseiny M, Liu B, Peng X, Elgammal A (2018) A generative adversarial approach for zero-shot learning from noisy texts. In: IEEE Conference on computer vision and pattern recognition (CVPR), pp 1004–1013
Schonfeld E, Ebrahimi S, Sinha S, Darrell T, Akata Z (2019) Generalized zero-and few-shot learning via aligned variational autoencoders. In: IEEE Conference on computer vision and pattern recognition (CVPR), pp 8247–8255
He K, Fan H, Wu Y, Xie S, Girshick R (2020) Momentum contrast for unsupervised visual representation learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9729–9738
Li J, Wei Y, Wang C, Hu Q, Liu Y, Xu L (2022) 3-d cnn-based multichannel contrastive learning for alzheimer’s disease automatic diagnosis. IEEE Trans Instrum Meas 71:1–11
Article Google Scholar
Han Z, Fu Z, Chen S, Yang J (2021) Contrastive embedding for generalized zero-shot learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2371–2381
Cheng D, Wang G, Wang N, Zhang D, Zhang Q, Gao X (2023) Discriminative and robust attribute alignment for zero-shot learning. IEEE Trans Circuits Syst Video Technol
Zhu F, Zhang W, Chen X, Gao X, Ye N (2023) Large margin distribution multi-class supervised novelty detection. Expert Syst Appl 224:119937
Article Google Scholar
Hendrycks D, Gimpel K (2016) A baseline for detecting misclassified and out-of-distribution examples in neural networks. In: International conference on learning representations
Zhang J, Gao L, Hao B, Huang H, Song J, Shen H (2023) From global to local: Multi-scale out-of-distribution detection. IEEE Trans Image Process
Yang J, Zhou K, Liu Z (2023) Full-spectrum out-of-distribution detection. Int J Comput Vis 1–16
Socher R, Ganjoo M, Manning CD, Ng A (2013) Zero-shot learning through cross-modal transfer. In: Advances in neural information processing systems (NeurIPS), pp 935–943
Chao W-L, Changpinyo S, Gong B, Sha F (2016) An empirical study and analysis of generalized zero-shot learning for object recognition in the wild. In: Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part II 14, pp. 52–68. Springer
Atzmon Y, Chechik G (2019) Adaptive confidence smoothing for generalized zero-shot learning. In: IEEE Conference on computer vision and pattern recognition (CVPR), pp 11671–11680
Chen X, Lan X, Sun F, Zheng N (2020) A boundary based out-of-distribution classifier for generalized zero-shot learning. In: European conference on computer vision (ECCV), pp 572–588
Su H, Li J, Chen Z, Zhu L, Lu K (2022) Distinguishing unseen from seen for generalized zero-shot learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7885–7894
Mettes P, Pol E, Snoek C (2019) Hyperspherical prototype networks. Adv Neural Inf Process Syst 32
Wang T, Isola P (2020) Understanding contrastive representation learning through alignment and uniformity on the hypersphere. In: International conference on machine learning, pp. 9929–9939. PMLR
Borodachov SV, Hardin DP, Saff EB (2019) Discrete energy on rectifiable sets. 3
Farhadi A, Endres I, Hoiem D, Forsyth D (2009) Describing objects by their attributes. In: IEEE Conference on computer vision and pattern recognition (CVPR), pp 1778–1785
Nilsback M-E, Zisserman A (2008) Automated flower classification over a large number of classes. In: Indian conference on computer vision, graphics & image processing, pp 722–729
Wah C, Branson S, Welinder P, Perona P, Belongie S (2011) The caltech-ucsd birds-200-2011 dataset
Felix R, Kumar VB, Reid I, Carneiro G (2018) Multi-modal cycle-consistent generalized zero-shot learning. In: European conference on computer vision (ECCV), pp 21–37
Li Q, Hou M, Lai H, Yang M (2022) Cross-modal distribution alignment embedding network for generalized zero-shot learning. Neural Netw 148:176–182
Article Google Scholar
Annadani Y, Biswas S (2018) Preserving semantic relations for zero-shot learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7603–7612
Zhang R, Zhu Q, Xu X, Zhang D, Huang S-J (2021) Visual-guided attentive attributes embedding for zero-shot learning. Neural Netw 143:709–718
Article Google Scholar
Changpinyo S, Chao W-L, Gong B, Sha F (2020) Classifier and exemplar synthesis for zero-shot learning. Int J Comput Vis 128:166–201
Article MathSciNet Google Scholar
Gao R, Hou X, Qin J, Shen Y, Long Y, Liu L, Zhang Z, Shao L (2022) Visual-semantic aligned bidirectional network for zero-shot learning. IEEE Trans Multimedia
Li Y, Liu Z, Yao L, Chang X (2021) Attribute-modulated generative meta learning for zero-shot learning. IEEE Trans Multimedia 25:1600–1610
Article Google Scholar
Chen Z, Huang Y, Chen J, Geng Y, Zhang W, Fang Y, Pan JZ, Chen H (2023) Duet: Cross-modal semantic grounding for contrastive zero-shot learning. In: Proceedings of the AAAI conference on artificial intelligence, vol 37, pp 405–413
Cheng D, Wang G, Wang B, Zhang Q, Han J, Zhang D (2023) Hybrid routing transformer for zero-shot learning. Pattern Recognit 137:109270
Article Google Scholar
Han Z, Fu Z, Li G, Yang J (2021) Inference guided feature generation for generalized zero-shot learning. Neurocomputing 430:150–158
Article Google Scholar
Chen L, Zhang H, Xiao J, Liu W, Chang S-F (2018) Zero-shot visual recognition using semantics-preserving adversarial embedding networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1043–1052
Chen S, Xie G, Liu Y, Peng Q, Sun B, Li H, You X, Shao L (2021) Hsva: Hierarchical semantic-visual adaptation for zero-shot learning. Adv Neural Inf Process Syst 34:16622–16634
Google Scholar
Xian Y, Sharma S, Schiele B, Akata Z (2019) f-vaegan-d2: A feature generating framework for any-shot learning. In: IEEE Conference on computer vision and pattern recognition (CVPR), pp 10275–10284
Ding B, Fan Y, He Y, Zhao J (2023) Enhanced vaegan: a zero-shot image classification method. Appl Intell 53(8):9235–9246
Article Google Scholar
Yun Y, Wang S, Hou M, Gao Q (2022) Attributes learning network for generalized zero-shot learning. Neural Netw 150:112–118
Article Google Scholar
Li K, Min MR, Fu Y (2019) Rethinking zero-shot learning: A conditional visual classification perspective. In: IEEE Conference on computer vision and pattern recognition (CVPR), pp 3583–3592
Shen J, Xiao Z, Zhen X, Zhang L (2021) Spherical zero-shot learning. IEEE Trans Circuits Syst Video Technol 32(2):634–645
Article Google Scholar
Huynh D, Elhamifar E (2020) Fine-grained generalized zero-shot learning via dense attribute-based attention. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 4483–4493
Li X, Xu Z, Wei K, Deng C (2021) Generalized zero-shot learning via disentangled representation. In: the Association for the advancement of artificial intelligence (AAAI), vol 35, pp 1966–1974
Chen S, Hong Z, Liu Y, Xie G-S, Sun B, Li H, Peng Q, Lu K, You X (2022) Transzero: attribute-guided transformer for zero-shot learning. In: Proceedings of the AAAI conference on artificial intelligence, vol 36, pp 330–338
Chen S, Hong Z, Xie G-S, Yang W, Peng Q, Wang K, Zhao J, You X (2022) Msdn: Mutually semantic distillation network for zero-shot learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7612–7621
Li Z, Chen Q, Liu Q (2021) Augmented semantic feature based generative network for generalized zero-shot learning. Neural Netw 143:1–11
Article Google Scholar
Chen S, Wang W, Xia B, Peng Q, You X, Zheng F, Shao L (2021) Free: Feature refinement for generalized zero-shot learning. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 122–131
Yue Z, Wang T, Sun Q, Hua X-S, Zhang H (2021) Counterfactual zero-shot and open-set visual recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 15404–15414
Romera-Paredes B, Torr P (2015) An embarrassingly simple approach to zero-shot learning. In: International conference on machine learning (ICML), pp 2152–2161
Kwon G, Al Regib G (2022) A gating model for bias calibration in generalized zero-shot learning. IEEE Trans Image Process

Download references

Funding

As for funding, this work was supported in part by the National Key Research and Development Program of China under Grant No.2020AAA0140004 and in part by the China Postdoctoral Science Foundation under Grant No. 2022M712792. This work was also partially supported by Ningbo Science and Technology Innovation 2025 major project under grants 2020Z106 and 2023Z040.

Author information

Authors and Affiliations

School of Aeronautics and Astronautics, Zhejiang University, Zheda Road, Hangzhou, 310027, China
Ziqian Lu, Zheming Lu, Zewei He, Xuecheng Sun, Hao Luo & Yangming Zheng

Authors

Ziqian Lu
View author publications
You can also search for this author inPubMed Google Scholar
Zheming Lu
View author publications
You can also search for this author inPubMed Google Scholar
Zewei He
View author publications
You can also search for this author inPubMed Google Scholar
Xuecheng Sun
View author publications
You can also search for this author inPubMed Google Scholar
Hao Luo
View author publications
You can also search for this author inPubMed Google Scholar
Yangming Zheng
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

Conceptualization and Methodology were performed by Ziqian Lu. Software and Programming were designed by Zewei He. Validation was performed by Xuecheng Sun. Formal analysis and writing were performed by Hao Luo. Supervision was performed by Yangming Zheng. Funding came from Zheming Lu and Zewei He.

Corresponding authors

Correspondence to Zheming Lu, Hao Luo or Yangming Zheng.

Ethics declarations

Ethical and informed consent for data used

Written informed consent for publication of this paper was obtained from the Zhejiang University and all authors. And this study did not involve human or animal subjects, and thus, no ethical approval was required. The study protocol adhered to the guidelines established by the journal.

Competing Interests

All authors come from school of aeronautics and astronautics, Zhejiang University. The authors declare no potential conflicts of interest with respect to the research, author- ship, and publication of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Lu, Z., Lu, Z., He, Z. et al. Hierarchical contrastive representation for zero shot learning. Appl Intell 54, 9213–9229 (2024). https://doi.org/10.1007/s10489-024-05531-w

Download citation

Accepted: 17 May 2024
Published: 12 July 2024
Issue Date: October 2024
DOI: https://doi.org/10.1007/s10489-024-05531-w

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Hierarchical contrastive representation for zero shot learning

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Indirect visual–semantic alignment for generalized zero-shot recognition

Application of CLIP for efficient zero-shot learning

Asymmetric graph based zero shot learning

Explore related subjects

Data availability and access

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Ethical and informed consent for data used

Competing Interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now