Multi-level adaptive few-shot learning network combined with vision transformer

Zhu, Hegui; Cai, Xiaoxi; Dou, Jieru; Gao, Zhan; Zhang, Libo

doi:10.1007/s12652-022-04327-5

Multi-level adaptive few-shot learning network combined with vision transformer

Original Research
Published: 25 July 2022

Volume 14, pages 12477–12491, (2023)
Cite this article

Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Hegui Zhu ORCID: orcid.org/0000-0002-6501-4097¹,
Xiaoxi Cai¹,
Jieru Dou¹,
Zhan Gao¹ &
…
Libo Zhang²

263 Accesses
Explore all metrics

Abstract

Most few-shot learning methods usually focus on the improvement of the single high-level feature extractor, however, a lot of practices have proved that the low-level features also contain abundant visual information and play an important role in the learning of feature extractors. In this paper, we propose a multi-level adaptive vision transformer few-shot learning network (MLVT-FSL). First, we propose a two-branch feature extraction network, which employs the multi-level feature extractor and vision transformer to extract the multi-level features and obtain the global relationship. It can bring 1.16% improvement to the baseline model under the setting of 5-way 5-shot on the MiniImageNet dataset. Then we use a feature adjustment module (FAM) to adaptively adjust features to achieve task-specific and more discriminative embeddings, which can bring 0.65% improvement to our network under the 5-way 5-shot condition on the MiniImageNet dataset. To further evaluate the performance of MLVT-FSL, we conduct some extensive experiments on several standard few-shot classification benchmarks on MiniImageNet, TieredImageNet and the fine-grained dataset CUB-200. Particularly, the proposed MLVT-FSL can achieve 82.46% and 84.97% top-1 classification accuracy under the setting of 5-way 5-shot on the benchmark dataset MiniImageNet and TieredImageNet. In addition, MLVT-FSL can obtain 87.04% 5-way 5-shot classification accuracy on CUB-200. All these results verify the effectiveness and performance of the proposed model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Light transformer learning embedding for few-shot classification with task-based enhancement

Article 01 August 2022

A novel method of data and feature enhancement for few-shot image classification

Article 13 January 2023

Few-Shot Image Classification Method Based on Fusion of Important Features of Different Scales

References

Biederman I (1987) Recognition-by-components: a theory of human image understanding. Psychol Rev 94(2):115–117
Article Google Scholar
Cao C, Zhang Y (2022) Learning to compare relation: semantic alignment for few-shot learning. IEEE Trans Image Process 31:1462–1474
Article Google Scholar
Clark EV, Casillas M (2015) First language acquisition
Dixit M, Kwitt R, Niethammer M, Vasconcelos N (2017) Aga: attribute-guided augmentation. In: Proceedings of The IEEE conference on computer vision and pattern recognition, pp 7455–7463
Fan Q, Zhuo W, Tang CK, Tai YW (2020) Few-shot object detection with attention-RPN and multi-relation detector. In: Proceedings of The IEEE/CVF conference on computer vision and pattern recognition, pp 4013–4022
Finn C, Abbeel P, Levine S (2017) Model-agnostic meta-learning for fast adaptation of deep networks. In: International conference on machine learning, pp 1126–1135
Garcia V, Bruna J (2017) Few-shot learning with graph neural networks. arxiv preprint arxiv:1711.04043
Girshick R (2015) Fast r-cnn. In: Proceedings of The IEEE international conference on computer vision, pp 1440–1448
Guo Y, Cheung NM (2020) Attentive weights generation for few shot learning via information maximization. In: Proceedings of The IEEE/CVF conference on computer vision and pattern recognition, pp 13499–13508
Hariharan B, Girshick R (2017) Low-shot visual recognition by shrinking and hallucinating features. In: Proceedings of The IEEE international conference on computer vision, pp 3018–3027
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of The IEEE conference on computer vision and pattern recognition, pp 770–778
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inf Process Syst 25:1097–1105
Google Scholar
Kwitt R, Hegenbart S, Niethammer M (2016) One-shot learning of scene locations via feature trajectory transfer. In: Proceedings of The IEEE conference on computer vision and pattern recognition, pp 78–86
Lai N, Kan M, Han C, Song X, Shan S (2021) Learning to learn adaptive classifier-predictor for few-shot learning. IEEE Trans Neural Netw Learn Syst 32(8):3458–3470
Article Google Scholar
Lake B, Salakhutdinov R, Gross J, Tenenbaum J (2011) One shot learning of simple visual concepts. In: Proceedings of the annual meeting of the cognitive science society, vol 33
Land EH (1977) The retinex theory of color vision. Sci Am 237(6):108–129
Article Google Scholar
Lee K, Maji S, Ravichandran A, Soatto S (2019) Meta-learning with differentiable convex optimization. In: Proceedings of The IEEE/CVF conference on computer vision and pattern recognition, pp 10657–10665
Li H, Tao R, Li J, Qin H, Ding Y, Wang S, Liu X (2021) Multi-pretext attention network for few-shot learning with self-supervision. In: 2021 IEEE international conference on multimedia and expo (ICME), IEEE, pp 1–6
Li K, Zhang Y, Li K, Fu Y (2020) Adversarial feature hallucination networks for few-shot learning. In: Proceedings of The IEEE/CVF conference on computer vision and pattern recognition, pp 13470–13479
Li X, Deng J, Fang Y (2022) Few-shot object detection on remote sensing images. IEEE Trans Geosci Remote Sens 60:1–14
Google Scholar
Li Z, Zhou F, Chen F, Li H (2017) Meta-sgd: learning to learn quickly for few-shot learning. arxiv preprint arxiv:1707.09835
Liu Y, Schiele B, Sun Q (2020) An ensemble of epoch-wise empirical bayes for few-shot learning. In: European conference on computer vision, pp 404–421
Markman EM (1989) Categorization and naming in children: problems of induction
Mishra N, Rohaninejad M, Chen X, Abbeel P (2017) A simple neural attentive meta-learner. arxiv preprint arxiv:1707.03141
Nichol A, Achiam J, Schulman J (2018) On first-order meta-learning algorithms. arxiv preprint arxiv:1803.02999
Oreshkin B, Rodriguez P, Lacoste A (2018) Tadam: task dependent adaptive metric for improved few-shot learning. Adv Neural Inf Process Syst 31
Rajendran J, Irpan A, Jang E (2020) Meta-learning requires meta-augmentation. Adv Neural Inf Process Syst 33:5705–5715
Google Scholar
Rakhlin A (2016) Convolutional neural networks for sentence classification. arxiv preprint arXiv:1408.5882
Ravi S, Larochelle H (2017) Optimization as a model for few-shot learning. In: ICLR
Ren M, Triantafillou E, Ravi S, Snell J, Swersky K, Tenenbaum JB, Larochelle H, Zemel RS (2018) Meta-learning for semi-supervised few-shot classification. arxiv preprint arxiv:1803.00676
Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: towards real-time object detection with region proposal networks. Adv Neural Inf Process Syst 28
Rusu AA, Rao D, Sygnowski J, Vinyals O, Pascanu R, Osindero S, Hadsell R (2018) Meta-learning with latent embedding optimization. arxiv preprint arxiv:1807.05960
Shaban A, Bansal S, Liu Z, Essa I, Boots B (2017) One-shot learning for semantic segmentation. arxiv preprint arxiv:1709.03410
Simon C, Koniusz P, Nock R, Harandi M (2020) Adaptive subspaces for few-shot learning. In: Proceedings of The IEEE conference on computer vision and pattern recognition, pp 4136–4145
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arxiv preprint arxiv:1409.1556
Snell J, Swersky K, Zemel R (2017) Prototypical networks for few-shot learning. Adv Neural Inf Process Syst 30
Sung F, Yang Y, Zhang L, Xiang T, Torr PH, Hospedales TM (2018) Learning to compare: relation network for few-shot learning. In: Proceedings of The IEEE conference on computer vision and pattern recognition, pp 1199–1208
Vinyals O, Blundell C, Lillicrap T, Wierstra D, et al. (2016) Matching networks for one shot learning. Adv Neural Inf Process Syst 29
Wah C, Branson S, Welinder P, Perona P, Belongie S (2011) The caltech-ucsd birds-200-2011 dataset
Wang C, Gu H, Su W (2022) Sar image classification using contrastive learning and pseudo-labels with limited data. IEEE Geosci Remote Sens Lett 19:1–5
Google Scholar
Wang Y, Chao WL, Weinberger KQ, van der Maaten L (2019) Simpleshot: revisiting nearest-neighbor classification for few-shot learning. arxiv preprint arxiv:1911.04623
Wang Y, Xu C, Liu C, Zhang L, Fu Y (2020) Instance credibility inference for few-shot learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 12836–12845
Wang Z, Bovik AC, Sheikh HR, Simoncelli EP (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 13(4):600–612
Article Google Scholar
Xiong C, Li W, Liu Y, Wang M (2021) Multi-dimensional edge features graph neural network on few-shot image classification. IEEE Signal Process Lett 28:573–577
Article Google Scholar
Xu F, Tenenbaum JB (2007) Word learning as Bayesian inference. Psychol Rev 114(2):245–272
Article Google Scholar
Ye HJ, Hu H, Zhan DC, Sha F (2020) Few-shot learning via embedding adaptation with set-to-set functions. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 8808–8817
Yoon SW, Seo J, Moon J (2019) Tapnet: neural network augmented with task-adaptive projection for few-shot learning. In: International conference on machine learning, pp 7115–7123
Yuan L, Chen Y, Wang T, Yu W, Shi Y, Jiang ZH, Tay FE, Feng J, Yan S (2021) Tokens-to-token vit: Training vision transformers from scratch on imagenet. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 558–567
Zhang Y, Zhang J, Guo X (2019) Kindling the darkness: a practical low-light image enhancer. In: Proceedings of The 27th ACM international conference on multimedia, pp 1632–1640
Zhang J, Zhang M, Lu Z, Xiang T (2021) Adargcn: adaptive aggregation GCN for few-shot learning. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 3482–3491
Zhou H, Zhang S, Peng J, Zhang S, Li J, Xiong H, Zhang W (2021) Informer: beyond efficient transformer for long sequence time-series forecasting. Proc AAAI Conf Artif Intell 35:11106–11115
Google Scholar
Zhu Z, Xu M, Bai S, Huang T, Bai X (2019) Asymmetric non-local neural networks for semantic segmentation. In: Proceedings of The IEEE/CVF international conference on computer vision, pp 593–602

Download references

Acknowledgements

This work is supported by the Natural Science Foundation of Liaoning Province (No. 2020-MS-080), the National Key Research and Development Program of China (No. 2017YFF0108800).

Author information

Authors and Affiliations

College of Sciences, Northeastern University, Shenyang, 110819, China
Hegui Zhu, Xiaoxi Cai, Jieru Dou & Zhan Gao
Department of radiology, The General Hospital of Northern Theater Command PLA, Shenyang, 110016, China
Libo Zhang

Authors

Hegui Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoxi Cai
View author publications
You can also search for this author in PubMed Google Scholar
Jieru Dou
View author publications
You can also search for this author in PubMed Google Scholar
Zhan Gao
View author publications
You can also search for this author in PubMed Google Scholar
Libo Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hegui Zhu.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhu, H., Cai, X., Dou, J. et al. Multi-level adaptive few-shot learning network combined with vision transformer. J Ambient Intell Human Comput 14, 12477–12491 (2023). https://doi.org/10.1007/s12652-022-04327-5

Download citation

Received: 01 September 2021
Accepted: 11 July 2022
Published: 25 July 2022
Issue Date: September 2023
DOI: https://doi.org/10.1007/s12652-022-04327-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-level adaptive few-shot learning network combined with vision transformer

Abstract

Access this article

Similar content being viewed by others

Light transformer learning embedding for few-shot classification with task-based enhancement

A novel method of data and feature enhancement for few-shot image classification

Few-Shot Image Classification Method Based on Fusion of Important Features of Different Scales

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multi-level adaptive few-shot learning network combined with vision transformer

Abstract

Access this article

Similar content being viewed by others

Light transformer learning embedding for few-shot classification with task-based enhancement

A novel method of data and feature enhancement for few-shot image classification

Few-Shot Image Classification Method Based on Fusion of Important Features of Different Scales

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation