Skip to main content
Log in

Self-attention network for few-shot learning based on nearest-neighbor algorithm

  • Original Paper
  • Published:
Machine Vision and Applications Aims and scope Submit manuscript

A Correction to this article was published on 14 March 2023

This article has been updated

Abstract

Few-shot learning is a challenging task because it focuses on classifying new object categories given only limited labeled samples and often results in poor generalization. Most of existing methods based on metric learning are not very simple for the networks. Moreover, these methods cannot solve low-data problem and are unable to extract discriminative features. This paper presents a simple, effective and general framework for few-shot image classification. Specifically, we first exploit data augmentation technique to alleviate overfitting problem. We propose a self-attention position attention module (PAM) which is utilized to extract discriminative features for constructing a few-shot representation model. Furthermore, we design a novel nearest-neighbor learner with feature transformation to obtain the appealing accuracy in few-shot learning (FSL). Our network is comprised of backbone and attention module and trained from scratch in an end-to-end manner. The backbone module is to extract multi-level features. The self-attention PAM is used to discover non-local information and allow long-range dependency. Excellent performance on benchmark demonstrates that our work provides a unified and effective approach for few-shot image classification.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Change history

References

  1. Gidaris, S., Bursuc, A., Komodakis, N., Pérez, P., Cord, M.: Boosting few-shot visual learning with self-supervision. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 8059–8068 (2019)

  2. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

  3. Zagoruyko, S., Komodakis, N.: Wide residual networks. arXiv preprint arXiv:1605.07146 (2016)

  4. Käding, C., Rodner, E., Freytag, A., Denzler, J.: Fine-tuning deep neural networks in continuous learning scenarios. In: Asian Conference on Computer Vision, pp. 588–605. Springer (2016)

  5. Wang, Y., Hu, C., Wang, G., Lin, X.: Contrastive representation for few-shot vehicle footprint recognition. In: 2021 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), pp. 1–6. IEEE (2021)

  6. Yoon, S.W., Seo, J., Moon, J.: Tapnet: neural network augmented with task-adaptive projection for few-shot learning. In: International Conference on Machine Learning, pp. 7115–7123. PMLR (2019)

  7. Oreshkin, B.N., Rodriguez, P., Lacoste, A.: Tadam: task dependent adaptive metric for improved few-shot learning. arXiv preprint arXiv:1805.10123 (2018)

  8. Ravi, S., Larochelle, H.: Optimization as a model for few-shot learning. In: 5th International Conference on Learning Representations, ICLR 2017—Conference Track Proceedings, pp. 1–11 (2017)

  9. Andrychowicz, M., Denil, M., Gomez, S., Hoffman, M.W., Pfau, D., Schaul, T., Shillingford, B., De Freitas, N.: Learning to learn by gradient descent by gradient descent. Adv. Neural Inf. Process. Syst. 29, 3981–3989 (2016)

    Google Scholar 

  10. Finn, C., Abbeel, P., Levine, S.: Model-agnostic meta-learning for fast adaptation of deep networks. In: International Conference on Machine Learning, pp. 1126–1135. PMLR (2017)

  11. Li, Z., Zhou, F., Chen, F., Li, H.: Meta-sgd: learning to learn quickly for few-shot learning. arXiv preprint arXiv:1707.09835 (2017)

  12. Snell, J., Swersky, K., Zemel, R.: Prototypical networks for few-shot learning. Adv. Neural Inf. Process. Syst. 30 (2017)

  13. Sung, F., Yang, Y., Zhang, L., Xiang, T., Torr, P.H., Hospedales, T.M.: Learning to compare: relation network for few-shot learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1199–1208 (2018)

  14. Li, A., Huang, W., Lan, X., Feng, J., Li, Z., Wang, L.: Boosting few-shot learning with adaptive margin loss. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12576–12584 (2020)

  15. Qiao, S., Liu, C., Shen, W., Yuille, A.L.: Few-shot image recognition by predicting parameters from activations. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7229–7238 (2018)

  16. Hou, R., Chang, H., Ma, B., Shan, S., Chen, X.: Cross attention network for few-shot classification. Adv. Neural Inf. Process. Syst. 32 (2019)

  17. Sun, Q., Liu, Y., Chua, T.-S., Schiele, B.: Meta-transfer learning for few-shot learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 403–412 (2019)

  18. Vinyals, O., Blundell, C., Lillicrap, T., Wierstra, D., et al.: Matching networks for one shot learning. Adv. Neural Inf. Process. Syst. 29, 3630–3638 (2016)

    Google Scholar 

  19. Wang, Y., Chao, W.-L., Weinberger, K.Q., van der Maaten, L.: Simpleshot: revisiting nearest-neighbor classification for few-shot learning. arXiv preprint arXiv:1911.04623 (2019)

  20. Hui, B., Zhu, P., Hu, Q., Wang, Q.: Self-attention relation network for few-shot learning. In: 2019 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), pp. 198–203. IEEE (2019)

  21. Zhang, H., Goodfellow, I., Metaxas, D., Odena, A.: Self-attention generative adversarial networks. In: International Conference on Machine Learning, pp. 7354–7363. PMLR (2019)

  22. Yun, S., Han, D., Oh, S.J., Chun, S., Choe, J., Yoo, Y.: Cutmix: regularization strategy to train strong classifiers with localizable features. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6023–6032 (2019)

  23. Thrun, S., Pratt, L.: Learning to learn: introduction and overview. In: Thrun, S., Pratt, L. (eds.) Learning to Learn, pp. 3–17. Springer, Boston (1998)

    Chapter  MATH  Google Scholar 

  24. Rusu, A.A., Rao, D., Sygnowski, J., Vinyals, O., Pascanu, R., Osindero, S., Hadsell, R.: Meta-learning with latent embedding optimization. arXiv preprint arXiv:1807.05960 (2018)

  25. Nichol, A., Achiam, J., Schulman, J.: On first-order meta-learning algorithms. arXiv preprint arXiv:1803.02999 (2018)

  26. Cai, Q., Pan, Y., Yao, T., Yan, C., Mei, T.: Memory matching networks for one-shot image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4080–4088 (2018)

  27. Munkhdalai, T., Yu, H.: Meta networks. In: International Conference on Machine Learning, pp. 2554–2563. PMLR (2017)

  28. Zhou, D., Bousquet, O., Lal, T.N., Weston, J., Schölkopf, B.: Learning with local and global consistency. Adv. Neural Inf. Process. Syst. 16, 321–328 (2004)

    Google Scholar 

  29. Chen, W.-Y., Liu, Y.-C., Kira, Z., Wang, Y.-C.F., Huang, J.-B.: A closer look at few-shot classification. arXiv preprint arXiv:1904.04232 (2019)

  30. Zhang, H., Cisse, M., Dauphin, Y.N., Lopez-Paz, D.: mixup: beyond empirical risk minimization. arXiv preprint arXiv:1710.09412 (2017)

  31. Verma, V., Lamb, A., Beckham, C., Najafi, A., Mitliagkas, I., Lopez-Paz, D., Bengio, Y.: Manifold mixup: better representations by interpolating hidden states. In: International Conference on Machine Learning, pp. 6438–6447. PMLR (2019)

  32. DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017)

  33. Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)

  34. Park, J., Woo, S., Lee, J.-Y., Kweon, I.S.: Bam: bottleneck attention module. arXiv preprint arXiv:1807.06514 (2018)

  35. Zhou, Q., Zhang, X., Zhang, Y.-D.: Ensemble learning with attention-based multiple instance pooling for classification of SPT. IEEE Trans. Circuits Syst. II Express Br. 69(3), 1927–1931 (2021)

    Google Scholar 

  36. Wang, S.-H., Fernandes, S., Zhu, Z., Zhang, Y.-D.: AVNC: attention-based VGG-style network for COVID-19 diagnosis by CBAM. IEEE Sens. J. 22(18), 17431–17438 (2021)

    Article  Google Scholar 

  37. Zhang, X., Qiang, Y., Sung, F., Yang, Y., Hospedales, T.: Relationnet2: deep comparison network for few-shot learning. In: 2020 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2020)

  38. Mishra, N., Rohaninejad, M., Chen, X., Abbeel, P.: A simple neural attentive meta-learner. arXiv preprint arXiv:1707.03141 (2017)

  39. Lee, K., Maji, S., Ravichandran, A., Soatto, S.: Meta-learning with differentiable convex optimization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10657–10665 (2019)

  40. Gidaris, S., Komodakis, N.: Generating classification weights with gnn denoising autoencoders for few-shot learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 21–30 (2019)

  41. Ren, M., Triantafillou, E., Ravi, S., Snell, J., Swersky, K., Tenenbaum, J.B., Larochelle, H., Zemel, R.S.: Meta-learning for semi-supervised few-shot classification. arXiv preprint arXiv:1803.00676 (2018)

  42. Wah, C., Branson, S., Welinder, P., Perona, P., Belongie, S.: The caltech-ucsd birds-200-2011 dataset (2011)

  43. Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)

    Article  MathSciNet  Google Scholar 

  44. Li, W., Wang, L., Xu, J., Huo, J., Gao, Y., Luo, J.: Revisiting local descriptor based image-to-class measure for few-shot learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7260–7268 (2019)

  45. Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., DeVito, Z., Lin, Z., Desmaison, A., Antiga, L., Lerer, A.: Automatic differentiation in pytorch (2017)

  46. Ye, H.-J., Hu, H., Zhan, D.-C., Sha, F.: Few-shot learning via embedding adaptation with set-to-set functions. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8808–8817 (2020)

  47. Tseng, H.-Y., Lee, H.-Y., Huang, J.-B., Yang, M.-H.: Cross-domain few-shot classification via learned feature-wise transformation. In: International Conference on Learning Representations (2020)

  48. Liu, B., Cao, Y., Lin, Y., Li, Q., Zhang, Z., Long, M., Hu, H.: Negative margin matters: understanding margin in few-shot classification. In: European Conference on Computer Vision, pp. 438–455 (2020). Springer

  49. Liu, C., Fu, Y., Xu, C., Yang, S., Li, J., Wang, C., Zhang, L.: Learning a few-shot embedding model with contrastive learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 8635–8643 (2021)

  50. Ziko, I., Dolz, J., Granger, E., Ayed, I.B.: Laplacian regularized few-shot learning. In: International Conference on Machine Learning, pp. 11660–11670. PMLR (2020)

Download references

Funding

This work was supported by the National Science Foundation of Shanghai (No.22ZR1443700).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yongxiong Wang.

Ethics declarations

Conflict of interest

The authors declares that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The original online version of this article was revised: Funding information updated.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, G., Wang, Y. Self-attention network for few-shot learning based on nearest-neighbor algorithm. Machine Vision and Applications 34, 28 (2023). https://doi.org/10.1007/s00138-023-01375-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00138-023-01375-5

Keywords

Navigation