Spatial Attention Network for Few-Shot Learning

He, Xianhao; Qiao, Peng; Dou, Yong; Niu, Xin

doi:10.1007/978-3-030-30484-3_46

Xianhao He¹²,
Peng Qiao¹²,
Yong Dou¹² &
…
Xin Niu¹²

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11728))

Included in the following conference series:

International Conference on Artificial Neural Networks

4057 Accesses
1 Citations

Abstract

Metric learning is one of the feasible approaches to few-shot learning. However, most metric learning methods encode images through CNN directly, without considering image contents. The general CNN features may lead to hard discrimination among distinct classes. Based on observation that feature maps correspond to image regions, we assume that image regions relevant to target objects should be salient in image features. To this end, we propose an effective framework, called Spatial Attention Network (SAN), to exploit spatial context of images. SAN produces attention weights on clustered regional features indicating the contributions of different regions to classification, and takes weighted sum of regional features as discriminative features. Thus, SAN highlights important contents by giving them large weights. Once trained, SAN compares unlabeled data with class prototypes of few labeled data in nearest-neighbor manner and identifies classes of unlabeled data. We evaluate our approach on three disparate datasets: miniImageNet, Caltech-UCSD Birds and miniDogsNet. Experimental results show that when compared with state-of-the-art models, SAN achieves competitive accuracy in miniImageNet and Caltech-UCSD Birds, and it improves 5-shot accuracy in miniDogsNet by a large margin.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Pytorch. https://github.com/pytorch/pytorch
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, June 2009. https://doi.org/10.1109/cvprw.2009.5206848
Finn, C., Abbeel, P., Levine, S.: Model-agnostic meta-learning for fast adaptation of deep networks. arXiv preprint arXiv:1703.03400 (2017)
Fu, J., Zheng, H., Mei, T.: Look closer to see better: recurrent attention convolutional neural network for fine-grained image recognition. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), p. 3. IEEE, July 2017. https://doi.org/10.1109/cvpr.2017.476
Hara, K., Liu, M.Y., Tuzel, O., Farahmand, A.m.: Attentional network for visual object detection. arXiv preprint arXiv:1702.01478 (2017)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778. IEEE, June 2016. https://doi.org/10.1109/cvpr.2016.90
Hilliard, N., Phillips, L., Howland, S., Yankov, A., Corley, C.D., Hodas, N.O.: Few-shot learning with metric-agnostic conditional embeddings. arXiv preprint arXiv:1802.04376 (2018)
Ji, Z., Fu, Y., Guo, J., Pang, Y., Zhang, Z.M., et al.: Stacked semantics-guided attention model for fine-grained zero-shot learning. In: Advances in Neural Information Processing Systems, pp. 5998–6007 (2018)
Google Scholar
Koch, G., Zemel, R., Salakhutdinov, R.: Siamese neural networks for one-shot image recognition. In: ICML Deep Learning Workshop, vol. 2 (2015)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Google Scholar
Li, Z., Zhou, F., Chen, F., Li, H.: Meta-SGD: learning to learn quickly for few shot learning. arXiv preprint arXiv:1707.09835 (2017)
Luo, W., Li, Y., Urtasun, R., Zemel, R.: Understanding the effective receptive field in deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 4898–4906 (2016)
Google Scholar
Maaten, L.v.d., Hinton, G.J.: Visualizing data using T-SNE. Mach. Learn. Res. 9, 2579–2605 (2008)
Google Scholar
Mathe, S., Pirinen, A., Sminchisescu, C.: Reinforcement learning for visual object detection. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2894–2902. IEEE, June 2016. https://doi.org/10.1109/cvpr.2016.316
Ravi, S., Larochelle, H.: Optimization as a model for few-shot learning. In: International Conference on Learning Representations (2017)
Google Scholar
Rippel, O., Paluri, M., Dollar, P., Bourdev, L.: Metric learning with adaptive density discrimination. arXiv preprint arXiv:1511.05939 (2015)
Schwartz, E., et al.: Delta-encoder: an effective sample synthesis method for few-shot object recognition. In: Advances in Neural Information Processing Systems, pp. 2850–2860 (2018)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Snell, J., Swersky, K., Zemel, R.: Prototypical networks for few-shot learning. In: Advances in Neural Information Processing Systems, pp. 4077–4087 (2017)
Google Scholar
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
MathSciNet MATH Google Scholar
Sung, F., Yang, Y., Zhang, L., Xiang, T., Torr, P.H., Hospedales, T.M.: Learning to compare: relation network for few-shot learning. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1199–1208. IEEE, June 2018. https://doi.org/10.1109/cvpr.2018.00131
Vinyals, O., Blundell, C., Lillicrap, T., Wierstra, D., et al.: Matching networks for one shot learning. In: Advances in Neural Information Processing Systems, pp. 3630–3638 (2016)
Google Scholar
Xiao, T., Xu, Y., Yang, K., Zhang, J., Peng, Y., Zhang, Z.: The application of two-level attention models in deep convolutional neural network for fine-grained image classification. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 842–850. IEEE, June 2015. https://doi.org/10.1109/cvpr.2015.7298685
Xu, K., et al.: Show, attend and tell: neural image caption generation with visual attention. In: International Conference on Machine Learning, pp. 2048–2057 (2015)
Google Scholar
Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? In: Advances in Neural Information Processing Systems, pp. 3320–3328 (2014)
Google Scholar
You, Q., Jin, H., Wang, Z., Fang, C., Luo, J.: Image captioning with semantic attention. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4651–4659. IEEE, June 2016. https://doi.org/10.1109/cvpr.2016.503
Zheng, H., Fu, J., Mei, T., Luo, J.: Learning multi-attention convolutional neural network for fine-grained image recognition. In: 2017 IEEE International Conference on Computer Vision (ICCV). IEEE, October 2017. https://doi.org/10.1109/iccv.2017.557
Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2921–2929. IEEE, June 2016. https://doi.org/10.1109/cvpr.2016.319

Download references

Acknowledgements

This paper is supported by the National Key Research and Development Program of China (Grant No. 2018YFB1003405) and the National Natural Science Foundation of China (Grant No. 61732018).

Author information

Authors and Affiliations

National Laboratory for Parallel and Distributed Processing, National University of Defense Technology, Changsha, China
Xianhao He, Peng Qiao, Yong Dou & Xin Niu

Authors

Xianhao He
View author publications
You can also search for this author in PubMed Google Scholar
Peng Qiao
View author publications
You can also search for this author in PubMed Google Scholar
Yong Dou
View author publications
You can also search for this author in PubMed Google Scholar
Xin Niu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xianhao He .

Editor information

Editors and Affiliations

Helmholtz Zentrum München - Deutsches Forschungszentrum für Gesundheit und Umwelt (GmbH), Neuherberg, Germany
Igor V. Tetko
Institute of Computer Science, Czech Academy of Sciences, Prague 8, Czech Republic
Věra Kůrková
Helmholtz Zentrum München - Deutsches Forschungszentrum für Gesundheit und Umwelt (GmbH), Neuherberg, Germany
Pavel Karpov
Helmholtz Zentrum München - Deutsches Forschungszentrum für Gesundheit und Umwelt (GmbH), Neuherberg, Germany
Fabian Theis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

He, X., Qiao, P., Dou, Y., Niu, X. (2019). Spatial Attention Network for Few-Shot Learning. In: Tetko, I., Kůrková, V., Karpov, P., Theis, F. (eds) Artificial Neural Networks and Machine Learning – ICANN 2019: Deep Learning. ICANN 2019. Lecture Notes in Computer Science(), vol 11728. Springer, Cham. https://doi.org/10.1007/978-3-030-30484-3_46

Download citation

DOI: https://doi.org/10.1007/978-3-030-30484-3_46
Published: 09 September 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-30483-6
Online ISBN: 978-3-030-30484-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics