Embedded adaptive cross-modulation neural network for few-shot learning

Wang, Peng; Cheng, Jun; Hao, Fusheng; Wang, Lei; Feng, Wei

doi:10.1007/s00521-019-04605-y

Embedded adaptive cross-modulation neural network for few-shot learning

ATCI 2019
Published: 16 November 2019

Volume 32, pages 5505–5515, (2020)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Peng Wang^1,2,3,
Jun Cheng ORCID: orcid.org/0000-0002-3131-3275^1,2,3,
Fusheng Hao^1,2,3,
Lei Wang^1,2,3 &
…
Wei Feng^1,2,3

700 Accesses
20 Citations
Explore all metrics

Abstract

Although deep neural networks have made great success in several scenarios of machine learning, they face persistent challenges in small training datasets learning scenarios. Few-shot learning aims to learn from a few labeled examples. However, the limited training samples and weakly distinguishable embedding vectors in a metric space often lead to unsatisfactory test results and directly calculating the distance between tensors can cause ambiguity. This paper proposes an embedded adaptive cross-modulation (EACM) method for few-shot learning which combines the information between support and query examples. Specifically, the inter-class categorizability between the support set prototype representations is enhanced by the adaptive cosine metric module to improve the accuracy of the few-shot recognition result. The learning is performed by using the cross-modulation module at many levels of abstraction layers along the prediction pipeline. The support set and query set feature cross-enhance, which improves the generalization ability and robustness of image recognition. Afterward, we further combine above two methods by a weight balance scalar to determine the task-related metric space and construct a joint loss function. Theoretical analysis demonstrates the generalization ability of EACM. We conduct comprehensive experiments on mini-ImageNet and CUB datasets. Experimental results show that our approach is the state-of-the-art approach by significant margins.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A survey on deep multimodal learning for computer vision: advances, trends, applications, and datasets

Article 10 June 2021

Learning with Noisy Correspondence

Article 13 April 2024

CLIP-Adapter: Better Vision-Language Models with Feature Adapters

Article 15 September 2023

References

Ravi S, Larochelle H (2017) Optimization as a model for few-short learning. In: ICLR, pp 1–11
Perez E, de Vries H, Strub F, Dumoulin V, Courville A (2017) Learning visual reasoning without strong priors. In: MLSLP workshop at ICML
Li J, Wong HC, Lo SL, Xin Y (2018) Multiple object detection by a deformable part-based model and an R-CNN. IEEE Signal Process Lett 25(2):288–292
Article Google Scholar
Wu C, Li Y, Zhao Z, Liu B (2019) Extreme learning machine with autoencoding receptive fields for image classification. Neural Comput Appl 2019:1–17
Google Scholar
Wang X, Gao L, Song J, Shen H (2017) Beyond frame-level CNN: saliency-aware 3-D CNN with LSTM for video action recognition. IEEE Signal Process Lett 24(4):510–514
Article Google Scholar
Liu F, Tao D, Wang L, Xu Y, Xia H, Cheng J (2018) Ensemble one-dimensional convolution neural networks for skeleton-based action recognition. IEEE Signal Process Lett 25(7):1044–1048
Article Google Scholar
Kalash M, Rochan M, Mohammed N, Bruce ND, Wang Y, Iqbal F (2018) Malware classification with deep convolutional neural networks. In: 2018 9th IFIP international conference on new technologies, mobility and security, NTMS 2018—proceedings, vol 2018, no 6, pp 1–5
Chen T, Zhao Y, Guo Y (2019) Sparsity-regularized feature selection for multi-class remote sensing image classification. Neural Comput Appl 2019:1–9
Google Scholar
Taylor L, Nitschke G (2017) Improving deep learning using generic data augmentation. In: CoRR
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC, Fei-Fei L (2015) ImageNet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252
Article MathSciNet Google Scholar
Garcia V, Bruna J (2018) Few-shot learning with graph neural networks. In: Proceedings of the international conference on learning representations
Das R, Walia E (2019) Partition selection with sparse autoencoders for content based image classification. Neural Comput Appl 31(3):675–690
Article Google Scholar
Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: Proceedings of the 32nd international conference on machine learning, pp 448–456
Kukačka J, Golkov V, Cremers D (2017) Regularization for deep learning: a taxonomy. arXiv:1710.10686
Hilliard N, Phillips L, Howland S, Yankov A, Corley CD, Hodas NO (2018) Few-shot learning with metric-agnostic conditional embeddings. arXiv:1802.04376
Zou X, Zhou L, Li K, Ouyang A, Chen C (2019) Multi-task cascade deep convolutional neural networks for large-scale commodity recognition. Neural Comput Appl. https://doi.org/10.1007/s00521-019-04311-9
Article Google Scholar
Munkhdalai T, Yuan X, Mehri S, Trischler A (2018) Rapid adaptation with conditionally shifted neurons. In: Proceedings of the 35th international conference on machine learning, pp 3664–3673
Mishra N, Rohaninejad M, Chen XPA (2018) A simple neural attentive meta-learner. In: ICLR, 2018
Santoro A, Bartunov S, Botvinick M, Wierstra D, Lillicrap T, Deepmind G (2016) Meta-learning with memory-augmented neural networks Google DeepMind. In: Proceedings of the 33rd international conference on machine learning, pp 1842–1850
Vinyals O, Blundell C, Lillicrap T, Kavukcuoglu K, Wierstra D (2016) Matching networks for one shot learning. In: Advances in neural information processing systems
Sung F, Yang Y, Zhang L (2018) Learning to compare : relation network for few-shot learning Queen Mary University of London. In: Cvpr, pp 1199–1208
Oh J, Singh S, Lee H, Kohli P (2017) Zero-shot task generalization with multi-task deep reinforcement learning. In: ICML
Finn C, Abbeel P, Levine S (2017) Model-agnostic meta-learning for fast adaptation of deep networks. In: ICML
Nichol A, Achiam J, Schulman J (2018) On first-order meta-learning algorithms. In: CoRR
Yoon J, Kim T, Dia O, Kim S (2018) Bayesian model-agnostic meta-learning. In: NIPS'18 proceedings of the 32nd international conference on neural information processing systems, pp 7343–7353
Finn C, Xu K, Levine S (2018) Probabilistic model-agnostic meta-learning. In: Advances in Neural Information Processing Systems, pp. 9516–9527
Grant E, Finn C, Levine S, Darrell T, Griffiths T (2018) Recasting gradient-based meta-learning as hierarchical Bayes. In: CoRR
Rusu AA, Rao D, Sygnowski J, Vinyals O, Pascanu R, Osindero S, Hadsell R (2018) Meta-learning with latent embedding optimization. In: CoRR
Snell J, Swersky K, Zemel R (2017) Prototypical networks for few-shot learning. In: Advances in neural information processing systems, pp 4077–4087
Bromley J, Guyon I, LeCun Y Signature verification using a “siamese” time delay neural network. In: NIPS
Koch G, Zemel R, RS Deep Learning Workshop (2015) Siamese neural networks for one-shot image recognition. In: ICML
Bertinetto L, Henriques JF, Torr PH, Vedaldi A (2018) Meta-learning with differentiable closed-form solvers. In ICLR, 2019, pp 2–8
Santoro A, Raposo D, Barrett DGT, Malinowski M, Pascanu R, Battaglia P, Lillicrap T (2017) A simple neural network module for relational reasoning. In: NIPS
Koch, G., Zemel, R., & Salakhutdinov, R. (2015). Siamese neural networks for one-shot image recognition. In ICML deep learning workshop vol. 2
Qiao S, Liu C, Shen W, Yuille A (2018) Few-shot image recognition by predicting parameters from activations. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp 7229–7238
Nicosia M, Moschitti A (2017) Learning contextual embeddings for structural semantic similarity using categorical information, aclweb.org, pp 260–270
Weston J, Chopra S, Bordes A (2014) Memory networks. arXiv:1410.3916
Cai Q, Pan Y, Yao T, Yan C, Mei T (2018) Memory matching networks for one-shot image recognition. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp 4080–4088
Munkhdalai T, Yu H (2017) Meta networks. In: ACM 2017
Triantafillou E, Larochelle H, Snell J, Tenenbaum J (2018) Meta-learning for semi-supervised few-shot classification. In: ICLR
Hao F, Cheng J, Wang L, Cao J (2019) Instance-level embedding adaptation for few-shot learning. In: IEEE Access
Perez E, Strub F, de Vries H, Dumoulin V, Courville A (2017) FiLM: visual reasoning with a general conditioning layer. In: AAAI 2017
Oreshkin BN, Rodriguez P, Lacoste A (2018) TADAM: task dependent adaptive metric for improved few-shot learning. In: NIPS
Wah C, Branson S, Welinder P, Perona P, Belongie S (2011) The Caltech-UCSD Birds-200–2011 dataset. In: Cns-Tr-2011-001
Chen W-Y, Liu Y-C, Kira Z, Wang Y-CF, Huang J-B (2019) A closer look at few-shot classification. In: Proceedings of the international conference learning represent
Ketkar N (2017) In: Deep learning with python. Apress, Berkeley, CA, USA, pp 195–208
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. In: ICLR
Li Z, Zhou F, Chen F, Li H (2017) Meta-SGD: learning to learn quickly for few-shot learning. In: CoRR

Download references

Acknowledgements

Funding was provided by National Natural Science Foundation of China (Grant Nos. U1713213, 61772508), National Key R&D Program of China (2018YFB1308000), National Natural Science Foundation of China (U1713213, 61772508), Key Research and Development Program of Guangdong Province [grant numbers 2019B090915001], Shenzhen Technology Project (JCYJ20180507182610734, JCYJ20170413152535587), CAS Key Technology Talent Program.

Author information

Authors and Affiliations

CAS Key Laboratory of Human-Machine Intelligence-Synergy Systems, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
Peng Wang, Jun Cheng, Fusheng Hao, Lei Wang & Wei Feng
Shenzhen College of Advanced Technology, University of Chinese Academy of Sciences, Beijing, China
Peng Wang, Jun Cheng, Fusheng Hao, Lei Wang & Wei Feng
The Chinese University of Hong Kong, Hong Kong, China
Peng Wang, Jun Cheng, Fusheng Hao, Lei Wang & Wei Feng

Authors

Peng Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jun Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Fusheng Hao
View author publications
You can also search for this author in PubMed Google Scholar
Lei Wang
View author publications
You can also search for this author in PubMed Google Scholar
Wei Feng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jun Cheng.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, P., Cheng, J., Hao, F. et al. Embedded adaptive cross-modulation neural network for few-shot learning. Neural Comput & Applic 32, 5505–5515 (2020). https://doi.org/10.1007/s00521-019-04605-y

Download citation

Received: 23 July 2019
Accepted: 08 November 2019
Published: 16 November 2019
Issue Date: May 2020
DOI: https://doi.org/10.1007/s00521-019-04605-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Embedded adaptive cross-modulation neural network for few-shot learning

Abstract

Access this article

Similar content being viewed by others

A survey on deep multimodal learning for computer vision: advances, trends, applications, and datasets

Learning with Noisy Correspondence

CLIP-Adapter: Better Vision-Language Models with Feature Adapters

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Embedded adaptive cross-modulation neural network for few-shot learning

Abstract

Access this article

Similar content being viewed by others

A survey on deep multimodal learning for computer vision: advances, trends, applications, and datasets

Learning with Noisy Correspondence

CLIP-Adapter: Better Vision-Language Models with Feature Adapters

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation