Skip to main content
Log in

Model-agnostic multi-stage loss optimization meta learning

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Model Agnostic Meta Learning (MAML) has become the most representative meta learning algorithm to solve few-shot learning problems. This paper mainly discusses MAML framework, focusing on the key problem of solving few-shot learning through meta learning. However, MAML is sensitive to the base model for the inner loop, and training instability occur during the training process, resulting in an increase of the training difficulty of the model in the process of training and verification process, causing degradation of model performance. In order to solve these problems, we propose a multi-stage loss optimization meta-learning algorithm. By discussing a learning mechanism for inner and outer loops, it improves the training stability and accelerates the convergence for the model. The generalization ability of MAML has been enhanced.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. Snell J, Swersky K, Zemel R (2017) Prototypical networks for few-shot learning[C]//Advances in Neural Information Processing Systems, pp. 4077–4087

  2. Caruana R (1995) Learning many related tasks at the same time with backpropagation. In Advances in neural information processing systems, pp. 657–664

  3. Vilalta R, Drissi Y (2002) A perspective view and survey of meta-learning. Artif Intell Rev 18(2):77–95

    Article  Google Scholar 

  4. Thrun S, Pratt L (1998) Learning to learn: introduction and overview. In Learning to learn. Springer, Boston, pp. 3-17

  5. Finn C, Abbeel P, Levine S (2017) Model-agnostic meta-learning for fast adaptation of deep networks. In Proceedings of the 34th International Conference on Machine Learning-Volume 70 pp. 1126–1135

  6. Naik DK, Mammone RJ (1992) Meta-neural networks that learn by learning. In [Proceedings 1992] IJCNN International Joint Conference on Neural Networks (1) : 437–442

  7. Boytsov L, Naidan B (2013) Learning to prune in metric and non-metric spaces. In Advances in Neural Information Processing Systems pp. 1574–1582

  8. Koch G, Zemel R, Salakhutdinov R (2015) Siamese neural networks for one-shot image recognition. In ICML deep learning workshop 2

  9. Bromley J, Guyon I, LeCun Y, Säckinger E, Shah R (1994) Signature verification using a “Siamese” time delay neural network. In Advances in neural information processing systems pp. 737–744

  10. Sung F, Yang Y, Zhang L, Xiang T, Torr PH, Hospedales TM (2018) Learning to compare: Relation network for few-shot learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition pp. 1199–1208

  11. Vinyals O, Blundell C, Lillicrap T, Wierstra D (2016) Matching networks for one shot learning. In Advances in neural information processing systems pp. 3630–3638

  12. Graves A, Wayne G, Danihelka I (2014) Neural turing machines. arXiv preprint arXiv:1410.5401

  13. Chen J, Qiu X, Liu P, Huang X (2018) Meta multi-task learning for sequence modeling. In Thirty-Second AAAI Conference on Artificial Intelligence

  14. Santoro A, Bartunov S, Botvinick M, Wierstra D, Lillicrap T (2016) One-shot learning with memory-augmented neural networks. arXiv preprint arXiv:1605.06065

  15. Nichol A, Achiam J, Schulman J (2018) On first-order meta-learning algorithms. arXiv preprint arXiv:1803.02999

  16. Li Z, Zhou F, Chen F, Li H (2017) Meta-SGD: learning to learn quickly for few-shot learning. arXiv preprint arXiv:1707.09835

  17. Xingjian SHI, Chen Z, Wang H, Yeung DY, Wong WK, Woo WC (2015) Convolutional LSTM network: a machine learning approach for precipitation nowcasting. In Advances in neural information processing systems pp. 802–810

  18. De Boer PT, Kroese DP, Mannor S, Rubinstein RY (2005) A tutorial on the cross-entropy method. Ann Oper Res 134(1):19–67

    Article  MathSciNet  Google Scholar 

  19. Lake BM, Salakhutdinov R, Tenenbaum JB (2015) Human-level concept learning through probabilistic program induction. Science 350(6266):1332–1338

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

This work is supported by the Fundamental Research Funds for the Central Universities B200202205, the Key Research and Development Program of Jiangsu under Grants BE2017071, BE2017647 and BE2018004-04, BK20192004. National Nature Science Foundation of China under Grants (61501170, 41876097, 61401148, 61471157, 61671202), the Open Research Fund of State Key Laboratory of Bioelectronics, Southeast University under Grant 2019005, and the State Key Laboratory of Integrated Management of Pest Insects and Rodents under Grant IPM1914. Shenzhen Science and Technology Plan Project (JSGG20180507183020876).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Xiao Yao or Guanying Huo.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yao, X., Zhu, J., Huo, G. et al. Model-agnostic multi-stage loss optimization meta learning. Int. J. Mach. Learn. & Cyber. 12, 2349–2363 (2021). https://doi.org/10.1007/s13042-021-01316-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-021-01316-6

Keywords

Navigation