Abstract
Model Agnostic Meta Learning (MAML) has become the most representative meta learning algorithm to solve few-shot learning problems. This paper mainly discusses MAML framework, focusing on the key problem of solving few-shot learning through meta learning. However, MAML is sensitive to the base model for the inner loop, and training instability occur during the training process, resulting in an increase of the training difficulty of the model in the process of training and verification process, causing degradation of model performance. In order to solve these problems, we propose a multi-stage loss optimization meta-learning algorithm. By discussing a learning mechanism for inner and outer loops, it improves the training stability and accelerates the convergence for the model. The generalization ability of MAML has been enhanced.
Similar content being viewed by others
References
Snell J, Swersky K, Zemel R (2017) Prototypical networks for few-shot learning[C]//Advances in Neural Information Processing Systems, pp. 4077–4087
Caruana R (1995) Learning many related tasks at the same time with backpropagation. In Advances in neural information processing systems, pp. 657–664
Vilalta R, Drissi Y (2002) A perspective view and survey of meta-learning. Artif Intell Rev 18(2):77–95
Thrun S, Pratt L (1998) Learning to learn: introduction and overview. In Learning to learn. Springer, Boston, pp. 3-17
Finn C, Abbeel P, Levine S (2017) Model-agnostic meta-learning for fast adaptation of deep networks. In Proceedings of the 34th International Conference on Machine Learning-Volume 70 pp. 1126–1135
Naik DK, Mammone RJ (1992) Meta-neural networks that learn by learning. In [Proceedings 1992] IJCNN International Joint Conference on Neural Networks (1) : 437–442
Boytsov L, Naidan B (2013) Learning to prune in metric and non-metric spaces. In Advances in Neural Information Processing Systems pp. 1574–1582
Koch G, Zemel R, Salakhutdinov R (2015) Siamese neural networks for one-shot image recognition. In ICML deep learning workshop 2
Bromley J, Guyon I, LeCun Y, Säckinger E, Shah R (1994) Signature verification using a “Siamese” time delay neural network. In Advances in neural information processing systems pp. 737–744
Sung F, Yang Y, Zhang L, Xiang T, Torr PH, Hospedales TM (2018) Learning to compare: Relation network for few-shot learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition pp. 1199–1208
Vinyals O, Blundell C, Lillicrap T, Wierstra D (2016) Matching networks for one shot learning. In Advances in neural information processing systems pp. 3630–3638
Graves A, Wayne G, Danihelka I (2014) Neural turing machines. arXiv preprint arXiv:1410.5401
Chen J, Qiu X, Liu P, Huang X (2018) Meta multi-task learning for sequence modeling. In Thirty-Second AAAI Conference on Artificial Intelligence
Santoro A, Bartunov S, Botvinick M, Wierstra D, Lillicrap T (2016) One-shot learning with memory-augmented neural networks. arXiv preprint arXiv:1605.06065
Nichol A, Achiam J, Schulman J (2018) On first-order meta-learning algorithms. arXiv preprint arXiv:1803.02999
Li Z, Zhou F, Chen F, Li H (2017) Meta-SGD: learning to learn quickly for few-shot learning. arXiv preprint arXiv:1707.09835
Xingjian SHI, Chen Z, Wang H, Yeung DY, Wong WK, Woo WC (2015) Convolutional LSTM network: a machine learning approach for precipitation nowcasting. In Advances in neural information processing systems pp. 802–810
De Boer PT, Kroese DP, Mannor S, Rubinstein RY (2005) A tutorial on the cross-entropy method. Ann Oper Res 134(1):19–67
Lake BM, Salakhutdinov R, Tenenbaum JB (2015) Human-level concept learning through probabilistic program induction. Science 350(6266):1332–1338
Acknowledgements
This work is supported by the Fundamental Research Funds for the Central Universities B200202205, the Key Research and Development Program of Jiangsu under Grants BE2017071, BE2017647 and BE2018004-04, BK20192004. National Nature Science Foundation of China under Grants (61501170, 41876097, 61401148, 61471157, 61671202), the Open Research Fund of State Key Laboratory of Bioelectronics, Southeast University under Grant 2019005, and the State Key Laboratory of Integrated Management of Pest Insects and Rodents under Grant IPM1914. Shenzhen Science and Technology Plan Project (JSGG20180507183020876).
Author information
Authors and Affiliations
Corresponding authors
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Yao, X., Zhu, J., Huo, G. et al. Model-agnostic multi-stage loss optimization meta learning. Int. J. Mach. Learn. & Cyber. 12, 2349–2363 (2021). https://doi.org/10.1007/s13042-021-01316-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-021-01316-6