Dynamic Tuning and Weighting of Meta-learning for NMT Domain Adaptation

Song, Ziyue; Ma, Zhiyuan; Qi, Kaiyue; Liu, Gongshen

doi:10.1007/978-3-030-86383-8_46

Ziyue Song¹²,
Zhiyuan Ma¹²,
Kaiyue Qi¹² &
…
Gongshen Liu¹²

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12895))

Included in the following conference series:

International Conference on Artificial Neural Networks

2067 Accesses

Abstract

Neural machine translation (NMT) systems fall short when training data is insufficient. For low-resource domain adaptation, meta-learning has proven to be an effective training scheme. It aims to find an optimal initialization that is easily adaptable to new domains. However, it is assumed that samples contribute equally in tasks and tasks contribute equally in the training task distribution, which deteriorates the performance of the meta-model. In the inner loop, we propose the dynamic tuning strategy to distinguish the tasks’ adapting abilities and weight the loss according to the representativeness to discriminate tasks from the same domain. In the outer loop, to measure effects of each task on meta parameters, we calculate uncertainty-aware confidence and assign weights on meta-updating steps. Experiments show that the proposed approaches gain stable improvements in all domains (+1.35 BLEU points in maximum). We also analyze the ability of our strategies to alleviate domain imbalance in non-ideal settings.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Bapna, A., Firat, O.: Simple, scalable adaptation for neural machine translation. In: EMNLP/IJCNLP (1), pp. 1538–1548. Association for Computational Linguistics (2019)
Google Scholar
Behl, H.S., Baydin, A.G., Torr, P.H.S.: Alpha MAML: adaptive model-agnostic meta-learning. CoRR abs/1905.07435 (2019)
Google Scholar
Buntine, W.L., Weigend, A.S.: Bayesian back-propagation. Complex Syst. 5(6), 603–643 (1991)
MATH Google Scholar
Cai, D., Sheth, R., Mackey, L., Fusi, N.: Weighted meta-learning. CoRR abs/2003.09465 (2020)
Google Scholar
Cer, D., et al.: Universal sentence encoder for English. In: EMNLP (Demonstration), pp. 169–174. Association for Computational Linguistics (2018)
Google Scholar
Chu, C., Dabre, R., Kurohashi, S.: An empirical comparison of simple domain adaptation methods for neural machine translation. CoRR abs/1701.03214 (2017)
Google Scholar
Finn, C., Abbeel, P., Levine, S.: Model-agnostic meta-learning for fast adaptation of deep networks. In: ICML Proceedings of Machine Learning Research, vol. 70, pp. 1126–1135. PMLR (2017)
Google Scholar
Freitag, M., Al-Onaizan, Y.: Fast domain adaptation for neural machine translation. CoRR abs/1612.06897 (2016)
Google Scholar
Gal, Y., Ghahramani, Z.: Dropout as a Bayesian approximation: representing model uncertainty in deep learning. In: ICML JMLR Workshop and Conference Proceedings, vol. 48, pp. 1050–1059. JMLR.org (2016)
Google Scholar
Gu, S., Feng, Y.: Investigating catastrophic forgetting during continual training for neural machine translation. In: COLING, pp. 4315–4326. International Committee on Computational Linguistics (2020)
Google Scholar
Houlsby, N., et al.: Parameter-efficient transfer learning for NLP. In: ICML, Proceedings of Machine Learning Research, vol. 97, pp. 2790–2799. PMLR (2019)
Google Scholar
Killamsetty, K., Li, C., Zhao, C., Iyer, R.K., Chen, F.: A reweighted meta learning framework for robust few shot learning. CoRR abs/2011.06782 (2020)
Google Scholar
Kobus, C., Crego, J.M., Senellart, J.: Domain control for neural machine translation. In: RANLP, pp. 372–378. INCOMA Ltd. (2017)
Google Scholar
Lee, H., et al.: Learning to balance: Bayesian meta-learning for imbalanced and out-of-distribution tasks. In: ICLR. OpenReview.net (2020)
Google Scholar
Li, R., Wang, X., Yu, H.: MetaMT, a meta learning method leveraging multiple domain data for low resource machine translation. In: AAAI, pp. 8245–8252. AAAI Press (2020)
Google Scholar
Li, Z., Zhou, F., Chen, F., Li, H.: Meta-SGD: learning to learn quickly for few shot learning. CoRR abs/1707.09835 (2017)
Google Scholar
Papineni, K., Roukos, S., Ward, T., Zhu, W.: BLEU: a method for automatic evaluation of machine translation. In: ACL, pp. 311–318. ACL (2002)
Google Scholar
Ren, M., Zeng, W., Yang, B., Urtasun, R.: Learning to reweight examples for robust deep learning. In: ICML, Proceedings of Machine Learning Research, vol. 80, pp. 4331–4340. PMLR (2018)
Google Scholar
Sajjad, H., Durrani, N., Dalvi, F., Belinkov, Y., Vogel, S.: Neural machine translation training in a multi-domain scenario. CoRR abs/1708.08712 (2017)
Google Scholar
Sharaf, A., Hassan, H., III, H.D.: Meta-learning for few-shot NMT adaptation. In: NGT@ACL, pp. 43–53. Association for Computational Linguistics (2020)
Google Scholar
Thompson, B., Gwinnup, J., Khayrallah, H., Duh, K., Koehn, P.: Overcoming catastrophic forgetting during domain adaptation of neural machine translation. In: NAACL-HLT (1), pp. 2062–2068. Association for Computational Linguistics (2019)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: NIPS, pp. 5998–6008 (2017)
Google Scholar
Wang, R., Utiyama, M., Liu, L., Chen, K., Sumita, E.: Instance weighting for neural machine translation domain adaptation. In: EMNLP, pp. 1482–1488. Association for Computational Linguistics (2017)
Google Scholar
Wang, S., Liu, Y., Wang, C., Luan, H., Sun, M.: Improving back-translation with uncertainty-based confidence estimation. In: EMNLP/IJCNLP (1), pp. 791–802. Association for Computational Linguistics (2019)
Google Scholar
van der Wees, M., Bisazza, A., Monz, C.: Dynamic data selection for neural machine translation. In: EMNLP, pp. 1400–1410. Association for Computational Linguistics (2017)
Google Scholar

Download references

Acknowledgement

This research work has been funded by the National Natural Science Foundation of China (Grant No. 61772337, U1736207).

Author information

Authors and Affiliations

School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China
Ziyue Song, Zhiyuan Ma, Kaiyue Qi & Gongshen Liu

Authors

Ziyue Song
View author publications
You can also search for this author in PubMed Google Scholar
Zhiyuan Ma
View author publications
You can also search for this author in PubMed Google Scholar
Kaiyue Qi
View author publications
You can also search for this author in PubMed Google Scholar
Gongshen Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Kaiyue Qi or Gongshen Liu .

Editor information

Editors and Affiliations

Comenius University in Bratislava, Bratislava, Slovakia
Igor Farkaš
iMotions A/S, Copenhagen, Denmark
Paolo Masulli
University of Tübingen, Tübingen, Baden-Württemberg, Germany
Sebastian Otte
Universität Hamburg, Hamburg, Germany
Stefan Wermter

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Song, Z., Ma, Z., Qi, K., Liu, G. (2021). Dynamic Tuning and Weighting of Meta-learning for NMT Domain Adaptation. In: Farkaš, I., Masulli, P., Otte, S., Wermter, S. (eds) Artificial Neural Networks and Machine Learning – ICANN 2021. ICANN 2021. Lecture Notes in Computer Science(), vol 12895. Springer, Cham. https://doi.org/10.1007/978-3-030-86383-8_46

Download citation

DOI: https://doi.org/10.1007/978-3-030-86383-8_46
Published: 07 September 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86382-1
Online ISBN: 978-3-030-86383-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Dynamic Tuning and Weighting of Meta-learning for NMT Domain Adaptation