Skip to main content

Dynamic Tuning and Weighting of Meta-learning for NMT Domain Adaptation

  • Conference paper
  • First Online:
Artificial Neural Networks and Machine Learning – ICANN 2021 (ICANN 2021)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12895))

Included in the following conference series:

  • 2067 Accesses

Abstract

Neural machine translation (NMT) systems fall short when training data is insufficient. For low-resource domain adaptation, meta-learning has proven to be an effective training scheme. It aims to find an optimal initialization that is easily adaptable to new domains. However, it is assumed that samples contribute equally in tasks and tasks contribute equally in the training task distribution, which deteriorates the performance of the meta-model. In the inner loop, we propose the dynamic tuning strategy to distinguish the tasks’ adapting abilities and weight the loss according to the representativeness to discriminate tasks from the same domain. In the outer loop, to measure effects of each task on meta parameters, we calculate uncertainty-aware confidence and assign weights on meta-updating steps. Experiments show that the proposed approaches gain stable improvements in all domains (+1.35 BLEU points in maximum). We also analyze the ability of our strategies to alleviate domain imbalance in non-ideal settings.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://opus.nlpl.eu.

  2. 2.

    https://github.com/google/sentencepiece.

  3. 3.

    https://github.com/THUNLP-MT/THUMT.

  4. 4.

    https://github.com/kpu/kenlm.

  5. 5.

    https://tfhub.dev/google/universal-sentence-encoder-xling/en-de/1.

References

  1. Bapna, A., Firat, O.: Simple, scalable adaptation for neural machine translation. In: EMNLP/IJCNLP (1), pp. 1538–1548. Association for Computational Linguistics (2019)

    Google Scholar 

  2. Behl, H.S., Baydin, A.G., Torr, P.H.S.: Alpha MAML: adaptive model-agnostic meta-learning. CoRR abs/1905.07435 (2019)

    Google Scholar 

  3. Buntine, W.L., Weigend, A.S.: Bayesian back-propagation. Complex Syst. 5(6), 603–643 (1991)

    MATH  Google Scholar 

  4. Cai, D., Sheth, R., Mackey, L., Fusi, N.: Weighted meta-learning. CoRR abs/2003.09465 (2020)

    Google Scholar 

  5. Cer, D., et al.: Universal sentence encoder for English. In: EMNLP (Demonstration), pp. 169–174. Association for Computational Linguistics (2018)

    Google Scholar 

  6. Chu, C., Dabre, R., Kurohashi, S.: An empirical comparison of simple domain adaptation methods for neural machine translation. CoRR abs/1701.03214 (2017)

    Google Scholar 

  7. Finn, C., Abbeel, P., Levine, S.: Model-agnostic meta-learning for fast adaptation of deep networks. In: ICML Proceedings of Machine Learning Research, vol. 70, pp. 1126–1135. PMLR (2017)

    Google Scholar 

  8. Freitag, M., Al-Onaizan, Y.: Fast domain adaptation for neural machine translation. CoRR abs/1612.06897 (2016)

    Google Scholar 

  9. Gal, Y., Ghahramani, Z.: Dropout as a Bayesian approximation: representing model uncertainty in deep learning. In: ICML JMLR Workshop and Conference Proceedings, vol. 48, pp. 1050–1059. JMLR.org (2016)

    Google Scholar 

  10. Gu, S., Feng, Y.: Investigating catastrophic forgetting during continual training for neural machine translation. In: COLING, pp. 4315–4326. International Committee on Computational Linguistics (2020)

    Google Scholar 

  11. Houlsby, N., et al.: Parameter-efficient transfer learning for NLP. In: ICML, Proceedings of Machine Learning Research, vol. 97, pp. 2790–2799. PMLR (2019)

    Google Scholar 

  12. Killamsetty, K., Li, C., Zhao, C., Iyer, R.K., Chen, F.: A reweighted meta learning framework for robust few shot learning. CoRR abs/2011.06782 (2020)

    Google Scholar 

  13. Kobus, C., Crego, J.M., Senellart, J.: Domain control for neural machine translation. In: RANLP, pp. 372–378. INCOMA Ltd. (2017)

    Google Scholar 

  14. Lee, H., et al.: Learning to balance: Bayesian meta-learning for imbalanced and out-of-distribution tasks. In: ICLR. OpenReview.net (2020)

    Google Scholar 

  15. Li, R., Wang, X., Yu, H.: MetaMT, a meta learning method leveraging multiple domain data for low resource machine translation. In: AAAI, pp. 8245–8252. AAAI Press (2020)

    Google Scholar 

  16. Li, Z., Zhou, F., Chen, F., Li, H.: Meta-SGD: learning to learn quickly for few shot learning. CoRR abs/1707.09835 (2017)

    Google Scholar 

  17. Papineni, K., Roukos, S., Ward, T., Zhu, W.: BLEU: a method for automatic evaluation of machine translation. In: ACL, pp. 311–318. ACL (2002)

    Google Scholar 

  18. Ren, M., Zeng, W., Yang, B., Urtasun, R.: Learning to reweight examples for robust deep learning. In: ICML, Proceedings of Machine Learning Research, vol. 80, pp. 4331–4340. PMLR (2018)

    Google Scholar 

  19. Sajjad, H., Durrani, N., Dalvi, F., Belinkov, Y., Vogel, S.: Neural machine translation training in a multi-domain scenario. CoRR abs/1708.08712 (2017)

    Google Scholar 

  20. Sharaf, A., Hassan, H., III, H.D.: Meta-learning for few-shot NMT adaptation. In: NGT@ACL, pp. 43–53. Association for Computational Linguistics (2020)

    Google Scholar 

  21. Thompson, B., Gwinnup, J., Khayrallah, H., Duh, K., Koehn, P.: Overcoming catastrophic forgetting during domain adaptation of neural machine translation. In: NAACL-HLT (1), pp. 2062–2068. Association for Computational Linguistics (2019)

    Google Scholar 

  22. Vaswani, A., et al.: Attention is all you need. In: NIPS, pp. 5998–6008 (2017)

    Google Scholar 

  23. Wang, R., Utiyama, M., Liu, L., Chen, K., Sumita, E.: Instance weighting for neural machine translation domain adaptation. In: EMNLP, pp. 1482–1488. Association for Computational Linguistics (2017)

    Google Scholar 

  24. Wang, S., Liu, Y., Wang, C., Luan, H., Sun, M.: Improving back-translation with uncertainty-based confidence estimation. In: EMNLP/IJCNLP (1), pp. 791–802. Association for Computational Linguistics (2019)

    Google Scholar 

  25. van der Wees, M., Bisazza, A., Monz, C.: Dynamic data selection for neural machine translation. In: EMNLP, pp. 1400–1410. Association for Computational Linguistics (2017)

    Google Scholar 

Download references

Acknowledgement

This research work has been funded by the National Natural Science Foundation of China (Grant No. 61772337, U1736207).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Kaiyue Qi or Gongshen Liu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Song, Z., Ma, Z., Qi, K., Liu, G. (2021). Dynamic Tuning and Weighting of Meta-learning for NMT Domain Adaptation. In: Farkaš, I., Masulli, P., Otte, S., Wermter, S. (eds) Artificial Neural Networks and Machine Learning – ICANN 2021. ICANN 2021. Lecture Notes in Computer Science(), vol 12895. Springer, Cham. https://doi.org/10.1007/978-3-030-86383-8_46

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-86383-8_46

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-86382-1

  • Online ISBN: 978-3-030-86383-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics