Skip to main content

Terminology-Enriched Meta-curriculum Learning for Domain Neural Machine Translation

  • Conference paper
  • First Online:
Advanced Intelligent Computing Technology and Applications (ICIC 2023)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14089))

Included in the following conference series:

  • 929 Accesses

Abstract

As a data-driven approach, neural machine translation (NMT) heavily depends on large parallel corpora. Nevertheless, such corpora are frequently unavailable in domains of user interest, consequently diminishing the performance of NMT models in terms of domain robustness and adaptability. To tackle this challenge, this paper presents an innovative training method for multi-domain translation that leverages meta-curriculum learning and terminology information. To utilize domain-specific terminology, the proposed method first need to extract, align, and filter terms, thereby integrating bilingual terminology into the training dataset. Following this, the aligned sentences are sorted according to the domain similarity scores with the general domain in a curriculum learning manner. Then, the training data is divided into sub-datasets in ascending order of difficulty. A meta-learning technique is then employed to train the model utilizing these partitioned training datasets as tasks, ultimately yielding a translation model with exceptional domain robustness and remarkable domain adaptability. Experimental results on test data from both seen and unseen domains demonstrate that the proposed method yields an average improvement of 2.44 in BLEU scores on the test sets of multiple domains compared to the pre-training and fine-tuning method, and a 1.54 BLEU scores increase over the meta-curriculum learning approach without terminology information. Upon fine-tuning with a small amount of target domain data, the proposed method outperforms these two baselines by 2.62 and 1.5 in BLEU scores, respectively. These outcomes underscore the efficacy of the proposed method in improving NMT performance in scenarios with limited domain-specific data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. In: Bengio, Y., LeCun, Y. (eds.) 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA (2015)

    Google Scholar 

  2. Bapna, A., Firat, O.: Simple, scalable adaptation for neural machine translation. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 1538–1548. Association for Computational Linguistics, Hong Kong, China (2019)

    Google Scholar 

  3. Bengio, Y., Louradour, J., Collobert, R., Weston, J.: Curriculum learning. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 41–48. ICML ’09, Association for Computing Machinery, New York, NY, USA (2009)

    Google Scholar 

  4. Chu, C., Wang, R.: A survey of domain adaptation for neural machine translation. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 1304–1319. Association for Computational Linguistics, Santa Fe, New Mexico, USA (Aug 2018)

    Google Scholar 

  5. Dakwale, P., Monz, C.: Fine-tuning for neural machine translation with limited degradation across in- and out-of-domain data. In: Proceedings of Machine Translation Summit XVI: Research Track, pp. 156–169. Nagoya Japan (Sep 18 – Sep 22 2017)

    Google Scholar 

  6. Duan, N., Li, M., Xiao, T., Zhou, M.: The feature subspace method for SMT system combination. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pp. 1096–1104. Association for Computational Linguistics, Singapore (Aug 2009)

    Google Scholar 

  7. Finn, C., Abbeel, P., Levine, S.: Model-agnostic meta-learning for fast adaptation of deep networks. In: Proceedings of the 34th International Conference on Machine Learning-Volume 70, pp. 1126–1135. ICML’17, JMLR.org (2017)

    Google Scholar 

  8. Gu, J., Wang, Y., Chen, Y., Li, V.O.K., Cho, K.: Meta-learning for low-resource neural machine translation. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 3622–3631. Association for Computational Linguistics, Brussels, Belgium(Oct-Nov2018)

    Google Scholar 

  9. Khayrallah, H., Kumar, G., Duh, K., Post, M., Koehn, P.: Neural lattice search for domain adaptation in machine translation. In: Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pp. 20–25. Asian Federation of Natural Language Processing, Taipei, Taiwan (Nov 2017)

    Google Scholar 

  10. Koehn, P., Knowles, R.: Six challenges for neural machine translation. In: Proceedings of the First Workshop on Neural Machine Translation, pp. 28–39. Association for Computational Linguistics, Vancouver (Aug 2017)

    Google Scholar 

  11. Lai, W., Libovický, J., Fraser, A.: Improving both domain robustness and domain adaptability in machine translation. In: Proceedings of the 29th International Conference on Computational Linguistics, pp. 5191–5204. International Committee on Computational Linguistics, Gyeongju, Republic of Korea (Oct 2022)

    Google Scholar 

  12. Luong, M.T., Manning, C.: Stanford neural machine translation systems for spoken language domains. In: Proceedings of the 12th International Workshop on Spoken Language Translation: Evaluation Campaign, pp. 76–79. Da Nang, Vietnam (Dec 3–4 2015)

    Google Scholar 

  13. Matsoukas, S., Rosti, A.V.I., Zhang, B.: Discriminative corpus weight estimation for machine translation. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pp. 708–717. Association for Computational Linguistics, Singapore (Aug 2009)

    Google Scholar 

  14. Moore, R.C., Lewis, W.: Intelligent selection of language model training data. In: Proceedings of the ACL 2010 Conference Short Papers, pp. 220–224. ACLShort ’10, Association for Computational Linguistics, USA (2010)

    Google Scholar 

  15. Reimers, N., Gurevych, I.: Sentence-BERT: sentence embeddings using Siamese BERT-networks. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 3982–3992. Association for Computational Linguistics, Hong Kong, China (Nov 2019)

    Google Scholar 

  16. Saunders, D.: Domain adaptation and multi-domain adaptation for neural machine translation: a survey. J. Artif. Intell. Res. 75, 351–424 (2022)

    Article  MathSciNet  MATH  Google Scholar 

  17. Shah, K., Barrault, L., Schwenk, H.: Translation model adaptation by resampling. In: Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR, pp. 392–399. Association for Computational Linguistics, Uppsala, Sweden (Jul 2010)

    Google Scholar 

  18. Sharaf, A., Hassan, H., Daumé III, H.: Meta-learning for few-shot NMT adaptation. In: Proceedings of the Fourth Workshop on Neural Generation and Translation, pp. 43–53. Association for Computational Linguistics, Online (Jul 2020)

    Google Scholar 

  19. Tian, L., et al.: UM-corpus: A large English-Chinese parallel corpus for statistical machine translation. In: Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14), pp. 1837–1842. European Language Resources Association (ELRA), Reykjavik, Iceland (May 2014)

    Google Scholar 

  20. Vaswani, A., et al.: Attention is all you need, pp. 6000–6010 (2017)

    Google Scholar 

  21. Xu, B.: Nlp chinese corpus: large scale Chinese corpus for NLP (Sep 2019)

    Google Scholar 

  22. Zhan, R., Liu, X., Wong, D.F., Chao, L.S.: Meta-curriculum learning for domain adaptation in neural machine translation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 14310–14318 (2021)

    Google Scholar 

  23. Zhang, X., Shapiro, P., Kumar, G., McNamee, P., Carpuat, M., Duh, K.: Curriculum learning for domain adaptation in neural machine translation. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 1903–1915. Association for Computational Linguistics, Minneapolis, Minnesota (Jun 2019)

    Google Scholar 

  24. Zhu, J., Hovy, E.: Active learning for word sense disambiguation with methods for addressing the class imbalance problem. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pp. 783–790. Association for Computational Linguistics, Prague, Czech Republic (Jun 2007)

    Google Scholar 

Download references

Acknowledgements

This work is supported by the Natural Science Foundation of Sichuan Province (2022NSFSC0503), and Sichuan Science and Technology Program (2022ZHCG0007).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zheng Chen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Chen, Z., Wang, Y. (2023). Terminology-Enriched Meta-curriculum Learning for Domain Neural Machine Translation. In: Huang, DS., Premaratne, P., Jin, B., Qu, B., Jo, KH., Hussain, A. (eds) Advanced Intelligent Computing Technology and Applications. ICIC 2023. Lecture Notes in Computer Science(), vol 14089. Springer, Singapore. https://doi.org/10.1007/978-981-99-4752-2_32

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-4752-2_32

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-4751-5

  • Online ISBN: 978-981-99-4752-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics