Abstract
As a data-driven approach, neural machine translation (NMT) heavily depends on large parallel corpora. Nevertheless, such corpora are frequently unavailable in domains of user interest, consequently diminishing the performance of NMT models in terms of domain robustness and adaptability. To tackle this challenge, this paper presents an innovative training method for multi-domain translation that leverages meta-curriculum learning and terminology information. To utilize domain-specific terminology, the proposed method first need to extract, align, and filter terms, thereby integrating bilingual terminology into the training dataset. Following this, the aligned sentences are sorted according to the domain similarity scores with the general domain in a curriculum learning manner. Then, the training data is divided into sub-datasets in ascending order of difficulty. A meta-learning technique is then employed to train the model utilizing these partitioned training datasets as tasks, ultimately yielding a translation model with exceptional domain robustness and remarkable domain adaptability. Experimental results on test data from both seen and unseen domains demonstrate that the proposed method yields an average improvement of 2.44 in BLEU scores on the test sets of multiple domains compared to the pre-training and fine-tuning method, and a 1.54 BLEU scores increase over the meta-curriculum learning approach without terminology information. Upon fine-tuning with a small amount of target domain data, the proposed method outperforms these two baselines by 2.62 and 1.5 in BLEU scores, respectively. These outcomes underscore the efficacy of the proposed method in improving NMT performance in scenarios with limited domain-specific data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. In: Bengio, Y., LeCun, Y. (eds.) 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA (2015)
Bapna, A., Firat, O.: Simple, scalable adaptation for neural machine translation. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 1538–1548. Association for Computational Linguistics, Hong Kong, China (2019)
Bengio, Y., Louradour, J., Collobert, R., Weston, J.: Curriculum learning. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 41–48. ICML ’09, Association for Computing Machinery, New York, NY, USA (2009)
Chu, C., Wang, R.: A survey of domain adaptation for neural machine translation. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 1304–1319. Association for Computational Linguistics, Santa Fe, New Mexico, USA (Aug 2018)
Dakwale, P., Monz, C.: Fine-tuning for neural machine translation with limited degradation across in- and out-of-domain data. In: Proceedings of Machine Translation Summit XVI: Research Track, pp. 156–169. Nagoya Japan (Sep 18 – Sep 22 2017)
Duan, N., Li, M., Xiao, T., Zhou, M.: The feature subspace method for SMT system combination. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pp. 1096–1104. Association for Computational Linguistics, Singapore (Aug 2009)
Finn, C., Abbeel, P., Levine, S.: Model-agnostic meta-learning for fast adaptation of deep networks. In: Proceedings of the 34th International Conference on Machine Learning-Volume 70, pp. 1126–1135. ICML’17, JMLR.org (2017)
Gu, J., Wang, Y., Chen, Y., Li, V.O.K., Cho, K.: Meta-learning for low-resource neural machine translation. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 3622–3631. Association for Computational Linguistics, Brussels, Belgium(Oct-Nov2018)
Khayrallah, H., Kumar, G., Duh, K., Post, M., Koehn, P.: Neural lattice search for domain adaptation in machine translation. In: Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pp. 20–25. Asian Federation of Natural Language Processing, Taipei, Taiwan (Nov 2017)
Koehn, P., Knowles, R.: Six challenges for neural machine translation. In: Proceedings of the First Workshop on Neural Machine Translation, pp. 28–39. Association for Computational Linguistics, Vancouver (Aug 2017)
Lai, W., Libovický, J., Fraser, A.: Improving both domain robustness and domain adaptability in machine translation. In: Proceedings of the 29th International Conference on Computational Linguistics, pp. 5191–5204. International Committee on Computational Linguistics, Gyeongju, Republic of Korea (Oct 2022)
Luong, M.T., Manning, C.: Stanford neural machine translation systems for spoken language domains. In: Proceedings of the 12th International Workshop on Spoken Language Translation: Evaluation Campaign, pp. 76–79. Da Nang, Vietnam (Dec 3–4 2015)
Matsoukas, S., Rosti, A.V.I., Zhang, B.: Discriminative corpus weight estimation for machine translation. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pp. 708–717. Association for Computational Linguistics, Singapore (Aug 2009)
Moore, R.C., Lewis, W.: Intelligent selection of language model training data. In: Proceedings of the ACL 2010 Conference Short Papers, pp. 220–224. ACLShort ’10, Association for Computational Linguistics, USA (2010)
Reimers, N., Gurevych, I.: Sentence-BERT: sentence embeddings using Siamese BERT-networks. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 3982–3992. Association for Computational Linguistics, Hong Kong, China (Nov 2019)
Saunders, D.: Domain adaptation and multi-domain adaptation for neural machine translation: a survey. J. Artif. Intell. Res. 75, 351–424 (2022)
Shah, K., Barrault, L., Schwenk, H.: Translation model adaptation by resampling. In: Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR, pp. 392–399. Association for Computational Linguistics, Uppsala, Sweden (Jul 2010)
Sharaf, A., Hassan, H., Daumé III, H.: Meta-learning for few-shot NMT adaptation. In: Proceedings of the Fourth Workshop on Neural Generation and Translation, pp. 43–53. Association for Computational Linguistics, Online (Jul 2020)
Tian, L., et al.: UM-corpus: A large English-Chinese parallel corpus for statistical machine translation. In: Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14), pp. 1837–1842. European Language Resources Association (ELRA), Reykjavik, Iceland (May 2014)
Vaswani, A., et al.: Attention is all you need, pp. 6000–6010 (2017)
Xu, B.: Nlp chinese corpus: large scale Chinese corpus for NLP (Sep 2019)
Zhan, R., Liu, X., Wong, D.F., Chao, L.S.: Meta-curriculum learning for domain adaptation in neural machine translation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 14310–14318 (2021)
Zhang, X., Shapiro, P., Kumar, G., McNamee, P., Carpuat, M., Duh, K.: Curriculum learning for domain adaptation in neural machine translation. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 1903–1915. Association for Computational Linguistics, Minneapolis, Minnesota (Jun 2019)
Zhu, J., Hovy, E.: Active learning for word sense disambiguation with methods for addressing the class imbalance problem. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pp. 783–790. Association for Computational Linguistics, Prague, Czech Republic (Jun 2007)
Acknowledgements
This work is supported by the Natural Science Foundation of Sichuan Province (2022NSFSC0503), and Sichuan Science and Technology Program (2022ZHCG0007).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Chen, Z., Wang, Y. (2023). Terminology-Enriched Meta-curriculum Learning for Domain Neural Machine Translation. In: Huang, DS., Premaratne, P., Jin, B., Qu, B., Jo, KH., Hussain, A. (eds) Advanced Intelligent Computing Technology and Applications. ICIC 2023. Lecture Notes in Computer Science(), vol 14089. Springer, Singapore. https://doi.org/10.1007/978-981-99-4752-2_32
Download citation
DOI: https://doi.org/10.1007/978-981-99-4752-2_32
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-4751-5
Online ISBN: 978-981-99-4752-2
eBook Packages: Computer ScienceComputer Science (R0)