Terminology-Enriched Meta-curriculum Learning for Domain Neural Machine Translation

Chen, Zheng; Wang, Yifan

doi:10.1007/978-981-99-4752-2_32

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14089))

Included in the following conference series:

International Conference on Intelligent Computing

929 Accesses

Abstract

As a data-driven approach, neural machine translation (NMT) heavily depends on large parallel corpora. Nevertheless, such corpora are frequently unavailable in domains of user interest, consequently diminishing the performance of NMT models in terms of domain robustness and adaptability. To tackle this challenge, this paper presents an innovative training method for multi-domain translation that leverages meta-curriculum learning and terminology information. To utilize domain-specific terminology, the proposed method first need to extract, align, and filter terms, thereby integrating bilingual terminology into the training dataset. Following this, the aligned sentences are sorted according to the domain similarity scores with the general domain in a curriculum learning manner. Then, the training data is divided into sub-datasets in ascending order of difficulty. A meta-learning technique is then employed to train the model utilizing these partitioned training datasets as tasks, ultimately yielding a translation model with exceptional domain robustness and remarkable domain adaptability. Experimental results on test data from both seen and unseen domains demonstrate that the proposed method yields an average improvement of 2.44 in BLEU scores on the test sets of multiple domains compared to the pre-training and fine-tuning method, and a 1.54 BLEU scores increase over the meta-curriculum learning approach without terminology information. Upon fine-tuning with a small amount of target domain data, the proposed method outperforms these two baselines by 2.62 and 1.5 in BLEU scores, respectively. These outcomes underscore the efficacy of the proposed method in improving NMT performance in scenarios with limited domain-specific data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. In: Bengio, Y., LeCun, Y. (eds.) 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA (2015)
Google Scholar
Bapna, A., Firat, O.: Simple, scalable adaptation for neural machine translation. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 1538–1548. Association for Computational Linguistics, Hong Kong, China (2019)
Google Scholar
Bengio, Y., Louradour, J., Collobert, R., Weston, J.: Curriculum learning. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 41–48. ICML ’09, Association for Computing Machinery, New York, NY, USA (2009)
Google Scholar
Chu, C., Wang, R.: A survey of domain adaptation for neural machine translation. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 1304–1319. Association for Computational Linguistics, Santa Fe, New Mexico, USA (Aug 2018)
Google Scholar
Dakwale, P., Monz, C.: Fine-tuning for neural machine translation with limited degradation across in- and out-of-domain data. In: Proceedings of Machine Translation Summit XVI: Research Track, pp. 156–169. Nagoya Japan (Sep 18 – Sep 22 2017)
Google Scholar
Duan, N., Li, M., Xiao, T., Zhou, M.: The feature subspace method for SMT system combination. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pp. 1096–1104. Association for Computational Linguistics, Singapore (Aug 2009)
Google Scholar
Finn, C., Abbeel, P., Levine, S.: Model-agnostic meta-learning for fast adaptation of deep networks. In: Proceedings of the 34th International Conference on Machine Learning-Volume 70, pp. 1126–1135. ICML’17, JMLR.org (2017)
Google Scholar
Gu, J., Wang, Y., Chen, Y., Li, V.O.K., Cho, K.: Meta-learning for low-resource neural machine translation. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 3622–3631. Association for Computational Linguistics, Brussels, Belgium(Oct-Nov2018)
Google Scholar
Khayrallah, H., Kumar, G., Duh, K., Post, M., Koehn, P.: Neural lattice search for domain adaptation in machine translation. In: Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pp. 20–25. Asian Federation of Natural Language Processing, Taipei, Taiwan (Nov 2017)
Google Scholar
Koehn, P., Knowles, R.: Six challenges for neural machine translation. In: Proceedings of the First Workshop on Neural Machine Translation, pp. 28–39. Association for Computational Linguistics, Vancouver (Aug 2017)
Google Scholar
Lai, W., Libovický, J., Fraser, A.: Improving both domain robustness and domain adaptability in machine translation. In: Proceedings of the 29th International Conference on Computational Linguistics, pp. 5191–5204. International Committee on Computational Linguistics, Gyeongju, Republic of Korea (Oct 2022)
Google Scholar
Luong, M.T., Manning, C.: Stanford neural machine translation systems for spoken language domains. In: Proceedings of the 12th International Workshop on Spoken Language Translation: Evaluation Campaign, pp. 76–79. Da Nang, Vietnam (Dec 3–4 2015)
Google Scholar
Matsoukas, S., Rosti, A.V.I., Zhang, B.: Discriminative corpus weight estimation for machine translation. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pp. 708–717. Association for Computational Linguistics, Singapore (Aug 2009)
Google Scholar
Moore, R.C., Lewis, W.: Intelligent selection of language model training data. In: Proceedings of the ACL 2010 Conference Short Papers, pp. 220–224. ACLShort ’10, Association for Computational Linguistics, USA (2010)
Google Scholar
Reimers, N., Gurevych, I.: Sentence-BERT: sentence embeddings using Siamese BERT-networks. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 3982–3992. Association for Computational Linguistics, Hong Kong, China (Nov 2019)
Google Scholar
Saunders, D.: Domain adaptation and multi-domain adaptation for neural machine translation: a survey. J. Artif. Intell. Res. 75, 351–424 (2022)
Article MathSciNet MATH Google Scholar
Shah, K., Barrault, L., Schwenk, H.: Translation model adaptation by resampling. In: Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR, pp. 392–399. Association for Computational Linguistics, Uppsala, Sweden (Jul 2010)
Google Scholar
Sharaf, A., Hassan, H., Daumé III, H.: Meta-learning for few-shot NMT adaptation. In: Proceedings of the Fourth Workshop on Neural Generation and Translation, pp. 43–53. Association for Computational Linguistics, Online (Jul 2020)
Google Scholar
Tian, L., et al.: UM-corpus: A large English-Chinese parallel corpus for statistical machine translation. In: Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14), pp. 1837–1842. European Language Resources Association (ELRA), Reykjavik, Iceland (May 2014)
Google Scholar
Vaswani, A., et al.: Attention is all you need, pp. 6000–6010 (2017)
Google Scholar
Xu, B.: Nlp chinese corpus: large scale Chinese corpus for NLP (Sep 2019)
Google Scholar
Zhan, R., Liu, X., Wong, D.F., Chao, L.S.: Meta-curriculum learning for domain adaptation in neural machine translation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 14310–14318 (2021)
Google Scholar
Zhang, X., Shapiro, P., Kumar, G., McNamee, P., Carpuat, M., Duh, K.: Curriculum learning for domain adaptation in neural machine translation. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 1903–1915. Association for Computational Linguistics, Minneapolis, Minnesota (Jun 2019)
Google Scholar
Zhu, J., Hovy, E.: Active learning for word sense disambiguation with methods for addressing the class imbalance problem. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pp. 783–790. Association for Computational Linguistics, Prague, Czech Republic (Jun 2007)
Google Scholar

Download references

Acknowledgements

This work is supported by the Natural Science Foundation of Sichuan Province (2022NSFSC0503), and Sichuan Science and Technology Program (2022ZHCG0007).

Author information

Authors and Affiliations

School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu, China
Zheng Chen & Yifan Wang

Authors

Zheng Chen
View author publications
You can also search for this author in PubMed Google Scholar
Yifan Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zheng Chen .

Editor information

Editors and Affiliations

Department of Computer Science, Eastern Institute of Technology, Zhejiang, China
De-Shuang Huang
University of Wollongong, North Wollongong, NSW, Australia
Prashan Premaratne
Zhengzhou University of Light Industry, Zhengzhou, China
Baohua Jin
Zhong Yuan University of Technology, Zhengzhou, China
Boyang Qu
University of Ulsan, Ulsan, Korea (Republic of)
Kang-Hyun Jo
Department of Computer Science, Liverpool John Moores University, Liverpool, UK
Abir Hussain

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, Z., Wang, Y. (2023). Terminology-Enriched Meta-curriculum Learning for Domain Neural Machine Translation. In: Huang, DS., Premaratne, P., Jin, B., Qu, B., Jo, KH., Hussain, A. (eds) Advanced Intelligent Computing Technology and Applications. ICIC 2023. Lecture Notes in Computer Science(), vol 14089. Springer, Singapore. https://doi.org/10.1007/978-981-99-4752-2_32

Download citation

DOI: https://doi.org/10.1007/978-981-99-4752-2_32
Published: 31 July 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-4751-5
Online ISBN: 978-981-99-4752-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics