Effective Soft-Adaptation for Neural Machine Translation

Wu, Shuangzhi; Zhang, Dongdong; Zhou, Ming

doi:10.1007/978-3-030-32236-6_22

Effective Soft-Adaptation for Neural Machine Translation

Shuangzhi Wu¹³,
Dongdong Zhang¹⁴ &
Ming Zhou¹⁴

Conference paper
First Online: 30 September 2019

4656 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11839))

Abstract

Domain mismatch between training data and test data often degrades translation quality. It is necessary to make domain adaptation for machine translation tasks. In this paper, we propose a novel method to tackle Neural Machine Translation (NMT) domain adaptation issue, where a soft-domain adapter (SDA) is added in the encoder-decoder NMT framework. Our SDA automatically learns domain representations from the training corpus, and dynamically compute domain-aware context for inputs which can guide the decoder to generate domain-aware translations. Our method can softly leverage domain information to translate source sentences, which can not only improve the translation quality on specific domain but also be robust and scalable on different domains. Experiments on Chinese-English and English-French tasks show that our proposed method can significantly improve the translation quality of in-domain test sets, without performance sacrifice of out-of-domain/general-domain data sets.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
In the rest of this paper, the characters in bold refer to vectors.
2.
LDC2002E17, LDC2002E18, LDC2003E07, LDC2003E14, LDC2005E83, LDC2005T06, LDC2005T10, LDC2006E17, LDC2006E26, LDC2006E34, LDC2006E85, LDC2006E92, LDC2006T06, LDC2004T08, LDC2005T10.
3.
https://github.com/rsennrich/subword-nmt.

References

Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. In: ICLR 2015 (2015)
Google Scholar
Cettolo, M., Niehues, J., Stüker, S., Bentivogli, L., Cattoni, R., Federico, M.: The IWSLT 2015 evaluation campaign. In: Proceedings of IWSLT, Da Nang, Vietnam (2015)
Google Scholar
Cettolo, M., Niehues, J., Stüker, S., Bentivogli, L., Federico, M.: Report on the 11th IWSLT evaluation campaign. In: Proceedings of the International Workshop on Spoken Language Translation, IWSLT 2014, Hanoi, Vietnam (2014)
Google Scholar
Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Proceedings of ENMLP 2014 (2014)
Google Scholar
Chu, C., Dabre, R., Kurohashi, S.: An empirical comparison of simple domain adaptation methods for neural machine translation. arXiv preprint arXiv:1701.03214 (2017)
Freitag, M., Al-Onaizan, Y.: Fast domain adaptation for neural machine translation. arXiv preprint arXiv:1612.06897 (2016)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Kobus, C., Crego, J., Senellart, J.: Domain control for neural machine translation. arXiv preprint arXiv:1612.06140
Luong, M.T., Manning, C.D.: Stanford neural machine translation systems for spoken language domains. In: Proceedings of IWSLT 2015 (2015)
Google Scholar
Luong, T., Pham, H., Manning, C.D.: Effective approaches to attention-based neural machine translation. In: Proceedings of EMNLP 2015 (2015)
Google Scholar
Luong, T., Sutskever, I., Le, Q., Vinyals, O., Zaremba, W.: Addressing the rare word problem in neural machine translation. In: Proceedings of ACL 2015 (2015)
Google Scholar
Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of ACL 2002 (2002)
Google Scholar
Pryzant, R., Britz, D., Le, Q.: Effective domain mixing for neural machine translation. In: Second Conference on Machine Translation (WMT) (2017)
Google Scholar
Sennrich, R., Haddow, B., Birch, A.: Neural machine translation of rare words with subword units. arXiv preprint arXiv:1508.07909 (2015)
Servan, C., Crego, J., Senellart, J.: Domain specialization: a post-training domain adaptation for neural machine translation. arXiv preprint arXiv:1612.06141 (2016)
Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems (2014)
Google Scholar
Vaswani, A., et al.: Attention is all you need. Curran Associates, Inc. (2017)
Google Scholar
Wang, R., Utiyama, M., Liu, L., Chen, K., Sumita, E.: Instance weighting for neural machine translation domain adaptation. In: Proceedings of EMNLP 2017 (2017)
Google Scholar
Wu, Y., et al.: Google’s neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144 (2016)

Download references

Author information

Authors and Affiliations

Harbin Institute of Technology, Harbin, China
Shuangzhi Wu
Microsoft Research Asia, Beijing, China
Dongdong Zhang & Ming Zhou

Authors

Shuangzhi Wu
View author publications
You can also search for this author in PubMed Google Scholar
Dongdong Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Ming Zhou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shuangzhi Wu .

Editor information

Editors and Affiliations

Tsinghua University, Beijing, China
Jie Tang
National University of Singapore, Singapore, Singapore
Min-Yen Kan
Peking University, Beijing, China
Dongyan Zhao
Peking University, Beijing, China
Sujian Li
Zhengzhou University, Zhengzhou, China
Hongying Zan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wu, S., Zhang, D., Zhou, M. (2019). Effective Soft-Adaptation for Neural Machine Translation. In: Tang, J., Kan, MY., Zhao, D., Li, S., Zan, H. (eds) Natural Language Processing and Chinese Computing. NLPCC 2019. Lecture Notes in Computer Science(), vol 11839. Springer, Cham. https://doi.org/10.1007/978-3-030-32236-6_22

Download citation

DOI: https://doi.org/10.1007/978-3-030-32236-6_22
Published: 30 September 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-32235-9
Online ISBN: 978-3-030-32236-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the China Computer Federation (CCF) (opens in a new tab)