Abstract
Neural machine translation system heavily depends on large-scale parallel corpus, which is not available for some low-resource languages, resulting in poor translation quality. To alleviate such data hungry problem, we present a high quality data augmentation method which merely utilize the given parallel corpus. Specifically, we propose to augment the low-resource parallel corpus with a language-mixed bitext, which is simply built by concatenating two sentences in different languages. Furthermore, our approach which only takes advantage of parallel corpus is supplementary to existing data manipulation strategies, i.e. back-translation, self-training and knowledge distillation. Experiments on several low-resource datasets show that our approach achieves significant improvement over a strong baseline, despite its simplicity.
This work was supported by Project Funded by the Priority Academic Program Development of Jiangsu Higher Education Institutions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. In: ICLR, vol. abs/1409.0473 (2015)
Brown, P.F., et al.: A statistical approach to machine translation. Comput. Linguist. 16(2), 79–85 (1990)
Brown, P.F., Della Pietra, S.A., Della Pietra, V.J., Mercer, R.L.: The mathematics of statistical machine translation: Parameter estimation. Comput. Linguist. 19(2), 263–311 (1993)
Chen, Y., Liu, Y., Li, V.O.: Zero-resource neural machine translation with multi-agent communication game. In: AAAI, pp. 5086–5093 (2018)
Cheng, Y., Liu, Y., Yang, Q., Sun, M., Xu, W.: Neural machine translation with pivot languages. arXiv preprint arXiv:1611.04928 (2016)
Cohn, T., Hoang, C.D.V., Vymolova, E., Yao, K., Dyer, C., Haffari, G.: Incorporating structural alignment biases into an attentional neural translation model. In: NAACL-HLT, pp. 876–885 (2016)
Conneau, A., Lample, G.: Cross-lingual language model pretraining. In: NeurIPS, pp. 7059–706 (2019)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. In: NAACL-HLT, pp. 4171–4186 (2019)
Ding, L., Wu, D., Tao, D.: Improving neural machine translation by bidirectional training. In: EMNLP, pp. 3278–3284 (2021)
Edunov, S., Ott, M., Auli, M., Grangier, D., Ranzato, M.: Classical structured prediction losses for sequence to sequence learning. In: NAACL-HLT, pp. 355–364 (2018)
He, D., Xia, Y., Qin, T., Wang, L., Yu, N., Liu, T.Y., Ma, W.Y.: Dual learning for machine translation. In: NeurIPS. vol. 29, pp. 820–828 (2016)
He, J., Gu, J., Shen, J., Ranzato, M.: Revisiting self-training for neural sequence generation. In: ICLR (2020)
Hinton, G.E., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. CoRR abs/1503.02531 (2015)
Kim, Y., Rush, A.M.: Sequence-level knowledge distillation. In: EMNLP, pp. 1317–1327 (2016)
Koehn, P., Knowles, R.: Six challenges for neural machine translation. In: NMT@ACL, pp. 28–39 (2017)
Kondo, S., Hotate, K., Hirasawa, T., Kaneko, M., Komachi, M.: Sentence concatenation approach to data augmentation for neural machine translation. In: NAACL-HLT (Student Research Workshop), pp. 143–149 (2021)
Lample, G., Conneau, A., Denoyer, L., Ranzato, M.: Unsupervised machine translation using monolingual corpora only. In: ICLR (2018)
Liu, Y., Gu, J., Goyal, N., Li, X., Edunov, S., Ghazvininejad, M., Lewis, M., Zettlemoyer, L.: Multilingual denoising pre-training for neural machine translation. Trans. Assoc. Comput. Linguist. 8, 726–742 (2020)
Nguyen, X.P., Joty, S.R., Wu, K., Aw, A.T.: Data diversification: a simple strategy for neural machine translation. In: NeurIPS (2020)
Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: Bleu: a method for automatic evaluation of machine translation. In: ACL, pp. 311–318 (2002)
Scudder, H.: Probability of error of some adaptive pattern-recognition machines. IEEE Trans. Inf. Theory 11(3), 363–371 (1965)
Sennrich, R., Haddow, B., Birch, A.: Edinburgh neural machine translation systems for wmt 16. In: WMT, pp. 371–376 (2016)
Sennrich, R., Haddow, B., Birch, A.: Improving neural machine translation models with monolingual data. In: ACL (2016)
Sennrich, R., Haddow, B., Birch, A.: Neural machine translation of rare words with subword units. In: ACL (2016)
Song, K., Tan, X., Qin, T., Lu, J., Liu, T.Y.: Mass: Masked sequence to sequence pre-training for language generation. In: ICML, pp. 5926–5936 (2019)
Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: NeurIPS, pp. 3104–3112 (2014)
Vaswani, A., et al.: Attention is all you need. In: NeurIPS, pp. 5998–6008 (2017)
Yamada, T., Matsunaga, H., Ogata, T.: Paired recurrent autoencoders for bidirectional translation between robot actions and linguistic descriptions. IEEE Robot. Autom. Lett. 3(4), 3441–3448 (2018)
Zhang, H., Qiu, S., Wu, S.: Dual knowledge distillation for bidirectional neural machine translation. In: IJCNN, pp. 1–7 (2021)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Xiao, N., Zhang, H., Jin, C., Duan, X. (2022). Random Concatenation: A Simple Data Augmentation Method for Neural Machine Translation. In: Lu, W., Huang, S., Hong, Y., Zhou, X. (eds) Natural Language Processing and Chinese Computing. NLPCC 2022. Lecture Notes in Computer Science(), vol 13551. Springer, Cham. https://doi.org/10.1007/978-3-031-17120-8_6
Download citation
DOI: https://doi.org/10.1007/978-3-031-17120-8_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-17119-2
Online ISBN: 978-3-031-17120-8
eBook Packages: Computer ScienceComputer Science (R0)