Abstract
This paper describes the submissions of the Tencent minority-mandarin translation system for CCMT19. We participate in 3 translation directions including Uighur\(\rightarrow \)Chinese, Tibetan\(\rightarrow \)Chinese and Mongolian\(\rightarrow \)Chinese. Our systems are neural machine translation systems trained with our improved Marian, and are called TenTrans, which are based on Google’s Transformer model architecture. We also adopt most techniques that have been proven effective recently in academia, such as back-translation based sampling, data selection, sequence-level knowledge distillation, ensemble distillation, model ensembling and reranking. By using the above technologies, our submitted systems achieve a stable performance improvement.
B. Hu, A. Han, Z. Zhang—Equal contribution.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
- 3.
- 4.
This setting is slightly different from big model in tensor2tensor and Fairseq, which is the best parameter setting we have ever tried.
- 5.
- 6.
- 7.
XMU corpus: http://nlp.nju.edu.cn/cwmt2018/resources.html.
- 8.
The used BERT is developed by our department’s NLP team.
- 9.
- 10.
- 11.
Using official scoring programs and requirements.
References
Cho, K., Merrienboer, B., Gulcehre, C., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: EMNLP (2014)
Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems, pp. 3104–3112 (2014)
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. In: Proceedings of ICLR (2014)
Sennrich, R., Haddow, B., Birch, A.: Edinburgh neural machine translation systems for WMT 2016. In: Proceedings of the First Conference on Machine Translation. Association for Computational Linguistics, Berlin, Germany (2016)
Wu, Y., et al.: Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation. CoRR, abs/1609.08144 (2016)
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)
Junczys-Dowmunt, M., et al.: Marian: fast neural machine translation in C++. In: Proceedings of ACL: System Demonstrations, p. 2018. Association for Computational Linguistics, Melbourne, Australia (2018)
Hu, B., Han, A., Huang, S.: TencentFmRD neural machine translation for WMT 2018. In: Proceedings of the Third Conference on Machine Translation: Shared Task Papers. Association for Computational Linguistics, Belgium, Brussels (2018)
Hu, B., Han, A., Huang, S.: TencentFmRD neural machine translation system. In: Chen, J., Zhang, J. (eds.) CWMT 2018. CCIS, vol. 954, pp. 111–123. Springer, Singapore (2019). https://doi.org/10.1007/978-981-13-3083-4_11
Sennrich, R., Haddow, B., Birch, A.: Neural machine translation of rare words with subword units. In: Proceedings of ACL (2016)
Luong, M., Manning, C.D.: Achieving open vocabulary neural machine translation with hybrid word-character models. In: Proceedings of ACL (2016)
Sennrich, R., Haddow, B., Birch, A.: Improving neural machine translation models with monolingual data. In: Proceedings of ACL (2016)
Edunov, S., Ott, M., Auli, M., Grangier, D.: Understanding back-translation at scale. In: Proceedings of the: Conference on Empirical Methods in Natural Language Processing, p. 2018. Association for Computational Linguistics, Brussels, Belgium (2018)
Imamura, K., Fujita, A., Sumita, E.: Enhancement of encoder and attention using target monolingual corpora in neural machine translation. In: Proceedings of the 2nd Workshop on Neural Machine Translation and Generation, pp. 55–63 (2018)
Hoang, V.C.D., Koehn, P., Haffari, G., Cohn, T.: Iterative back-translation for neural machine translation. In: Proceedings of the 2nd Workshop on Neural Machine Translation and Generation, pp. 18–24 (2018)
Kim, Y., Rush, A.M.: Sequence-level knowledge distillation. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 1317–1327. Association for Computational Linguistics, Austin, Texas (2016)
Freitag, M., Al-Onaizan, Y., Sankaran, B.: Ensemble distillation for neural machine translation. arXiv preprint arXiv:1702.01802 (2017)
Denkowski, M., Neubig, G.: Stronger baselines for trustable results in neural machine translation. In: Proceedings of the First Workshop on Neural Machine Translation, pp. 18–27 (2017)
Sennrich, R., et al.: The University of Edinburgh neural MT systems for WMT 2017. In: Proceedings of the Second Conference on Machine Translation, pp. 389–399 (2017)
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
Cherry, C., Foster, G.: Batch tuning strategies for statistical machine translation. In: Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 427–436. Association for Computational Linguistics (2012)
Ott, M., et al.: FAIRSEQ: a fast, extensible toolkit for sequence modeling. In: NAACL HLT 2019 (2019)
Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Pascanu, R., Mikolov, T., Bengio, Y.: On the difficulty of training recurrent neural networks. In: International Conference on Machine Learning, pp. 1310–1318 (2013)
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016)
Ramachandran, P., Zoph, B., Le, Q.V.: Searching for activation functions. arXiv preprint arXiv:1710.05941 (2017)
Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pp. 311–318. Association for Computational Linguistics (2002)
Chen, B., Cherry, C., Foster, G., Larkin, S.: Cost weighting for neural machine translation domain adaptation. In: Proceedings of the First Workshop on Neural Machine Translation, pp. 40–46 (2017)
Zhang, S., Xiong, D.: Sentence weighting for neural machine translation domain adaptation. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 3181–3190 (2018)
Wang, R., Utiyama, M., Liu, L., Chen, K., Sumita, E.: Instance weighting for neural machine translation domain adaptation. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 1482–1488 (2017)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Chen, W., Matusov, E., Khadivi, S., Peter, J.T.: Guided alignment training for topic-aware neural machine translation. arXiv preprint arXiv:1607.01628 (2016)
Deng, Y., et al.: Alibaba’s neural machine translation systems for WMT 2018. In: Proceedings of the Third Conference on Machine Translation: Shared Task Papers, pp. 368–376 (2018)
Junczys-Dowmunt, M.: Microsoft’s submission to the WMT 2018 news translation task: how i learned to stop worrying and love the data. In: Proceedings of the Third Conference on Machine Translation: Shared Task Papers, pp. 425–430 (2018)
Axelrod, A., He, X., Gao, J.: Domain adaptation via pseudo in-domain data selection. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 355–362. Association for Computational Linguistics (2011)
Chen, B., Huang, F.: Semi-supervised convolutional networks for translation adaptation with tiny amount of in-domain data. In: Proceedings of The 20th SIGNLL Conference on Computational Natural Language Learning, pp. 314–323 (2016)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
Luong, M.T., Manning, C.D.: Stanford neural machine translation systems for spoken language domains. In: Proceedings of the International Workshop on Spoken Language Translation, pp. 76–79 (2015)
Liu, L., Utiyama, M., Finch, A., Sumita, E.: Agreement on target-bidirectional neural machine translation. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 411–416 (2016)
Tu, Z., Liu, Y., Shang, L., Liu, X., Li, H.: Neural machine translation with reconstruction. In: Thirty-First AAAI Conference on Artificial Intelligence (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Hu, B., Han, A., Zhang, Z., Huang, S., Ju, Q. (2019). Tencent Minority-Mandarin Translation System. In: Huang, S., Knight, K. (eds) Machine Translation. CCMT 2019. Communications in Computer and Information Science, vol 1104. Springer, Singapore. https://doi.org/10.1007/978-981-15-1721-1_10
Download citation
DOI: https://doi.org/10.1007/978-981-15-1721-1_10
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-1720-4
Online ISBN: 978-981-15-1721-1
eBook Packages: Computer ScienceComputer Science (R0)