Skip to main content

The Influence of Different Methods on the Quality of the Russian-Tatar Neural Machine Translation

  • Conference paper
  • First Online:
Artificial Intelligence (RCAI 2020)

Abstract

This article presents the results of experiments on the use of various methods and algorithms in creating the Russian-Tatar machine translation system. As a basic algorithm, we used a neural network approach based on the Transformer architecture as well as various algorithms to increase the amount of parallel data using monolingual corpora (back-translation). For the first time experiments were conducted for the Russian-Tatar language pair on the use of transfer learning (based on Kazakh-Russian parallel corpus). As the main training data, we created and used the parallel corpus with a total volume of about 1 million Russian-Tatar sentence pairs. Experiments show that the created system is superior in quality to the currently existing Russian-Tatar translators. The best quality for the Russian-Tatar translation direction was achieved by our basic model (BLEU 35.4), and for the Tatar-Russian direction – by the model for which the back-translation algorithm was used (BLEU 39.2).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Forcada, M.L., Ginestí-Rosell, M., Nordfalk, J., et al.: Apertium: a free/open-source platform for rule-based machine translation. Mach. Trans. 25, 127–144 (2011). https://doi.org/10.1007/s10590-011-9090-0

    Article  Google Scholar 

  2. Yandex translate. https://translate.yandex.com/. Accessed 14 Mar 2019

  3. Khusainov, A., Suleymanov, D., Gilmullin, R., Gatiatullin, A.: Building the Tatar-Russian NMT system based on re-translation of multilingual data. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds.) TSD 2018. LNCS (LNAI), vol. 11107, pp. 163–170. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00794-2_17

    Chapter  Google Scholar 

  4. Open-source neural machine translation in Theano. https://github.com/rsennrich/nematus. Accessed 21 Nov 2019

  5. Sennrich, R., et al.: The University of Edinburgh’s neural Mt systems for WMT17. In: Proceedings of the Second Conference on Machine Translation, vol. 2: Shared Task Papers, Stroudsburg, PA, USA (2017)

    Google Scholar 

  6. Gage, P.: A new algorithm for data compression. C Users J. 12(2), 23–38 (1994)

    Google Scholar 

  7. Sennrich, R., Haddow, B., Birch, A.: Neural machine translation of rare words with subword units. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL 2016), Berlin, Germany (2016)

    Google Scholar 

  8. Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems, pp. 3104–3112 (2014)

    Google Scholar 

  9. Gehring, J., Auli, M., Grangier, D., Yarats, D., Dauphin, Y.N.: Convolutional sequence to sequence learning. In: International Conference of Machine Learning (ICML) (2017)

    Google Scholar 

  10. Vaswani, A., et al.: Attention is all you need. In: Conference on Advances in Neural Information Processing Systems (NIPS) (2017)

    Google Scholar 

  11. Paulus, R., Xiong, C., Socher, R.: A deep reinforced model for abstractive summarization. In: International Conference on Learning Representations (ICLR) (2018)

    Google Scholar 

  12. Johnson, M.: Google’s multilingual neural machine translation system: enabling zero-shot translation. Trans. Assoc. Comput. Linguist. 5, 339–351 (2017)

    Article  Google Scholar 

  13. Irvine, A., Callison-Burch, C.: End-to-end statistical machine translation with zero or small parallel texts. Nat. Lang. Eng. 1(1), 517 (2015)

    Google Scholar 

  14. Gulcehre, C., et al.: On using monolingual corpora in neural machine translation. arXiv:1503.03535 (2015)

  15. Gulcehre, C., Firat, O., Xu, K., Cho, K., Bengio, Y.: On integrating a language model into neural machine translation. Comput. Speech Lang. 45, 137–148 (2017)

    Article  Google Scholar 

  16. Domhan, T., Hieber, F.: Using target-side monolingual data for neural machine translation through multi-task learning. In: Conference on Empirical Methods in Natural Language Processing (EMNLP) (2017)

    Google Scholar 

  17. Cheng, Y., et al.: Semi-supervised learning for neural machine translation. arXiv:1606.04596 (2016)

  18. He, D., et al.: Dual learning for machine translation. In: Advances in Neural Information Processing Systems, pp. 820–828 (2016)

    Google Scholar 

  19. Sennrich, R., Haddow, B., Birch, A.: Improving neural machine translation models with monolingual data. arXiv preprint. arXiv:1511.06709 (2015)

  20. Imamura, K., Fujita, A., Sumita, E.: Enhancement of encoder and attention using target monolingual corpora in neural machine translation. In: Proceedings of the 2nd Workshop on Neural Machine Translation and Generation, pp. 55–63 (2018)

    Google Scholar 

  21. Abbyy aligner 2.0. https://www.abbyy.com/ru-ru/aligner. Accessed 10 May 2019

  22. Abbyy smartcat tool for professional translators. https://smartcat.ai/workspace. Accessed 02 Apr 2019

  23. Corpora Collection Leipzig University. https://corpora.uni-leipzig.de/en. Accessed 10 Apr 2020

  24. Papineni, K., Roukos, S., Ward, T., Zhu, W.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pp. 311–318 (2002)

    Google Scholar 

  25. Baisa, V.: Problems of machine translation evaluation. In: Proceedings of Recent Advances in Slavonic Natural Language Processing, Brno (2009)

    Google Scholar 

  26. Tatsoft translate. https://translate.tatar/. Accessed 10 Jun 2020

  27. Google translate. https://translate.google.ru/. Accessed 12 Apr 2020

Download references

Acknowledgments

The reported study was funded by RFBR, project number 20-07-00823.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Aidar Khusainov .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Khusainov, A., Suleymanov, D., Gilmullin, R. (2020). The Influence of Different Methods on the Quality of the Russian-Tatar Neural Machine Translation. In: Kuznetsov, S.O., Panov, A.I., Yakovlev, K.S. (eds) Artificial Intelligence. RCAI 2020. Lecture Notes in Computer Science(), vol 12412. Springer, Cham. https://doi.org/10.1007/978-3-030-59535-7_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-59535-7_18

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-59534-0

  • Online ISBN: 978-3-030-59535-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics