Skip to main content

Kazakh-Uzbek Speech Cascade Machine Translation on Complete Set of Endings

  • Conference paper
  • First Online:
Advances in Computational Collective Intelligence (ICCCI 2023)

Abstract

Studies of speech-to-speech machine translation for Turkic languages are practically absent due to the difficulties of creating parallel speech corpora for training neural models. Therefore, the actual problem is creation of synthetic parallel corpora for investigation of speech machine translation of Turkic languages. In this work, a technology of formation of Turkic speech parallel corpora on the base of CSE (Complete Set of Endings) morphology model is proposed. For this technology of formation of Turkic speech parallel corpora is used the cascade scheme: speech-to-text, text-to-text and text-to-speech. A feature of this scheme is that it is used for the phase “text-to-text” a relational model of translation based on CSE morphology model. The scientific contribution of this work is the development of the technology of forming parallel speech corpora of the Kazakh-Uzbek language pairs, based on a cascade scheme for machine translation of speech on relational models. In the future, the formed synthetic parallel speech corpora will be used to train the neural machine translation of the speech of the Turkic languages.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Koehn, P., Knowles, R.: Six challenges for neural machine translation. In: Proceedings of the First Workshop on Neural Machine Translation, pp. 28–39, Vancouver, Canada (2017)

    Google Scholar 

  2. Tukeyev, U.: Automaton models of the morphology analysis and the completeness of the endings of the Kazakh language. In: Proceedings of the international conference “Turkic languages processing” TURKLANG-2015 September 17–19, pp. 91–100. Kazan, Tatarstan, Russia (2015). (in Russian)

    Google Scholar 

  3. Tukeyev, U., Karibayeva, Ai.: Inferring the complete set of Kazakh endings as a language resource. In: Hernes, M., Wojtkiewicz, K., Szczerbicki, E. (eds.) Advances in Computational Collective Intelligence: 12th International Conference, ICCCI 2020, Da Nang, Vietnam, 30 Nov–3 Dec 2020, Proceedings, pp. 741–751. Springer International Publishing, Cham (2020). https://doi.org/10.1007/978-3-030-63119-2_60

    Chapter  Google Scholar 

  4. Tukeyev, U., Sundetova, A., Abduali, B., Akhmadiyeva, Z., Zhanbussunov, N.: Inferring of the morphological chunk transfer rules on the base of complete set of Kazakh endings. In: Nguyen, N.-T., Manolopoulos, Y., Iliadis, L., Trawiński, B. (eds.) ICCCI 2016. LNCS (LNAI), vol. 9876, pp. 563–574. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-45246-3_54

    Chapter  Google Scholar 

  5. Tukeyev, U., Karibayeva, A., Zhumanov, Z.: Morphological segmentation method for Turkic language neural machine translation. Cogent Eng. 7(1), 1856500 (2020). https://doi.org/10.1080/23311916.2020.1856500

    Article  Google Scholar 

  6. Tukeyev, U., Karibayeva, A., Turganbayeva, A., Amirova, D.: Universal programs for stemming, segmentation, morphological analysis of Turkic words. In: Thanh Nguyen, N., Iliadis, L., Maglogiannis, I., Trawiński, B. (eds.) Computational Collective Intelligence: 13th International Conference, ICCCI 2021, Rhodes, Greece, September 29 – October 1, 2021, Proceedings, pp. 643–654. Springer International Publishing, Cham (2021). https://doi.org/10.1007/978-3-030-88081-1_48

    Chapter  Google Scholar 

  7. Matlatipov, S., Tukeyev, U., Aripov, M.: Towards the uzbek language endings as a language resource. In: Hernes, M., Wojtkiewicz, K., Szczerbicki, E. (eds.) Advances in Computational Collective Intelligence: 12th International Conference, ICCCI 2020, Da Nang, Vietnam, 30 Nov–3 Dec 2020, Proceedings, pp. 729–740. Springer International Publishing, Cham (2020). https://doi.org/10.1007/978-3-030-63119-2_59

    Chapter  Google Scholar 

  8. Toleush, A., Israilova, N., Tukeyev, U.: Development of morphological segmentation for the kyrgyz language on complete set of endings. In: Nguyen, N.T., Chittayasothorn, S., Niyato, D., Trawiński, B. (eds.) Intelligent Information and Database Systems: 13th Asian Conference, ACIIDS 2021, Phuket, Thailand, 7–10 Apr 2021, Proceedings, pp. 327–339. Springer International Publishing, Cham (2021). https://doi.org/10.1007/978-3-030-73280-6_26

    Chapter  Google Scholar 

  9. Qamet, A., Zhakypbayeva, K., Turganbayeva, A., Tukeyev, U.: Development Kazakh-Turkish machine translation on the base of complete set of endings model. In: Szczerbicki, E., Wojtkiewicz, K., Van Nguyen, S., Pietranik, M., Krótkiewicz, M. (eds.) Recent Challenges in Intelligent Information and Database Systems: 14th Asian Conference, ACIIDS 2022, Ho Chi Minh City, Vietnam, 28–30 Nov 2022, Proceedings, pp. 543–555. Springer Nature Singapore, Singapore (2022). https://doi.org/10.1007/978-981-19-8234-7_42

    Chapter  Google Scholar 

  10. Lavie, A., et al.: JANUS-III: Speech-to-speech translation in multiple languages. In: Proceedings of the ICASSP 1997 (1997)

    Google Scholar 

  11. Wahlster, W. (ed.): Verbmobil: Foundations of Speech-to-Speech Translation. Springer, Berlin, Heidelberg (2000)

    MATH  Google Scholar 

  12. Nakamura, S., et al.: The ATR multilingual speech-to-speech translation system. IEEE Trans. Audio Speech Language Process. 14(2), 365–376 (2006)

    Article  Google Scholar 

  13. Guo, M., Haque, A., Verma, P.: End-to-end spoken language translation, arXiv preprint arXiv:1904.10760 (2019)

  14. Jia, Y., et al.: Direct speech-to-speech translation with a sequence-to-sequence model arXiv:1904.06037v2 (2019)

  15. Papi, S., Gaido, M., Negri, M., Turchi, M.: Speechformer: Reducing Information Loss in Direct Speech Translation arXiv:2109.04574v1 (2021)

  16. Kano, T., Sakti, S., Nakamura, S.: Transformer-based direct speech-to-speech translation with transcoder. In: 2021 IEEE Spoken Language Technology Workshop (SLT), pp. 958–965. IEEE (2021)

    Google Scholar 

  17. Papi, S., Gaido, M., Karakanta, A., Cettolo, M., Negri, M., Turchi, M.: Direct Speech Translation for Automatic Subtitling. CoRR abs/2209.13192 (2022)

    Google Scholar 

  18. Bentivogli, L., et al.: Cascade versus Direct Speech Translation: Do the Differences Still Make a Difference? ACL/IJCNLP (1) 2021, pp. 2873–2887 (2021)

    Google Scholar 

  19. Niehues, J., Salesky, E., Turchi, M., Negri, M.: Tutorial Proposal: End-to-End Speech Translation. EACL (Tutorial Abstracts) 2021, pp. 10–13 (2021)

    Google Scholar 

  20. Wang, C., et al.: Simple and Effective Unsupervised Speech Translation. arXiv:2210.10191v1 [cs.CL] (2022)

  21. Bansal, S., Kamper, H., Livescu, K., Lopez, A., Goldwater, S.: Low resource speech-to-text translation. Proc. Interspeech 2018, 1298–1302 (2018)

    Google Scholar 

  22. Bansal, S., Kamper, H., Livescu, K., Lopez, A., Goldwater, S.: Pretraining on high-resource speech recognition improves low-resource speech-to-text translation. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 1 (Long and Short Papers), pp. 58–68 (2019)

    Google Scholar 

  23. Cheng, Y.-F., Hung-Shin Lee, H.-S., Wang, H.-M.: AlloST: Low-Resource Speech Translation Without Source Transcription. In: Proceedings of the Interspeech 2021, pp. 2252–2256 (2021)

    Google Scholar 

  24. Chung, Y.-A., Weng, W.-H., Tong, S., James Glass, J.: Towards unsupervised speech-to-text translation. In: ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 7170–7174. IEEE (2019)

    Google Scholar 

  25. Karakanta, A., Negri, M., Turchi, M.: MuST-Cinema: a Speech-to-Subtitles corpus. LREC 2020, pp. 3727–3734 (2020)

    Google Scholar 

  26. Jia, Y., Ramanovich, M.T., Wang, Q., Zen, H.: Cvss corpus and massively multilingual speech-to-speech translation. arXiv preprint arXiv:2201.03713 (2022)

  27. Bentivogli, L., Mauro, C., Marco, G., Alina, K., Matteo, N., Marco, T.: Extending the MuST-C Corpus for a Comparative Evaluation of Speech Translation Technology. EAMT 2022, pp. 359–360 (2022)

    Google Scholar 

  28. Musaev, M., Mussakhojayeva, S., Khujayorov, I., Khassanov, Y., Ochilov, M., Varol, H.A.: USC: An Open-Source Uzbek Speech Corpus and Initial Speech Recognition Experiments. In: Karpov, A., Potapova, R. (eds.) SPECOM 2021. LNCS (LNAI), vol. 12997, pp. 437–447. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87802-3_40

    Chapter  Google Scholar 

  29. Mussakhojayeva, S., Janaliyeva, A., Mirzakhmetov, A., Khassanov, Y., Varol, H.A.: KazakhTTS: an open-source kazakh text-to-speech synthesis dataset. In: Proceedings of the Interspeech 2021, pp. 2786–2790. https://doi.org/10.21437/Interspeech.2021-2124Open-Source Kazakh Text-to-Speech Synthesis Dataset. arXiv preprint arXiv:2104.08459 (2021)

  30. Mamyrbayev, O., Alimhan, K., Zhumazhanov, B., Turdalykyzy, T., Gusmanova, F.: End-to-End Speech Recognition in Agglutinative Languages. In: Nguyen, N.T., Jearanaitanakij, K., Selamat, A., Trawiński, B., Chittayasothorn, S. (eds.) ACIIDS 2020. LNCS (LNAI), vol. 12034, pp. 391–401. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-42058-1_33

    Chapter  Google Scholar 

  31. Mamyrbayev, O., Alimhan, K., Oralbekova, D., Bekarystankyzy A., Zhumazhanov, B.: Identifying the influence of transfer learning methods in developing an end-to-end automatic speech recognition system with a low data level. Eastern-European J. Enterprise Technol. 1(9(115)), 84–92 (2022). https://doi.org/10.15587/1729-4061.2022.252801

  32. Maмыpбaeв, O.Ж., Opaлбeкoвa, Д.O., Aлимxaн, K., Othman, M., Жyмaжaнoв, Б.: Пpимeнeниe гибpиднoй интeгpaльнoй мoдeли для pacпoзнaвaния кaзaxcкoй peчи. News of the National academy of sciences of the republic of Kazakhstan. 1(341), 58–68 (2022). https://doi.org/10.32014/2022.2518-1726.117

  33. Khassanov, Y., Mussakhojayeva, S., Mirzakhmetov, A., Adiyev, A., Nurpeiissov, M., Varol, H.A.: A crowdsourced open-source Kazakh speech corpus and initial speech recognition baseline. In: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pp. 697–706. Association for Computational Linguistics (2021). https://issai.nu.edu.kz/ru/%d0%b3%d0%bb%d0%b0%d0%b2%d0%bd%d0%b0%d1%8f/#research

  34. NLP-KAZNU/Kazakh-Uzbek machine translation. https://github.com/NLP-KazNU/Kazakh-Uzbek-machine-translation-on-the-base-of-CSE-model. Access date: 1 Mar 2023

  35. Wolf, T., et al.: HuggingFace’s Transformers: State-of-the-art Natural Language Processing. arXiv (2020). https://doi.org/10.48550/arXiv.1910.03771

  36. Baevski, A., Zhou H., Mohamed A., Auli, M.: wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations. arXiv (2020). https://doi.org/10.48550/arXiv.2006.11477.

  37. Mussakhojayeva, S., Janaliyeva, A., Mirzakhmetov, A., Khassanov, Y., Varol, H.A.: KazakhTTS: An Open-Source Kazakh Text-to-Speech Synthesis Dataset. arXiv:2104.08459v3 [eess.AS] (2021)

  38. Shen, J., et al.: Natural TTS synthesis by conditioning Wavenet on MEL spectrogram predictions. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4779–4783. IEEE (2018)

    Google Scholar 

  39. Li, N., Liu, S., Liu, Y., Zhao, S., Liu, M.: Neural speech synthesis with transformer network. Proc. AAAI Conf. Artif. Intell. 33(01), 6706–6713 (2019). https://doi.org/10.1609/aaai.v33i01.33016706

    Article  Google Scholar 

  40. Sacrebleu: https://github.com/mjpost/sacrebleu. Access date 1 Mar 2023

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ualsher Tukeyev .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Balabekova, T., Kairatuly, B., Tukeyev, U. (2023). Kazakh-Uzbek Speech Cascade Machine Translation on Complete Set of Endings. In: Nguyen, N.T., et al. Advances in Computational Collective Intelligence. ICCCI 2023. Communications in Computer and Information Science, vol 1864. Springer, Cham. https://doi.org/10.1007/978-3-031-41774-0_34

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-41774-0_34

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-41773-3

  • Online ISBN: 978-3-031-41774-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics