A Comparative Analysis of Speech Recognition Systems for the Tatar Language

Khusainov, Aidar

doi:10.1007/978-3-319-77113-7_40

Aidar Khusainov¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10761))

Included in the following conference series:

International Conference on Computational Linguistics and Intelligent Text Processing

856 Accesses

Abstract

This paper presents a comparative study of several different approaches to speech recognition for the Tatar language. All the compared systems use a corpus-based approach, so recent results in speech and text corpora creation are also shown. The recognition systems differ in acoustic modelling algorithms, basic acoustic units, and language modelling techniques. The DNN-based system shows the best recognition result obtained on the test part of speech corpus.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Lewis, M.P., Simons, G.F., Fennig, C.D. (eds.). Ethnologue: Languages of the World, 9th (edn.). SIL International, Dallas (2016). http://www.ethnologue.com
Berment, V.: “Me´thodes pour informatiser des langues et des groups de langues peu dotées”, Ph.D. thesis, J. Fourier University, Grenoble I (2004)
Google Scholar
Krauwer, S.: The basic language resource kit (BLARK) as the first milestone for the language resources roadmap. In: Proceedings of International Workshop Speech and Computer SPEECOM, Moscow, Russia, pp. 8–15 (2003)
Google Scholar
Khusainov, A.: Tekhnologiya avtomatizatsii sozdaniya I otsenki kachestva programmnikh sredstv analiza rechi c uchetom osobennostey maloresursnykh yazikov, Ph.D. thesis, Kazan, 162 p (2014)
Google Scholar
Salimzyanov, I., Washington, J., Tyers, F.: A free/open-source Kazakh-Tatar machine translation system. In: Proceedings of the Machine Translation Summit XIV, Nice, France (2013)
Google Scholar
Yandex Translate. https://translate.yandex.com/translator/Russian-Tatar
Suleymnov, D., Gatiatullin, A., Gilmullin, R.: Lexicograficheskaya baza dannykh dlya system mashinnogo perevoda blizkorodstvennykh yazykov. In: Proceedings of Third International Conference «Informatizatciya obschestva», Astana, Kazakhstan, pp. 585–587 (2012)
Google Scholar
Khusainov, A., Khusainova, A.: Speech human-machine interface for the Tatar language. In: Artificial Intelligence and Natural Language Conference, FRUCT Oy, Helsinki, pp. 60–65 (2016)
Google Scholar
Khusainov, A., Suleymanov, D.: Language identification system for the tatar language. In: Železný, M., Habernal, I., Ronzhin, A. (eds.) SPECOM 2013. LNCS (LNAI), vol. 8113, pp. 203–210. Springer, Cham (2013). https://doi.org/10.1007/978-3-319-01931-4_27
Chapter Google Scholar
Suleymanov, Dz., Nevzorova, O.A., Khakimov, B.: National corpus of the tatar language “Tugan Tel”: structure and features of grammatical annotation. In: Proceedings of International Conference Georgian Language and modern Technology, Tbilisi, pp. 107–108 (2013)
Google Scholar
Povey, D., et al.: The kaldi speech recognition toolkit. In: Proceedings of ASRU, pp. 1–4 (2011)
Google Scholar
Rath, S.P., Povey, D., Vesely, K., Cernocky, J.H.: Improved feature processing for deep neural networks. In: Proceedings of InterSpeech (2013)
Google Scholar
Zhang, X., Trmal, J., Povey, D., Khudanpur, S.: Improving deep neural network acoustic models using generalized maxout networks. In: 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), May 2014, pp. 215–219. IEEE (2014)
Google Scholar
Stolcke, A.: SRILM – an extensible language modeling toolkit. In: Proceedings of International Conference on Spoken Language Processing, vol. 2, Denver, pp. 901–904 (2002)
Google Scholar
Stolcke, A.: Entropy-based pruning of backoff language models. In: Proceedings of DARPA Broadcast News Transcription and Understanding Workshop, Lansdowne, pp. 270–274 (1998)
Google Scholar
Kneser, R., Ney, H.: Improved backingoff for m-gram language modeling. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 1 (1995)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Applied Semiotics of the Tatarstan Academy of Sciences, Kazan Federal University, Kazan, Russia
Aidar Khusainov

Authors

Aidar Khusainov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Aidar Khusainov .

Editor information

Editors and Affiliations

CIC, Instituto Politécnico Nacional, Mexico City, Mexico
Alexander Gelbukh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Khusainov, A. (2018). A Comparative Analysis of Speech Recognition Systems for the Tatar Language. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2017. Lecture Notes in Computer Science(), vol 10761. Springer, Cham. https://doi.org/10.1007/978-3-319-77113-7_40

Download citation

DOI: https://doi.org/10.1007/978-3-319-77113-7_40
Published: 10 October 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-77112-0
Online ISBN: 978-3-319-77113-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics