Study of Morphological Factors of Factored Language Models for Russian ASR

Kipyatkova, Irina; Karpov, Alexey

doi:10.1007/978-3-319-11581-8_56

Irina Kipyatkova²² &
Alexey Karpov^22,23

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8773))

Included in the following conference series:

International Conference on Speech and Computer

1304 Accesses
5 Citations

Abstract

In the paper, we describe a research of factored language model (FLM) for Russian speech recognition. We used FLM at N-best list rescoring stage. Optimization of the FLM parameters was carried out by means of Genetic Algorithm. The best models used four factors: lemma, morphological tag, stem, and word. Experiments on large vocabulary continuous Russian speech recognition showed a relative WER reduction of 8% when FLM was interpolated with the baseline 3-gram model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Nouza, J., Zdansky, J., Cerva, P., Silovsky, J.: Challenges in speech processing of Slavic languages (Case studies in speech recognition of Czech and Slovak). In: Esposito, A., Campbell, N., Vogel, C., Hussain, A., Nijholt, A. (eds.) Second COST 2102. LNCS, vol. 5967, pp. 225–241. Springer, Heidelberg (2010)
Google Scholar
Whittaker, E.W.D., Woodland, P.C.: Language modelling for Russian and English using words and classes. Computer Speech and Language 17, 87–104 (2000)
Article Google Scholar
Bilmes, J.A., Kirchhoff, K.: Factored language models and generalized parallel backoff. In: Proceedings of Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, Stroudsburg, PA, USA, vol. 2, pp. 4–6 (2003)
Google Scholar
Vergyri, D., Kirchhoff, K., Duh, K., Stolcke, A.: Morphology-Based Language Modeling for Arabic Speech Recognition. In: Proceedings of ICSLP 2004, pp. 2245–2248 (2004)
Google Scholar
Tachbelie, M.Y., Teferra Abate, S., Menzel, W.: Morpheme-based language modeling for Amharic speech recognition. In: Proceedings of the 4th Language and Technology Conference, LTC 2009, Posnan, Poland, pp. 114–118 (2009)
Google Scholar
Alumae, T.: Sentence-adapted factored language model for transcribing Estonian speech. In: Proceedings of ICASSP 2006, Toulouse, France, pp. 429–432 (2006)
Google Scholar
Adel, H., Kirchhof, K., Telaar, D., Vu, N.T., Schlippe, T., Schultz, T.: Features for factores language models for code-switching speech. In: Proceedings of 4th International Workshop on Spoken Language Technologies for Under-resourced Languages (SLTU 2014), St. Petersburg, Russia, pp. 32–38 (2014)
Google Scholar
Adel, H., Vu, N.T., Schultz, T.: Combination of Recurrent Neural Networks and Factored Language Models for Code-Switching Language Modeling. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (ACL 2013), Sofia, Bulgaria (2013)
Google Scholar
Vazhenina, D., Markov, K.: Evaluation of advanced language modelling techniques for Russian LVCSR. In: Železný, M., Habernal, I., Ronzhin, A. (eds.) SPECOM 2013. LNCS, vol. 8113, pp. 124–131. Springer, Heidelberg (2013)
Chapter Google Scholar
Vazhenina, D., Markov, K.: Factored Language Modeling for Russian LVCSR. In: Proceedings of International Joint Conference on Awareness Science and Technology & Ubi-Media Computing, Aizu-Wakamatsu city, Japan, pp. 205–210 (2013)
Google Scholar
Schmid, H.: Probabilistic part-of-speech tagging using decision trees. In: Proceedings of the International Conference on New Methods of Language Processing, Manchester, UK, pp. 44–49 (1994)
Google Scholar
Kipyatkova, I., Verkhodanova, V., Karpov, A.: Rescoring N-best lists for Russian speech recognition using factored language models. In: Proceedings of 4th International Workshop on Spoken Language Technologies for Under-resourced Languages (SLTU-2014), St. Petersburg, Russia, pp. 81–86 (2014)
Google Scholar
Zulkarneev, M., Satunovsky, P., Shamraev, N.: The use of d-gram language model for speech recognition in Russian. In: Železný, M., Habernal, I., Ronzhin, A. (eds.) SPECOM 2013. LNCS, vol. 8113, pp. 362–366. Springer, Heidelberg (2013)
Chapter Google Scholar
Kirchhoff, K., Bilmes, J., Duh, K.: Factored Language Models Tutorial. Tech. Report UWEETR-2007-0003, Dept. of EE, U. Washington (2007)
Google Scholar
Karpov, A., Markov, K., Kipyatkova, I., Vazhenina, D., Ronzhin, A.: Large vocabulary Russian speech recognition using syntactico-statistical language modeling. Speech Communication 56, 213–228 (2014)
Article Google Scholar
Kipyatkova, I.S., Karpov, A.A.: Development and Research of a Statistical Russian Language Model. SPIIRAS Proceedings 12, 35–49 (2010) (in Rus.)
Google Scholar
Stolcke, A., Zheng, J., Wang, W., Abrash, V.: SRILM at Sixteen: Update and Outlook. In: Proceedings of IEEE Automatic Speech Recognition and Understanding Workshop ASRU 2011, Waikoloa, Hawaii, USA (2011)
Google Scholar
Kipyatkova, I., Karpov, A.: Lexicon Size and Language Model Order Optimization for Russian LVCSR. In: Železný, M., Habernal, I., Ronzhin, A. (eds.) SPECOM 2013. LNCS, vol. 8113, pp. 219–226. Springer, Heidelberg (2013)
Chapter Google Scholar
Sokirko, A.: Morphological modules on the website www.aot.ru. In: Proceedings of “Dialogue-2004”, Protvino, Russia, pp. 559–564 (2004) (in Rus.)
Google Scholar
Jokisch, O., Wagner, A., Sabo, R., Jaeckel, R., Cylwik, N., Rusko, M., Ronzhin, A., Hoffmann, R.: Multilingual speech data collection for the assessment of pronunciation and prosody in a language learning system. In: Proceedings of SPECOM 2009, St. Petersburg, Russia, pp. 515–520 (2009)
Google Scholar
Karpov, A., Kipyatkova, I., Ronzhin, A.: Very Large Vocabulary ASR for Spoken Russian with Syntactic and Morphemic Analysis. In: Proceedings of Interspeech 2011, Florence, Italy, pp. 3161–3164 (2011)
Google Scholar
Lee, A., Kawahara, T.: Recent Development of Open-Source Speech Recognition Engine Julius. In: Proceedings of Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC 2009), Sapporo, Japan, pp.131–137 (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

St. Petersburg Institute for Informatics and Automation of RAS, SPIIRAS, St. Petersburg, 199178, Russia
Irina Kipyatkova & Alexey Karpov
ITMO University, 49 Kronverkskiy av., St. Petersburg, 197101, Russia
Alexey Karpov

Authors

Irina Kipyatkova
View author publications
You can also search for this author in PubMed Google Scholar
Alexey Karpov
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Speech and Multimodal Interfaces Laboratory, St. Petersburg Institute of Informatics and Automation of the Russian Academy of Sciences, 39, 14th line, 199178, St. Petersburg, Russia
Andrey Ronzhin
Institute of Applied and Mathematical Linguistics, Moscow State Linguistic University, 38, Ostozhenka, 119034, Moscow, Russia
Rodmonga Potapova
Faculty of Technical Sciences, University of Novi Sad, 6, Trg Dositeja Obradovića, 21000, Novi Sad, Serbia
Vlado Delic

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kipyatkova, I., Karpov, A. (2014). Study of Morphological Factors of Factored Language Models for Russian ASR. In: Ronzhin, A., Potapova, R., Delic, V. (eds) Speech and Computer. SPECOM 2014. Lecture Notes in Computer Science(), vol 8773. Springer, Cham. https://doi.org/10.1007/978-3-319-11581-8_56

Download citation

DOI: https://doi.org/10.1007/978-3-319-11581-8_56
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11580-1
Online ISBN: 978-3-319-11581-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics