Large Vocabulary Continuous Speech Recognizer for Slovenian Language

Žgank, Andrej; Ka7#x010D;ič, Zdravko; Horvat, Bogomir

doi:10.1007/3-540-44805-5_32

Andrej Žgank²,
Zdravko Ka7#x010D;ič² &
Bogomir Horvat²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2166))

Included in the following conference series:

International Conference on Text, Speech and Dialogue

411 Accesses

Abstract

The paper describes the development of a large vocabulary continuous speech recogniser for Slovenian language with SNABI database. The problems with inflectional languages when speech recognition is performed are presented. The system is based on hidden Markov models. For acoustic modeling biphones were used whereas for language modeling bigrams and trigrams were used. To improve the recognition result and to enable fast operation of the recogniser, speaker adaptation is also used. The optimal system with the adapted acoustic model and bigram language model achieved word accuracy of 91.30% at near 10× real time. The unadapted system with the trigram language model achieved the word accuracy of 89.56%, but it was also slower than the optimal system. Its run time was 15.3× real time.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Chhattisgarhi speech corpus for research and development in automatic speech recognition

Article 16 February 2018

A Continuous Speech Recognition System for Bangla Language

Recent Results in Speech Recognition for the Tatar Language

References

Žibert, J., Mihelič, F.: Govorna zbirka vremenskih napovedi. Information Society multiconference: Language Technologies, Ljubljana, Slovenia, 2000.
Google Scholar
Kaiser, J., Kačič, Z.: Development of the Slovenian SpeechDat database. Proc. First International Conference on Language Resources and Evaluation (LREC-1998), Granada, Spain, 1998.
Google Scholar
Johansen, F.T., Warakagoda, N., Lindberg, B., Lehtinen, G., Kačič, Z., Žgank, A., Elenius, K., Salvi, G.: The COST 249 SpeechDat Multilingual Reference Recogniser. Proc. Second International Conference on Language Resources and Evaluation (LREC-2000), Athens, May, 2000.
Google Scholar
Lindberg, B., Johansen, F.T., Warakagoda, N., Lehtinen, G., Kačič, Z., Žgank, A., Elenius, K., Salvi, G.: A noise robust multilingual reference recogniser based on SpeechDat(II). ICSLP 2000: the proceedings of the 6th conference, Beijing, China, 2000.
Google Scholar
Imperl, B., Köhler, J., Kačič, Z.: On the use of semi-continuous HMM for the isolated digits recognition over the telephone. Proceedings of the COST 249, 250, 258 workshop: Speech technology in the public telephone network: Where are we today? Rhodes, Greece, 26–27 September 1997, 41–44.
Google Scholar
Ipšič, I., Mihelič, F., Dobrišek, S., Gros, J., Pavešić, N.: A Slovenian Spoken Dialog System for Air Flight Inquires. Proceedings of the Eurospeech’ 99, Budapest, Hungary, 1999, 2659–2662.
Google Scholar
Kačič, Z., Horvat, B., Zögling A.: Issues in Design and Collection of Large Telephone Speech Corpus for Slovenian Language. Proc. Second International Conference on Language Resources and Evaluation (LREC-2000), Athens, May, 2000.
Google Scholar
Byrne, W., Hajič, J., Ircing, P., Jelinek, F., Khudanpur, S., McDonough, J., Peterek, N., Psutka, J.: Large Vocabulary Speech Recognition for Read and Broadcast Czech. In: Proceedings of the Second Workshop on Text, Speech, Dialogue-TSD99, Pilsen, Czech Republic, September 1999.
Google Scholar
Žgank, A.: The Development of UMB Broadcast News 1996 Transcription System. In: Advances in Speech Technology: International Workshop, Maribor, Slovenia, 4–5 July 2000.
Google Scholar
Byrne, W., Hajič, J., Ircing, P., Krbec, P., Psutka, J.: Morpheme Based Language Models for Speech Recognition of Czech. In: Proceedings of the Third Workshop on Text, Speech, Dialogue-TSD 2000, Brno, Czech Republic, September 2000, 211–216.
Google Scholar
Malkovsky, M.G., Subbotin, A.V.: NL-Processor and Linguistic Knowledge Base in a Speech Recognition System. In: Proceedings of the Third Workshop on Text, Speech, Dialogue-TSD 2000, Brno, Czech Republic, September 2000, 237–242.
Google Scholar
Young, S., Ollason, D., Valtchev, V., Woodland, P.: The HTK book (for HTK version 2.1). Entropic Cambridge Research Laboratory, March 1997.
Google Scholar
Clarkson, P.R., Rosenfeld, R.: Statistical Language Modeling Using the CMU-Cambridge Toolkit. Proc. of the Eurospeech’ 97, Rhodes, Greece, 1997.
Google Scholar
Odell, J.J.: The Use of Context in Large Vocabulary Speech Recognition. PhD Thesis, 1995.
Google Scholar
Leggetter, C.J., Woodland, P.C.: Flexible Speaker Adaptation using Maximum Likelihood Linear Regression. Proc. ARPA Spoken Language Technology Workshop, Austin, Texas, February, 1995, 104–109.
Google Scholar
Niemöller, M., Hauenstein, A., Marschall, E., Witschel, P., Harke, U.: A PC-Based Real-Time Large Vocabulary Continuous Speech Recognizer for German. ICASSP’97: the proceedings of the conference, Munich, Germany, 1997.
Google Scholar
Nouza, J., A Large Czech Vocabulary Recognition System for Real-Time Applications. In: Proceedings of the Third Workshop on Text, Speech, Dialogue-TSD 2000, Brno, Czech Republic, September 2000, 217–222.
Google Scholar

Download references

Author information

Authors and Affiliations

Laboratory for Digital Signal Processing, Faculty of EE & CS, University of Maribor, Smetanova 17, SI-2000, Maribor, Slovenia
Andrej Žgank, Zdravko Ka7#x010D;ič & Bogomir Horvat

Authors

Andrej Žgank
View author publications
You can also search for this author in PubMed Google Scholar
Zdravko Ka7#x010D;ič
View author publications
You can also search for this author in PubMed Google Scholar
Bogomir Horvat
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dept. of Computer Science and Engineering, University of West Bohemia in Plzeň, Faculty of Applied Sciences, Univerzitní 22, 306-14, Plzeň, Czech Republic
Václav Matoušek , Pavel Mautner , Roman Mouček & Karel Taušer , , &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Žgank, A., Ka7#x010D;ič, Z., Horvat, B. (2001). Large Vocabulary Continuous Speech Recognizer for Slovenian Language. In: Matoušek, V., Mautner, P., Mouček, R., Taušer, K. (eds) Text, Speech and Dialogue. TSD 2001. Lecture Notes in Computer Science(), vol 2166. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44805-5_32

Download citation

DOI: https://doi.org/10.1007/3-540-44805-5_32
Published: 24 August 2001
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42557-1
Online ISBN: 978-3-540-44805-1
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics