Skip to main content

Large Vocabulary Continuous Speech Recognizer for Slovenian Language

  • Conference paper
  • First Online:
Book cover Text, Speech and Dialogue (TSD 2001)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2166))

Included in the following conference series:

  • 395 Accesses

Abstract

The paper describes the development of a large vocabulary continuous speech recogniser for Slovenian language with SNABI database. The problems with inflectional languages when speech recognition is performed are presented. The system is based on hidden Markov models. For acoustic modeling biphones were used whereas for language modeling bigrams and trigrams were used. To improve the recognition result and to enable fast operation of the recogniser, speaker adaptation is also used. The optimal system with the adapted acoustic model and bigram language model achieved word accuracy of 91.30% at near 10× real time. The unadapted system with the trigram language model achieved the word accuracy of 89.56%, but it was also slower than the optimal system. Its run time was 15.3× real time.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Žibert, J., Mihelič, F.: Govorna zbirka vremenskih napovedi. Information Society multiconference: Language Technologies, Ljubljana, Slovenia, 2000.

    Google Scholar 

  2. Kaiser, J., Kačič, Z.: Development of the Slovenian SpeechDat database. Proc. First International Conference on Language Resources and Evaluation (LREC-1998), Granada, Spain, 1998.

    Google Scholar 

  3. Johansen, F.T., Warakagoda, N., Lindberg, B., Lehtinen, G., Kačič, Z., Žgank, A., Elenius, K., Salvi, G.: The COST 249 SpeechDat Multilingual Reference Recogniser. Proc. Second International Conference on Language Resources and Evaluation (LREC-2000), Athens, May, 2000.

    Google Scholar 

  4. Lindberg, B., Johansen, F.T., Warakagoda, N., Lehtinen, G., Kačič, Z., Žgank, A., Elenius, K., Salvi, G.: A noise robust multilingual reference recogniser based on SpeechDat(II). ICSLP 2000: the proceedings of the 6th conference, Beijing, China, 2000.

    Google Scholar 

  5. Imperl, B., Köhler, J., Kačič, Z.: On the use of semi-continuous HMM for the isolated digits recognition over the telephone. Proceedings of the COST 249, 250, 258 workshop: Speech technology in the public telephone network: Where are we today? Rhodes, Greece, 26–27 September 1997, 41–44.

    Google Scholar 

  6. Ipšič, I., Mihelič, F., Dobrišek, S., Gros, J., Pavešić, N.: A Slovenian Spoken Dialog System for Air Flight Inquires. Proceedings of the Eurospeech’ 99, Budapest, Hungary, 1999, 2659–2662.

    Google Scholar 

  7. Kačič, Z., Horvat, B., Zögling A.: Issues in Design and Collection of Large Telephone Speech Corpus for Slovenian Language. Proc. Second International Conference on Language Resources and Evaluation (LREC-2000), Athens, May, 2000.

    Google Scholar 

  8. Byrne, W., Hajič, J., Ircing, P., Jelinek, F., Khudanpur, S., McDonough, J., Peterek, N., Psutka, J.: Large Vocabulary Speech Recognition for Read and Broadcast Czech. In: Proceedings of the Second Workshop on Text, Speech, Dialogue-TSD99, Pilsen, Czech Republic, September 1999.

    Google Scholar 

  9. Žgank, A.: The Development of UMB Broadcast News 1996 Transcription System. In: Advances in Speech Technology: International Workshop, Maribor, Slovenia, 4–5 July 2000.

    Google Scholar 

  10. Byrne, W., Hajič, J., Ircing, P., Krbec, P., Psutka, J.: Morpheme Based Language Models for Speech Recognition of Czech. In: Proceedings of the Third Workshop on Text, Speech, Dialogue-TSD 2000, Brno, Czech Republic, September 2000, 211–216.

    Google Scholar 

  11. Malkovsky, M.G., Subbotin, A.V.: NL-Processor and Linguistic Knowledge Base in a Speech Recognition System. In: Proceedings of the Third Workshop on Text, Speech, Dialogue-TSD 2000, Brno, Czech Republic, September 2000, 237–242.

    Google Scholar 

  12. Young, S., Ollason, D., Valtchev, V., Woodland, P.: The HTK book (for HTK version 2.1). Entropic Cambridge Research Laboratory, March 1997.

    Google Scholar 

  13. Clarkson, P.R., Rosenfeld, R.: Statistical Language Modeling Using the CMU-Cambridge Toolkit. Proc. of the Eurospeech’ 97, Rhodes, Greece, 1997.

    Google Scholar 

  14. Odell, J.J.: The Use of Context in Large Vocabulary Speech Recognition. PhD Thesis, 1995.

    Google Scholar 

  15. Leggetter, C.J., Woodland, P.C.: Flexible Speaker Adaptation using Maximum Likelihood Linear Regression. Proc. ARPA Spoken Language Technology Workshop, Austin, Texas, February, 1995, 104–109.

    Google Scholar 

  16. Niemöller, M., Hauenstein, A., Marschall, E., Witschel, P., Harke, U.: A PC-Based Real-Time Large Vocabulary Continuous Speech Recognizer for German. ICASSP’97: the proceedings of the conference, Munich, Germany, 1997.

    Google Scholar 

  17. Nouza, J., A Large Czech Vocabulary Recognition System for Real-Time Applications. In: Proceedings of the Third Workshop on Text, Speech, Dialogue-TSD 2000, Brno, Czech Republic, September 2000, 217–222.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Žgank, A., Ka7#x010D;ič, Z., Horvat, B. (2001). Large Vocabulary Continuous Speech Recognizer for Slovenian Language. In: Matoušek, V., Mautner, P., Mouček, R., Taušer, K. (eds) Text, Speech and Dialogue. TSD 2001. Lecture Notes in Computer Science(), vol 2166. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44805-5_32

Download citation

  • DOI: https://doi.org/10.1007/3-540-44805-5_32

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-42557-1

  • Online ISBN: 978-3-540-44805-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics