Abstract
This paper concerns a speaker independent recognition engine of Czech continuous speech designed for Czech telephone applications and describes the recognition module as an important component of a telephone dialogue system being designed and constructed at the Department of Cybernetics, the University of West Bohemia. The recognition is based on a statistical approach. The left-to-right three-state HMMs with an output probability density function expressed as multivariate Gaussian mixture are used to model triphones as basic units in acoustic modelling and stochastic regular grammars are implemented to reduce a task perplexity. A real time recognition process is supported by a very computation cost reduction approach estimating log-likelihood scores of Gaussian mixtures and also by a beam pruning used during Viterbi decoding. The present paper concerns the main part of the engine— a speaker independent recognition engine for continuous Czech speech.
The work was partially supported by the Ministry of Education of the Czech Republic, project no MSM235200004, and by the Grant Agency of the Czech Republic, project no.102/96/K087
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Müller, L., Psutka J.: The Speaker Independent Recognition Module for the Czech Telephony Dialogue System, ICSPAT’ 99, Orlando, USA, (1999).
Jelinek, F.: Statistical Methods for Speech Recognition, MIT Press, Cambridge, (1997).
Radová, V., Psutka J., Šmýdl L., Vopálka P., Jurčýček F.: Czech Speech Corpus for Development of Speech Recognition Systems. In Proceedings of the Workshop on Developing language resources for minority languages, Athens 2000.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Mŭller, L., Psutka, J., Šmídl, L. (2000). Design of Speech Recognition Engine. In: Sojka, P., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2000. Lecture Notes in Computer Science(), vol 1902. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45323-7_44
Download citation
DOI: https://doi.org/10.1007/3-540-45323-7_44
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41042-3
Online ISBN: 978-3-540-45323-9
eBook Packages: Springer Book Archive