Abstract
This paper introduces a novel combination of Artificial Neural Networks (ANNs) and Hidden Markov Models (HMMs) for Automatic Speech Recognition (ASR), relying on ANN non-parametric estimation of the emission probabilities of an underlying HMM. A gradient-ascent global training technique aimed at maximizing the likelihood (ML) of acoustic observations given the model is presented. A maximum aposteriori variant of the algorithm is also proposed as a viable solution to the “divergence problem” that may arise in the ML setup. A 46.34% relative word error rate reduction with respect to standard HMMs was obtained in a speaker-independent, continuous ASR task with a small vocabulary.
Keywords
- Hide Markov Model
- Speech Recognition
- Automatic Speech Recognition
- Emission Probability
- Continuous Speech Recognition
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Y. Bengio, P. Simard, and P. Frasconi. Learning long-term dependencies with gradient descent is difficult. IEEE Trans. on Neural Networks, 5(2):157–166, 1994.
Y. Bengio. Neural Networks for Speech and Sequence Recognition. International Thomson Computer Press, London, UK, 1996.
H. Bourlard and N. Morgan. Connectionist Speech Recognition. A Hybrid Approach. Kluwer Academic Publishers, Boston, 1994.
J.S. Bridle. Alphanets: a recurrent ‘neural’ network architecture with a hidden Markov model interpretation. Speech Communication, 9(1):83–92, 1990.
S. B. Davis and P. Mermelstein. Comparison of parametric representations of monosyllabic word recognition in continuously spoken sentences. IEEE Trans. on Acoustics, Speech and Signal Processing, 28(4):357–366, 1980.
X. D. Huang, Y. Ariki, and M. Jack. Hidden Markov Models for Speech Recognition. Edinburgh University Press, Edinburgh, 1990.
E. Trentin, Y. Bengio, C. Furlanello, and R. De Mori. Neural networks for speech recognition. In R. De Mori, editor, Spoken Dialogues with Computers, pages 311–361, London, UK, 1998. Academic Press.
E. Trentin and M. Gori. A survey of hybrid ANN/HMM models for automatic speech recognition. Neurocomputing, 37(1–4):91–126, 2001.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Trentin, E., Gori, M. (2001). Continuous Speech Recognition with a Robust Connectionist/Markovian Hybrid Model. In: Dorffner, G., Bischof, H., Hornik, K. (eds) Artificial Neural Networks — ICANN 2001. ICANN 2001. Lecture Notes in Computer Science, vol 2130. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44668-0_81
Download citation
DOI: https://doi.org/10.1007/3-540-44668-0_81
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42486-4
Online ISBN: 978-3-540-44668-2
eBook Packages: Springer Book Archive