Abstract
This paper discusses the usage of feed-forward and recurrent Artificial Neural Networks (ANNs) in whole word speech recognition. The Long-Short Term Memory (LSTM) network has been trained to do speaker independent recognition of any series of connected digits in polish language, using only the acoustic features extracted from speech. It is also shown how to effectively change the analog network output into binary information on recognized words. The parametrs of the conversion are fine-tuned using artificial evolution.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Gers, F.: Long Short-Term Memory in Recurrent Neural Networks, PhD thesis (2001)
Graves, A., Schmidthuber, J.: Framewise Phoneme Classification with Bidirectional LSTM and Other Neural Network Architectures. Journal of Neural Networks, 602–610 (June/July 2005)
Graves, A., Eck, D., Beringer, N., Schmidthuber, J.: Biologically Plausible Speech Recognition with LSTM Neural Nets. In: Proceedings of the First International Workshop on Biologically Inspired Approaches to Advanced Information Technology, Bio-ADIT 2004, Lausanne, Switzerland, January 2004, pp. 175–184 (2004)
Hochreiter, S., Schmidhuber, J.: Long Short-Term Memory. Neural Computation 9(8), 1735–1780 (1997)
Hochreiter, S., Bengio, Y., Frasconi, P., Schmidhuber, J.: Gradient flow in recurrent nets: the difficulty of learning long-term dependencies. In: Kremer, S.C., Kolen, J.F. (eds.) A Field Guide to Dynamical Recurrent Neural Networks. IEEE Press, Los Alamitos (2001)
Michalewicz, Z.: Genetic algorithms + Data Structures = Evolution Programs. Springer, Heidelberg (1994)
Michalewicz, Z., Fogel, D.B.: How to Solve It: Modern Heuristics. Springer, Heidelberg (1999)
Rabiner, L.R.: A tutorial on hidden markov models and selected applications in speech recognition. Readings in speech recognition, 267–296 (1990)
Young, S.: The HTK Book. Cambridge University Press, Cambridge (1995)
Werbos, P.J.: Backpropagation through time: what it does and how to do it. Proc. IEEE 78(10), 1550–1560 (1990)
Williams, R., Zipser, D.: A learning algorithm for continually running fully recurrent neural networks. Neural Computation 1(2), 270–280 (1989)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Brocki, Ł., Koržinek, D., Marasek, K. (2006). Recognizing Connected Digit Strings Using Neural Networks. In: Sojka, P., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2006. Lecture Notes in Computer Science(), vol 4188. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11846406_43
Download citation
DOI: https://doi.org/10.1007/11846406_43
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-39090-9
Online ISBN: 978-3-540-39091-6
eBook Packages: Computer ScienceComputer Science (R0)