Recognizing Connected Digit Strings Using Neural Networks

Brocki, Łukasz; Koržinek, Danijel; Marasek, Krzysztof

doi:10.1007/11846406_43

Łukasz Brocki²¹,
Danijel Koržinek²¹ &
Krzysztof Marasek²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4188))

Included in the following conference series:

International Conference on Text, Speech and Dialogue

1045 Accesses
1 Citations

Abstract

This paper discusses the usage of feed-forward and recurrent Artificial Neural Networks (ANNs) in whole word speech recognition. The Long-Short Term Memory (LSTM) network has been trained to do speaker independent recognition of any series of connected digits in polish language, using only the acoustic features extracted from speech. It is also shown how to effectively change the analog network output into binary information on recognized words. The parametrs of the conversion are fine-tuned using artificial evolution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Gers, F.: Long Short-Term Memory in Recurrent Neural Networks, PhD thesis (2001)
Google Scholar
Graves, A., Schmidthuber, J.: Framewise Phoneme Classification with Bidirectional LSTM and Other Neural Network Architectures. Journal of Neural Networks, 602–610 (June/July 2005)
Google Scholar
Graves, A., Eck, D., Beringer, N., Schmidthuber, J.: Biologically Plausible Speech Recognition with LSTM Neural Nets. In: Proceedings of the First International Workshop on Biologically Inspired Approaches to Advanced Information Technology, Bio-ADIT 2004, Lausanne, Switzerland, January 2004, pp. 175–184 (2004)
Google Scholar
Hochreiter, S., Schmidhuber, J.: Long Short-Term Memory. Neural Computation 9(8), 1735–1780 (1997)
Article Google Scholar
Hochreiter, S., Bengio, Y., Frasconi, P., Schmidhuber, J.: Gradient flow in recurrent nets: the difficulty of learning long-term dependencies. In: Kremer, S.C., Kolen, J.F. (eds.) A Field Guide to Dynamical Recurrent Neural Networks. IEEE Press, Los Alamitos (2001)
Google Scholar
Michalewicz, Z.: Genetic algorithms + Data Structures = Evolution Programs. Springer, Heidelberg (1994)
MATH Google Scholar
Michalewicz, Z., Fogel, D.B.: How to Solve It: Modern Heuristics. Springer, Heidelberg (1999)
Google Scholar
Rabiner, L.R.: A tutorial on hidden markov models and selected applications in speech recognition. Readings in speech recognition, 267–296 (1990)
Google Scholar
Young, S.: The HTK Book. Cambridge University Press, Cambridge (1995)
Google Scholar
Werbos, P.J.: Backpropagation through time: what it does and how to do it. Proc. IEEE 78(10), 1550–1560 (1990)
Article Google Scholar
Williams, R., Zipser, D.: A learning algorithm for continually running fully recurrent neural networks. Neural Computation 1(2), 270–280 (1989)
Article Google Scholar
http://www.phon.ucl.ac.uk/home/sampa/polish.htm

Download references

Author information

Authors and Affiliations

Polish-Japanese Institute of Information Technology, Koszykowa 86, 02-008, Warsaw, Poland
Łukasz Brocki, Danijel Koržinek & Krzysztof Marasek

Authors

Łukasz Brocki
View author publications
You can also search for this author in PubMed Google Scholar
Danijel Koržinek
View author publications
You can also search for this author in PubMed Google Scholar
Krzysztof Marasek
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Informatics, Masaryk University, Brno, Czech Republic
Petr Sojka
Faculty of Informatics, Masaryk University, Botanická 68a, CZ-602 00, Brno, Czech Republic
Ivan Kopeček
Faculty of Informatics, Department of Computer Graphics and Design, Masaryk University, Botanická 68a, 60200, Brno, Czech Republic
Karel Pala

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Brocki, Ł., Koržinek, D., Marasek, K. (2006). Recognizing Connected Digit Strings Using Neural Networks. In: Sojka, P., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2006. Lecture Notes in Computer Science(), vol 4188. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11846406_43

Download citation

DOI: https://doi.org/10.1007/11846406_43
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-39090-9
Online ISBN: 978-3-540-39091-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics