Abstract
In this paper we investigate whether a combination of statistical, neural network and cache language models can outperform a basic statistical model. These models have been developed, tested and exploited for a Czech spontaneous speech data, which is very different from common written Czech and is specified by a small set of the data available and high inflection of the words. As a baseline model we used a trigram model and after its training several cache models interpolated with the baseline model have been tested and measured on a perplexity. Finally, an evaluation of the model with the lowest perplexity has been performed on speech recordings of phone calls.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Stolcke, A.: SRILM – an extensible language modeling toolkit. In: INTERSPEECH (2002)
Mikolov, T., Kopecký, J., Burget, L., Glembek, O., Černocký, J.: Neural network based language models for highly inflective languages. In: ICASSP, pp. 4725–4728 (2009)
Schwenk, H., Gauvain, J.: Training Neural Network Language Models on Very Large Corpora. In: HLT/EMNLP (2005)
Brown, P.F., Pietra, V.J.D., Souza, P.V.D., Lai, J.C., Mercer, R.L.: Class-Based n-gram Models of Natural Language. Computational Linguistics, 467–479 (1992)
Bengio, Y., Ducharme, R., Vincent, P., Janvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3, 1137–1155 (2003)
Kuhn, R., De Mori, R.: A Cache-Based Natural Language Model for Speech Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 570–583 (June 1990)
Trmal, J., Zelinka, J., Müller, L.: Adaptation of a Feedforward Artificial Neural Network Using a Linear Transform. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds.) TSD 2010. LNCS, vol. 6231, pp. 423–430. Springer, Heidelberg (2010)
Pavel, I., Josef, P., Psutka Josef, V.: Using Morphological Information for Robust Language Modeling in Czech ASR System. IEEE Transactions on Audio Speech and Language Processing 17, 840–847 (2009)
Skorkovská, L., Ircing, P., Pražák, A., Lehečka, J.: Automatic Topic Identification for Large Scale Language Modeling Data Filtering. In: Habernal, I., Matoušek, V. (eds.) TSD 2011. LNCS, vol. 6836, pp. 64–71. Springer, Heidelberg (2011)
Bacchiani, M., Roark, B.: Unsupervised language model adaptation. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 224–227 (2003)
Psutka, J., Švec, J., Psutka, J.V., Vaněk, J., Pražák, A., Šmídl, L., Ircing, P.: System for Fast Lexical and Phonetic Spoken Term Detection in a Czech Cultural Heritage Archive. EURASIP Journal on Audio, Speech, and Music Processing (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Soutner, D., Loose, Z., Müller, L., Pražák, A. (2012). Neural Network Language Model with Cache. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2012. Lecture Notes in Computer Science(), vol 7499. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32790-2_64
Download citation
DOI: https://doi.org/10.1007/978-3-642-32790-2_64
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32789-6
Online ISBN: 978-3-642-32790-2
eBook Packages: Computer ScienceComputer Science (R0)