A Hybrid Approach to Statistical Language Modeling with Multilayer Perceptrons and Unigrams

Blat, Fernando; Castro, María José; Tortajada, Salvador; Sánchez, Joan Andreu

doi:10.1007/11551874_25

A Hybrid Approach to Statistical Language Modeling with Multilayer Perceptrons and Unigrams

Fernando Blat¹⁹,
María José Castro¹⁹,
Salvador Tortajada¹⁹ &
…
Joan Andreu Sánchez¹⁹

Conference paper

692 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3658))

Abstract

In language engineering, language models are employed in order to improve system performance. These language models are usually N-gram models which are estimated from large text databases using the occurrence frequencies of these N-grams. An alternative to conventional frequency-based estimation of N-gram probabilities consists on using neural networks to this end. In this paper, an approach to language modeling with a hybrid language model is presented as a linear combination of a connectionist N-gram model, which is used to represent the global relations between certain linguistic categories, and a stochastic model of word distribution into such categories. The hybrid language model is tested on the corpus of the Wall Street journal processed in the Penn Treebank project.

This work has been supported by the Spanish CICYT under contract TIC2003-07158-C04-03.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bahl, L., Jelinek, F., Mercer, R.: A Maximum Likelihood Approach to Continuous Speech Recognition. IEEE Trans. on PAMI 5, 179–190 (1983)
Google Scholar
Jelinek, F.: Statistical Methods for Speech Recognition. MIT Press, Cambridge (1998)
Google Scholar
Rumelhart, D., Hinton, G., Williams, R.: Learning internal representations by error propagation. In: PDP: Computational models of cognition and perception, I, MIT Press, Cambridge (1986)
Google Scholar
Bishop, C.: Neural networks for pattern recognition. Oxford University Press, Oxford (1995)
Google Scholar
Nakamura, M., Shikano, K.: A study of English word category prediction based on neural networks. In: Proc. of the ICASSP, Glasgow (Scotland), pp. 731–734 (1989)
Google Scholar
Castro, M., Casacuberta, F., Prat, F.: Towards connectionist language modeling. In: Proc. of the Symposium on Pattern Recognition and Image Analysis, Bilbao (Spain), pp. 9–10 (1999)
Google Scholar
Castro, M., Prat, F., Casacuberta, F.: MLP emulation of N-gram models as a first step to connectionist language modeling. In: Proc. of the ICANN, Edinburgh (UK), pp. 910–915 (1999)
Google Scholar
Castro, M.J., Polvoreda, V., Prat, F.: Connectionist N-gram Models by UsingMLPs. In: Proc. of the NLPNN, Tokyo (Japan), pp. 16–22 (2001)
Google Scholar
Castro, M.J., Prat, F.: New Directions in Connectionist Language Modeling. In: Mira, J., Álvarez, J.R. (eds.) IWANN 2003. LNCS, vol. 2686, pp. 598–605. Springer, Heidelberg (2003)
Chapter Google Scholar
Xu, W., Rudnicky, A.: Can Artificial Neural Networks Learn Language Models? In: Proc. of the ICSLP, Beijing, China (2000)
Google Scholar
Bengio, Y., Ducharme, R., Vincent, P.: A Neural Probabilistic Language Model. In: Advances in NIPS, vol. 13, pp. 932–938. Morgan Kaufmann, San Francisco (2001)
Google Scholar
Bengio, Y., Ducharme, R., Vincent, P., Jauvin, C.: A Neural Probabilistic Language Model. Journal of Machine Learning Research 3, 1137–1155 (2003)
Article MATH Google Scholar
Schwenk, H., Gauvain, J.L.: Connectionist language modeling for large vocabulary continuous speech recognition. In: Proc. of the ICASSP, Orlando, Florida (USA), pp. 765–768 (2002)
Google Scholar
Schwenk, H., Gauvain, J.L.: Using continuous space language models for conversational speech recognition. In: Work. on Spontaneous Speech Process. and Recog., Tokyo (2003)
Google Scholar
Schwenk, H.: Efficient Training of Large Neural Networks for Language Modeling. In: Proc. of the IJCNN, Budapest, pp. 3059–3062 (2004)
Google Scholar
Elman, J.: Finding structure in time. Cognitive Science 14, 179–211 (1990)
Article Google Scholar
Rodriguez, P.: Comparing Simple Recurrent Networks and n-Grams in a Large Corpus. Journal of Applied Intelligence 19, 39–50 (2003)
Article MATH Google Scholar
Benedí, J., Sánchez, J.: Estimation of stochastic context-free grammars and their use as language models. Computer Speech and Language (2005) (in press)
Google Scholar
Marcus, M.P., Santorini, B., Marcinkiewicz, M.A.: Building a Large Annotated Corpus of English: The Penn Treebank. Computational Linguistics 19, 313–330 (1994)
Google Scholar
Roark, B.: Probabilistic top-down parsing and language modeling. Computational Linguistics 27, 249–276 (2001)
Article MathSciNet Google Scholar
Rosenfeld, R.: Adaptative statistical language modeling: Amaximum entropy approach. PhD thesis, Carnegie Mellon University (1994)
Google Scholar
Clarkson, P., Rosenfeld, R.: Statistical Language Modeling using the CMU-Cambridge toolkit. In: Proc. of the Eurospeech, Rhodes (Greece), pp. 2707–2711 (1997)
Google Scholar

Download references

Author information

Authors and Affiliations

Departament de Sistemes Informàtics i Computació, Universitat Politècnica de València, E-46022, València, Spain
Fernando Blat, María José Castro, Salvador Tortajada & Joan Andreu Sánchez

Authors

Fernando Blat
View author publications
You can also search for this author in PubMed Google Scholar
María José Castro
View author publications
You can also search for this author in PubMed Google Scholar
Salvador Tortajada
View author publications
You can also search for this author in PubMed Google Scholar
Joan Andreu Sánchez
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, University of West Bohemia in Pilsen, Univerzitni 8, 30614, Plzen, Czech Republic
Václav Matoušek , Pavel Mautner & Tomáš Pavelka , &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Blat, F., Castro, M.J., Tortajada, S., Sánchez, J.A. (2005). A Hybrid Approach to Statistical Language Modeling with Multilayer Perceptrons and Unigrams. In: Matoušek, V., Mautner, P., Pavelka, T. (eds) Text, Speech and Dialogue. TSD 2005. Lecture Notes in Computer Science(), vol 3658. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11551874_25

Download citation

DOI: https://doi.org/10.1007/11551874_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28789-6
Online ISBN: 978-3-540-31817-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics