Skip to main content

A Hybrid Approach to Statistical Language Modeling with Multilayer Perceptrons and Unigrams

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3658))

Abstract

In language engineering, language models are employed in order to improve system performance. These language models are usually N-gram models which are estimated from large text databases using the occurrence frequencies of these N-grams. An alternative to conventional frequency-based estimation of N-gram probabilities consists on using neural networks to this end. In this paper, an approach to language modeling with a hybrid language model is presented as a linear combination of a connectionist N-gram model, which is used to represent the global relations between certain linguistic categories, and a stochastic model of word distribution into such categories. The hybrid language model is tested on the corpus of the Wall Street journal processed in the Penn Treebank project.

This work has been supported by the Spanish CICYT under contract TIC2003-07158-C04-03.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bahl, L., Jelinek, F., Mercer, R.: A Maximum Likelihood Approach to Continuous Speech Recognition. IEEE Trans. on PAMI 5, 179–190 (1983)

    Google Scholar 

  2. Jelinek, F.: Statistical Methods for Speech Recognition. MIT Press, Cambridge (1998)

    Google Scholar 

  3. Rumelhart, D., Hinton, G., Williams, R.: Learning internal representations by error propagation. In: PDP: Computational models of cognition and perception, I, MIT Press, Cambridge (1986)

    Google Scholar 

  4. Bishop, C.: Neural networks for pattern recognition. Oxford University Press, Oxford (1995)

    Google Scholar 

  5. Nakamura, M., Shikano, K.: A study of English word category prediction based on neural networks. In: Proc. of the ICASSP, Glasgow (Scotland), pp. 731–734 (1989)

    Google Scholar 

  6. Castro, M., Casacuberta, F., Prat, F.: Towards connectionist language modeling. In: Proc. of the Symposium on Pattern Recognition and Image Analysis, Bilbao (Spain), pp. 9–10 (1999)

    Google Scholar 

  7. Castro, M., Prat, F., Casacuberta, F.: MLP emulation of N-gram models as a first step to connectionist language modeling. In: Proc. of the ICANN, Edinburgh (UK), pp. 910–915 (1999)

    Google Scholar 

  8. Castro, M.J., Polvoreda, V., Prat, F.: Connectionist N-gram Models by UsingMLPs. In: Proc. of the NLPNN, Tokyo (Japan), pp. 16–22 (2001)

    Google Scholar 

  9. Castro, M.J., Prat, F.: New Directions in Connectionist Language Modeling. In: Mira, J., Álvarez, J.R. (eds.) IWANN 2003. LNCS, vol. 2686, pp. 598–605. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  10. Xu, W., Rudnicky, A.: Can Artificial Neural Networks Learn Language Models? In: Proc. of the ICSLP, Beijing, China (2000)

    Google Scholar 

  11. Bengio, Y., Ducharme, R., Vincent, P.: A Neural Probabilistic Language Model. In: Advances in NIPS, vol. 13, pp. 932–938. Morgan Kaufmann, San Francisco (2001)

    Google Scholar 

  12. Bengio, Y., Ducharme, R., Vincent, P., Jauvin, C.: A Neural Probabilistic Language Model. Journal of Machine Learning Research 3, 1137–1155 (2003)

    Article  MATH  Google Scholar 

  13. Schwenk, H., Gauvain, J.L.: Connectionist language modeling for large vocabulary continuous speech recognition. In: Proc. of the ICASSP, Orlando, Florida (USA), pp. 765–768 (2002)

    Google Scholar 

  14. Schwenk, H., Gauvain, J.L.: Using continuous space language models for conversational speech recognition. In: Work. on Spontaneous Speech Process. and Recog., Tokyo (2003)

    Google Scholar 

  15. Schwenk, H.: Efficient Training of Large Neural Networks for Language Modeling. In: Proc. of the IJCNN, Budapest, pp. 3059–3062 (2004)

    Google Scholar 

  16. Elman, J.: Finding structure in time. Cognitive Science 14, 179–211 (1990)

    Article  Google Scholar 

  17. Rodriguez, P.: Comparing Simple Recurrent Networks and n-Grams in a Large Corpus. Journal of Applied Intelligence 19, 39–50 (2003)

    Article  MATH  Google Scholar 

  18. Benedí, J., Sánchez, J.: Estimation of stochastic context-free grammars and their use as language models. Computer Speech and Language (2005) (in press)

    Google Scholar 

  19. Marcus, M.P., Santorini, B., Marcinkiewicz, M.A.: Building a Large Annotated Corpus of English: The Penn Treebank. Computational Linguistics 19, 313–330 (1994)

    Google Scholar 

  20. Roark, B.: Probabilistic top-down parsing and language modeling. Computational Linguistics 27, 249–276 (2001)

    Article  MathSciNet  Google Scholar 

  21. Rosenfeld, R.: Adaptative statistical language modeling: Amaximum entropy approach. PhD thesis, Carnegie Mellon University (1994)

    Google Scholar 

  22. Clarkson, P., Rosenfeld, R.: Statistical Language Modeling using the CMU-Cambridge toolkit. In: Proc. of the Eurospeech, Rhodes (Greece), pp. 2707–2711 (1997)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Blat, F., Castro, M.J., Tortajada, S., Sánchez, J.A. (2005). A Hybrid Approach to Statistical Language Modeling with Multilayer Perceptrons and Unigrams. In: Matoušek, V., Mautner, P., Pavelka, T. (eds) Text, Speech and Dialogue. TSD 2005. Lecture Notes in Computer Science(), vol 3658. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11551874_25

Download citation

  • DOI: https://doi.org/10.1007/11551874_25

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-28789-6

  • Online ISBN: 978-3-540-31817-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics