Skip to main content
Log in

Incorporating linguistic structure into maximum entropy language models

  • Notes
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

In statistical language models, how to integrate diverse linguistic knowledge in a general framework for long-distance dependencies is a challenging issue. In this paper, an improved language model incorporating linguistic structure into maximum entropy framework is presented. The proposed model combines trigram with the structure knowledge of base phrase in which trigram is used to capture the local relation between words, while the structure knowledge of base phrase is considered to represent the long-distance relations between syntactical structures. The knowledge of syntax, semantics and vocabulary is integrated into the maximum entropy framework. Experimental results show that the proposed model improves by 24% for language model perplexity and increases about 3% for sign language recognition rate compared with the trigram model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Rosenfeld R. Two decades of statistical language modeling: Where do we go from here? InProc. the IEEE, 2000, 88(8): 1270–1278.

  2. Niesler T, Whittaker E, Woodland P. Comparison of part-of-speech and automatically derived category-based language models for speech recognition. InProc. ICASSP-98, Seattle, USA, 1998, pp.177–180.

  3. Niesler T. Category-based statistical language models [Dissertation]. University of Cambridge, UK, 1997.

    Google Scholar 

  4. Kuo H, Reichl W. Phrase-based language models for speech recognition. InProc. Eurospeech-99, Budapest, Hungary, 1999, pp.1595–1598.

  5. Ney H, Essen U, Kneser R. On structuring probabilistic dependences in stochastic language modeling.Computer Speech and Language, 1994, 8: 1–38.

    Article  Google Scholar 

  6. Lau R, Rosenfeld R, Roukos S. Trigger-based language models: A maximum entropy approach. InProc. ICASSP-93, Minneapolis, USA, 1993, pp.45–48.

  7. Ron D, Singer Y, Tishby N. The power of amnesia: Learning probabilistic automata with variable memory length.Machine Learning, 1996, 25: 117–149.

    Article  MATH  Google Scholar 

  8. Siu M, Ostendorf M. Variable n-grams and extensions for conversational speech language modeling.IEEE Trans. Speech and Audio Processing, 2000, 8(1): 63–75.

    Article  Google Scholar 

  9. Chelba Cet al. Structure and performance of a dependency language model. InProc. Eurospeech-97, Rhodes, Greece, 1997, pp.2775–2778.

  10. Benedi J, Sanchez J. Combination of n-grams and stochastic context-free grammars for language modeling. InProc. the 18th Int. Conf. on Computational Linguistics, Saarbrücken, Luxembourg, 2000, pp. 55–61.

  11. Lafferty J, Sleator D, Temperley D. Grammatical trigrams: A probabilistic model of link grammar. InProc. the AAAI Fall Symposium on Probabilistic Approaches to Natural Language, Cambridge, Massachusetts, 1992, pp.89–97.

  12. Rosenfeld R. A maximum entropy approach to adaptive statistical language modeling.Computer Speech and Language, 1996, 10: 187–228.

    Article  Google Scholar 

  13. Zhou Q, Sun M S, Huang C N. Chunk parsing scheme for Chinese sentences.Chinese Journal of Computers, 1999, 22(11): 1159–1165.

    Google Scholar 

  14. Zhao T J, Yang M Y, Liu Fet al. Statistics-based hybrid approach to Chinese base phrase identification. InProc. the Second Chinese Language Processing Workshop, Hong Kong, China, 2001, pp.73–77.

  15. Jaynes E T. Information theory and statistical mechanics.Physics Reviews, 1957, 106(108): 620–630.

    Article  MathSciNet  Google Scholar 

  16. Della Pietra S, Della Pietra V, Mercer R, Roukos S. Adaptive language modeling using minimum discriminant estimation. InProc. ICASSP-92, San Francisco, USA, 1992, pp.633–636.

  17. Zhao T Jet al. Increasing accuracy of Chinese segmentation with strategy of multi-step processing.Journal of Chinese Information Processing, 2000, 15(1): 13–18.

    Google Scholar 

  18. Fang G L, Gao W, Chen X Let al. A signer-independent continuous sign language recognition system based on SRN/HMM.Journal of Software, 2002, 13(11): 2169–2174.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fang GaoLin.

Additional information

This work is supported by the National Natural Science Foundation of China (Grant No.69789301) and the National High-Technology Development ‘863’ Program of China (Grant No.2001AA114160).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fang, G., Gao, W. & Wang, Z. Incorporating linguistic structure into maximum entropy language models. J. Comput. Sci. & Technol. 18, 131–136 (2003). https://doi.org/10.1007/BF02946662

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02946662

Keywords

Navigation