Training and application of integrated grammar/bigram language models

Wright, J. H.; Jones, G. J. F.; Lloyd-Thomas, H.

doi:10.1007/3-540-58473-0_153

J. H. Wright^1,2,
G. J. F. Jones³ &
H. Lloyd-Thomas^1,2

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 862))

Included in the following conference series:

International Colloquium on Grammatical Inference

125 Accesses
2 Citations

Abstract

This paper discusses a robust language model consisting of context-free grammar rules and symbol bigrams, integrated into a single framework. The aim is to remove the sharp grammatical/ungrammatical distinction by exploiting whatever grammar structure is present in every sentence, and hence to achieve continuity of scoring across the language. Both training and scoring are based on a similar principle: summing over paths that span the sentence. In addition to finding the overall score, a procedure for finding the best interpretation is described. Efficiency is maximised by the use of node-based (rather than path-based) procedures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

J.H.Wright, G.J.F.Jones and E.N.Wrigley, “Hybrid grammar-bigram speech recognition system with first-order dependence model”, Proc. ICASSP-92, San Francisco, pp I-169–172.
Google Scholar
G.J.F.Jones, J.H.Wright and E.N.Wrigley, “The HMM interface with hybrid grammar-bigram language models for speech recognition”, Proc. ICSLP-92, Banff, pp 253–256.
Google Scholar
G.J.F.Jones, H.Lloyd-Thomas and J.H.Wright, “Adaptive statistical and grammar models of language for application to speech recognition”, Proc. I.E.E. Colloquium on Grammatical Inference: Theory, Applications and Alternatives, University of Essex, April 1993.
Google Scholar
J.H.Wright, G.J.F.Jones and H.LLoyd-Thomas, “A consolidated language model for speech recognition”, Proc. European Conference on Speech Communication and Technology, Berlin, 1993, pp 977–980.
Google Scholar
J.H.Wright, G.J.F.Jones and H.Lloyd-Thomas, “A robust language model incorporating a substring parser and extended n-grams”, Proc. ICASSP-94, Adelaide, pp 361–364.
Google Scholar
G.J.F.Jones, “Application of Linguistic Models to Continuous Speech Recognition”, PhD Thesis, University of Bristol, 1994.
Google Scholar
X.Huang, F.Alleva, H-W Hon, M-Y Hwang, K-F Lee, and R.Rosenfeld, “The SPHINX-II speech recognition system: an overview”, Computer Speech and Language 7, 1993, pp 137–148.
Article Google Scholar
R.Rosenfeld, “A hybrid approach to adaptive statistical language modelling”, Proc. ARPA Workshop on Human Language Technology, Plainsboro, U.S.A., March 1994, pp 76–81.
Google Scholar
R.Iyer, M.Ostendorf and J.R.Rohlicek, “Language modelling with sentence-level mixtures”, Proc. ARPA Workshop on Human Language Technology, Plainsboro, U.S.A., March 1994, pp 82–86.
Google Scholar
H.Ney, U.Essen and R.Kneser, “On structuring probabilistic dependencies in stochastic language modelling”, Computer Speech and Language, vol 8 (1994), pp 1–38.
Google Scholar
M.Meteer and J.R.Rohlicek, “Statistical language modelling combining n-gram and context-free grammars”, Proc. ICASSP-93, Minneapolis, pp II-37–40.
Google Scholar
M. Tomita, “Efficient Parsing for Natural Language”, Kluwer Academic Publishers, Boston, 1986.
Google Scholar
Y.M.Bishop, S.E.Fienberg and P.W.Holland, “Discrete Multivariate Analysis: Theory and Practice”, M.I.T.Press, 1975.
Google Scholar
M.J.Russell, K.M.Ponting, S.M.Peeling, S.R.Browning, J.S.Bridie and R.K.Moore, “The ARM continuous speech recognition system”, Proc. ICASSP-90, Albuquerque.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Engineering Mathematics, University of Bristol, Queen's Building, University Walk, BS8 1TR, Bristol, UK
J. H. Wright & H. Lloyd-Thomas
Ensigma Limited, Turing House, Station Road, NP6 5PB, Chepstow, Gwent, UK
J. H. Wright & H. Lloyd-Thomas
Department of Engineering, University of Cambridge, Trumpington Street, CB2 1PZ, Cambridge, UK
G. J. F. Jones

Authors

J. H. Wright
View author publications
You can also search for this author in PubMed Google Scholar
G. J. F. Jones
View author publications
You can also search for this author in PubMed Google Scholar
H. Lloyd-Thomas
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Rafael C. Carrasco Jose Oncina

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wright, J.H., Jones, G.J.F., Lloyd-Thomas, H. (1994). Training and application of integrated grammar/bigram language models. In: Carrasco, R.C., Oncina, J. (eds) Grammatical Inference and Applications. ICGI 1994. Lecture Notes in Computer Science, vol 862. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-58473-0_153

Download citation

DOI: https://doi.org/10.1007/3-540-58473-0_153
Published: 04 June 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-58473-5
Online ISBN: 978-3-540-48985-6
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics