Abstract
In this work, we present an approach to language understanding using corpus-based and statistical language models based on multigrams. Assuming that we can assign meanings to segments of words, the n-multigram modelization is a good approach to model sequences of segments that have semantic information associated to them. This approach has been applied to the task of speech understanding in the framework of a dialogue system that answers queries about train timetables in Spanish. Some experimental results are also reported.
Work partially funded by CICYT under project TIC2002-04103-C03-03, Spain.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bahl, L., Jelinek, F., Mercer, R.: A maximum likelihood approach to continuous speech recognition. IEEE Trans. on PAMI-5, 179–190 (1983)
Clarkson, P., Rosenfeld, R.: Statistical language modeling using the CMU-cambridge toolkit. In: Proc. Eurospeech, Rhodes, Greece, pp. 2707–2710 (1997)
Bonafonte, A., Mariño, J.B.: Language modeling using X-grams. In: Proc. of ICSLP, Philadelphia, PA, pp. 394–397 (1996)
Bonafonte, A., Mariño, J.B.: Using X-Gram For Efficient Speech Recognition. In: Proc. of ICSLP, Sydney, Australia (1998)
Riccardi, G., Pieraccini, R., Bocchieri, E.: Stochastic automata for language modelling. Computer Speech and Language 10, 265–293 (1996)
Deligne, S., Bimbot, F.: Language modeling by variable length sequences: theoretical formulation and evaluation of multigram. In: Proc. of ICASSP, pp. 169–172 (1995)
Deligne, S., Bimbot, F.: Inference of variable-length acoustic units for continuous speech recognition. In: Proc. ICASSP, Munich, Germany, pp. 1731–1734 (1997)
Deligne, S., Sagisaka, Y.: Statistical language modeling with a class-based n-multigram. Computer Speech and Language 14 (2000)
Bonafonte, A., et al.: Desarrollo de un sistema de diálogo oral en dominios restringidos. In: I Jornadas en Tecnología del Habla, Sevilla (Spain) (2000)
Segarra, E., Sanchis, E., García, F., Hurtado, L.: Extracting semantic information through automatic learning techniques. IJPRAI 16, 301–307 (2002)
García, P., Segarra, E., Vidal, E., Galiano, I.: On the use of the Morphic Generator Grammatical Inference (MGGI) Methodology in automatic speech recognition. IJPRAI 4(4) (1990)
Segarra, E., Hurtado, L.: Construction of Language Models using Morfic Generator Grammatical Inference MGGI Methodology. In: Proc. of Eurospeech, Rhodes, Greece, pp. 2695–2698 (1997)
Prieto, N., Vidal, E.: Learning language models through the ECGI method. Speech Communication (1992)
Prieto, N., Sanchis, E., Palmero, L.: Continuous speech understanding based on automatic learning of acoustic and semantic models. In: Proc. of ICSLP, pp. 2175–2178 (1994)
García, P., Vidal, E.: Inference of k-testable languages in the strict sense and application to syntactic pattern recognition. IEEE Trans. on PAMI-12, 920–925 (1990)
Fraser, N.M., Gilbert, G.N.: Simulating speech systems. Computer Speech and Languages 5, 81–99 (1991)
Segarra, E., et al.: Achieving full coverage of automatically learnt finite-state language models. In: Proc. of EACL, Budapest, pp. 135–142 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hurtado, L., Segarra, E., García, F., Sanchis, E. (2004). Language Understanding Using n-multigram Models. In: Vicedo, J.L., Martínez-Barco, P., Muńoz, R., Saiz Noeda, M. (eds) Advances in Natural Language Processing. EsTAL 2004. Lecture Notes in Computer Science(), vol 3230. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30228-5_19
Download citation
DOI: https://doi.org/10.1007/978-3-540-30228-5_19
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23498-2
Online ISBN: 978-3-540-30228-5
eBook Packages: Springer Book Archive