Finite State Grammar Transduction from Distributed Collected Knowledge

Gupta, Rakesh; Hennacy, Ken

doi:10.1007/11671299_36

Rakesh Gupta¹⁷ &
Ken Hennacy¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3878))

Included in the following conference series:

International Conference on Intelligent Text Processing and Computational Linguistics

1372 Accesses

Abstract

In this paper, we discuss the use of Open Mind Indoor Common Sense (OMICS) project for the purpose of speech recognition of user requests. As part of OMICS data collection, we asked users to enter different ways of asking a robot to perform specific tasks. This paraphrasing data is processed using Natural Language techniques and lexical resources like WordNet to generate a Finite State Grammar Transducer (FSGT). This transducer captures the variations in user requests and captures their structure.

We compare the task recognition performance of this FSGT model with an n-gram Statistical Language Model (SLM). The SLM model is trained with the same data that was used to generate the FSGT. The FSGT model and SLM are combined in a two-pass system to optimize full and partial recognition for both in-grammar and out-of-grammar user requests. Our work validates the use of a web based knowledge capture system to harvest phrases to build grammar models. Work was performed using Nuance Speech Recognition system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Mohri, M., Pereira, F., Riley, M.: Weighted finite state transducers in speech recognition. Computer Speech & Language 16, 69–88 (2002)
Article Google Scholar
Stork, D.G.: The Open Mind Initiative. IEEE Expert Systems and Their Applications 14, 19–20 (1999)
Google Scholar
Stork, D.G.: Open data collection for training intelligent software in the open mind initiative. In: Proceedings of the Engineering Intelligent Systems (EIS 2000), Paisley, Scotland (2000)
Google Scholar
Gupta, R., Kochenderfer, M.: Common sense data acquisition for indoor mobile robots. In: Nineteenth National Conference on Artificial Intelligence, AAAI 2004 (2004)
Google Scholar
Punyakanok, V., Roth, D., Yih, W.: Natural language inference via dependency tree mapping: An application to question answering. Computational Linguistics (2005) (submitted for review)
Google Scholar
Steedman, M.: Wide-coverage semantic representations from a ccg parser. In: Proceedings of the 20th International Conference on Computational Linquistics (2004)
Google Scholar
Bangalore, S., Joshi, A.K.: Supertagging: An approach to alsmost parsing. Computational Linguistics 25, 237–265 (1999)
Google Scholar
Solan, Z., Horn, D., Ruppin, E., Edelman, S.: Unsupervised context sensitive language acquisition from a large corpus. In: NIPS (2003)
Google Scholar
Klein, D., Manning, C.D.: Natural language grammar induction with a generative constituent-context model. Pattern Recognition 38, 1407–1419 (2005)
Article MATH Google Scholar
Klein, D., Manning, C.D.: Distributional phrase structure induction. In: Proceedings of the Fifth Conference on Natural Language Learning (CoNLL), pp. 113–120 (2001)
Google Scholar
Sinha, A.K., Landay, J.A.: Towards automatic speech input grammar generation for natural language interfaces. In: CHI 2000 Workshop on Natural Language Interfaces, The Hague, The Netherlands (2000)
Google Scholar
Martin, P.: The casual cashmere diaper bag: Constraining speech recognition using examples. In: Proceedings of the Association of Computational Linguistics (1997)
Google Scholar
Riccardi, G., Bangalore, S.: Automatic acquisition of phrase grammars for stochastic language modeling. In: 6th Workshop on Very Large Corpora, pp. 186–198 (1998)
Google Scholar
Dowding, J., Gawron, J.M., Appelt, D.E., Bear, J., Cherny, L., Moore, R., Moran, D.B.: GEMINI: A natural language system for spoken-language understanding. In: Meeting of the Association for Computational Linguistics, pp. 54–61 (1993)
Google Scholar
Acero, A., Wang, Y., Wang, K.: A semantically structured language model. In: Special Workshop in Maui, SWIM (2004)
Google Scholar
Wang, Y., Mahajan, M., Huang, X.: A unified context-free grammar and n-gram model for spoken language processing. In: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, Istanbul, Turkey (2000)
Google Scholar
Hockey, B.A., Lemon, O., Campana, E., Hiatt, L., Aist, G., Hieronymus, J., Gruenstein, A., Dowding, J.: Targeted help for spoken dialogue systems: intelligent feedback improves naive users’ performance. In: Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics, vol. 1, pp. 147–154 (2003)
Google Scholar
Klein, D., Manning, C.D.: Parsing with treebank grammars: Empirical bounds, theoretical models, and the structure of the penn treebank. In: Proceedings of the 39th Annual Meeting of the ACL (2001)
Google Scholar
Liu, H.: Montylingua: An end-to-end natural language processor with common sense. Available at: http://web.media.mit.edu/hugo/montylingua (2004)
Brill, E.: A simple rule-based part-of-speech tagger. In: Proceedings of ANLP 1992, 3rd Conference on Applied Natural Language Processing, Trento, IT, pp. 152–155 (1992)
Google Scholar
Deerwester, S.C., Dumais, S.T., Landauer, T.K., Furnas, G.W., Harshman, R.A.: Indexing by latent semantic analysis. Journal of the American Society of Information Science 41, 391–407 (1990)
Article Google Scholar
Nuance: Nuance Speech Recognition System 8.5: Grammar Developer’s Guide, Nuance Coproration, USA (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

Honda Research Institute USA, Inc., 800 California Street, Suite 300, Mountain View, CA, 94041, USA
Rakesh Gupta
Institute for Advanced Computer Studies, University of Maryland, College Park, MD, 20742, USA
Ken Hennacy

Authors

Rakesh Gupta
View author publications
You can also search for this author in PubMed Google Scholar
Ken Hennacy
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

National Polytechnic Institute, Center for Computing Research, 07738, Mexico City, México
Alexander Gelbukh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gupta, R., Hennacy, K. (2006). Finite State Grammar Transduction from Distributed Collected Knowledge. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2006. Lecture Notes in Computer Science, vol 3878. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11671299_36

Download citation

DOI: https://doi.org/10.1007/11671299_36
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-32205-4
Online ISBN: 978-3-540-32206-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics