Skip to main content

Computer assisted grammar construction

  • Conference paper
  • First Online:
Grammatical Inference and Applications (ICGI 1994)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 862))

Included in the following conference series:

Abstract

This paper presents a system for computer assisted grammar construction (CAGC). The CAGC system is designed to generate broad-coverage grammars for large natural language corpora by utilizing both an extended inside-outside algorithm and an automatic phrase bracketing (AUTO) system, which is designed to provide the extended algorithm with constituent information during learning. This paper demonstrates the capability of the CAGC system to deal with realistic natural language problems and the usefulness of the AUTO system in the inside-outside based grammar re-estimation. Performance results including an analysis of degree of coverage and bracketing precision are presented for a grammar constructed for the Wall Street Journal (WSJ) corpus.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. E. Black, S. Abney, D. Flicknger, C. Gdaniec, R. Grishman, P. Harrison, D. Hindle, R. Ingria, F. Jelinek, J. Klavans, M. Liberman, M. Marcus, S. Roukos, B. Santorini, and T. Strzalkowski. A procedure for quantitatively comparing the syntactic coverage of English grammar. In DARPA Speech and Natural Language Workshop, pages 306–311, 1991.

    Google Scholar 

  2. J.K. Baker. Trainable grammar for speech recognition. In Speech Communication Papers for the 97th Meeting of the acoustical Society of America (D. Klatt and J. Wolf, eds), pages 547–550, 1979.

    Google Scholar 

  3. E. Briscoe, C. Grover, B. Bogurraev, and J. Carroll. A formalism and environment for the development of a large grammar of english. In Proceedings of the 10th International Joint Conference on Artificial Intelligence, pages 703–708, Milan, Italy, 1987.

    Google Scholar 

  4. E. Black, J. Lafferty, and S. Roukos. Development and evaluation of a broadcoverage probabilistic grammar of English-language computer manuals. In Proceedings of the 30th Annual Meeting of the Association for Computational Linguistics, pages 185–192, June 1992.

    Google Scholar 

  5. E. Briscoe and N. Waegner. Undergeneration and robust paring. In J. Arts, P. de Haan, and N. Oostdijk, editors, English language corpora: design, analysis and exploitation, pages 14–19. Rodopi, Amsterdam, 1993.

    Google Scholar 

  6. K. Fu and T. Booth. Grammatical inference: Introduction and survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 8:343–375, 1986.

    Google Scholar 

  7. G. Gazdar and C. Mellish. Natural Language Processing in PROLOG. Addison-Wesley, 1988.

    Google Scholar 

  8. S.M. Lucas. New directions in grammatical inference. In Colloquium on Grammatical Inference: Theories, Applications and Alternatives, 1993.

    Google Scholar 

  9. K. Lari and S.J. Young. The estimation of stochastic context-free grammars using inside-outside algorithm. Computer Speech and Language, pages 35–56, 1990.

    Google Scholar 

  10. K. Lari and S.J. Young. Application of stochastic context-free grammars using inside-outside algorithm. Computer Speech and Language, pages 237–257, 1991.

    Google Scholar 

  11. M. Marcus. Very large annotated database of american english. In DARPA Speech and Natural Language Workshop, page 430, 1991.

    Google Scholar 

  12. F. Pereira and Y. Schabes. Inside-outside re-estimation for partially bracketed corpora. In Proceedings of the 30th Annual Meeting of the Association for Computational Linguistics, pages 128–135, June 1992.

    Google Scholar 

  13. H. Rulot, N. Prieto, and E. Vidal. Learning accurate finite-state structural models of words through the ECGI algorithm. In Proceedings of the ICASSP 89, volume 2, pages 643–646, 1989.

    Google Scholar 

  14. H-H. Shih and S.J. Young. A system for computer assisted grammar construction. Technical Report TR.170, Engineering Department, Cambridge University, England, June 1994.

    Google Scholar 

  15. H.S. Thompson. Parseval workshop. In ELSNews Vol.1(2), 1992.

    Google Scholar 

  16. N. Waegner. Stochastic Models for Language Acquisition. PhD thesis, Cambridge University, England, 1993.

    Google Scholar 

  17. Peter Wyard. Context free grammar induction using genetic algorithm. In Colloquium on Grammatical Inference: Theories, Applications and Alternatives, 1993.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Rafael C. Carrasco Jose Oncina

Rights and permissions

Reprints and permissions

Copyright information

© 1994 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Young, S.J., Shih, H.H. (1994). Computer assisted grammar construction. In: Carrasco, R.C., Oncina, J. (eds) Grammatical Inference and Applications. ICGI 1994. Lecture Notes in Computer Science, vol 862. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-58473-0_156

Download citation

  • DOI: https://doi.org/10.1007/3-540-58473-0_156

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-58473-5

  • Online ISBN: 978-3-540-48985-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics