Skip to main content
Log in

Parsing Ill-Formed Text Using an Error Grammar

  • Published:
Artificial Intelligence Review Aims and scope Submit manuscript

Abstract

This paper presents a robust parsing approach which is designed to address the issue of syntactic errors in text. The approach is based on the concept of an error grammar which is a grammar of ungrammatical sentences. An error grammar is derived from a conventional grammar on the basis of an analysis of a corpus of observed ill-formed sentences. A robust parsing algorithm is presented which is applied after a conventional bottom–up parsing algorithm has failed. This algorithm combines a rule from the error grammar with rules from the normal grammar to arrive at a parse for an ungrammatical sentence. This algorithm is applied to 50 test sentences, with encouraging results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Becker, M., Bredenkamp, A., Crysmann, B. & Klein, J. (1999). Annotation of Error Types for German News Corpus. In Proceedings of the ATALA workshop on Treebanks, Paris.

  • Charniak (2000). A Maximum-Entropy-Inspired Parser. In Proceedings of the NAACL-2000.

  • Copestake, A. (2002). Implementing Typed Feature Structure Grammars. CSLI Lecture Notes. Cambridge: Cambridge University Press.

    Google Scholar 

  • Douglas, S. & Dale, R. (1992). Towards Robust Patr. In COLING '92, 468–474.

  • Earley, J. (1970). An Efficient Context-free Parsing Algorithm. Commun. ACM 6(8): 451–455.

    Google Scholar 

  • Foster, J. (2000). A Unification Strategy for Parsing Agreement Errors. In Piliére, C. (ed.), Proceedings of the ESSLLI-2000 Student Session, 77–87.

  • Fouvry, F. (2000). Robust Unification for Linguistics. In ROMAND 2000 1st workshop on Robust Methods in Analysis of Natural language Data, Lausanne.

  • Fouvry, F. (2003). Constraint Relaxation with Weighted Feature Structures. In Proceedings of the 8th International Workshop on Parsing Technologies, Nancy, France.

  • Gojenola, K. & Oronoz, M. (2000). Corpus-based Syntactic Error Detection Using Syntactic Patterns. In NAACL-ANLP00, Student Research Workshop, Seattle.

  • James, C. (1998). Errors in Language Learning and Use: Exploring Error Analysis. Addison Wesley Longman.

  • Jensen, K., Heidorn, G., Miller, L. & Ravin, Y. (1983). Parse Fitting and Prose Fixing: Getting a Hold on Ill-formedness. Am. J. Comput. Linguist. 9(3-4): 147–160.

    Google Scholar 

  • Keenan, E. L. (1976). Towards a Universal Definition of 'Subject'. In Li, C. (ed.), Subject and Topic. London: Academic Press Inc.

    Google Scholar 

  • Magerman, D. M. & Weir, C. (1992). Efficiency, Robustness and Accuracy in Picky Chart Parsing. In Proceedings of the 30th ACL.

  • Mellish, C. S. (1989). Some Chart-based Techniques for Parsing Ill-formed Input. In Proceedings of the 27th ACL, 102–109.

  • Pereira, F. C. & Shieber, S. M. (1987). Prolog and Natural-Language Analysis. CSLI Lecture Notes: Number 10. Center for the Study of Language and Information.

  • Sampson, G. (2001). Evidence Against the Grammatical/Ungrammatical Distinction. In Empirical Linguistics, Chap. 10. Continuum, New York.

    Google Scholar 

  • Schneider, D. & McCoy, K. (1998). Recognizing Syntactic Errors in the Writing of Second Language Learners. In Proceedings of the Thirty-Sixth Annual Meeting of the Association for Computational Linguistics and the Seventeenth International Conference on Computational Linguistics (COLING-ACL), Vol. 2, Montreal, Canada.

  • Vogel, C. & Cooper, R. (1995). Robust Chart Parsing with Mildly Inconsistent Feature Structures. In Schöter, A. & Vogel, C. (eds.), Nonclassical Feature Systems, Vol. 10. Centre for Cognitive Science, University of Edinburgh, Working Papers in Cognitive Science.

  • Weischedel, R. M. & Sondheimer, N. K. (1983). Meta-Rules as a Basis for Processing Ill-Formed Input. Am. J. Comput. Linguist. 9(3-4): 161–177.

    Google Scholar 

Download references

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Foster, J., Vogel, C. Parsing Ill-Formed Text Using an Error Grammar. Artificial Intelligence Review 21, 269–291 (2004). https://doi.org/10.1023/B:AIRE.0000036259.68818.1e

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/B:AIRE.0000036259.68818.1e

Navigation