Abstract
A desirable property for any system dealing with unrestricted natural language text is robustness, the ability to analyze any input regardless of its grammaticality. In this paper we present a novel, general transformation technique to automatically obtain robust, error-repair parsers from standard non-robust parsers. The resulting error-repair parsing schema is guaranteed to be correct when our method is applied to a correct parsing schema verifying certain conditions that are weak enough to be fulfilled by a wide variety of parsers used in natural language processing.
Partially supported by Ministerio de Educación y Ciencia and FEDER (HUM2007-66607-C04) and Xunta de Galicia (PGIDIT07SIN005206PR, INCITE08E1R104022ES, INCITE08ENA305025ES, INCITE08PXIB302179PR and Rede Galega de Procesamento da Linguaxe e Recuperación de Información).
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Cerecke, C.: Repairing syntax errors in LR-based parsers. Australian Computer Science Communications 24(1), 17–22 (2002)
Corchuelo, R., Pérez, J.A., Ruiz, A., Toro, M.: Repairing Syntax Errors in LR Parsers. ACM Transactions on Programming Languages and Systems 24(6), 698–710 (2002)
Earley, J.: An efficient context-free parsing algorithm. Communications of the ACM 13(2), 94–102 (1970)
Gómez-Rodríguez, C., Vilares, J., Alonso, M.A.: A compiler for parsing schemata. Software: Practice and Experience, doi:10.1002/spe.904 (forthcoming)
Grune, D., Jacobs, C.J.H.: Parsing Techniques. A Practical Guide, 2nd edn. Springer Science+Business Media, Heidelberg (2008)
Kasper, W., Kiefer, B., Krieger, H.U., Rupp, C.J., Worm, K.L.: Charting the depths of robust speech parsing. In: Proc. of ACL 1999, Morristown, NJ, USA, pp. 405–412 (1999)
Kim, I.-S., Choe, K.-M.: Error Repair with Validation in LR-based Parsing. ACM Transactions on Programming Languages and Systems 23(4), 451–471 (2001)
Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics Doklady 10(8), 707–710 (1966)
Lyon, G.: Syntax-directed least-errors analysis for context-free languages: a practical approach. Commun. ACM 17(1), 3–14 (1974)
Moore, R.C.: Improved left-corner chart parsing for large context-free grammars. In: Proc. of the 6th IWPT, Trento, Italy, pp. 171–182 (2000)
Perez-Cortes, J.C., Amengual, J.C., Arlandis, J., Llobet, R.: Stochastic error-correcting parsing for OCR post-processing. In: ICPR 2000: Proceedings of the International Conference on Pattern Recognition, Washington, DC, USA, p. 4405. IEEE Computer Society, Los Alamitos (2000)
Shieber, S.M., Schabes, Y., Pereira, F.C.N.: Principles and implementation of deductive parsing. Journal of Logic Programming 24(1–2), 3–36 (1995)
Sikkel, K.: Parsing Schemata — A Framework for Specification and Analysis of Parsing Algorithms. Springer, Heidelberg (1997)
Sikkel, K.: Parsing schemata and correctness of parsing algorithms. Theoretical Computer Science 199(1–2), 87–103 (1998)
van der Spek, P., Plat, N., Pronk, N.: Syntax Error Repair for a Java-based Parser Generator. ACM SIGPLAN Notices 40(4), 47–50 (2005)
Vilares, M., Darriba, V.M., Ribadas, F.J.: Regional least-cost error repair. In: Yu, S., Păun, A. (eds.) CIAA 2000. LNCS, vol. 2088, pp. 293–301. Springer, Heidelberg (2001)
Vilares, M., Darriba, V.M., Vilares, J., Ribadas, F.J.: A formal frame for robust parsing. Theoretical Computer Science 328, 171–186 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gómez-Rodríguez, C., Alonso, M.A., Vilares, M. (2009). A General Method for Transforming Standard Parsers into Error-Repair Parsers. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2009. Lecture Notes in Computer Science, vol 5449. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00382-0_17
Download citation
DOI: https://doi.org/10.1007/978-3-642-00382-0_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-00381-3
Online ISBN: 978-3-642-00382-0
eBook Packages: Computer ScienceComputer Science (R0)