Abstract
The Earley algorithm is a widely used parsing method in natural language processing applications. We introduce a variant of Earley parsing that is based on a “delayed” recognition of constituents. This allows us to start the recognition of a constituent only in cases in which all of its subconstituents have been found within the input string. This is particularly advantageous in several cases in which partial analysis of a constituent cannot be completed and in general in all cases of productions sharing some suffix of their right-hand sides (even for different left-hand side nonterminals). Although the two algorithms result in the same asymptotic time and space complexity, from a practical perspective our algorithm improves the time and space requirements of the original method, as shown by reported experimental results.
Research by the first author is carried out within the framework of the Priority Programme Language and Speech Technology (TST). The TST-Programme is sponsored by NWO (Dutch Organization for Scientific Research).
Preview
Unable to display preview. Download preview PDF.
References
G. E. Barton, Jr. On the complexity of ID/LP parsing. Computational Linguistics, 11(4):205–218, 1985.
J. Bear. A breadth-first parsing model. In Proc. of the Eighth International Joint Conference on Artificial Intelligence, volume 2, pages 696–698, Karlsruhe, West Germany, August 1983.
S. Billot and B. Lang. The structure of shared forests in ambiguous parsing. In Proc. of the 27 th ACL, pages 143–151, Vancouver, British Columbia, Canada, 1989.
J. A. Carroll. Practical unification-based parsing of natural language. Technical Report No. 314, University of Cambridge, Computer Laboratory, England, 1993. PhD thesis.
J. Dowding, R. Moore, F. Andry, and D. Moran. Interleaving syntax and semantics in an efficient bottom-up parser. In Proc. of the 32 nd ACL, pages 110–116, Las Cruces, New Mexico, 1994.
J. Earley. An efficient context-free parsing algorithm. Communications of the Association for Computing Machinery, 13(2):94–102, 1970.
D. Gardemann. Using restriction to optimize unification parsing. In International Workshop on Parsing Technologies, pages 8–17, Pittsburgh, 1989.
S. L. Graham and M. A. Harrison. Parsing of general context free languages. In Advances in Computers, volume 14, pages 77–185. Academic Press, New York, NY, 1976.
S. L. Graham, M. A. Harrison, and W. L. Ruzzo. An improved context-free recognizes. ACM Transactions on Programming Languages and Systems, 2(3):415–462, 1980.
R. Kaplan. A general syntactic processor. In E. Rustin, editor, Natural Language Processing. Prentice-Hall, Englewood Cliffs, NJ, 1973.
M. Kay. Algorithm schemata and data structures in syntactic processing. Technical report CSL-80, Xerox Palo Alto Research Center, Palo Alto, CA, 1980. Also in: B. J. Grosz, K. Sparck Jones and B. L. Webber, editors, Natural Language Processing, pages 35–70, Kaufmann, Los Altos, CA, 1986.
R. Leermakers. How to cover a grammar. In Proc. of the 27 th ACL, pages 135–142, Vancouver, British Columbia, Canada, 1989.
R. Leermakers. A recursive ascent Earley parser. Information Processing Letters, 41(2):87–91, February 1992.
R. Leermakers. Recursive ascent parsing: from Earley to Marcus. Theoretical Computer Science, 104:299–312, 1992.
H. Leiss. On Kilbury's modification of Earley's algorithm. ACM Transactions on Programming Languages and Systems, 12(4):610–640, 1990.
J. M. I. M. Leo. A general context-free parsing algorithm running in linear time on every LR(κ) grammar without using lookahead. Theoretical Computer Science, 82:165–176, 1991.
S. Naumann and H. Langer. Parsing. B.G. Teubner, Stuttgart, 1994.
M. J. Nederhof. An optimal tabular parsing algorithm. In Proc. of the 32 nd ACL, pages 117–124, Las Cruces, New Mexico, 1994.
M. J. Nederhof. Efficient generation of random sentences. Natural Language Engineering, 2(1):1–13, 1996.
M. J. Nederhof and G. Satta. Efficient tabular LR parsing. In Proc. of the 34 th ACL, pages 239–246, Santa Cruz, CA, 1996.
A. Nijholt. Context-Free Grammars: Covers, Normal Forms, and Parsing, volume 93. Springer-Verlag, Berlin, Germany, 1980.
F. C. N. Pereira and D. H. D. Warren. Parsing as deduction. In Proc. of the 21 st ACL, pages 137–144, Cambridge, MA, 1983.
J. J. Schoorl and S. Belder. Computational linguistics at Delft: A status report. Report WTM/TT 90-09, Delft University of Technology, Applied Linguistics Unit, 1990.
P. Shann. Experiments with GLR and chart parsing. In M. Tomita, editor, Generalized LR Parsing. Kluwer Academic Publishers, 1991.
S. M. Shieber. Direct parsing of ID/LP grammars. Linguistics and Philosophy, 7:135–154, 1984.
S. M. Shieber. Using restriction to extend parsing algorithms for complex-feature-based formalisms. In Proc. of the 23 rd ACL, pages 145–152, Chicago, IL, 1985.
O. Stock. Parsing with flexibility, dynamic strategies, and idioms in mind. Computational Linguistics, 15(1):1–18, 1989.
A. van Wijngaarden et al. Revised report on the algorithmic language ALGOL 68. Acta Informatica, 5:1–236, 1975.
T. G. Vosse. The Word Connection. PhD thesis, University of Leiden, 1994.
M. Wiren. A comparison of rule-invocation strategies in parsing. In Proc. of the 3 rd EACL, pages 226–233, Copenhagen, Denmark, 1987.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1997 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Nederhof, MJ., Satta, G. (1997). A variant of Earley parsing. In: Lenzerini, M. (eds) AI*IA 97: Advances in Artificial Intelligence. AI*IA 1997. Lecture Notes in Computer Science, vol 1321. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-63576-9_98
Download citation
DOI: https://doi.org/10.1007/3-540-63576-9_98
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-63576-5
Online ISBN: 978-3-540-69601-8
eBook Packages: Springer Book Archive