Abstract
The Generalized LR parsing algorithm for context-free grammars, introduced by Tomita in 1986, is a polynomial-time implementation of nondeterministic LR parsing that uses graph- structured stack to represent the contents of the nondeterministic parser's pushdown for all possible branches of computation at a single computation step. It has been specifically developed as a solution for practical parsing tasks arising in computational linguistics, and indeed has proved itself to be very suitable for natural language processing. Conjunctive grammars extend context-free grammars by allowing the use of an explicit intersection operation within grammar rules. This paper develops a new LR-style parsing algorithm for these grammars, which is based on the very same idea of a graph-structured pushdown, where the simultaneous existence of several paths in the graph is used to perform the mentioned intersection operation. The underlying finite automata are treated in the most general way: instead of showing the algorithm's correctness for some particular way of constructing automata, the paper defines a wide class of automata usable with a given grammar, which includes not only the traditional LR(k) automata, but also, for instance, a trivial automaton with a single reachable state. A modification of the SLR(k) table construction method that makes use of specific properties of conjunctive grammars is provided as one possible way of making finite automata to use with the algorithm. It is shown that the algorithm is applicable to any conjunctive grammar and can be implemented to work in no more than cubic time. Additionally, the algorithm can be made to work in linear time for the Boolean closure of the family of deterministic context-free languages.
Similar content being viewed by others
References
A. V. Aho, R. Sethi and J. D. Ullman. Compilers: Principles, Techniques and Tools, Addison-Wesley, Reading, MA, 1986.
N. P. Chapman, LR Parsing, Cambridge University Press, Cambridge, 1987.
F. DeRemer, Simple LR(k) grammars, Communications of the ACM, 14(7): 453–460, 1971.
J. Earley, An efficient context-free parsing algorithm, Communications of the ACM, 13: 94–102, 1970.
S. L. Graham, M. A. Harrison and W. L. Ruzzo, An improved context-free recognizer, ACM Transactions of Programming Languages and Systems, 2(3): 415–462, 1980.
M. A. Harrison, Introduction to Formal Language Theory, Addison-Wesley, Reading, MA, 1978.
D. E. Knuth, On the translation of languages from left to right, Information and Control, 11: 269–289, 1967.
A. Okhotin, Conjunctive grammars, Journal of Automata, Languages and Combinatorics, 6(4): 2001.
A. Okhotin, Top-down parsing of conjunctive languages, Grammars, 5(1): 21–40, 2002.
A. Okhotin, Whale Calf, a parser generator for conjunctive grammars, to be presented at CIAA 2002, Tours, France, July 3-6, 2002. The software is available at http://www.cs.queensu.ca/home/okhotin/whalecalf/,2002b./
A. Okhotin, A recognition and parsing algorithm for arbitrary conjunctive grammars, to appear in Theoretical Computer Science.
K. Sikkel and A. Nijholt, Parsing of context-free languages, in G. Rozenberg and A. Salomaa, editors, Handbook of Formal Languages, Vol. 2, 61–100, Springer, Berlin, 1997.
S. Sippu and E. Soisalon-Soininen, Parsing Theory, Vol. II: LR(k) and LL(k) Parsing, EATCS Monographs on Theoretical Computer Science, Vol. 20, Springer, Berlin, 1991.
H. Tanaka, T. Tokunaga and M. Aizawa, Integration of morphological and syntactic analysis based on GLR parsing, in H. C. Bunt and M. Tomita, editors, Recent Advances in Parsing Technology, 325–342, Kluwer, Dordrecht, 1996.
M. Tomita, Efficient Parsing for Natural Language, Kluwer, Dordrecht, 1986.
D. H. Younger, Recognition and parsing of context-free languages in time n 3, Information and Control, 10: 189–208, 1967.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Okhotin, A. LR Parsing for Conjunctive Grammars. Grammars 5, 81–124 (2002). https://doi.org/10.1023/A:1016329527130
Issue Date:
DOI: https://doi.org/10.1023/A:1016329527130