Conjunctive and Boolean grammars: The true general case of the context-free grammars

doi:10.1016/j.cosrev.2013.06.001

Computer Science Review

Volume 9, August 2013, Pages 27-59

https://doi.org/10.1016/j.cosrev.2013.06.001 Get rights and content

Abstract

Conjunctive grammars extend the definition of a context-free grammar by allowing a conjunction operation in the rules; Boolean grammars are further equipped with an explicit negation. These grammars maintain the main principle of the context-free grammars, that of defining syntactically correct strings inductively from their substrings, but lift the restriction of using disjunction only. This paper surveys the results on conjunctive and Boolean grammars obtained over the last decade, comparing them to the corresponding results for ordinary context-free grammars and their main subfamilies. Much attention is given to parsing algorithms, most of which are inherited from the case of ordinary context-free grammars without increasing their computational complexity. The intended readership includes any computer scientists looking for a compact and accessible description of this formal model and its properties, as well as for a general outlook on formal grammars. The paper is also addressed to theoretical computer scientists seeking a subject for research; an account of pure theoretical research in the area presented in this paper is accompanied by a list of significant open problems, with an award offered for the first correct solution of each problem. Several directions for future investigation are proposed.

Introduction

Syntax of languages, both natural and artificial, is defined inductively, in the sense that syntactic properties of strings of symbols are logically determined by the properties of their substrings. A grammar of a particular language gives names to these syntactic properties, and explains, how shorter strings with certain properties can be concatenated to obtain longer strings with another property. For instance, a rule of a hypothetical grammar for a natural language may say that a subject followed by a predicate is a sentence; the form of a subject and a predicate is defined by other rules of the grammar. Similarly, a grammar for a programming language may define a loop statement as a keyword while followed by an expression and a statement (where the latter may, in particular, be another loop statement). An abstract language ${a^{n} b^{n} ∣ n ⩾ 0}$ over an alphabet ${a, b}$ is completely defined by saying that a string belongs to this language if and only if it is either an empty sequence of symbols, or a string of the form $a w b$ , where $w$ is a string belonging to this language.

Definitions of this kind are naturally given when the syntax of any language needs to be clearly described, such as in the textbook on the English grammar by Reed and Kellogg [1] or in the first description of the Algol programming language [2]. Following Chomsky’s [3] influential works, such definitions became known as context-free grammars, reflecting the fact that syntactic properties of a substring do not depend on the context, in which it occurs. The form of the rules was fixed to $A \to X_{1} \dots X_{ℓ},$ where the symbol $A$ represents a syntactic notion defined in the grammar, such as a sentence or a loop statement (these auxiliary notions are called “nonterminal symbols” for historical reasons), and each symbol $X_{i}$ may be another nonterminal symbol or a symbol of the target alphabet. A rule ( $*$ ) means that every string representable as a concatenation $X_{1} \dots X_{ℓ}$ therefore has the property $A$ . Returning to the above examples, the hypothetical grammar for a natural language has a rule $Sentence \to Subject Predicate$ , while the grammar for the abstract language ${a^{n} b^{n} ∣ n ⩾ 0}$ consists of two rules, $S \to a S b$ and $S \to ε$ (where $ε$ denotes the empty string), and represents a complete inductive definition.

Context-free grammars may be thought of as a logic for inductive descriptions of syntax, in which the propositional connectives available for combining syntactical conditions are restricted to disjunction only. Indeed, having multiple rules for a single nonterminal $A$ essentially represents disjunction: two rules $A \to α$ and $A \to β$ mean that a string has the property $A$ if and only if it is representable as $α$ or as $β$ . Any other Boolean operations, such as conjunction and negation, are not expressible using ordinary context-free grammars: that is, there is no general way for specifying all strings that satisfy a condition $α$ and at the same time another condition $β$ , and no way to denote the set of strings that do not satisfy a condition $α$ . Furthermore, it is well-known that intersection of two context-free languages or the complement of a context-free language need not be context-free [4]; in other words, in this particular logic, conjunction or negation cannot be represented through disjunction only.

This omission in the formalism suggests the idea of defining a more general logic, which would maintain the main principles behind the context-free grammars, but at the same time extend the set of available propositional connectives. The result could then be regarded as a completion of the incomplete standard definition of context-free grammars. Early attempts in this direction were made by Latta and Wall [5] and by Heilbrunner and Schmitz [6], who proposed formalisms for specifying Boolean combinations of context-free languages. Latta and Wall [5], in particular, argued for the relevance of their formalism to linguistics. However, the use of conjunction and negation in these grammars was heavily restricted, and one still could not use them as freely as the disjunction.

An extension of the definition of context-free grammars featuring unrestricted conjunction, introduced by the author [7], is known as a conjunctive grammar. In these grammars, the conjunction of two syntactical conditions can be directly expressed in the form of a rule $A \to X_{1} \dots X_{ℓ} & Y_{1} \dots Y_{m},$ which asserts that every string representable both as $X_{1} \dots X_{ℓ}$ and as $Y_{1} \dots Y_{m}$ therefore has the property $A$ . The more general Boolean grammars [8] further extend the definition by allowing an explicit negation, that is, every Boolean operation is directly expressible in their formalism. For instance, the set of strings representable as $X_{1} \dots X_{ℓ}$ , and at the same time not representable as $Y_{1} \dots Y_{m}$ , can be written as a rule $A \to X_{1} \dots X_{ℓ} & \neg Y_{1} \dots Y_{m} .$ Both types of grammars remain essentially context-free, in the sense, that the deduction of the properties of a string does not depend on the context, where it occurs; the properties of a string are defined as a function of the properties of the substrings, into which the string can be split. Therefore, as rightfully noted by Kountouriotis et al. [9], “conjunctive context-free grammars” and “Boolean context-free grammars” would be appropriate names for these models. Along with most of the literature, this paper assumes the shorter names, yet refers to the fragment of Boolean grammars featuring the disjunction only as to the ordinary context-free grammars.

All semiformal interpretations of rules given so far show the intended meaning of grammars, but are not yet definitions in the mathematical sense. Viewing grammars as a logic, the most direct approach for defining their semantics is by introducing a deduction of elementary statements of the form “string $w$ has property $A$ ”, which are inferred from each other according to the rules of the grammar. This general approach may be regarded as folklore; in particular, it was used by Sikkel [10] to explain computations carried out by parsers. Alternatively, the same logical dependencies can be represented by interpreting a grammar as a system of equations with formal languages as unknowns, as done by Ginsburg and Rice [11]. Finally, the most widespread definition of ordinary context-free grammars, given by Chomsky [3], is by string rewriting, when a rule $A \to α$ is regarded as a production for rewriting a symbol $A$ with a substring $α$ , so that an abstract symbol for a sentence is eventually rewritten into an actual sentence. However, it must be stressed that even though the definition by rewriting is indeed the simplest one, it is nothing more than a convenient characterization, which leaves behind the true nature of formal grammars: that is, logical dependence between items of the form “string $w$ has property $A$ ”. Identifying formal grammars with rewriting systems is a grave error in judgement.

A conjunctive grammar can be defined in the same three ways as an ordinary context-free grammar: by a deduction system [12] with appropriately extended inference rules, by language equations [13] involving the intersection operation, and by rewriting [7], which is augmented to a special kind of term rewriting. These definitions are explained in detail in Section 2 of this survey, along with several characteristic examples of conjunctive grammars, which demonstrate, how conjunction can be put to use in the well-known setting of inductive definitions of syntax.

The definition of Boolean grammars is more complicated, because a grammar may express a contradiction of the form $A \to \neg A$ , which states that a string has the property $A$ if and only if it does not have this property. Thus, for Boolean grammars, the dependence of items of the form “string $w$ has property $A$ ” has a more complicated form, which calls for being expressed by equations. The simpler approach to the definition uses a system of language equations, in which the negation is interpreted by complementation, and imposes a certain condition upon this system, which ensures that it has a unique solution; this unique solution then defines the meaning of a grammar. A more general definition was given by Kountouriotis et al. [14], who interpreted a Boolean grammar in terms of three-valued languages, so that a string may belong to a language, not belong to it, or have an undetermined membership status. Both methods are explained in Section 3. There is no known definition of Boolean grammars by rewriting.

This survey paper is aimed to present the research on conjunctive and Boolean grammars carried out over the last decade, and to justify the thesis that these grammars are the true general case of the context-free grammars. The crucial points in support of this statement are that, on the one hand, conjunctive and Boolean grammars maintain the main inductive principles behind the ordinary context-free grammars, which account for their intuitive clarity and suitability for representing syntax, and only offer additional logical connectives within the same framework; these further expressive means allow giving meaningful descriptions of quite a few syntactic constructs not representable by ordinary context-free grammars. On the other hand, this extra power does not damage the crucial properties of context-free grammars: the intuitive clarity of descriptions is preserved, the upper bounds on time complexity remain the same, and most of the parsing algorithms are directly inherited from the context-free case. In particular, the basic bottom-up parsing algorithms for ordinary context-free grammars of the general form, such as the Cocke–Kasami–Younger [15], [16] and its variants [17], [18], [19], extend to Boolean grammars so smoothly and obviously, that one can hardly see any reason for limiting logical connectives in a grammar to disjunction only. Applying some other algorithms, such as the Lang–Tomita generalized LR [20], [21] and the recursive descent, to Boolean grammars requires elaborating their flow control, but, in general, the elaboration amounts to having a parser compute conjunction and negation, wherever these operations occur in the grammar.

Although conjunctive and Boolean grammars inherit many practical properties of ordinary context-free grammars, they have some essential differences in their theoretical properties. One of these differences concerns sublinear-time parallel recognition algorithms operating on a circuit: such algorithms are known for ordinary context-free grammars [22], [23], [24], but most likely do not exist already for conjunctive grammars, since these grammars are capable of representing some P-complete languages. Another difference concerns the decision problems: for an ordinary context-free grammar, one can effectively test whether it generates a non-empty language, but it is undecidable whether a given grammar generates the set of all strings; both problems are undecidable for conjunctive grammars. Yet another difference concerns grammars over a one-symbol alphabet: while ordinary context-free grammars are limited to regular subsets of $a^{*}$ , conjunctive grammars can generate a wide variety of one-symbol languages [25], [26], [27].

The last topic of this survey is summarizing the properties of conjunctive and Boolean grammars, and comparing them with the properties of other important families of formal grammars. Considering all types of grammars together, and understanding conjunctive and Boolean grammars as an essential part of the theory of formal grammars, leads to a new outlook on grammars as such. This new outlook begins with a new classification of meaningful families of formal grammars, done in terms of the amount of ambiguity and nondeterminism, various motivated restrictions on the form of rules (such as linear concatenation), and the set of allowed logical connectives (limited to disjunction alone in ordinary context-free grammars).

The proposed classification of grammars notably ignores the first and still the most well-known classification of families of formal languages: the Chomsky hierarchy. Why is it ignored? Chomsky’s hierarchy is comprised of the regular languages (“type 3”), the context-free languages (“type 2”), the nondeterministic linear space (“type 1”) and the recursively enumerable sets (“type 0”). These are the families of languages considered in the early days of computer science by Chomsky [3], who had formalized the intuitive notion of a formal grammar using string-rewriting systems, and then attempted to implement further linguistic ideas by altering this definition. This had a surprising outcome: though none of the modifications had anything to do with syntax, all three of them turned out to be important models of computation: “type 0” is a reformulation of a nondeterministic Turing machine, “type 3” reformulates nondeterministic finite automata, and “type 1” became the first computational complexity class to be ever considered. Putting these three models of computability, along with the basic model of syntax, within a single framework had a significant impact on the early development of the theory of computation, and Chomsky’s hierarchy remains a milestone in the history of computer science. However, as far as models of syntax are being concerned, this hierarchy did not serve its purpose. Despite decades of subsequent laborious studies, the research in string-rewriting systems centred around context-free rewriting revealed no other viable model of syntax besides the context-free grammar. This leads to a conclusion that the representation of context-free grammars by string rewriting is a unique coincidence, rather than a systematic association between rewriting and grammars. Furthermore, the ungrammatical levels of the Chomsky hierarchy (“type 1” and “type 0”) are useless even as a point of reference, because meaningful syntax has complexity much below $NSPACE (n)$ . In spite of its historical importance, the Chomsky hierarchy is hardly relevant anymore.

The research on formal grammars carried out over the last fifty years revealed quite a few important families of formal grammars, obtained by restricting ordinary context-free grammars: $LR (1)$ grammars (and the deterministic context-free languages they generate), linear grammars, unambiguous grammars, etc. These families form the basis of the proposed hierarchy of formal grammars, which is then extended towards the conjunctive and Boolean grammars, as well as to their subfamilies defined by analogy with the subfamilies of ordinary context-free grammars: such as, for instance, linear conjunctive grammars or unambiguous Boolean grammars. In Section 8, all families in the hierarchy are compared in terms of their expressive power, closure properties under basic operations and the decidability and complexity of various properties. The expressive power of different grammars is furthermore related to the computational complexity classes between ${NC}^{1}$ and $P$ .

The survey is concluded with some suggested directions for research on conjunctive and Boolean grammars, and on formal grammars in general. First, there is a list of significant theoretical open problems, with an award of $360 Canadian offered by the author [28] for the first correct solution of each problem. Nine problems were originally stated in 2006; since then, two problems were solved [25], [29], and, at the time of writing, seven remain open. This survey includes the statements of all problems, and briefly comments on possible approaches to them. Furthermore, some general questions worth investigation are suggested, including possible discovery of new variants of formal grammars, as well as implementation and application of conjunctive and Boolean grammars as they are.

Section snippets

Three equivalent definitions

A conjunctive grammar is a quadruple $G = (Σ, N, R, S)$ , in which:

•
$Σ$ is the alphabet of the language being defined, that is, a finite set of symbols, from which the strings in the language are built;
•
$N$ is a finite set of auxiliary notions used in the grammar, each of them represents a syntactic property that a string in $Σ^{*}$ may have or not have; for historical reasons, they are called nonterminal symbols or nonterminals, even though this name ought to have been deprecated long ago;
•
$R$ is a finite set of

Intuitive definition

Boolean grammars are context-free grammars equipped with all propositional connectives, or, in other words, conjunctive grammars augmented with negation. Conversely, conjunctive grammars are the monotone fragment of Boolean grammars.

A Boolean grammar is a quadruple $G = (Σ, N, R, S)$ , in which

•
$Σ$ is the alphabet;
•
$N$ is the set of nonterminal symbols;
•
$R$ is a finite set of rules of the form $A \to α_{1} & \dots & α_{m} & \neg β_{1} & \dots & \neg β_{n}$ with $A \in N, m, n ⩾ 0, m + n ⩾ 1$ and $α_{i}, β_{j} \in {(Σ \cup N)}^{*}$ ;
•
$S \in N$ is the initial symbol.

The only difference from a

Grammars with linear concatenation

A special case of ordinary context-free grammars, which can express a concatenation of a nonterminal symbol only with terminal strings, is known as a linear context-free grammar. In such grammars, every rule $A \to α$ has $α \in Σ^{*} \cup Σ^{*} N Σ^{*}$ . These grammars are notable for their lower computational complexity and other noteworthy properties.

Similarly to the case of grammars with disjunction only, a conjunctive grammar is called linear conjunctive, if every rule it contains is either of the form $A \to u_{1} B_{1} v_{1} & \dots & u_{n} B_{n}$

Basic parsing algorithms

Parsing means decomposing a string into substrings according to a grammar, and verifying that it is a well-formed sentence. Given a string as an input, a parsing algorithm should determine whether the string belongs to the language described by a fixed (or a given) grammar, and if it does, construct a parse tree of the string, as it is defined by the grammar.

Advanced approaches to parsing

This section describes several parsing methods, that have theoretically superior performance to the basic parsing algorithms discussed above. Even though some of them are quite unlikely to be useful in practice, they are important for understanding the theoretical complexity of formal grammars.

Grammars over a one-symbol alphabet

Conjunctive grammars over a unary alphabet form a special area of study. Though such grammars are completely irrelevant to the main purpose of formal grammars, that of representing syntax, they are theoretically important as a pure case of conjunctive grammars, which already shows some of their distinctive properties. Furthermore, they are crucial in the study of language equations, where their properties form the basis of the study of the more general language equations over a unary alphabet

Hierarchy of language families

In order to compare the expressive power of meaningful models of syntax, one should begin with compiling a list of such models. The main point of reference are, of course, the ordinary context-free grammars (CF). Many important families of languages are defined by restricting context-free grammars in one or another way. Prohibiting syntactic ambiguity leads to the unambiguous context-free grammars (UnambCF), and to their special cases: the $LR (k)$ context-free grammars, which define the

Nine theoretical problems

The previous survey of Boolean grammars [28] introduced nine open problems, each concerned with some theoretical property of Boolean grammars. Since then, two problems have been solved, and seven others remain open.¹ An award is offered for the first correct solution of each of the remaining problems.²

Acknowledgements

The author was supported by the Academy of Finland under grants 134860 and 257857.

References (145)

N. Chomsky
On certain formal properties of grammars
Information and Control
(1959)
S. Scheinberg
Note on the Boolean properties of context free languages
Information and Control
(1960)
S. Heilbrunner et al.
An efficient recognizer for the Boolean closure of context-free languages
Theoretical Computer Science
(1991)
A. Okhotin
Boolean grammars
Information and Computation
(2004)
V. Kountouriotis et al.
A game-theoretic characterization of Boolean grammars
Theoretical Computer Science
(2011)
A. Okhotin
The dual of concatenation
Theoretical Computer Science
(2005)
V. Kountouriotis et al.
Well-founded semantics for Boolean grammars
Information and Computation
(2009)
D.H. Younger
Recognition and parsing of context-free languages in time $n^{3}$
Information and Control
(1967)
W.L. Ruzzo
On uniform circuit complexity
Journal of Computer and System Sciences
(1981)
E. Moriya
A grammatical characterization of alternating pushdown automata
Theoretical Computer Science
(1989)

Preliminary report: international algebraic language

Communications of the ACM

(1958)

M. Latta, R. Wall, Intersective context-free languages, in: Lenguajes Naturales y Lenguajes Formales IX, Barcelona,...

A. Okhotin

Conjunctive grammars

Journal of Automata, Languages and Combinatorics

(2001)

K. Sikkel

Parsing Schemata

(1997)

S. Ginsburg et al.

Two families of languages related to ALGOL

Journal of the ACM

(1962)

A. Okhotin

Conjunctive grammars and systems of language equations

Programming and Computer Software

(2002)

T. Kasami, An efficient recognition and syntax-analysis algorithm for context-free languages, Report AF CRL-65-758, Air...

S.L. Graham et al.

An improved context-free recognizer

ACM Transactions on Programming Languages and Systems

(1980)

T. Kasami et al.

A syntax-analysis procedure for unambiguous context-free grammars

Journal of the ACM

(1969)

L.G. Valiant

General context-free recognition in less than cubic time

Journal of Computer and System Sciences

(1975)

B. Lang

Deterministic techniques for efficient non-deterministic parsers

M. Tomita

An efficient augmented context-free parsing algorithm

Computational Linguistics

(1987)

R.P. Brent et al.

A parallel algorithm for context-free parsing

Australian Computer Science Communications

(1984)

Cited by (52)

Inductive definitions in logic versus programs of real-time cellular automata
2024, Theoretical Computer Science
Descriptive complexity provides intrinsic, i.e. machine-independent, characterizations of the main complexity classes. On the other hand, logic can be useful for designing programs in a natural declarative way. This is especially important for parallel computation models such as cellular automata, since designing parallel programs is considered a difficult task.
This paper establishes three logical characterizations of the three classical complexity classes modeling minimal time, called real-time, of one-dimensional cellular automata according to their canonical variations: unidirectional or bidirectional communication, input word given in a parallel or sequential way.
Our three logics are natural restrictions of existential second-order Horn logic with built-in successor and predecessor functions. These logics correspond exactly to the three ways of deciding a language on a square grid circuit of side n according to one of the three natural locations of an input word of length n: along a side of the grid, on the diagonal that contains the output cell – placed on the vertex (n,n) of the square grid–, or on the diagonal opposite to the output cell.
The key ingredient to our results is a normalization method that transforms a formula from one of our three logics into an equivalent normalized formula that closely mimics a grid circuit.
Then, we extend our logics by allowing a limited use of negation on hypotheses like in Stratified Datalog. By revisiting in detail a number of representative classical problems - recognition of the set of primes by Fisher's algorithm, Dyck language recognition, Firing Squad Synchronization problem, etc. - we show that this extension makes easier programming and we prove that it does not change the real-time complexity of our logics.
Finally, based on our experience in expressing these representative problems in logic, we argue that our logics are high-level programming languages: they make it possible to express in a natural, complete and synthetic way the algorithms of the literature, based on signals – and even to design new inductive algorithms –, and to translate them automatically into cellular automata of the same complexity.
A computation model with automatic functions and relations as primitive operations
2022, Theoretical Computer Science
Prior work of Hartmanis and Simon [36] and Floyd and Knuth [30] investigated what happens if a device uses primitive steps more natural than single updates of a Turing tape. One finding was that in the numerical setting, addition, subtraction and bit-wise Boolean operations of numbers preserve polynomial time while incorporating concatenation or multiplication allows to solve all PSPACE problems in polynomially many steps. Therefore we propose to use updates and comparisons with automatic functions as primitive operations and use constantly many registers; the resulting model covers all primitive operations of Hartmanis and Simon as well as Floyd and Knuth, but the model remains in polynomial time. The present work investigates in particular the deterministic complexity of various natural problems and also gives an overview on the nondeterministic complexity of this model.
Formal languages over GF(2)
2022, Information and Computation
Citation Excerpt :
In the ordinary kind of formal grammars, called “context-free grammars” in Chomsky's tradition, the available operations are union and concatenation. Other grammar families, such as linear grammars or conjunctive grammars [18], differ from the ordinary grammars in the sets of operations allowed in the rules: in linear grammars, the operations include concatenation with a single symbol on either side, as well as union, whereas in conjunctive grammars, the operations are union, intersection and concatenation. This paper initiates the study of a new model, the GF(2)-grammars, with the operations of symmetric difference and GF(2)-concatenation.
Variants of the union and concatenation operations on formal languages are investigated, in which Boolean logic in the definitions (that is, conjunction and disjunction) is replaced with the operations in the two-element field GF(2) (conjunction and exclusive OR). Union is thus replaced with symmetric difference, whereas concatenation gives rise to a new GF(2)-concatenation operation, which is notable for being invertible. All operations preserve regularity, and for a pair of languages recognized by an m-state and an n-state DFA, their GF(2)-concatenation is recognized by a DFA with $m \cdot 2^{n}$ states, and this number of states is in the worst case necessary. Similarly, the state complexity of GF(2)-inverse is $2^{n} + 1$ . Next, a new class of formal grammars based on GF(2)-operations is defined, and it is shown to have the same computational complexity as ordinary grammars with union and concatenation: in particular, simple parsing in time $O (n^{3})$ , fast parsing in the time of matrix multiplication, and parsing in NC².
Edit distance neighbourhoods of input-driven pushdown automata
2019, Theoretical Computer Science
Edit distance ℓ-neighbourhood of a formal language is the set of all strings that can be transformed to one of the strings in this language by at most ℓ insertions and deletions. Both the regular and the context-free languages are known to be closed under this operation, whereas the family recognized by deterministic pushdown automata is not. This paper establishes the closure of the family recognized by input-driven pushdown automata (IDPDA), also known as visibly pushdown automata, under the edit distance neighbourhood operation. For an n-state nondeterministic IDPDA with n stack symbols, an automaton for its edit distance ℓ-neighbourhood using $O (n^{ℓ + 1})$ states is constructed, and an asymptotically matching lower bound is established. For an n-state deterministic IDPDA, its 1-neighbourhood in the worst case requires a deterministic IDPDA with at least $2^{Ω (n^{2})}$ states.
Hardest languages for conjunctive and Boolean grammars
2019, Information and Computation
Citation Excerpt :
The importance of conjunctive grammars is justified by two facts: on the one hand, they enrich the standard inductive definitions of syntax with a useful logical operation, which is sufficient to express several syntactic constructs beyond the scope of ordinary grammars. On the other hand, conjunctive grammars have generally the same parsing algorithms as ordinary grammars [1,26], and the same subcubic upper bound on the time complexity of parsing [27]. Among the numerous theoretical results on conjunctive grammars, the one particularly relevant for this paper is the closure of the language family described by conjunctive grammars under inverse homomorphisms, and, more generally, under inverse deterministic finite transductions [20].
A famous theorem by Greibach (“The hardest context-free language”, SIAM J. Comp., 1973) states that there exists such a context-free language $L_{0}$ , that every context-free language over any alphabet is reducible to $L_{0}$ by a homomorphic reduction—in other words, is representable as its inverse homomorphic image $h^{- 1} (L_{0})$ , for a suitable homomorphism h. This paper establishes similar characterizations for conjunctive grammars, that is, for grammars extended with a conjunction operator, as well as for Boolean grammars, which are further equipped with a negation operator. At the same time, it is shown that no such characterization is possible for several subclasses of linear grammars.
Linear-space recognition for grammars with contexts
2018, Theoretical Computer Science
Grammars with contexts are an extension of context-free grammars equipped with operators for referring to the left and the right contexts of a substring being defined. These grammars are notable for still having a cubic-time parsing algorithm, as well as for being able to describe some useful syntactic constructs, such as declaration before use. It is proved in this paper that every language described by a grammar with contexts can be recognized in deterministic linear space.

View all citing articles on Scopus

^☆: This paper supersedes the earlier surveys, “An overview of conjunctive grammars” (Bulletin of the EATCS, 2004) and “Nine open problems for conjunctive and Boolean grammars” (Bulletin of the EATCS, 2007).

View full text

SurveyConjunctive and Boolean grammars: The true general case of the context-free grammars☆

Abstract

Introduction

Section snippets

Three equivalent definitions

Intuitive definition

Grammars with linear concatenation

Basic parsing algorithms

Advanced approaches to parsing

Grammars over a one-symbol alphabet

Hierarchy of language families

Nine theoretical problems

Acknowledgements

Information and Control

Information and Control

Theoretical Computer Science

Information and Computation

Theoretical Computer Science

Theoretical Computer Science

Information and Computation

Information and Control

Journal of Computer and System Sciences

Theoretical Computer Science

Theoretical Computer Science

Electronic Notes in Theoretical Computer Science

Information and Computation

Theoretical Computer Science

Journal of Computer and System Sciences

Information and Computation

Information and Control

Information and Control

Theoretical Computer Science

Information Processing Letters

Information and Control

Theoretical Computer Science

Discrete Applied Mathematics

Theoretical Computer Science

Information and Control

Electronic Notes in Theoretical Computer Science

Theoretical Computer Science

Journal of Symbolic Computation

Journal of Computer and System Sciences

Theoretical Computer Science

Information Processing Letters

Theoretical Computer Science

Information and Control

Theoretical Computer Science

Information and Computation

Theoretical Computer Science

Higher Lessons in English

Preliminary report: international algebraic language

Communications of the ACM

Conjunctive grammars

Journal of Automata, Languages and Combinatorics

Parsing Schemata

Two families of languages related to ALGOL

Journal of the ACM

Conjunctive grammars and systems of language equations

Programming and Computer Software

An improved context-free recognizer

ACM Transactions on Programming Languages and Systems

A syntax-analysis procedure for unambiguous context-free grammars

Journal of the ACM

General context-free recognition in less than cubic time

Journal of Computer and System Sciences

Deterministic techniques for efficient non-deterministic parsers

An efficient augmented context-free parsing algorithm

Computational Linguistics

A parallel algorithm for context-free parsing

Australian Computer Science Communications

Survey
Conjunctive and Boolean grammars: The true general case of the context-free grammars☆