Abstract
Nested words allow modeling of linear and hierarchical structure in data, and nested word automata are special kinds of pushdown automata whose push/pop actions are directed by the hierarchical structure in the input nested word. The resulting class of regular languages of nested words has many appealing theoretical properties, and has found many applications, including model checking of procedural programs. In the nested word model, the hierarchical matching of open- and close- tags must be properly nested, and this is not the case, for instance, in program executions in presence of exceptions. This limitation of nested words narrows its model checking applications to programs with no exceptions. We introduce the model of colored nested words which allows such hierarchical structures with mismatches. We say that a language of colored nested words is regular if the language obtained by inserting the missing closing tags is a well-colored regular language of nested words. We define an automata model that accepts regular languages of colored nested words. These automata can execute restricted forms of \(\varepsilon \)-pop/push transitions. We provide an equivalent grammar characterization and show that the class of regular languages of colored nested words has the same appealing closure and decidability properties as nested words, thus removing the restriction of programs to be exception-free in order to be amenable for model checking, via the nested words paradigm.
Similar content being viewed by others
Notes
Our notation for internal letters, marking a letter with a dot as in \({\dot{a}}\), differs slightly from nested words literature which uses simply a. When there is no risk of confusion we may use un-dotted versions too.
Nested word automata where called visibly pushdown automata in the paper first introducing them [5]. The name visibly pushdown automata originated since these are pushdown automata where the push and pop operations are directly determined by the alphabet, which is partitioned into push letters, pop letters and internal letters, and in this sense the operations on the stack are visible.
A nested word automaton is a pushdown automaton working on an alphabet of the form \(\varSigma \times \{ \langle , \cdot , \rangle \}\) where all letters in \(\varSigma \times \{\langle \}\) cause a push transition, all letters in \(\varSigma \times \{\cdot \}\) have no affect on the stack, and all letters in \(\varSigma \times \{\rangle \} \) cause a pop transition [6].
We note that since -transitions are conducted only immediately before -transitions, one can define an equivalent model without -transitions by extending the type of to and performing in the case of letter with color greater than c the two steps of and the original together. We chose to define the automaton this way to ease the correspondence between \(\varepsilon \)-pop (resp. \(\varepsilon \)-push) transitions and recovered calls (resp. recovered returns).
Where \(\gamma [j]\) refers to the j’th letter of \(\gamma \).
Note that a blind cna is still different than a traditional nested word automaton, as it has the means to skip all the unmatched calls of lower color and arrive to the matching call, if such exists, and a greater call otherwise.
References
Alur R, Bouajjani A, Esparza J (2016) Model checking of procedural programs. In: Handbook of Model Checking. Springer. To Appear
Alur R, Chaudhuri S (2010) Temporal reasoning for procedural programs. In: VMCAI, pp. 45–60
Alur R, Chaudhuri S, Madhusudan P (2006) A fixpoint calculus for local and global program flows. In: POPL, pp. 153–165
Alur R, Chaudhuri S, Madhusudan P (2011) Software model checking using languages of nested trees. ACM Trans. Program. Lang. Syst. 33(5):15
Alur R, Madhusudan P (2004) Visibly pushdown languages. In: STOC, pp. 202–211
Alur R, Madhusudan P (2009) Adding nesting structure to words. J. ACM 56(3). http://robotics.upenn.edu/~alur/Jacm09.pdf
Caucal D, Hassen S (2008) Synchronization of grammars. In: CSR, pp. 110–121
Chaudhuri S, Alur R (2007) Instrumenting C programs with nested word monitors. In: SPIN, pp. 279–283
Crespi-Reghizzi S, Mandrioli D (2012) Operator precedence and the visibly pushdown property. J. Comput. Syst. Sci. 78(6):1837–1867
Debarbieux D, Gauwin O, Niehren J, Sebastian T, Zergaoui M (2013) Early nested word automata for xpath query answering on XML streams. In: CIAA’13, pp. 292–305
Driscoll E, Burton A, Reps TW (2011) Checking conformance of a producer and a consumer. In: SIGSOFT/FSE, pp. 113–123
Filiot E, Gauwin O, Reynier PA, Servais F (2011) Streamability of nested word transductions. In: Annual Conference on Foundations of Software Technology and Theoretical Computer Science, FSTTCS, pp. 312–324
Filiot E, Raskin JF, Reynier PA, Servais F, Talbot JM (2010) Properties of visibly pushdown transducers. In: In Proc. 35th MFCS, pp. 355–367
Filiot E, Servais F (2012) Visibly pushdown transducers with look-ahead. In: SOFSEM 2012: Conf. on Current Trends The. and Prac. of CS, pp. 251–263
Hague M, Murawski AS, Ong CL, Serre O (2008) Collapsible pushdown automata and recursion schemes. In: LICS, pp. 452–461
Madhusudan P, Viswanathan M (2009) Query automata for nested words. In: MFCS, pp. 561–573
Mozafari B, Zeng K, Zaniolo C (2012) High-performance complex event processing over xml streams. In: SIGMOD Conference, pp. 253–264
Nowotka D, Srba J (2007) Height-deterministic pushdown automata. In: MFCS, pp. 125–134
Raskin JF, Servais F (2008) Visibly pushdown transducers. In: Automata, Languages and Programming, ICALP 2008, pp. 386–397
Staworko S, Laurence G, Lemay A, Niehren J (2009) Equivalence of deterministic nested word to word transducers. In: In Proc. 17th FCT, pp. 310–322
Thomo A, Venkatesh S (2011) Rewriting of visibly pushdown languages for XML data integration. Theor. Comput. Sci. 412(39):5285–5297
Acknowledgements
This is an extended version of the paper that appeared in LATA’16. The version in the proceedings has an error in Theorem 6: Regular languages of nested words as defined in the proceedings version are not closed under reversal. The current version defines regular languages of colored nested words (and colored nested words automata) differently. Under the current definition regular languages of colored nested words are closed under reversal as well. We would like to thank Sarai Sheinvald for pointing to the problem in the proceedings version.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Examples of runs of the CNAs in Fig 7
Example 1
Consider the cna at the right of the first line in Fig. 7. Let \(\mathbb {w}=\mathbb {w}_1\mathbb {w}_2\) be the word depicted in Fig. 5 where and . The sequence of configurations goes through when reading \(\mathbb {w}_1\) is:
Note that on reading first an \(\varepsilon \)-pop transition is done, and then a pop-transition consuming is preformed.
The sequence of configurations goes through when reading \(\mathbb {w}_2\) is:
Here an \(\varepsilon \)-pop transition is preformed from state li (upon reading ). Note that it corresponds to the first which is the only un-matced and that the recovery is preformed just before reading the closing exactly as depicted in Fig. 5.
Example 2
Consider the cna at the left of the second line in Fig. 7. Let The sequence of configurations goes through when reading \(\mathbb {w}_3\) is:
Since \((even,(\bot ,\infty ))\) is an accepting configuration the word \(\mathbb {w}_3\) is accepepted by .
Consider the cna at the middle of the second line in Fig. 7. Let . The sequence of configurations goes through when reading \(\mathbb {w}_4\) is:
Since \((odd,(\bot ,\infty ))\) is an accepting configuration the word \(\mathbb {w}_4\) is accepepted by . Consider the cna \(A_5\) at the right of the second line in Fig. 7. Let and The word \(\mathbb {w}_5\) should be accepted whereas \(\mathbb {w}_6\). We show here the sequence of configurations goes through when reading \(\mathbb {w}_5\).
Since \((odd,(\bot ,\infty ))\) is an accepting configuration the word \(\mathbb {w}_5\) is accepepted by .
Below we show the sequence of configurations goes through when reading \(\mathbb {w}_6\), where the first steps are omitted as they are similar to those on reading \(\mathbb {w}_5\).
At this point will get stuck because there is no pop transition from state even on stack letter (e, 2) and thus \(\mathbb {w}_6\) is rejected.
Proofs of lemmas of Sect. 5.1
Lemma 3 states that every cna can be converted into an equivalent icna. Here is its proof.
Proof
(of Lemma 3) Let be a cna. We define an icna , as follows. Let \(p_\bot \) be a fresh stack symbol, and \(c_\bot \) a color bigger than all colors in C. Let are \(P \cup \{p_\bot \}\), and . The set of states is . We use \(g_\bot \) for \((p_\bot ,c_\bot )\) and \(g,g',g''\) for arbitrary elements of . The idea is that records in the state the initial stack symbol. If encounters \(g_\bot \) on a pop operation, it proceeds as would if the current stack symbol was what is recorded in its state. Formally, the initial frontiers are \(\{((q,g),g_\bot )~|~(q,g)\in I\}\). The final frontiers are . For the transition relation we have:
-
if \(q'\in {{\dot{\delta }}}(q,{\dot{a}})\)
-
if
-
if \(g'\ne g_\bot \) and
-
if
-
if \(g'\ne g_\bot \) and
-
if
-
if
\(\square \)
Lemma 4 states that every cna can be converted into an equivalent cna with a single initial frontier. Here is its proof.
Proof
(of Lemma 4) By Lemma 3 we can convert the given cna into a cna with , where \({I \subseteq Q \times \{p_\bot \}} \times \{c_\bot \})\) for some \(g_\bot =(p_\bot ,c_\bot )\in \varGamma \). Let , where \(q_I\) is a fresh state, is same as F if \(F\cap I = \emptyset \) and otherwise . For the transition relations we connect \(q_I\) to the states that are reachable from one of the initial states, thus , , , and . \(\square \)
Rights and permissions
About this article
Cite this article
Alur, R., Fisman, D. Colored nested words. Form Methods Syst Des 58, 347–374 (2021). https://doi.org/10.1007/s10703-021-00384-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10703-021-00384-2