Skip to main content
Log in

Colored nested words

  • Published:
Formal Methods in System Design Aims and scope Submit manuscript

Abstract

Nested words allow modeling of linear and hierarchical structure in data, and nested word automata are special kinds of pushdown automata whose push/pop actions are directed by the hierarchical structure in the input nested word. The resulting class of regular languages of nested words has many appealing theoretical properties, and has found many applications, including model checking of procedural programs. In the nested word model, the hierarchical matching of open- and close- tags must be properly nested, and this is not the case, for instance, in program executions in presence of exceptions. This limitation of nested words narrows its model checking applications to programs with no exceptions. We introduce the model of colored nested words which allows such hierarchical structures with mismatches. We say that a language of colored nested words is regular if the language obtained by inserting the missing closing tags is a well-colored regular language of nested words. We define an automata model that accepts regular languages of colored nested words. These automata can execute restricted forms of \(\varepsilon \)-pop/push transitions. We provide an equivalent grammar characterization and show that the class of regular languages of colored nested words has the same appealing closure and decidability properties as nested words, thus removing the restriction of programs to be exception-free in order to be amenable for model checking, via the nested words paradigm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. Our notation for internal letters, marking a letter with a dot as in \({\dot{a}}\), differs slightly from nested words literature which uses simply a. When there is no risk of confusion we may use un-dotted versions too.

  2. Nested word automata where called visibly pushdown automata in the paper first introducing them [5]. The name visibly pushdown automata originated since these are pushdown automata where the push and pop operations are directly determined by the alphabet, which is partitioned into push letters, pop letters and internal letters, and in this sense the operations on the stack are visible.

  3. A nested word automaton is a pushdown automaton working on an alphabet of the form \(\varSigma \times \{ \langle , \cdot , \rangle \}\) where all letters in \(\varSigma \times \{\langle \}\) cause a push transition, all letters in \(\varSigma \times \{\cdot \}\) have no affect on the stack, and all letters in \(\varSigma \times \{\rangle \} \) cause a pop transition [6].

  4. We note that since -transitions are conducted only immediately before -transitions, one can define an equivalent model without -transitions by extending the type of to and performing in the case of letter with color greater than c the two steps of and the original together. We chose to define the automaton this way to ease the correspondence between \(\varepsilon \)-pop (resp. \(\varepsilon \)-push) transitions and recovered calls (resp. recovered returns).

  5. Where \(\gamma [j]\) refers to the j’th letter of \(\gamma \).

  6. Note that a blind cna is still different than a traditional nested word automaton, as it has the means to skip all the unmatched calls of lower color and arrive to the matching call, if such exists, and a greater call otherwise.

References

  1. Alur R, Bouajjani A, Esparza J (2016) Model checking of procedural programs. In: Handbook of Model Checking. Springer. To Appear

  2. Alur R, Chaudhuri S (2010) Temporal reasoning for procedural programs. In: VMCAI, pp. 45–60

  3. Alur R, Chaudhuri S, Madhusudan P (2006) A fixpoint calculus for local and global program flows. In: POPL, pp. 153–165

  4. Alur R, Chaudhuri S, Madhusudan P (2011) Software model checking using languages of nested trees. ACM Trans. Program. Lang. Syst. 33(5):15

    Article  Google Scholar 

  5. Alur R, Madhusudan P (2004) Visibly pushdown languages. In: STOC, pp. 202–211

  6. Alur R, Madhusudan P (2009) Adding nesting structure to words. J. ACM 56(3). http://robotics.upenn.edu/~alur/Jacm09.pdf

  7. Caucal D, Hassen S (2008) Synchronization of grammars. In: CSR, pp. 110–121

  8. Chaudhuri S, Alur R (2007) Instrumenting C programs with nested word monitors. In: SPIN, pp. 279–283

  9. Crespi-Reghizzi S, Mandrioli D (2012) Operator precedence and the visibly pushdown property. J. Comput. Syst. Sci. 78(6):1837–1867

    Article  MathSciNet  MATH  Google Scholar 

  10. Debarbieux D, Gauwin O, Niehren J, Sebastian T, Zergaoui M (2013) Early nested word automata for xpath query answering on XML streams. In: CIAA’13, pp. 292–305

  11. Driscoll E, Burton A, Reps TW (2011) Checking conformance of a producer and a consumer. In: SIGSOFT/FSE, pp. 113–123

  12. Filiot E, Gauwin O, Reynier PA, Servais F (2011) Streamability of nested word transductions. In: Annual Conference on Foundations of Software Technology and Theoretical Computer Science, FSTTCS, pp. 312–324

  13. Filiot E, Raskin JF, Reynier PA, Servais F, Talbot JM (2010) Properties of visibly pushdown transducers. In: In Proc. 35th MFCS, pp. 355–367

  14. Filiot E, Servais F (2012) Visibly pushdown transducers with look-ahead. In: SOFSEM 2012: Conf. on Current Trends The. and Prac. of CS, pp. 251–263

  15. Hague M, Murawski AS, Ong CL, Serre O (2008) Collapsible pushdown automata and recursion schemes. In: LICS, pp. 452–461

  16. Madhusudan P, Viswanathan M (2009) Query automata for nested words. In: MFCS, pp. 561–573

  17. Mozafari B, Zeng K, Zaniolo C (2012) High-performance complex event processing over xml streams. In: SIGMOD Conference, pp. 253–264

  18. Nowotka D, Srba J (2007) Height-deterministic pushdown automata. In: MFCS, pp. 125–134

  19. Raskin JF, Servais F (2008) Visibly pushdown transducers. In: Automata, Languages and Programming, ICALP 2008, pp. 386–397

  20. Staworko S, Laurence G, Lemay A, Niehren J (2009) Equivalence of deterministic nested word to word transducers. In: In Proc. 17th FCT, pp. 310–322

  21. Thomo A, Venkatesh S (2011) Rewriting of visibly pushdown languages for XML data integration. Theor. Comput. Sci. 412(39):5285–5297

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

This is an extended version of the paper that appeared in LATA’16. The version in the proceedings has an error in Theorem 6: Regular languages of nested words as defined in the proceedings version are not closed under reversal. The current version defines regular languages of colored nested words (and colored nested words automata) differently. Under the current definition regular languages of colored nested words are closed under reversal as well. We would like to thank Sarai Sheinvald for pointing to the problem in the proceedings version.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dana Fisman.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Examples of runs of the CNAs in Fig 7

Example 1

Consider the cna at the right of the first line in Fig. 7. Let \(\mathbb {w}=\mathbb {w}_1\mathbb {w}_2\) be the word depicted in Fig. 5 where and . The sequence of configurations goes through when reading \(\mathbb {w}_1\) is:

Note that on reading first an \(\varepsilon \)-pop transition is done, and then a pop-transition consuming is preformed.

The sequence of configurations goes through when reading \(\mathbb {w}_2\) is:

Here an \(\varepsilon \)-pop transition is preformed from state li (upon reading ). Note that it corresponds to the first which is the only un-matced and that the recovery is preformed just before reading the closing exactly as depicted in Fig. 5.

Example 2

Consider the cna at the left of the second line in Fig. 7. Let The sequence of configurations goes through when reading \(\mathbb {w}_3\) is:

Since \((even,(\bot ,\infty ))\) is an accepting configuration the word \(\mathbb {w}_3\) is accepepted by .

Consider the cna at the middle of the second line in Fig. 7. Let . The sequence of configurations goes through when reading \(\mathbb {w}_4\) is:

Since \((odd,(\bot ,\infty ))\) is an accepting configuration the word \(\mathbb {w}_4\) is accepepted by . Consider the cna \(A_5\) at the right of the second line in Fig. 7. Let and The word \(\mathbb {w}_5\) should be accepted whereas \(\mathbb {w}_6\). We show here the sequence of configurations goes through when reading \(\mathbb {w}_5\).

Since \((odd,(\bot ,\infty ))\) is an accepting configuration the word \(\mathbb {w}_5\) is accepepted by .

Below we show the sequence of configurations goes through when reading \(\mathbb {w}_6\), where the first steps are omitted as they are similar to those on reading \(\mathbb {w}_5\).

At this point will get stuck because there is no pop transition from state even on stack letter (e, 2) and thus \(\mathbb {w}_6\) is rejected.

Proofs of lemmas of Sect. 5.1

Lemma 3 states that every cna can be converted into an equivalent icna. Here is its proof.

Proof

(of Lemma 3) Let be a cna. We define an icna , as follows. Let \(p_\bot \) be a fresh stack symbol, and \(c_\bot \) a color bigger than all colors in C. Let are \(P \cup \{p_\bot \}\), and . The set of states is . We use \(g_\bot \) for \((p_\bot ,c_\bot )\) and \(g,g',g''\) for arbitrary elements of . The idea is that records in the state the initial stack symbol. If encounters \(g_\bot \) on a pop operation, it proceeds as would if the current stack symbol was what is recorded in its state. Formally, the initial frontiers are \(\{((q,g),g_\bot )~|~(q,g)\in I\}\). The final frontiers are . For the transition relation we have:

  • if \(q'\in {{\dot{\delta }}}(q,{\dot{a}})\)

  • if

  • if \(g'\ne g_\bot \) and

  • if

  • if \(g'\ne g_\bot \) and

  • if

  • if

\(\square \)

Lemma 4 states that every cna can be converted into an equivalent cna with a single initial frontier. Here is its proof.

Proof

(of Lemma 4) By Lemma 3 we can convert the given cna into a cna with , where \({I \subseteq Q \times \{p_\bot \}} \times \{c_\bot \})\) for some \(g_\bot =(p_\bot ,c_\bot )\in \varGamma \). Let , where \(q_I\) is a fresh state, is same as F if \(F\cap I = \emptyset \) and otherwise . For the transition relations we connect \(q_I\) to the states that are reachable from one of the initial states, thus , , , and . \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Alur, R., Fisman, D. Colored nested words. Form Methods Syst Des 58, 347–374 (2021). https://doi.org/10.1007/s10703-021-00384-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10703-021-00384-2

Keywords

Navigation