The Missing Case in Chomsky-Schützenberger Theorem

Crespi Reghizzi, Stefano; San Pietro, Pierluigi

doi:10.1007/978-3-319-30000-9_27

Stefano Crespi Reghizzi¹⁷ &
Pierluigi San Pietro¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9618))

Included in the following conference series:

Language and Automata Theory and Applications

1419 Accesses
1 Citations

Abstract

The theorem by Chomsky and Schützenberger (CST) says that every context-free language L over alphabet $\varSigma $ is representable as $h(D_k \cap R)$ where $D_k$ is the Dyck language over k pairs of brackets, R is a local (i.e., 2-strictly-locally-testable language) regular language, and h is an alphabetic homomorphism that may erase symbols; the Dyck alphabet size depends on the size of the grammar generating L. In the Stanley variant, the Dyck alphabet size only depends on the size of $\varSigma $, but the homomorphism has to erase many more symbols than in the previous version. Berstel found that the number of erasures in CST can be linearly limited if the grammar is in Greibach normal form, and recently Okhotin proved a non-erasing variant of CST for grammars in Double Greibach normal form. In both statements the Dyck alphabet depends on the grammar size. We present a new non-erasing variant of CST that uses a Dyck alphabet independent from the grammar size and a regular language that is strictly-locally-testable, similarly to a recent generalization of Medvedev theorem for regular languages.

Work partially supported by MIUR project PRIN 2010LYA9RH and by CNR-IEIIT.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Berstel, J.: Transductions and Context-Free Languages. Teubner, Stuttgart (1979)
Book MATH Google Scholar
Chomsky, N., Schützenberger, M.: The algebraic theory of context-free languages. In: Brafford, H. (ed.) Computer Programming and Formal Systems, pp. 118–161. North-Holland, Amsterdam (1963)
Chapter Google Scholar
Crespi Reghizzi, S., San Pietro, P.: From regular to strictly locally testable languages. Int. J. Found. Comput. Sci. 23(8), 1711–1728 (2012)
Article MathSciNet MATH Google Scholar
Engelfriet, J.: An elementary proof of double Greibach normal form. Inf. Process. Lett. 44(6), 291–293 (1992)
Article MathSciNet MATH Google Scholar
Ginsburg, S.: The Mathematical Theory of Context-free Languages. McGraw-Hill, New York (1966)
MATH Google Scholar
Harrison, M.: Introduction to Formal Language Theory. Addison Wesley, Reading (1978)
MATH Google Scholar
McNaughton, R., Papert, S.: Counter-free Automata. MIT Press, Cambridge (1971)
MATH Google Scholar
Medvedev, Y.T.: On the class of events representable in a finite automaton. In: Moore, E.F. (ed.) Sequential machines - Selected papers (translated from Russian), pp. 215–227. Addison-Wesley, New York, NY, USA (1964)
Google Scholar
Okhotin, A.: Non-erasing variants of the Chomsky–Schützenberger theorem. In: Yen, H.-C., Ibarra, O.H. (eds.) DLT 2012. LNCS, vol. 7410, pp. 121–129. Springer, Heidelberg (2012)
Chapter Google Scholar
Eilenberg, S.: Automata, Languages, and Machines. Academic Press, Orlando (1974)
MATH Google Scholar
Stanley, R.J.: Finite state representations of context-free languages. M.I.T. Res. Lab. Electron. Quart. Progr. Rept. 76(1), 276–279 (1965)
Google Scholar

Download references

Author information

Authors and Affiliations

Dipartimento di Elettronica, Informazione e Bioingegneria (DEIB), Politecnico di Milano, Piazza Leonardo da Vinci 32, 20133, Milano, Italy
Stefano Crespi Reghizzi & Pierluigi San Pietro

Authors

Stefano Crespi Reghizzi
View author publications
You can also search for this author in PubMed Google Scholar
Pierluigi San Pietro
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pierluigi San Pietro .

Editor information

Editors and Affiliations

Rovira i Virgili University, Tarragona, Spain
Adrian-Horia Dediu
Czech Technical University, Prague, Czech Republic
Jan Janoušek
Rovira i Virgili University, Tarragona, Spain
Carlos Martín-Vide
Justus Liebig University Giessen, Gießen, Germany
Bianca Truthe

Appendix: An Example

The example illustrates the crucial part of our constructions, namely the homomorphism $\tau $ defined by formulas (4) and (5). Consider language $L= \{a^{2n+4} b^{6n} \mid n\ge 0\}$ (generated, e.g., by grammar $\{S\rightarrow aaSb^6 \mid a^4\}$), and choose the value $m=2$ for Definition 3, meaning that the substrings of length two occurring in the language are mapped on the 2-tuples $\langle a,a\rangle , \langle a,b\rangle , \langle b,b\rangle $, shortened as $\langle aa\rangle $, etc. The following grammar in Double Greibach normal form, though constructed by hand, takes the place of grammar $G'$ of Lemma 1:

$1: S \rightarrow \langle aa\rangle \, S \,B\, \langle bb\rangle , \; 2: S\rightarrow \langle aa\rangle \,\langle aa\rangle , \; 3:B \rightarrow \langle bb\rangle \, \langle bb\rangle . $

The sentence $a^8 b^{12} \in L$ becomes $ \langle aa\rangle ^4 \langle bb\rangle ^6 \in L(G')$, with the syntax tree:

For Okhotin Theorem 1, this sentence is the image by homomorphism h of the following sequence $\gamma $ of labeled parentheses, where the numbers identify the rules and the dash marks the root:

$$ \gamma = (^{-}_{1} \quad (^{1}_{1} \quad (^{1}_{2}\quad )^{1}_{2} \quad (^{1}_{3}\quad )^{1}_{3} \quad )^{1}_{1} \quad (^{1}_{3} \quad )^{1}_{3}\quad )^-_{1} $$

We choose to represent the labeled parentheses with $m=2$ binary digits, defining $\tau $ as:

$$ \begin{array}{c|c | c|c} \omega &{} \omega ' &{} \tau (\omega ) &{} \tau (\omega ')\\ \hline (^{-}_{1} &{} )^-_{1} &{} [_{a,b,0} \; [_{a,b,0} &{} ]_{b,a,0} \; ]_{b,a,0} \\ (^{1}_{1} &{} )^1_{1} &{} [_{a,b,0} \; [_{a,b,1} &{} ]_{b,a,1} \; ]_{b,a,0} \\ (^{1}_{2} &{} )^1_{2} &{} [_{a,a,1} \; [_{a,a,0} &{} ]_{a,a,0} \; ]_{a,a,1} \\ (^{1}_{3} &{} )^1_{3} &{} [_{b,b,1} \; [_{b,b,1} &{} ]_{b,b,1} \; ]_{b,b,1} \end{array} $$

Hence $\tau \left( \pi (h(\gamma ))\right) $ is

$ \overbrace{[_{a,b,0}\; [_{a,b,0}}^{(^{-}_{1}}\; \overbrace{[_{a,b,0} \; [_{a,b,1}}^{(^{1}_{1}} \overbrace{[_{a,a,1} \; [_{a,a,0}}^{(^{1}_{2}} \; \overbrace{]_{a,a,0} \; ]_{a,a,1}}^{)^{1}_{2}} \; \overbrace{[_{b,b,1} \; [_{b,b,1}}^{(^{1}_{3}} \overbrace{]_{b,b,1} \; ]_{b,b,1}}^{)^{1}_{3}} $

$ \overbrace{]_{b,a,1} \; ]_{b,a,0}}^{)^{1}_{1}} \overbrace{[_{b,b,1} \; [_{b,b,1}}^{(^{1}_{3}} \overbrace{]_{b,b,1} \; ]_{b,b,1}}^{)^{1}_{3}} \overbrace{]_{b,a,0}\; ]_{b,a,0}}^{)^-_{1}} $

Notice that the 2-SLT language of the classical CST (applied to language L) is now replaced by an SLT language of higher width.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Crespi Reghizzi, S., San Pietro, P. (2016). The Missing Case in Chomsky-Schützenberger Theorem. In: Dediu, AH., Janoušek, J., Martín-Vide, C., Truthe, B. (eds) Language and Automata Theory and Applications. LATA 2016. Lecture Notes in Computer Science(), vol 9618. Springer, Cham. https://doi.org/10.1007/978-3-319-30000-9_27

Download citation

DOI: https://doi.org/10.1007/978-3-319-30000-9_27
Published: 26 February 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-29999-0
Online ISBN: 978-3-319-30000-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

The Missing Case in Chomsky-Schützenberger Theorem

Abstract

Access this chapter

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix: An Example

Appendix: An Example

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation