Skip to main content
Log in

On the overlap assembly of strings and languages

  • Published:
Natural Computing Aims and scope Submit manuscript

Abstract

This paper investigates properties of the binary string and language operation overlap assembly which was defined by Csuhaj-Varjú, Petre and Vaszil as a formal model of the linear self-assembly of DNA strands: The overlap assembly of two strings, xy and yz, which share an “overlap” y, results in the string xyz. The study of overlap assembly as a formal language operation is part of ongoing efforts to provide a formal framework and rigorous treatment of DNA-based information and DNA-based computation. Other studies along these lines include theoretical explorations of splicing systems, insertion/deletion systems, substitution, hairpin extension, hairpin reduction, superposition, overlapping catenation, conditional concatenation, contextual intra- and intermolecular recombinations, template guided recombination, as well as directed extension by PCR. In this context, we investigate overlap assembly and its properties: closure properties of basic language families under this operation, decision problems, as well as the possible use of iterated overlap assembly to generate combinatorial DNA libraries.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Alur R, Madhusudan P (2004) Visibly pushdown languages. In: Proceedings of ACM symposium on theory of computing, STOC. ACM-Press, pp 202–211

  • Angeleska A, Jonoska N, Saito M, Landweber LF (2007) RNA-guided DNA assembly. J Theor Biol 248(4):706–720

    Article  MathSciNet  Google Scholar 

  • Bottoni P, Labella A, Manca V, Mitrana V (2006) Superposition based on Watson–Crick-like complementarity. Theory Comput Syst 39(4):503–524

    Article  MathSciNet  MATH  Google Scholar 

  • Braich RS, Chelyapov N, Johnson C, Rothemund PWK, Adleman L (2002) Solution of a 20-variable 3-SAT problem on a DNA computer. Science 296(5567):499–502

    Article  Google Scholar 

  • Cheptea D, Martín-Vide C, Mitrana V (2006) A new operation on words suggested by DNA biochemistry: hairpin completion. In: Proceedings of transgressive computing, TC, pp 216–228

  • Chiniforooshan E, Daley M, Ibarra OH, Kari L, Seki S (2012) One-reversal counter machines and multihead automata: revisited. Theor Comput Sci 454:81–87

    Article  MathSciNet  MATH  Google Scholar 

  • Csuhaj-Varjú E, Petre I, Vaszil G (2007) Self-assembly of strings and languages. Theor Comput Sci 374(1–3):74–81

    Article  MathSciNet  MATH  Google Scholar 

  • Cukras AR, Faulhammer D, Lipton RJ, Landweber LF (1999) Chess games: a model for RNA based computation. Biosystems 52(1–3):35–45

    Article  Google Scholar 

  • Daley M, Kari L, Gloor G, Siromoney R: Circular contextual insertions/deletions with applications to biomolecular computation. In: Proceedings of string processing and information retrieval, SPIRE, pp 47–54 (1999)

  • Dassow J, Martín-Vide C, Păun G, Rodríguez-Patón A (2000) Conditional concatenation. Fundam Inf 44(4):353–372

    MathSciNet  MATH  Google Scholar 

  • Ehrenfeucht A, Petre I, Prescott DM, Rozenberg G (2001) Circularity and other invariants of gene assembly in ciliates. World Scientific, Singapore

    Book  MATH  Google Scholar 

  • Enaganti SK, Kari L, Kopecki S (2015) A formal language model of DNA polymerase activity. Fundam Inform 138:179–192

    MathSciNet  MATH  Google Scholar 

  • Faulhammer D, Cukras AR, Lipton RJ, Landweber LF (2000) Molecular computation: RNA solutions to chess problems. Proc Natl Acad Sci 97(4):1385–1389

    Article  Google Scholar 

  • Franco G (2005) A polymerase based algorithm for SAT. In: Coppo M, Lodi E, Pinna G (eds) Theoretical computer science, Lecture notes in computer science, vol 3701. Springer, Berlin, pp 237–250

  • Franco G, Manca V (2011) Algorithmic applications of XPCR. Nat Comput 10(2):805–819

    Article  MathSciNet  MATH  Google Scholar 

  • Franco G, Giagulli C, Laudanna C, Manca V (2005) DNA extraction by XPCR. In: Ferretti C, Mauri G, Zandron C (eds) Proceedings of DNA computing (DNA 11), LNCS, vol 3384, pp 104–112

  • Franco G, Manca V, Giagulli C, Laudanna C (2006) DNA recombination by XPCR. In: Carbone A, Pierce NA (eds) Proceedings of DNA computing (DNA 12), LNCS, vol 3892, pp 55–66

  • Gatterdam RW (1989) Splicing systems and regularity. Int J Comput Math 31(1–2):63–67

    Article  MATH  Google Scholar 

  • Head T, Pixton D, Goode E (2003) Splicing systems: regularity and below. In: Hagiya M, Ohuchi A (eds) DNA based computers: DNA computing, DNA 8. LNCS, vol 2568, pp 262–268

  • Hopcroft JE, Ullman JD (1978) Introduction to automata theory, languages, and computation. Addison-Wesley, Reading

    MATH  Google Scholar 

  • Ibarra OH (1978) Reversal-bounded multicounter machines and their decision problems. J ACM 25(1):116–133

    Article  MathSciNet  MATH  Google Scholar 

  • Ibarra OH (2014) Automata with reversal-bounded counters: a survey. In: Proceedings of descriptional complexity of formal systems, DCFS. Springer, pp 5–22

  • Ibarra OH, Seki S (2012) Characterizations of bounded semilinear languages by one-way and two-way deterministic machines. Int J Found Comput Sci 23(06):1291–1305

    Article  MathSciNet  MATH  Google Scholar 

  • Jürgensen H, Konstantinidis S (1997) Codes. In: Handbook of formal languages. Springer, Berlin, pp 511–607

  • Kaplan PD, Ouyang Q, Thaler DS, Libchaber A (1997) Parallel overlap assembly for the construction of computational DNA libraries. J Theor Biol 188(3):333–341

    Article  Google Scholar 

  • Kari L, Losseva E (2006) Block substitutions and their properties. Fundam Inform 73(1–2):165–178

    MathSciNet  MATH  Google Scholar 

  • Kari L, Sosík P (2008) On the weight of universal insertion grammars. Theor Comput Sci 396(1–3):264–270

    Article  MathSciNet  MATH  Google Scholar 

  • Kari L, Kopecki S (2012) Deciding whether a regular language is generated by a splicing system. In: Stefanovic D, Turberfield A (eds) DNA computing and molecular programming (DNA 18). LNCS, vol 7433, pp 98–109

  • Kari L, Kari J, Landweber L (1999a) Reversible molecular computation in ciliates. In: Karhumäki J, Maurer H, Păun G, Rozenberg G (eds) Jewels are forever. Springer, Berlin, pp 353–363

    Chapter  Google Scholar 

  • Kari L, Păun G, Thierrin G, Yu S (1999b) At the crossroads of DNA computing and formal languages: characterizing recursively enumerable languages using insertion–deletion systems. In: DNA based computers III (DNA3), DIMACS, vol 48, pp 329–347

  • Kim SM (1997) An algorithm for identifying spliced languages. In: Jiang T, Lee D (eds) Proceedings of computing and combinatorics conference, COCOON. LNCS, vol 1276, pp 403–411

  • Kopecki S (2011) On iterated hairpin completion. Theor Comput Sci 412(29):3629–3638

    Article  MathSciNet  MATH  Google Scholar 

  • Landweber LF, Kari L (1999) The evolution of cellular computing: natures solution to a computational problem. Biosystems 52(1–3):3–13

    Article  Google Scholar 

  • Ledesma L, Manrique D, Rodríguez-Patón A (2005) A tissue P system and a DNA microfluidic device for solving the shortest common superstring problem. Soft Comput 9(9):679–685

    Article  MATH  Google Scholar 

  • Manca V, Franco G (2008) Computing by polymerase chain reaction. Math Biosci 211(2):282–298

    Article  MathSciNet  MATH  Google Scholar 

  • Manea F, Mitrana V (2007) Hairpin completion versus hairpin reduction. In: Cooper SB, Löwe B, Sorbi A (eds) Proceedings of computability in Europe, CiE. LNCS, vol 4497, pp 532–541

  • Manea F, Martín-Vide C, Mitrana V (2009a) On some algorithmic problems regarding the hairpin completion. Discrete Appl Math 157(9):2143–2152

    Article  MathSciNet  MATH  Google Scholar 

  • Manea F, Mitrana V, Sempere J (2009b) Some remarks on superposition based on Watson–Crick–like complementarity. In: Diekert V, Nowotka D (eds) Developments in language theory. LNCS, vol 5583, pp 372–383

  • Martín-Vide C, Păun G, Pazos J, Rodríguez-Patón A (2003) Tissue P systems. Theor Comput Sci 296(2):295–326

    Article  MathSciNet  MATH  Google Scholar 

  • Mehlhorn K (1980) Pebbling moutain ranges and its application of DCFL-recognition. In: Proceedings of automata, languages and programming, ICALP, pp 422–435. Springer

  • Minsky ML (1961) Recursive unsolvability of post’s problem of “tag” and other topics in theory of turing machines. Ann Math 74(3):437–455

    Article  MathSciNet  MATH  Google Scholar 

  • Ouyang Q, Kaplan PD, Liu S, Libchaber A (1997) DNA solution of the maximal clique problem. Science 278(5337):446–449

    Article  Google Scholar 

  • Păun G, Rozenberg G, Salomaa A (2006) DNA computing: new computing paradigms. Springer, New York

    MATH  Google Scholar 

  • Păun G, Pèrez-Jimènez MJ, Yokomori T (2008) Representations and characterizations of languages in Chomsky hierarchy by means of insertion–deletion systems. Int J Found Comput Sci 19(4):859–871

    Article  MathSciNet  MATH  Google Scholar 

  • Pixton D (1996) Regularity of splicing languages. Discrete Appl Math 69(1–2):101–124

    Article  MathSciNet  MATH  Google Scholar 

  • Prescott DM, Ehrenfeucht A, Rozenberg G (2001) Molecular operations for DNA processing in hypotrichous ciliates. Eur J Protistol 37(3):241–260

    Article  Google Scholar 

  • Stemmer WP (1994) DNA shuffling by random fragmentation and reassembly: in vitro recombination for molecular evolution. Proc Natl Acad Sci 91(22):10747–10751

    Article  Google Scholar 

  • Takahara A, Yokomori T (2003) On the computational power of insertion–deletion systems. Nat Comput 2(4):321–336

    Article  MathSciNet  MATH  Google Scholar 

  • Yong M, Xiao-Gang J, Xian-Chuang S, Bo P (2005) Minimizing of the only-insertion insdel systems. J Zhejiang Univ Sci A 6(10):1021–1025

    Google Scholar 

Download references

Acknowledgments

This research was supported by a Natural Science and Engineering Council of Canada (NSERC) Discovery Grant R2824A01 and a University of Western Ontario Grant to L.K., and US NSF Grant CCF-1117708 to O.H.I. The authors thank Giuditta Franco for useful suggestions and discussions on experimental aspects of XPCR, as well as Sepinoud Azimi and Florin Manea for pointing out important references.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Srujan Kumar Enaganti.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Enaganti, S.K., Ibarra, O.H., Kari, L. et al. On the overlap assembly of strings and languages. Nat Comput 16, 175–185 (2017). https://doi.org/10.1007/s11047-015-9538-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11047-015-9538-x

Keywords

Navigation