Abstract
This paper investigates properties of the binary string and language operation overlap assembly which was defined by Csuhaj-Varjú, Petre and Vaszil as a formal model of the linear self-assembly of DNA strands: The overlap assembly of two strings, xy and yz, which share an “overlap” y, results in the string xyz. The study of overlap assembly as a formal language operation is part of ongoing efforts to provide a formal framework and rigorous treatment of DNA-based information and DNA-based computation. Other studies along these lines include theoretical explorations of splicing systems, insertion/deletion systems, substitution, hairpin extension, hairpin reduction, superposition, overlapping catenation, conditional concatenation, contextual intra- and intermolecular recombinations, template guided recombination, as well as directed extension by PCR. In this context, we investigate overlap assembly and its properties: closure properties of basic language families under this operation, decision problems, as well as the possible use of iterated overlap assembly to generate combinatorial DNA libraries.
Similar content being viewed by others
References
Alur R, Madhusudan P (2004) Visibly pushdown languages. In: Proceedings of ACM symposium on theory of computing, STOC. ACM-Press, pp 202–211
Angeleska A, Jonoska N, Saito M, Landweber LF (2007) RNA-guided DNA assembly. J Theor Biol 248(4):706–720
Bottoni P, Labella A, Manca V, Mitrana V (2006) Superposition based on Watson–Crick-like complementarity. Theory Comput Syst 39(4):503–524
Braich RS, Chelyapov N, Johnson C, Rothemund PWK, Adleman L (2002) Solution of a 20-variable 3-SAT problem on a DNA computer. Science 296(5567):499–502
Cheptea D, Martín-Vide C, Mitrana V (2006) A new operation on words suggested by DNA biochemistry: hairpin completion. In: Proceedings of transgressive computing, TC, pp 216–228
Chiniforooshan E, Daley M, Ibarra OH, Kari L, Seki S (2012) One-reversal counter machines and multihead automata: revisited. Theor Comput Sci 454:81–87
Csuhaj-Varjú E, Petre I, Vaszil G (2007) Self-assembly of strings and languages. Theor Comput Sci 374(1–3):74–81
Cukras AR, Faulhammer D, Lipton RJ, Landweber LF (1999) Chess games: a model for RNA based computation. Biosystems 52(1–3):35–45
Daley M, Kari L, Gloor G, Siromoney R: Circular contextual insertions/deletions with applications to biomolecular computation. In: Proceedings of string processing and information retrieval, SPIRE, pp 47–54 (1999)
Dassow J, Martín-Vide C, Păun G, Rodríguez-Patón A (2000) Conditional concatenation. Fundam Inf 44(4):353–372
Ehrenfeucht A, Petre I, Prescott DM, Rozenberg G (2001) Circularity and other invariants of gene assembly in ciliates. World Scientific, Singapore
Enaganti SK, Kari L, Kopecki S (2015) A formal language model of DNA polymerase activity. Fundam Inform 138:179–192
Faulhammer D, Cukras AR, Lipton RJ, Landweber LF (2000) Molecular computation: RNA solutions to chess problems. Proc Natl Acad Sci 97(4):1385–1389
Franco G (2005) A polymerase based algorithm for SAT. In: Coppo M, Lodi E, Pinna G (eds) Theoretical computer science, Lecture notes in computer science, vol 3701. Springer, Berlin, pp 237–250
Franco G, Manca V (2011) Algorithmic applications of XPCR. Nat Comput 10(2):805–819
Franco G, Giagulli C, Laudanna C, Manca V (2005) DNA extraction by XPCR. In: Ferretti C, Mauri G, Zandron C (eds) Proceedings of DNA computing (DNA 11), LNCS, vol 3384, pp 104–112
Franco G, Manca V, Giagulli C, Laudanna C (2006) DNA recombination by XPCR. In: Carbone A, Pierce NA (eds) Proceedings of DNA computing (DNA 12), LNCS, vol 3892, pp 55–66
Gatterdam RW (1989) Splicing systems and regularity. Int J Comput Math 31(1–2):63–67
Head T, Pixton D, Goode E (2003) Splicing systems: regularity and below. In: Hagiya M, Ohuchi A (eds) DNA based computers: DNA computing, DNA 8. LNCS, vol 2568, pp 262–268
Hopcroft JE, Ullman JD (1978) Introduction to automata theory, languages, and computation. Addison-Wesley, Reading
Ibarra OH (1978) Reversal-bounded multicounter machines and their decision problems. J ACM 25(1):116–133
Ibarra OH (2014) Automata with reversal-bounded counters: a survey. In: Proceedings of descriptional complexity of formal systems, DCFS. Springer, pp 5–22
Ibarra OH, Seki S (2012) Characterizations of bounded semilinear languages by one-way and two-way deterministic machines. Int J Found Comput Sci 23(06):1291–1305
Jürgensen H, Konstantinidis S (1997) Codes. In: Handbook of formal languages. Springer, Berlin, pp 511–607
Kaplan PD, Ouyang Q, Thaler DS, Libchaber A (1997) Parallel overlap assembly for the construction of computational DNA libraries. J Theor Biol 188(3):333–341
Kari L, Losseva E (2006) Block substitutions and their properties. Fundam Inform 73(1–2):165–178
Kari L, Sosík P (2008) On the weight of universal insertion grammars. Theor Comput Sci 396(1–3):264–270
Kari L, Kopecki S (2012) Deciding whether a regular language is generated by a splicing system. In: Stefanovic D, Turberfield A (eds) DNA computing and molecular programming (DNA 18). LNCS, vol 7433, pp 98–109
Kari L, Kari J, Landweber L (1999a) Reversible molecular computation in ciliates. In: Karhumäki J, Maurer H, Păun G, Rozenberg G (eds) Jewels are forever. Springer, Berlin, pp 353–363
Kari L, Păun G, Thierrin G, Yu S (1999b) At the crossroads of DNA computing and formal languages: characterizing recursively enumerable languages using insertion–deletion systems. In: DNA based computers III (DNA3), DIMACS, vol 48, pp 329–347
Kim SM (1997) An algorithm for identifying spliced languages. In: Jiang T, Lee D (eds) Proceedings of computing and combinatorics conference, COCOON. LNCS, vol 1276, pp 403–411
Kopecki S (2011) On iterated hairpin completion. Theor Comput Sci 412(29):3629–3638
Landweber LF, Kari L (1999) The evolution of cellular computing: natures solution to a computational problem. Biosystems 52(1–3):3–13
Ledesma L, Manrique D, Rodríguez-Patón A (2005) A tissue P system and a DNA microfluidic device for solving the shortest common superstring problem. Soft Comput 9(9):679–685
Manca V, Franco G (2008) Computing by polymerase chain reaction. Math Biosci 211(2):282–298
Manea F, Mitrana V (2007) Hairpin completion versus hairpin reduction. In: Cooper SB, Löwe B, Sorbi A (eds) Proceedings of computability in Europe, CiE. LNCS, vol 4497, pp 532–541
Manea F, Martín-Vide C, Mitrana V (2009a) On some algorithmic problems regarding the hairpin completion. Discrete Appl Math 157(9):2143–2152
Manea F, Mitrana V, Sempere J (2009b) Some remarks on superposition based on Watson–Crick–like complementarity. In: Diekert V, Nowotka D (eds) Developments in language theory. LNCS, vol 5583, pp 372–383
Martín-Vide C, Păun G, Pazos J, Rodríguez-Patón A (2003) Tissue P systems. Theor Comput Sci 296(2):295–326
Mehlhorn K (1980) Pebbling moutain ranges and its application of DCFL-recognition. In: Proceedings of automata, languages and programming, ICALP, pp 422–435. Springer
Minsky ML (1961) Recursive unsolvability of post’s problem of “tag” and other topics in theory of turing machines. Ann Math 74(3):437–455
Ouyang Q, Kaplan PD, Liu S, Libchaber A (1997) DNA solution of the maximal clique problem. Science 278(5337):446–449
Păun G, Rozenberg G, Salomaa A (2006) DNA computing: new computing paradigms. Springer, New York
Păun G, Pèrez-Jimènez MJ, Yokomori T (2008) Representations and characterizations of languages in Chomsky hierarchy by means of insertion–deletion systems. Int J Found Comput Sci 19(4):859–871
Pixton D (1996) Regularity of splicing languages. Discrete Appl Math 69(1–2):101–124
Prescott DM, Ehrenfeucht A, Rozenberg G (2001) Molecular operations for DNA processing in hypotrichous ciliates. Eur J Protistol 37(3):241–260
Stemmer WP (1994) DNA shuffling by random fragmentation and reassembly: in vitro recombination for molecular evolution. Proc Natl Acad Sci 91(22):10747–10751
Takahara A, Yokomori T (2003) On the computational power of insertion–deletion systems. Nat Comput 2(4):321–336
Yong M, Xiao-Gang J, Xian-Chuang S, Bo P (2005) Minimizing of the only-insertion insdel systems. J Zhejiang Univ Sci A 6(10):1021–1025
Acknowledgments
This research was supported by a Natural Science and Engineering Council of Canada (NSERC) Discovery Grant R2824A01 and a University of Western Ontario Grant to L.K., and US NSF Grant CCF-1117708 to O.H.I. The authors thank Giuditta Franco for useful suggestions and discussions on experimental aspects of XPCR, as well as Sepinoud Azimi and Florin Manea for pointing out important references.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Enaganti, S.K., Ibarra, O.H., Kari, L. et al. On the overlap assembly of strings and languages. Nat Comput 16, 175–185 (2017). https://doi.org/10.1007/s11047-015-9538-x
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11047-015-9538-x