Abstract
A duplication is basic phenomenon that occurs through molecular evolution on a biological sequence. A duplication on a string copies any substring of the string. We define k-pseudo-duplication of a string w that consists, roughly speaking, of all strings obtained from w by inserting after a substring u another substring obtained from u by at most k edit operations. We consider three variants of duplication operations, duplication, k-pseudo-duplication and reverse-duplication. First, we give the necessary and sufficient number of states that a nondeterministic finite automaton needs to recognize duplications on a string. Then, we show that regular languages and context-free languages are not closed under the duplication, k-pseudo-duplication and reverse-duplication operations. Furthermore, we show that the class of context-sensitive languages is closed under duplication, pseudo-duplication and reverse-duplication.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Calude, C., Salomaa, K., Yu, S.: Additive distances and quasi-distances between words. Univ. Comput. Sci. 8(2), 141–152 (2002)
Cameron, M., Williams, H.E., Cannane, A.: A deterministic finite automaton for faster protein hit detection in blast. Comput. Biol. 13(4), 965–978 (2006)
Cantone, D., Cristofaro, S., Faro, S.: Efficient string-matching allowing for non-overlapping inversions. Theor. Comput. Sci. 483, 85–95 (2013)
Cho, D.J., Han, Y.S., Kang, S.D., Kim, H., Ko, S.K., Salomaa, K.: Pseudo-inversion on formal languages. In: Ibarra, O.H., Kari, L., Kopecki, S. (eds.) UCNC 2014. LNCS, vol. 8553, pp. 93–104. Springer, Heidelberg (2014)
Cho, D.J., Han, Y.S., Kim, H.: Alignment with non-overlapping inversions and translocations on two strings. Theor. Comput. Sci. 575, 90–101 (2015)
Cho, D.J., Han, Y.S., Ko, S.K., Salomaa, K.: State complexity of inversion operations. Theor. Comput. Sci. (in press)
Choffrut, C., Pighizzini, G.: Distances between languages and reflexivity of relations. Theor. Comput. Sci. 286(1), 117–138 (2002)
Creighton, H.B., McClintock, B.: A correlation of cytological and genetical crossing-over in zea mays. Nat. Acad. Sci. U.S.A. 17(8), 492–497 (1931)
Dassow, J., Mitrana, V., Paun, G.: On the regularity of duplication closure. Bull. EATCS 69, 133–136 (1999)
Dassow, J., Mitrana, V., Salomaa, A.: Operations and language generating devices suggested by the genome evolution. Theor. Comput. Sci. 270(1), 701–738 (2002)
Dassow, J., Mitrana, V., Salomaa, A.: Context-free evolutionary grammars and the structural language of nucleic acids. Biosystems 43(3), 169–177 (1997)
Djian, P.: Evolution of simple repeats in dna and their relation to human disease. Cell 94(2), 155–160 (1998)
Herrmannsfeldt, G.: A highly parallel finite state automaton processor for biological pattern matching. In: Stringology, pp. 58–72 (1998)
Hopcroft, J., Ullman, J.: Introduction to Automata Theory, Languages, and Computation, 2nd edn. Addison-Wesley, Reading (1979)
Hussini, S., Kari, L., Konstantinidis, S.: Coding properties of DNA languages. Theor. Comput. Sci. 290(3), 1557–1579 (2003)
Ibarra, O.H.: On decidability and closure properties of language classes with respect to bio-operations. In: Murata, S., Kobayashi, S. (eds.) DNA 2014. LNCS, vol. 8727, pp. 148–160. Springer, Heidelberg (2014)
Ito, M., Leupold, P., Shikishima-Tsuji, K.: Closure of language classes under bounded duplication. In: Ibarra, O.H., Dang, Z. (eds.) DLT 2006. LNCS, vol. 4036, pp. 238–247. Springer, Heidelberg (2006)
Kari, L., Mahalingam, K.: DNA codes and their properties. In: Mao, C., Yokomori, T. (eds.) DNA12. LNCS, vol. 4287, pp. 127–142. Springer, Heidelberg (2006)
Kong, S.G., Fan, W.L., Chen, H.D., Hsu, Z.T., Zhou, N., Zheng, B., Lee, H.C.: Inverse symmetry in complete genomes and whole-genome inverse duplication. PLoS One 4(11), e7553 (2009)
Leupold, P., Mitrana, V., Sempere, J.M.: Formal languages arising from gene repeated duplication. In: Jonoska, N., Păun, G., Rozenberg, G. (eds.) Aspects of Molecular Computing. LNCS, vol. 2950, pp. 297–308. Springer, Heidelberg (2003)
Mitrana, V., Rozenberg, G.: Some properties of duplication grammars. Acta Cybernetica 14(1), 165–177 (1999)
Mount, D.W.: Using the basic local alignment search tool (blast). Cold Spring Harbor Protoc. 2007(7), pdb-top17 (2007)
Pearson, W.R., Lipman, D.J.: Improved tools for biological sequence comparison. Nat. Acad. Sci. 85(8), 2444–2448 (1988)
Rozenberg, G., Salomaa, A. (eds.): Handbook of Formal Languages. Beyond Words, vol. 3. Springer-Verlag New York Inc., New York (1997)
Schöniger, M., Waterman, M.S.: A local algorithm for DNA sequence alignment with inversions. Bull. Math. Biol. 54(4), 521–536 (1992)
Searls, D.B.: The computational linguistics of biological sequences. Artif. Intell. Mol. Biol. 2, 47–120 (1993)
Shallit, J.: A Second Course in Formal Languages and Automata Theory. Cambridge University Press, Cambridge (2009)
Viguera, E., Canceill, D., Ehrlich, S.D.: Replication slippage involves DNA polymerase pausing and dissociation. EMBO J. 20(10), 2587–2595 (2001)
Wood, D.: Theory of Computation. Harper & Row, New York (1987)
Yokomori, T., Kobayashi, S.: DNA evolutionary linguistics and RNA structure modeling: a computational approach. In: Neural and Biological Systems, pp. 38–45 (1995)
Acknowledgements
We wish to thank the referees for valuable suggestions that improve proofs for several results.
This research was supported by the Basic Science Research Program through NRF funded by MEST (2012R1A1A2044562), the International Cooperation Program managed by NRF of Korea (2014K2A1A2048512), the Yonsei University Future-leading Research Initiative of 2014 and the Natural Sciences and Engineering Research Council of Canada Grant OGP0147224.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Cho, DJ., Han, YS., Kim, H., Palioudakis, A., Salomaa, K. (2015). Duplications and Pseudo-Duplications. In: Calude, C., Dinneen, M. (eds) Unconventional Computation and Natural Computation. UCNC 2015. Lecture Notes in Computer Science(), vol 9252. Springer, Cham. https://doi.org/10.1007/978-3-319-21819-9_11
Download citation
DOI: https://doi.org/10.1007/978-3-319-21819-9_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-21818-2
Online ISBN: 978-3-319-21819-9
eBook Packages: Computer ScienceComputer Science (R0)