Skip to main content

Duplications and Pseudo-Duplications

  • Conference paper
  • First Online:
Unconventional Computation and Natural Computation (UCNC 2015)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9252))

Abstract

A duplication is basic phenomenon that occurs through molecular evolution on a biological sequence. A duplication on a string copies any substring of the string. We define k-pseudo-duplication of a string w that consists, roughly speaking, of all strings obtained from w by inserting after a substring u another substring obtained from u by at most k edit operations. We consider three variants of duplication operations, duplication, k-pseudo-duplication and reverse-duplication. First, we give the necessary and sufficient number of states that a nondeterministic finite automaton needs to recognize duplications on a string. Then, we show that regular languages and context-free languages are not closed under the duplication, k-pseudo-duplication and reverse-duplication operations. Furthermore, we show that the class of context-sensitive languages is closed under duplication, pseudo-duplication and reverse-duplication.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Calude, C., Salomaa, K., Yu, S.: Additive distances and quasi-distances between words. Univ. Comput. Sci. 8(2), 141–152 (2002)

    MathSciNet  MATH  Google Scholar 

  2. Cameron, M., Williams, H.E., Cannane, A.: A deterministic finite automaton for faster protein hit detection in blast. Comput. Biol. 13(4), 965–978 (2006)

    Article  MathSciNet  Google Scholar 

  3. Cantone, D., Cristofaro, S., Faro, S.: Efficient string-matching allowing for non-overlapping inversions. Theor. Comput. Sci. 483, 85–95 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  4. Cho, D.J., Han, Y.S., Kang, S.D., Kim, H., Ko, S.K., Salomaa, K.: Pseudo-inversion on formal languages. In: Ibarra, O.H., Kari, L., Kopecki, S. (eds.) UCNC 2014. LNCS, vol. 8553, pp. 93–104. Springer, Heidelberg (2014)

    Google Scholar 

  5. Cho, D.J., Han, Y.S., Kim, H.: Alignment with non-overlapping inversions and translocations on two strings. Theor. Comput. Sci. 575, 90–101 (2015)

    Article  MathSciNet  Google Scholar 

  6. Cho, D.J., Han, Y.S., Ko, S.K., Salomaa, K.: State complexity of inversion operations. Theor. Comput. Sci. (in press)

    Google Scholar 

  7. Choffrut, C., Pighizzini, G.: Distances between languages and reflexivity of relations. Theor. Comput. Sci. 286(1), 117–138 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  8. Creighton, H.B., McClintock, B.: A correlation of cytological and genetical crossing-over in zea mays. Nat. Acad. Sci. U.S.A. 17(8), 492–497 (1931)

    Article  Google Scholar 

  9. Dassow, J., Mitrana, V., Paun, G.: On the regularity of duplication closure. Bull. EATCS 69, 133–136 (1999)

    MathSciNet  MATH  Google Scholar 

  10. Dassow, J., Mitrana, V., Salomaa, A.: Operations and language generating devices suggested by the genome evolution. Theor. Comput. Sci. 270(1), 701–738 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  11. Dassow, J., Mitrana, V., Salomaa, A.: Context-free evolutionary grammars and the structural language of nucleic acids. Biosystems 43(3), 169–177 (1997)

    Article  Google Scholar 

  12. Djian, P.: Evolution of simple repeats in dna and their relation to human disease. Cell 94(2), 155–160 (1998)

    Article  Google Scholar 

  13. Herrmannsfeldt, G.: A highly parallel finite state automaton processor for biological pattern matching. In: Stringology, pp. 58–72 (1998)

    Google Scholar 

  14. Hopcroft, J., Ullman, J.: Introduction to Automata Theory, Languages, and Computation, 2nd edn. Addison-Wesley, Reading (1979)

    MATH  Google Scholar 

  15. Hussini, S., Kari, L., Konstantinidis, S.: Coding properties of DNA languages. Theor. Comput. Sci. 290(3), 1557–1579 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  16. Ibarra, O.H.: On decidability and closure properties of language classes with respect to bio-operations. In: Murata, S., Kobayashi, S. (eds.) DNA 2014. LNCS, vol. 8727, pp. 148–160. Springer, Heidelberg (2014)

    Google Scholar 

  17. Ito, M., Leupold, P., Shikishima-Tsuji, K.: Closure of language classes under bounded duplication. In: Ibarra, O.H., Dang, Z. (eds.) DLT 2006. LNCS, vol. 4036, pp. 238–247. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  18. Kari, L., Mahalingam, K.: DNA codes and their properties. In: Mao, C., Yokomori, T. (eds.) DNA12. LNCS, vol. 4287, pp. 127–142. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  19. Kong, S.G., Fan, W.L., Chen, H.D., Hsu, Z.T., Zhou, N., Zheng, B., Lee, H.C.: Inverse symmetry in complete genomes and whole-genome inverse duplication. PLoS One 4(11), e7553 (2009)

    Article  Google Scholar 

  20. Leupold, P., Mitrana, V., Sempere, J.M.: Formal languages arising from gene repeated duplication. In: Jonoska, N., Păun, G., Rozenberg, G. (eds.) Aspects of Molecular Computing. LNCS, vol. 2950, pp. 297–308. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  21. Mitrana, V., Rozenberg, G.: Some properties of duplication grammars. Acta Cybernetica 14(1), 165–177 (1999)

    MathSciNet  MATH  Google Scholar 

  22. Mount, D.W.: Using the basic local alignment search tool (blast). Cold Spring Harbor Protoc. 2007(7), pdb-top17 (2007)

    Google Scholar 

  23. Pearson, W.R., Lipman, D.J.: Improved tools for biological sequence comparison. Nat. Acad. Sci. 85(8), 2444–2448 (1988)

    Article  Google Scholar 

  24. Rozenberg, G., Salomaa, A. (eds.): Handbook of Formal Languages. Beyond Words, vol. 3. Springer-Verlag New York Inc., New York (1997)

    MATH  Google Scholar 

  25. Schöniger, M., Waterman, M.S.: A local algorithm for DNA sequence alignment with inversions. Bull. Math. Biol. 54(4), 521–536 (1992)

    Article  MATH  Google Scholar 

  26. Searls, D.B.: The computational linguistics of biological sequences. Artif. Intell. Mol. Biol. 2, 47–120 (1993)

    Google Scholar 

  27. Shallit, J.: A Second Course in Formal Languages and Automata Theory. Cambridge University Press, Cambridge (2009)

    MATH  Google Scholar 

  28. Viguera, E., Canceill, D., Ehrlich, S.D.: Replication slippage involves DNA polymerase pausing and dissociation. EMBO J. 20(10), 2587–2595 (2001)

    Article  Google Scholar 

  29. Wood, D.: Theory of Computation. Harper & Row, New York (1987)

    MATH  Google Scholar 

  30. Yokomori, T., Kobayashi, S.: DNA evolutionary linguistics and RNA structure modeling: a computational approach. In: Neural and Biological Systems, pp. 38–45 (1995)

    Google Scholar 

Download references

Acknowledgements

We wish to thank the referees for valuable suggestions that improve proofs for several results.

This research was supported by the Basic Science Research Program through NRF funded by MEST (2012R1A1A2044562), the International Cooperation Program managed by NRF of Korea (2014K2A1A2048512), the Yonsei University Future-leading Research Initiative of 2014 and the Natural Sciences and Engineering Research Council of Canada Grant OGP0147224.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yo-Sub Han .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Cho, DJ., Han, YS., Kim, H., Palioudakis, A., Salomaa, K. (2015). Duplications and Pseudo-Duplications. In: Calude, C., Dinneen, M. (eds) Unconventional Computation and Natural Computation. UCNC 2015. Lecture Notes in Computer Science(), vol 9252. Springer, Cham. https://doi.org/10.1007/978-3-319-21819-9_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-21819-9_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-21818-2

  • Online ISBN: 978-3-319-21819-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics