Abstract
The smallest known biological organisms are, by far, the viruses. One of the unique adaptations that many viruses have aquired is the compression of the genes in their genomes. In this paper we study a formalized model of gene compression in viruses. Specifically, we define a set of constraints that describe viral gene compression strategies and investigate the properties of these constraints from the point of view of genomes as languages. We pay special attention to the finite case (representing real viral genomes) and describe a metric for measuring the level of compression in a real viral genome. An efficient algorithm for establishing this metric is given along with applications to real genomes including automated classification of viruses and prediction of horizontal gene transfer between host and virus.
This research was funded in part by institutional grants of the University of Saskatchewan (M. Daley), the University of Western Ontario (M. Daley) and the Natural Science and Engineering Research Council of Canada (M. Daley and H. Jürgensen).
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Berstel, J.: Transductions and Context-Free Languages. B.B. Teubner, Stuttgart (1979)
Blumer, A., Blumer, J., Chen, M.T., Ehrenfeucht, A., Seiferas, J.: The smallest automaton recognizing the subwords of a text. Theoretical Computer Science 40(1), 31–55 (1985)
Cann, A.J.: Principles of Molecular Virology, 3rd edn. Academic Press, San Diego (2001)
Ginsburg, S.: Algebraic and Automata-Theoretic Properties of Formal Languages. North-Holland Publishing Company, Amsterdam (1975)
Ginsburg, S., Spanier, E.H.: Bounded algol-like languages. Transactions of the American Mathematical Society 113(2), 333–368 (1964)
Holub, J., Melichar, B.: Implementation of nondeterministic finite automata for approximate pattern matching. In: Champarnaud, J.-M., Maurel, D., Ziadi, D. (eds.) WIA 1998. LNCS, vol. 1660, pp. 92–99. Springer, Heidelberg (1999)
Ibarra, O.: Reversal-bounded multicounter machines and their decision problems. Journal of the ACM 25(1), 116–133 (1978)
Krakauer, D.C.: Evolutionary principles of genome compression. Comments on Theoretical Biology 7(4), 215–236 (2002)
Salomaa, A.: Formal Languages. Academic Press, New York (1973)
Wagner, E.K., Hewlett, M.J.: Basic Virology. Blackwell Science, Malden (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Daley, M., McQuillan, I. (2005). Viral Gene Compression: Complexity and Verification. In: Domaratzki, M., Okhotin, A., Salomaa, K., Yu, S. (eds) Implementation and Application of Automata. CIAA 2004. Lecture Notes in Computer Science, vol 3317. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30500-2_10
Download citation
DOI: https://doi.org/10.1007/978-3-540-30500-2_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-24318-2
Online ISBN: 978-3-540-30500-2
eBook Packages: Computer ScienceComputer Science (R0)