Abstract
In order to analyse the genetic code, the distribution of the 64 trinucleotides w (words of 3 letters on the gene alphabet {A,C,G,T}, w∈τ={AAA,⋯,TTT}) in the prokaryotic protein coding genes (words of large sizes) is studied with autocorrelation functions. The trinucleotides wp can be read in 3 frames p (p=0: reference frame, p=1: reference frame shifted by 1 letter, p=2: reference frame shifted by 2 letters) in coding genes. Then, the autocorrelation function wp(N)iw′ analyses the occurrence probability of the i-motif wp(N)iw′, i.e. 2 trinucleotides wp in frame p and w′ in any frame (w,w′∈ τ) which are separated by any i bases N (N=A, C, G or T). The 642×3=12288 autocorrelation functions applied to the prokaryotic protein coding genes are almost all non-random and have a modulo 3 periodicity among the 3 following types: 0 modulo 3, 1 modulo 3 and 2 modulo 3. The classification of 12288 i-motifs wp(N)iw′ according to the type of periodicity implies a constant preferential occurrence frame for w′ independent of w and p. Three sub-sets of trinucleotides are identified: 22 trinucleotides in frame 0 forming the subset τ 0={AAA, AAC, AAT, ACC, ATC, ATT, CAG, CTC, CTG, GAA, GAC, GAG, GAT, GCC, GGC, GGT, GTA, GTC, GTT, TAC, TTC, TTT} and 21 trinucleotides in each of the frames 1 and 2 forming the sub-sets τ 1 and τ 2 respectively. Except for AAA, CCC, GGG and TTT, the sub-sets τ 1 and τ 2 are generated by a circular permutation P of τ 0: P(τ 0)=τ 1 and P(τ 1)=τ 2. Furthermore, the complementarity property ∁ of the DNA double helix (i.e. ∁(A)=T, ∁(C)=G, ∁(G)=C, ∁(T)=A and if w=l1l2l3 then ∁(w)=∁(l3)∁(l2)∁(l1) with l1, l2, l3∈{A,C,G,T}) is observed in these 3 sub-sets: ∁(τ 0)=τ 0, ∁(τ 1)=τ 2 and ∁(τ 2)=τ 1.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Arquès, D.G. & Michel, C.J. (1987). A purine-pyrimidine motif verifying an identical presence in almost all gene taxonomic groups. J. Theor. Biol. 128, 457–461.
Arquès, D.G. & Michel, C.J. (1990). Periodicities in coding and noncoding regions of the genes. J. Theor. Biol. 143, 307–318.
Arquès, D.G. & Michel, C.J. (1990). A model of DNA sequence evolution, Part 1: Statistical features and classification of gene populations, Part 2: Simulation model, Part 3: Return of the model to the reality. Bull. Math. Biol. 52, 741–772.
Arquès, D.G., Lapayre, J.-C. & Michel, C.J. (1994). Identification and simulation of shifted periodicities common to protein coding genes of eukaryotes, prokaryotes and viruses. J. Theor. Biol. in press.
Crick, F.H.C., Griffith, J.S. & Orgel, L.E. (1957). Codes without commas. Proc. Natl. Acad. Sci. 43, 416–421.
Jukes, T.H., Holmquist, R. & Moise, H. (1975). Amino acid composition of proteins: selection against the genetic code. Science 189, 50–51.
Konecny, J., Eckert, M., Schöniger, M. & Hofacker, G.L. (1993). Neutral adaptation of the genetic code to double-strand coding. J. Mol. Evol. 36, 407–416.
Nirenberg, M.W. & Matthaei, J.H. (1961). The dependance of cell-free protein synthesis in E. Coli upon naturally occurring or synthetic polyribonucleotides. Proc. Natl. Acad. Sci. 47, 1588–1602.
Watson, J.D. & Crick, F.H.C. (1953). A structure for deoxyribose nucleic acid. Nature 171, 737–738.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1995 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Arquès, D.G., Michel, C.J. (1995). A possible code in the genetic code. In: Mayr, E.W., Puech, C. (eds) STACS 95. STACS 1995. Lecture Notes in Computer Science, vol 900. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-59042-0_112
Download citation
DOI: https://doi.org/10.1007/3-540-59042-0_112
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-59042-2
Online ISBN: 978-3-540-49175-0
eBook Packages: Springer Book Archive