Abstract
In this paper we extend codon volatility definition to amino acid reduced alphabets to characterize mutations that conserve physical-chemical properties. We also define the average relative changeability of amino acids in terms of single-base codon self-substitution frequencies (identities). These frequencies are taken from an empirical codon substitution matrix [14]. It is shown that this index splits the amino acids into two groups: replaceable and irreplaceable. The same grouping is obtained from the size/complexity index introduced by Dufton [32]. Also, a 71 % agreement is obtained with residues in mutually persistent conserved (MPC) positions [31]. These positions play a key role in fold and functional determination. The residual 29 % can be readily explained. 75 % of residues with highest rank according to MPC positions have the highest probability of causing disease if mutated.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Shih, A.C.-C., Hsiao, T.-C., Ho, M.-S., Li, W.-H.: Simultaneous amino acid substitutions at antigenic sites drive influenza A hemagglutinin evolution. Proc. Natl. Acad. Sci. USA 104(15), 6283–6288 (2007)
Clark, L.A., Ganesan, S., Papp, S., van Vlijmen, H.W.T.: Trends in Antibody Sequence Changes during the Somatic Hypermutation Process. The Journal of Immunology 177, 333–340 (2006)
Keefe, A.D., Szostak, J.W.: Functional proteins from a random-sequence library. Nature 410, 715–718 (2001)
Arnold, F.H.: Design by Directed Evolution. Accounts of Chemical Research 31(3), 125–131 (1998)
Orencia, M.C., Yoon, J.S., Ness, J.E., Stemmer, W.P.C., Stevens, R.C.: Predicting the emergence of antibiotic resistance by directed evolution and structural analysis. Nature Structural Biology 8(3), 238–242 (2001)
Vitkup, D., Sander, C., Church, G.M.: The amino-acid mutational spectrum of human genetic diease. Genome Biology 4 R72 (2003)
Liò, P., Goldman, N.: Models of molecular evolution and phylogeny. Genome Res. 8, 1233–1244 (1998)
Kosiol, C., Holmes, I., Goldman, N.L.: An empirical codon model for protein sequence evolution. Mol. Biol. Evol. 24(7), 1464–1479 (2007)
Yampolsky, L.Y., Stolzfus, A.: The exchangeability of amino acids in proteins. Genetics 170, 1459–1472 (2005)
Jiménez-Montaño, M.A., de la Mora-Basáñez, R., Pöschel, T.: The Hypercube Structure of the Genetic Code Explains Conservative and Non-Conservartive Amino acid Substitutions in Vivo and in Vitro. BioSystems 39, 117–125 (1996)
Karasev, V.A., Soronkin, S.G.: Topological structure of the genetic code. Russian Journal of Genetics 33, 622–628 (1997)
He, M.X., Petoukhov, S.V., Ricci, P.E.: Genetic code, Hamming distance and stochastic matrices. Bull. Math. Biology 66(5), 1405–1421 (2004)
Hershberg, U., Shlomchik, M.J.: Differences in potential for amino acid change after mutation reveals distinct strategies for {kappa} and {lambda} light-chain variation. Proc. Natl. Acad. Sci. USA 103(43), 15963–15968 (2006)
Schneider, A., Cannarozzi, G.M., Gonnet, G.H.: Empirical codon substitution matrix. BMC Bioinformatics 6, 134 (2005)
Doron-Faigenboim, A., Pupko, T.: A combined empirical and mechanistic codon model. Mol. Biol. Evol. 24(2), 388–397 (2007)
Plotkin, J.B., Dushoff, J.: Codon bias and frequency-dependent selection on the hemagglutinin epitopes of influenza A virus. Proc. Natl. Acad. Sci. USA 100(12), 7152–7157 (2003)
Plotkin, J.B., Dushoff, J., Fraser, H.B.: Detecting selection using a single genome sequence of M. tuberculosis and P. falciparum. Nature 428, 942–945 (2004)
Grantham, R.: Amino Acid Difference Formula to Help Explain Protein Evolution. Science 185(4154), 862–864 (1974)
Miyata, T., Miyazawa, S., Yasunaga, T.: Two types of amino acid substitutions in protein evolution. J. Mol. Evol. 12, 219–236 (1979)
Cannata, N., Toppo, S., Romualdi, C., Valle, G.: Simplifying amino acid alphabets by means of a branch and bound algorithm and substitution matrices. Bioinformatics 18, 1102–1108 (2002)
Murphy, L.R., Wallqvist, A., Levy, R.M.: Simplified amino acid alphabets for protein fold recognition and implications for folding. Protein Eng. 13(3), 149–152 (2000)
Fan, K., Wang, W.: What is the minimum number of letters required to fold a protein? J. Mol. Biol. 328, 921–926 (2003)
Albatineh, A., Razeghifard, R.: Clustering Amino Acids Using Maximum Clusters Similarity. In: Doble, M., Loging, W., Malone, J., Tseng, V.S.-M. (eds.) Proc. 2008 International Conference on Bioinformatics, Computational Biology, Genomics and Chemoinformatics (BCBGC 2008), pp. 87–92. ISRST, USA (2008)
Jiménez-Montaño, M.A.: On the syntactic structure of protein sequences and the concept of grammar complexity. Bull. Math. Biol. 46(4), 641–659 (1984)
Zhou, H., Zhou, Y.: Quantifying the effect of burial amino acid residues on protein stability. PROTEINS: Structure, Function, and Bioinformatics 54, 315–322 (2004)
Burks, E.A., Chen, G., Georgiou, G., Iverson, B.L.: In vitro scanning saturation mutagenesis of an antibody binding pocket. Proc. Natl. Acad. Sci. USA 94, 412–417 (1997)
Volkenstein, M.V.: Mutations and the value of information. J. Theor. Biol. 80, 155–169 (1979)
Bachinsky, A., Ratner, V.: Biomed. Zs. 18, 53 (1976) (in Russian)
Dayhoff, M. (ed).: Atlas of protein sequence and structure. Nat. Biomed. Res. Found (1972)
Luo, L.F.: The degeneracy rule of genetic code. Origins of Life and evolution of the biosphere 18, 65–70 (1988)
Friedberg, I., Margalit, H.: Persistently conserved positions in structurally similar, sequence dissimilar proteins: Roles in preserving protein fold and function. Protein Science 11, 350–360 (2002)
Dufton, M.J.: Genetic code synonym quotas and amino acid complexity: Cutting the cost of proteins? J. Theor. Biol. 187, 165–173 (1997)
Papentin, F.: On order and complexity. II. Application to chemical and biochemical structures. J. Theor. Biol. 95(2), 225–245 (1982)
Jones, D.T., Taylor, W.R., Thornton, J.: The rapid generation of mutation data matrices from protein sequences. Compt. Appl. Biosci. 8, 275–282 (1992)
Tourasse, N.J., Li, W.-H.: selective constraints, amino acid composition, and the rate of protein evolution. Mol. Biol. Evol. 17(4), 656–664 (2000)
Wang, Z., Moult, J.: SNPs, protein structure, and disease. Hum. Mutat. 17, 263–270 (2001)
Li, W.-H., Wu, C.-I., Luo, C.-C.: A new method for estimating synonymous and non-synonymous rates of nucleotide substitution considering the relative likelihood of nucleotide and codon changes. Mol. Biol. Evol. 2, 150–174 (1985)
Jiménez-Montaño, M.A., Ramos-Fernandez, A.: An empirical method to identify positively selected sites in antigenic evolution. In: Argüello-Astorga, G.R., González, R.A., Méndez Salinas, E. (eds.) e-Proc. V National Congress of Virology. Sociedad Mexicana de Bioquimica, Mexico (2007)
Cargill, M., Altshuler, D., Ireland, J., Sklar, P., Ardlie, K., Patil, N., Shaw, N., Lane, C.R., Lim, E.P., Kalyanaraman, N., et al.: Characterization of single-nucleotide polymorphisms in coding regions of human genes. Nat. Genet. 22, 231–238 (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jiménez-Montaño, M.A., He, M. (2009). Irreplaceable Amino Acids and Reduced Alphabets in Short-Term and Directed Protein Evolution. In: Măndoiu, I., Narasimhan, G., Zhang, Y. (eds) Bioinformatics Research and Applications. ISBRA 2009. Lecture Notes in Computer Science(), vol 5542. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01551-9_29
Download citation
DOI: https://doi.org/10.1007/978-3-642-01551-9_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-01550-2
Online ISBN: 978-3-642-01551-9
eBook Packages: Computer ScienceComputer Science (R0)