Abstract
Correlated mutations in proteins are believed to occur in order to preserve the protein functional folding through evolution. Their values can be deduced from sequence and/or structural alignments and are indicative of residue contacts in the protein three-dimensional structure. A correlation among pairs of residues is routinely evaluated with the Pearson correlation coefficient and the MCLACHLAN similarity matrix. In this paper, we describe an optimization procedure that maximizes the correlation between the Pearson coefficient and the protein residue contacts with respect to different similarity matrices, including random. Our results indicate that there is a large number of equivalent matrices that perform similarly to MCLACHLAN. We also obtain that the upper limit to the accuracy achievable in the prediction of the protein residue contacts is independent of the optimized similarity matrix. This suggests that poor scoring may be due to the choice of the linear correlation function in evaluating correlated mutations.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Altschul, S.F., Madden, T.L., Schäffer, A.A., Zhang, J., Zhang, Z., Miller, W., Lipman, D.J.: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25(17), 3389–3402 (1997)
Andreeva, A., Howorth, D., Brenner, S.E., Hubbard, T.J., Chothia, C., Murzin, A.G.: SCOP database in 2004: refinements integrate structure and sequence family data. Nucleic Acids Res. 32(Database issue), D226–D229 (2004)
Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N., Bourne, P.E.: The Protein Data Bank. Nucleic Acids Res. 28(1), 235–242 (2000)
Das, R., Baker, D.: Macromolecular modeling with rosetta. Annu. Rev. Biochem. 77, 363–382 (2008)
Göbel, U., Sander, C., Schneider, R., Valencia, A.: Correlated mutations and residue contacts in proteins. Proteins 18(4), 309–317 (1994)
Graña, O., Eyrich, V.A., Pazos, F., Rost, B., Valencia, A.: EVAcon: a protein contact prediction evaluation service. Nucleic Acids Res. 33(Web Server issue), W347–W351 (2005)
Henikoff, S., Henikoff, J.G.: Amino acid substitution matrices from protein blocks. Proc. Natl. Acad. Sci. U S A 89(22), 10915–10919 (1992)
Horner, D.S., Pirovano, W., Pesole, G.: Correlated substitution analysis and the prediction of amino acid structural contacts. Brief Bioinform. 9(1), 46–56 (2008); Epub. (November 13, 2007)
Izarzugaza, J.M., Graña, O., Tress, M.L., Valencia, A., Clarke, N.D.: Assessment of intramolecular contact predictions for CASP7. Proteins 69(suppl. 8), 152–158 (2007)
Hinds, D.A., Levitt, M.: A lattice model for protein structure prediction at low resolution. Proc. Natl. Acad. Sci. U S A 89(7), 2536–2540 (1992)
Lesk, A.: Introduction to Bioinformatics. Oxford University Press, Oxford (2006)
McLachlan, A.D.: Tests for comparing related amino-acid sequences. Cytochrome c and cytochrome c 551. J. Mol. Biol. 61(2), 409–424 (1971)
Olmea, O., Valencia, A.: Improving contact predictions by the combination of correlated mutations and other sources of sequence information. Fold Des. 2(3), S25–S32 (1997)
Dayhoff, M.O., Schwartz, R.M., Orcutt, B.C.: A model of evolutionary change in proteins. Atlas of Protein Sequence and Structure 5(3), 345–352 (1978); Dayhoff, M.O. (ed.)
Pollock, D.D., Taylor, W.R.: Effectiveness of correlation analysis in identifying protein residues undergoing correlated evolution. Protein Eng. 10(6), 647–657 (1997)
Samsonov, S.A., Teyra, J., Anders, G., Pisabarro, M.T.: Analysis of the impact of solvent on contacts prediction in proteins. BMC Struct. Biol. 9(1), 22 (2009)
Snyman, J.A.: Practical mathematical optimization: an introduction to basic optimization theory and classical and new gradient-based algorithms. Springer, New York (2005)
Vassura, M., Margara, L., Di Lena, P., Medri, F., Fariselli, P., Casadio, R.: Reconstruction of 3D structures from protein contact maps. IEEE/ACM Trans. Comput. Biol. Bioinform. 5(3), 357–367 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Di Lena, P., Fariselli, P., Margara, L., Vassura, M., Casadio, R. (2009). On the Upper Bound of the Prediction Accuracy of Residue Contacts in Proteins with Correlated Mutations: The Case Study of the Similarity Matrices. In: Salzberg, S.L., Warnow, T. (eds) Algorithms in Bioinformatics. WABI 2009. Lecture Notes in Computer Science(), vol 5724. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04241-6_6
Download citation
DOI: https://doi.org/10.1007/978-3-642-04241-6_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04240-9
Online ISBN: 978-3-642-04241-6
eBook Packages: Computer ScienceComputer Science (R0)