Skip to main content

On the Upper Bound of the Prediction Accuracy of Residue Contacts in Proteins with Correlated Mutations: The Case Study of the Similarity Matrices

  • Conference paper
Algorithms in Bioinformatics (WABI 2009)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 5724))

Included in the following conference series:

  • 801 Accesses

Abstract

Correlated mutations in proteins are believed to occur in order to preserve the protein functional folding through evolution. Their values can be deduced from sequence and/or structural alignments and are indicative of residue contacts in the protein three-dimensional structure. A correlation among pairs of residues is routinely evaluated with the Pearson correlation coefficient and the MCLACHLAN similarity matrix. In this paper, we describe an optimization procedure that maximizes the correlation between the Pearson coefficient and the protein residue contacts with respect to different similarity matrices, including random. Our results indicate that there is a large number of equivalent matrices that perform similarly to MCLACHLAN. We also obtain that the upper limit to the accuracy achievable in the prediction of the protein residue contacts is independent of the optimized similarity matrix. This suggests that poor scoring may be due to the choice of the linear correlation function in evaluating correlated mutations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Altschul, S.F., Madden, T.L., Schäffer, A.A., Zhang, J., Zhang, Z., Miller, W., Lipman, D.J.: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25(17), 3389–3402 (1997)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Andreeva, A., Howorth, D., Brenner, S.E., Hubbard, T.J., Chothia, C., Murzin, A.G.: SCOP database in 2004: refinements integrate structure and sequence family data. Nucleic Acids Res. 32(Database issue), D226–D229 (2004)

    Article  Google Scholar 

  3. Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N., Bourne, P.E.: The Protein Data Bank. Nucleic Acids Res. 28(1), 235–242 (2000)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Das, R., Baker, D.: Macromolecular modeling with rosetta. Annu. Rev. Biochem. 77, 363–382 (2008)

    Article  CAS  PubMed  Google Scholar 

  5. Göbel, U., Sander, C., Schneider, R., Valencia, A.: Correlated mutations and residue contacts in proteins. Proteins 18(4), 309–317 (1994)

    Article  PubMed  Google Scholar 

  6. Graña, O., Eyrich, V.A., Pazos, F., Rost, B., Valencia, A.: EVAcon: a protein contact prediction evaluation service. Nucleic Acids Res. 33(Web Server issue), W347–W351 (2005)

    Article  Google Scholar 

  7. Henikoff, S., Henikoff, J.G.: Amino acid substitution matrices from protein blocks. Proc. Natl. Acad. Sci. U S A 89(22), 10915–10919 (1992)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Horner, D.S., Pirovano, W., Pesole, G.: Correlated substitution analysis and the prediction of amino acid structural contacts. Brief Bioinform. 9(1), 46–56 (2008); Epub. (November 13, 2007)

    Article  CAS  PubMed  Google Scholar 

  9. Izarzugaza, J.M., Graña, O., Tress, M.L., Valencia, A., Clarke, N.D.: Assessment of intramolecular contact predictions for CASP7. Proteins 69(suppl. 8), 152–158 (2007)

    Article  CAS  PubMed  Google Scholar 

  10. Hinds, D.A., Levitt, M.: A lattice model for protein structure prediction at low resolution. Proc. Natl. Acad. Sci. U S A 89(7), 2536–2540 (1992)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Lesk, A.: Introduction to Bioinformatics. Oxford University Press, Oxford (2006)

    Google Scholar 

  12. McLachlan, A.D.: Tests for comparing related amino-acid sequences. Cytochrome c and cytochrome c 551. J. Mol. Biol. 61(2), 409–424 (1971)

    Article  CAS  PubMed  Google Scholar 

  13. Olmea, O., Valencia, A.: Improving contact predictions by the combination of correlated mutations and other sources of sequence information. Fold Des. 2(3), S25–S32 (1997)

    Article  Google Scholar 

  14. Dayhoff, M.O., Schwartz, R.M., Orcutt, B.C.: A model of evolutionary change in proteins. Atlas of Protein Sequence and Structure 5(3), 345–352 (1978); Dayhoff, M.O. (ed.)

    Google Scholar 

  15. Pollock, D.D., Taylor, W.R.: Effectiveness of correlation analysis in identifying protein residues undergoing correlated evolution. Protein Eng. 10(6), 647–657 (1997)

    Article  CAS  PubMed  Google Scholar 

  16. Samsonov, S.A., Teyra, J., Anders, G., Pisabarro, M.T.: Analysis of the impact of solvent on contacts prediction in proteins. BMC Struct. Biol. 9(1), 22 (2009)

    Article  PubMed  PubMed Central  Google Scholar 

  17. Snyman, J.A.: Practical mathematical optimization: an introduction to basic optimization theory and classical and new gradient-based algorithms. Springer, New York (2005)

    Google Scholar 

  18. Vassura, M., Margara, L., Di Lena, P., Medri, F., Fariselli, P., Casadio, R.: Reconstruction of 3D structures from protein contact maps. IEEE/ACM Trans. Comput. Biol. Bioinform. 5(3), 357–367 (2008)

    Article  CAS  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Di Lena, P., Fariselli, P., Margara, L., Vassura, M., Casadio, R. (2009). On the Upper Bound of the Prediction Accuracy of Residue Contacts in Proteins with Correlated Mutations: The Case Study of the Similarity Matrices. In: Salzberg, S.L., Warnow, T. (eds) Algorithms in Bioinformatics. WABI 2009. Lecture Notes in Computer Science(), vol 5724. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04241-6_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-04241-6_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-04240-9

  • Online ISBN: 978-3-642-04241-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics