Skip to main content
Log in

Threading with environment-specific score by artificial neural networks

  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Protein threading programs align a probe amino acid sequence onto a library of representative folds of known protein structure to identify a structural homology. A scoring function is usually formulated in terms of the threading energy to evaluate the protein sequence-structure fitness. In this paper, a model named threading with environment-specific score (TES) is proposed to build a new threading score function with the use of artificial neural networks. Given a protein structure with a residue level environment description, the compatibility of residue in sequence with its structural environment is presented. A threading score is constructed by log-odds scores of predicted probabilities from the trained model to determine which residue best fits its environment. Two decoy sets are used to test the proposed TES method on discrimination of native and decoy protein three-dimensional structure. The results showed that the performance of the proposed method is comparable to those of knowledge-based potential energy function.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Baldi P, Brunak S (2001) Bioinformaics: the machine learning approach. MIT Press, Cambridge

  2. Baldi P, Brunak S, Chauvin Y, Andersen CAF, Nielsen H (2000) Assessing the accuracy of prediction algorithms for classification: an overview. Bioinformatics 16(5):412–424

    Google Scholar 

  3. Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H et al (2000) The Protein Data Bank. Nucleic Acids Res 28:235–242

  4. Bernstein FC, Koetzle TF, Williams GJB, Meyer E Jr, Brice MD, Rodgers JR, Kennard O, Shimanouchi T, Tasumi M (1977) The Protein Data Bank: a computer-based archival file for macromolecular structures. J Mol Biol 15:937–946

    Google Scholar 

  5. Bowie JU, Luthy R, Eisenberg D (1991) A method to identify protein sequences that fold into a known three-dimensional structure. Science 253:164–170

    Google Scholar 

  6. Braxenthaler M, Samudrala R, Pedersen J, Luo R, Milash B Moult J (1997) PROSTAR: the protein potential test site. http://prostar.carb.nist.gov

  7. Bryant SH, Lawrence CE (1993) An empirical energy function for threading protein sequence through the folding motif. Proteins Struct Funct Genet 16(1):92–112

    Google Scholar 

  8. Cybenko G (1989) Approximation by superpositions of a sigmoidal function. Math Control Signal Syst 2(4):303–314

    Google Scholar 

  9. Gatchell DW, Dennis S, Vajda S (2000) Discrimination of nearnative protein structures from misfold models by empirical free energy functions. Proteins Struct Funct Genet 41:518–534

    Google Scholar 

  10. Hanley JA, McNeil BJ (1982) The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143:29–36

    Google Scholar 

  11. Holm L, Sander C (1992) Evaluation of protein models by atomic solvation preference. J Mol Biol 225:93–105

    Google Scholar 

  12. Holm L, Sander C (1997) Dali/ FSSP classification of three-dimensional protein folds. Nucleic Acids Res 25:231–234

    Google Scholar 

  13. Jadwiga RB, Robert GR Jr, Temple FS (1999) Performance of threading scoring function designed using new optimisation method. J Comput Biol 6:299–311

    Google Scholar 

  14. Jones DT, Miller RT, Thornton JM (1995) Successful protein fold recognition by optimal sequence threading validated by rigorous blind testing. Proteins Struct Funct Genet 23:387–397

    Google Scholar 

  15. Jones DT (1999) GenTHREADER: an efficient and reliable protein fold recognition method for genomic sequences. J Mol Biol 287(4):797–815

    Google Scholar 

  16. Lathrop RH, Smith TF (1996) Global optimum protein threading with gapped alignment and empirical pair potentials. J Mol Biol 255:641–665

    Google Scholar 

  17. Lazaridis T, Karplus M (2000) Effective energy functions for protein structure prediction. Curr Opin Struct Biol 10:139–145

    Google Scholar 

  18. Lin K, May ACW, Taylor WR (2002) Threading using neural network: the measure of protein sequence-structure compatibility. Bioinformatics 18(10):1350–1357

    Google Scholar 

  19. Lo Conte L, Brenner SE, Hubbard TJP, Chothia C, Murzin A (2002) SCOP database in 2002: refinements accommodate structural genomics. Nucleic Acids Res 30(1):264–267

  20. Lu H, Skolnick J (2001) A distance-dependent atomic knowledge-based potential for improved protein structure selection. Proteins Struct Funct Genet 44:223–232

    Google Scholar 

  21. McConkey BJ, Sobolev V, Edelman M (2003) Discrimination of native protein structures using atom-atom contact scoring. Proc Natl Acad Sci USA 100:3215–3220

    Google Scholar 

  22. McGuffin LJ, Jones DT (2003) Improvement of the GenTHREADER method for genomic fold recognition. Bioinformatics 19:874–881

    Google Scholar 

  23. Mosimann S, Meleshko R, James M (1995) A critical assessment of comparative molecular modelling of tertiary structures in proteins. Proteins Struct Funct Genet 23:301–317

    Google Scholar 

  24. Murzin AG, Brenner SE, Hubbard T, Chothia C (1995) SCOP: a structural classification of proteins database for the invertigation of sequences and structures. J Mol Biol 241(4):536–540

    Google Scholar 

  25. Orengo CA, Michie AD, Jones S, Jones DT, Swindells MB, Thornton JM (1997) CATH- a hierarchic classification of protein domain structures. Structure 5:1093–1108

    Google Scholar 

  26. Park B, Levitt M (1996) Energy functions that discriminate X-ray and near-native folds from well-constructed decoys. J Mol Biol 258:367–392

    Google Scholar 

  27. Russ WP, Ranganathan R (2002) Knowledge-based potential functions in protein design. Curr Opin Struct Biol 12:447–452

    Google Scholar 

  28. Samudrala R, Moult J (1998) An all-atom distance-dependent conditional probability discriminatory function for protein structure prediction. J Mol Biol 275:895–916

    Google Scholar 

  29. Samudrala R, Huang ES, Levitt M (1998) Selection of the most native-like conformations from a set of models constructed by homology modelling. Unpublished results.

  30. Samudrala R, Xia Y, Levitt M, Huang ES (1999) A combined approach for ab initio construction of low resolution protein tertiary structures from sequence. In: Proceedings of the pacific symposium on biocomputing, pp 505–516

  31. Samudrala R, Levitt M (2000) Decoys `R' Us: a database of incorrect conformations to improve protein structure prediction. Protein Sci 9:1399–1401

    Google Scholar 

  32. Samudrala R, Levitt M (2002) A comprehensive analysis of 40 blind protein structure predictions. BMC Struct Biol 2:3–18

    Google Scholar 

  33. Skolnick J, Kolinski A, Ortiz A (2000) Derivation of protein-specific pair potentials based on weak sequence fragment similarity. Proteins Struct Funct Genet 38:3–16

    Google Scholar 

  34. Simons KT, Kooperberg C, Huang ES, Baker D (1997) Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and bayesian scoring functions. J Mol Biol 268:209–225

    Google Scholar 

  35. Sippl MJ (1995) Knowledge-based potentials for proteins. Curr Opin Struct Biol 5:229–235

    Google Scholar 

  36. Taylor WR (1997) Multiple sequence threading: an analysis of alignment quality and stability. J Mol Biol 269:902–943

    Google Scholar 

  37. Taylor WR, Orengo CA (1989) Protein structure alignment. J Mol Biol 208(1):1–22

    Google Scholar 

  38. Thiele R, Zimmer R, Lengauer T (1999) Protein threading by recursive dynamic programming. J Mol Biol 290:757–779

    Google Scholar 

  39. Unger R, Moult J (1991) An analysis of protein folding pathways. Biochemistry 30:3816–3823

    Google Scholar 

  40. Vendruscolo M, Najmanovich R, Domany E (2000) Can a pairwise contact potential stabilize native protein folds against decoys obtained by threading?. Proteins Struct Funct Genet 38:134–148

    Google Scholar 

  41. Wang K, Fain B, Levitt M, Samudrala R (2004) Improved protein structure selection using decoy-dependent discriminatory functions. BMC Struct Biol 4(1):8

    Google Scholar 

  42. Xia Y, Huang ES, Levitt M, Samudrala R (2000) Ab initio construction of protein tertiary structures using a hierarchical approach. J Mol Biol 300:171–185

    Google Scholar 

  43. Zhou H, Zhou Y (2002) Distance-scaled, finite ideal-gas reference state improves structure-derived potentials of mean force for structure selection and stability prediction. Protein Sci 11:2714–2726

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to N. Jiang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jiang, N., XinyuWu, W. & Mitchell, I. Threading with environment-specific score by artificial neural networks. Soft Comput 10, 305–314 (2006). https://doi.org/10.1007/s00500-005-0488-6

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-005-0488-6

Keywords

Navigation