Skip to main content

Minimizing and Learning Energy Functions for Side-Chain Prediction

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 4453))

Abstract

Side-chain prediction is an important subproblem of the general protein folding problem. Despite much progress in side-chain prediction, performance is far from satisfactory. As an example, the ROSETTA program that uses simulated annealing to select the minimum energy conformations, correctly predicts the first two side-chain angles for approximately 72% of the buried residues in a standard data set. Is further improvement more likely to come from better search methods, or from better energy functions? Given that exact minimization of the energy is NP hard, it is difficult to get a systematic answer to this question.

In this paper, we present a novel search method and a novel method for learning energy functions from training data that are both based on Tree Reweighted Belief Propagation (TRBP). We find that TRBP can find the global optimum of the ROSETTA energy function in a few minutes of computation for approximately 85% of the proteins in a standard benchmark set. TRBP can also effectively bound the partition function which enables using the Conditional Random Fields (CRF) framework for learning.

Interestingly, finding the global minimum does not significantly improve side-chain prediction for an energy function based on ROSETTA’s default energy terms (less than 0.1%), while learning new weights gives a significant boost from 72% to 78%. Using a recently modified ROSETTA energy function with a softer Lennard-Jones repulsive term, the global optimum does improve prediction accuracy from 77% to 78%. Here again, learning new weights improves side-chain modeling even further to 80%. Finally, the highest accuracy (82.6%) is obtained using an extended rotamer library and CRF learned weights. Our results suggest that combining machine learning with approximate inference can improve the state-of-the-art in side-chain prediction.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Canutescu, A.A., Shelenkov, A.A., Dunbrack Jr., R.L.: A graph-theory algorithm for rapid protein side-chain prediction. Protein Sci. 12(9), 2001–2014 (2003)

    Article  Google Scholar 

  2. Kuhlman, B., Baker, D.: Native protein sequences are close to optimal for their structures. PNAS 97(19), 10383–10388 (2000), http://www.pnas.org/cgi/content/abstract/97/19/10383

    Article  Google Scholar 

  3. Fraenkel, A.S.: Protein Folding, Spin Glass and Computational Complexity. In: Proceedings of the 3rd DIMACS Workshop on DNA Based Computers, held at the University of Pennsylvania, June 23 – 25, pp. 175–191 (1997)

    Google Scholar 

  4. Desmet, J., Maeyer, M.D., Hazes, B., Lasters, I.: The dead-end elmination theorem and its use in protein side-chain positioning. Nature 356, 539–542 (1992)

    Article  Google Scholar 

  5. Goldstein, R.F.: Efficient rotamer elimination applied to protein side-chains and related spin glasses. Biophys. J. 66(5), 1335–1340 (1994)

    Article  Google Scholar 

  6. Pierce, N.A., Spriet, J.A., Desmet, J., Mayo, S.L.: Conformational splitting: A more powerful criterion for dead-end elimination. J. of Computational Chemistry 21(11), 999–1009 (2000)

    Article  Google Scholar 

  7. Kingsford, C.L., Chazelle, B., Singh, M.: Solving and analyzing side-chain positioning problems using linear and integer programming. Bioinformatics 21(7), 1028–1039 (2005)

    Article  Google Scholar 

  8. Dantas, G., Corrent, C., Reichow, S.L., Havranek, J.J., Eletr, Z.M., Isern, N.G., Kuhlman, B., Varani, G., Merritt, E.A., Baker, D.: High-resolution structural and thermodynamic analysis of extreme stabilization of human procarboxypeptidase by computational protein design’. Journal of Molecular Biology (in Press, 2007)

    Google Scholar 

  9. Dunbrack Jr., R.L., M., K.: Back-bone dependent Rotamer Library for Proteins: Application to Side-chain Predicrtion. J. Mol. Biol 230(2), 543–574 (1993)

    Article  Google Scholar 

  10. Rohl, C.A., Strauss, C.E.M., Chivian, D., Baker, D.: Modeling structurally variable regions in homologous proteins with Rosetta. Proteins: Structure, Function, and Bioinformatics 55(3), 656–677 (2004)

    Article  Google Scholar 

  11. Lazaridis, T., Karplus, M.: Effective energy function for proteins in solution. Proteins: Structure, Function, and Genetics 35(2), 133–152 (1999)

    Article  Google Scholar 

  12. Kortemme, T., Morozov, A.V., Baker, D.: An Orientation-dependent Hydrogen Bonding Potential Improves Prediction of Specificity and Structure for Proteins and Protein-Protein Complexes. Journal of Molecular Biology 326(4), 1239–1259 (2003)

    Article  Google Scholar 

  13. Lafferty, J., McCallum, A., Pereira, F.: Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. In: ICML 2001, pp. 282–289 (2001)

    Google Scholar 

  14. Lafferty, J., Zhu, X., Liu, Y.: Kernel conditional random fields: Representation and clique selection. In: ICML (2004)

    Google Scholar 

  15. LeCun, Y., Huang, F.J.: Loss Functions for Discriminative Training of Energy-Based Models. In: Proc. of the 10-th International Workshop on Artificial Intelligence and Statistics (AIStats’05) (2005)

    Google Scholar 

  16. Vishwanathan, S., Schraudolph, N., Schmidt, M., Murphy, K.: Accelerated Training of Conditional Random Fields with Stochastic Meta-Descent. In: ICML (2006)

    Google Scholar 

  17. Gunawardana, A., Mahajan, M., Acero, A., Platt, J.C.: Hidden conditional random fields for phone classification. In: INTERSPEECH (2005)

    Google Scholar 

  18. Quattoni, A., Collins, M., Darrell, T.: Conditional random fields for object recognition. In: Thrun, S., Saul, L., Schölkopf, B. (eds.) Advances in Neural Information Processing Systems, vol. 17, MIT Press, Cambridge (2005)

    Google Scholar 

  19. Taskar, B., Guestrin, C., Koller, D.: Max-Margin Markov Networks. In: Thrun, S., Saul, L., Schölkopf, B. (eds.) Advances in Neural Information Processing Systems, vol. 16, MIT Press, Cambridge (2004)

    Google Scholar 

  20. Wainwright, M.J., Jaakkola, T., Willsky, A.S.: MAP estimation via agreement on (hyper)trees: Message-passing and linear-programming approaches. IEEE Transactions on Information Theory 51(11), 3697–3717 (2005)

    Article  MathSciNet  Google Scholar 

  21. Yedidia, J.S., Freeman, W.T., Weiss, Y.: Understanding Belief Propagation and its Generalizations. IJCAI (distinguished lecture track) (2001)

    Google Scholar 

  22. Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, San Francisco (1988)

    Google Scholar 

  23. Yanover, C., Weiss, Y.: Approximate inference and protein folding. Advances in Neural Information Processing Systems (2002)

    Google Scholar 

  24. Kolmogorov, V.: Convergent tree-reweighted message passing for energy minimization. In: Proceedings AI Stats. (2005)

    Google Scholar 

  25. Kolmogorov, V., Wainwright, M.J.: On the Optimality of Tree-reweighted Max-product Message Passing. In: Uncertainty in Artificial Intelligence (UAI),

    Google Scholar 

  26. Meltzer, T., Yanover, C., Weiss, Y.: Globally optimal solutions for energy minimization in stereo vision using reweighted belief propagation. In: Proceedings International Conference on Computer Vision (ICCV) (2005)

    Google Scholar 

  27. Yanover, C., Meltzer, T., Weiss, Y.: Linear Programming Relaxations and Belief Propagation – An Empirical Study. Journal of Machine Learning Research 7, 1887–1907 (2006)

    MathSciNet  Google Scholar 

  28. Liu, Y., Kuhlman, B.: RosettaDesign server for protein design. NAR 34, W235–238 (2006)

    Article  Google Scholar 

  29. Wang, C., Schueler-Furman, O., Baker, D.: Improved side-chain modeling for protein-protein docking. Protein Sci. 14(5), 1328–1339 (2005)

    Article  Google Scholar 

  30. Gray, J.J., Moughon, S., Wang, C., Schueler-Furman, O., Kuhlman, B., Rohl, C.A., Baker, D.: Protein-Protein Docking with Simultaneous Optimization of Rigid-body Displacement and Side-chain Conformations. Journal of Molecular Biology 331(1), 281–299 (2003)

    Article  Google Scholar 

  31. Leaver-Fay, A., Kuhlman, B., Snoeyink, J.: An Adaptive Dynamic Programming Algorithm for the Side Chain Placement Problem. Pacific Symposium on Biocomputing 10, 16–27 (2005)

    Article  Google Scholar 

  32. Peterson, R.W., Dutton, P.L., Wand, A.J.: Improved side-chain prediction accuracy using an ab initio potential energy function and a very large rotamer library. Protein Sci. 13(3), 735–751 (2004)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Terry Speed Haiyan Huang

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Berlin Heidelberg

About this paper

Cite this paper

Yanover, C., Schueler-Furman, O., Weiss, Y. (2007). Minimizing and Learning Energy Functions for Side-Chain Prediction. In: Speed, T., Huang, H. (eds) Research in Computational Molecular Biology. RECOMB 2007. Lecture Notes in Computer Science(), vol 4453. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71681-5_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-71681-5_27

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-71680-8

  • Online ISBN: 978-3-540-71681-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics