Skip to main content

Minimization-Aware Recursive \(K^{*}\) (\({ MARK}^{*}\)): A Novel, Provable Algorithm that Accelerates Ensemble-Based Protein Design and Provably Approximates the Energy Landscape

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 11467))

Abstract

Protein design algorithms that model continuous sidechain flexibility and conformational ensembles better approximate the in vitro and in vivo behavior of proteins. The previous state of the art, iMinDEE-\(A^*\)-\(K^*\), computes provable \(\varepsilon \)-approximations to partition functions of protein states (e.g., bound vs. unbound) by computing provable, admissible pairwise-minimized energy lower bounds on protein conformations and using the \(A^*\) enumeration algorithm to return a gap-free list of lowest-energy conformations. iMinDEE-A\(^*\)-\(K^*\) runs in time sublinear in the number of conformations, but can be trapped in loosely-bounded, low-energy conformational wells containing many conformations with highly similar energies. That is, iMinDEE-\(A^*\)-\(K^*\) is unable to exploit the correlation between protein conformation and energy: similar conformations often have similar energy. We introduce two new concepts that exploit this correlation: Minimization-Aware Enumeration and Recursive \(K^{*}\). We combine these two insights into a novel algorithm, Minimization-Aware Recursive \(K^{*}\) (\({ MARK}^{*}\)), that tightens bounds not on single conformations, but instead on distinct regions of the conformation space. We compare the performance of iMinDEE-\(A^*\)-\(K^*\) vs. \({ MARK}^{*}\) by running the \(BBK^*\) algorithm, which provably returns sequences in order of decreasing \(K^{*}\) score, using either iMinDEE-\(A^*\)-\(K^*\) or \({ MARK}^{*}\) to approximate partition functions. We show on 200 design problems that \({ MARK}^{*}\) not only enumerates and minimizes vastly fewer conformations than the previous state of the art, but also runs up to two orders of magnitude faster. Finally, we show that \({ MARK}^{*}\) not only efficiently approximates the partition function, but also provably approximates the energy landscape. To our knowledge, \({ MARK}^{*}\) is the first algorithm to do so. We use \({ MARK}^{*}\) to analyze the change in energy landscape of the bound and unbound states of the HIV-1 capsid protein C-terminal domain in complex with camelid V\(_{\mathrm{{H}}}\)H, and measure the change in conformational entropy induced by binding. Thus, \({ MARK}^{*}\) both accelerates existing designs and offers new capabilities not possible with previous algorithms.

J. D. Jou and G. T. Holt—These authors contributed equally to the work.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   74.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. ClinicalTrials.gov Identifier: NCT02840474. NIAID and National Institutes of Health Clinical Center, September 2018. https://clinicaltrials.gov/ct2/results?cond=&term=VRC07

  2. Chazelle, B., Kingsford, C., Singh, M.: A semidefinite programming approach to side chain positioning with new rounding strategies. INFORMS J. Comput. 16(4), 380–392 (2004). https://doi.org/10.1287/ijoc.1040.0096

  3. Chen, C.Y., Georgiev, I., Anderson, A.C., Donald, B.R.: Computational structure-based redesign of enzyme activity. Proc. Natl. Acad. Sci. USA 106(10), 3764–9 (2009). https://doi.org/10.1073/pnas.0900266106

    Article  Google Scholar 

  4. Dahiyat, B.I., Mayo, S.L.: De novo protein design: fully automated sequence selection. Science 278(5335), 82–87 (1997)

    Article  Google Scholar 

  5. Davey, J.A., Damry, A.M., Goto, N.K., Chica, R.A.: Rational design of proteins that exchange on functional timescales. Nat. Chem. Biol. 13(12), 1280–1285 (2017)

    Article  Google Scholar 

  6. Donald, B.R.: Algorithms in Structural Molecular Biology. MIT Press, Cambridge (2011)

    Google Scholar 

  7. Fleishman, S.J., Khare, S.D., Koga, N., Baker, D.: Restricted sidechain plasticity in the structures of native proteins and complexes. Protein Sci. 20(4), 753–757 (2011). https://doi.org/10.1002/pro.604

    Article  Google Scholar 

  8. Frederick, K.K., Marlow, M.S., Valentine, K.G., Wand, A.J.: Conformational entropy in molecular recognition by proteins. Nature 448(7151), 325–329 (2007). https://doi.org/10.1038/nature05959

    Article  Google Scholar 

  9. Frey, K.M., Georgiev, I., Donald, B.R., Anderson, A.C.: Predicting resistance mutations using protein design algorithms. Proc. Natl. Acad. Sci. U.S.A. 107(31), 13,707–13,712 (2010). https://doi.org/10.1073/pnas.1002162107

  10. Gainza, P., Nisonoff, H.M., Donald, B.R.: Algorithms for protein design. Curr. Opin. Struct. Biol. 39, 16–26 (2016)

    Article  Google Scholar 

  11. Gainza, P., Roberts, K.E., Donald, B.R.: Protein design using continuous rotamers. PLoS Comput. Biol. 8(1), e1002335 (2012). https://doi.org/10.1371/journal.pcbi.1002335

  12. Georgiev, I., Donald, B.R.: Dead-end elimination with backbone flexibility. Bioinformatics 23(13), i185–i194 (2007). https://doi.org/10.1093/bioinformatics/btm197

    Article  Google Scholar 

  13. Georgiev, I., Keedy, D., Richardson, J.S., Richardson, D.C., Donald, B.R.: Algorithm for backrub motions in protein design. Bioinformatics 24(13), i196–i204 (2008). https://doi.org/10.1093/bioinformatics/btn169

    Article  Google Scholar 

  14. Georgiev, I., Lilien, R.H., Donald, B.R.: Improved pruning algorithms and divide-and-conquer strategies for dead-end elimination, with application to protein design. Bioinformatics 22(14), e174–e183 (2006). https://doi.org/10.1093/bioinformatics/btl220

    Article  Google Scholar 

  15. Georgiev, I., Lilien, R.H., Donald, B.R.: The minimized dead-end elimination criterion and its application to protein redesign in a hybrid scoring and search algorithm for computing partition functions over molecular ensembles. J. Comput. Chem. 29(10), 1527–1542 (2008). https://doi.org/10.1002/jcc.20909

    Article  MATH  Google Scholar 

  16. Georgiev, I., et al.: Design of epitope-specific probes for sera analysis and antibody isolation. Retrovirology 9, P50 (2012)

    Google Scholar 

  17. Georgiev, I.S., et al.: Antibodies VRC01 and 10E8 neutralize HIV-1 with high breadth and potency even with IG-framework regions substantially reverted to germline. J. Immunol. 192(3), 1100–1106 (2014). https://doi.org/10.4049/jimmunol.1302515

    Article  Google Scholar 

  18. Gilson, M.K., Given, J.A., Bush, B.L., McCammon, J.A.: The statistical-thermodynamic basis for computation of binding affinities: a critical review. Biophys. J. 72(3), 1047–1069 (1997). https://doi.org/10.1016/S0006-3495(97)78756-3

    Article  Google Scholar 

  19. Gorczynski, M.J., et al.: Allosteric inhibition of the protein-protein interaction between the leukemia-associated proteins Runx1 and CBFbeta. Chem. Biol. 14(10), 1186–1197 (2007). https://doi.org/10.1016/j.chembiol.2007.09.006

    Article  Google Scholar 

  20. Hallen, M.A., Donald, B.R.: CATS (coordinates of atoms by taylor series): protein design with backbone flexibility in all locally feasible directions. Bioinformatics 33(14), i5–i12 (2017). https://doi.org/10.1093/bioinformatics/btx277

    Article  Google Scholar 

  21. Hallen, M.A., Gainza, P., Donald, B.R.: Compact representation of continuous energy surfaces for more efficient protein design. J. Chem. Theory Comput. 11(5), 2292–2306 (2015). https://doi.org/10.1021/ct501031m

    Article  Google Scholar 

  22. Hallen, M.A., Jou, J.D., Donald, B.R.: LUTE (local unpruned tuple expansion): accurate continuously flexible protein design with general energy functions and rigid rotamer-like efficiency. J. Comput. Biol. 24(6), 536–546 (2017). https://doi.org/10.1089/cmb.2016.0136

    Article  Google Scholar 

  23. Hallen, M.A., Keedy, D.A., Donald, B.R.: Dead-end elimination with perturbations (DEEPer): a provable protein design algorithm with continuous sidechain and backbone flexibility. Proteins 81(1), 18–39 (2013). https://doi.org/10.1002/prot.24150

    Article  Google Scholar 

  24. Hallen, M.A., et al.: OSPREY 3.0: open-source protein redesign for you, with powerful new features. J. Comput. Chem. 39(30), 2494–2507 (2018)

    Google Scholar 

  25. Hart, P., Nilsson, N.J., Raphael, B.: A formal basis for the heuristic determination of minimum cost paths. IEEE Trans. SSC 4, 100–114 (1968)

    Google Scholar 

  26. Hastings, W.: Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57(1), 97–109 (1970). https://doi.org/10.1093/biomet/57.1.97

    Article  MathSciNet  MATH  Google Scholar 

  27. Jou, J.D., Holt, G.T., Lowegard, A.U., Donald, B.R.: Supplementary information: minimization-aware recursive: K\(^{*}\) (MARK\(^{*}\)): A novel, provable partition function approximation algorithm that accelerates ensemble-based protein design and provably approximates the energy landscape (2019). (Available at http://www.cs.duke.edu/donaldlab/Supplementary/recomb19/markstar)

  28. Kuhlman, B., Baker, D.: Native protein sequences are close to optimal for their structures. Proc. Natl. Acad. Sci. U.S.A. 97(19), 10,383–10,388 (2000)

    Google Scholar 

  29. Leach, A.R., Lemon, A.P.: Exploring the conformational space of protein side chains using dead-end elimination and the A* algorithm. Proteins 33(2), 227–239 (1998)

    Article  Google Scholar 

  30. Leaver-Fay, A., et al.: Rosetta3: an object-oriented software suite for the simulation and design of macromolecules. Methods Enzymol. 487, 545–574 (2011). https://doi.org/10.1016/B978-0-12-381270-4.00019-6

    Article  Google Scholar 

  31. Lee, C., Subbiah, S.: Prediction of protein side-chain conformation by packing optimization. J. Mol. Biol. 217(2), 373–388 (1991)

    Article  Google Scholar 

  32. Lee, J.: New Monte Carlo algorithm: entropic sampling. Phys. Rev. Lett. 71(2), 211–214 (1993). https://doi.org/10.1103/PhysRevLett.71.211

    Article  Google Scholar 

  33. Lilien, R.H., Stevens, B.W., Anderson, A.C., Donald, B.R.: A novel ensemble-based scoring and search algorithm for protein redesign and its application to modify the substrate specificity of the gramicidin synthetase a phenylalanine adenylation enzyme. J. Comput. Biol. 12(6), 740–761 (2005). https://doi.org/10.1089/cmb.2005.12.740

    Article  Google Scholar 

  34. Lou, Q., Dechter, R., Ihler, A.T.: Anytime anyspace and/or search for bounding the partition function. In: AAAI (2017)

    Google Scholar 

  35. Lou, Q., Dechter, R., Ihler, A.T.: Dynamic importance sampling for anytime bounds of the partition function. In: NIPS (2017)

    Google Scholar 

  36. Lovell, S.C., Word, J.M., Richardson, J.S., Richardson, D.C.: The penultimate rotamer library. Proteins 40(3), 389–408 (2000)

    Article  Google Scholar 

  37. Nisonoff, H.: Efficient partition function estimation in computational protein design: probabalistic guarantees and characterization of a novel algorithm. B.S. thesis. Department of Mathematics, Duke University (2015). http://hdl.handle.net/10161/9746

  38. Nosé, S.: A molecular dynamics method for simulations in the canonical ensemble. Mol. Phys. 52(2), 255–268 (2006). https://doi.org/10.1080/00268978400101201

    Article  Google Scholar 

  39. Ojewole, A., et al.: OSPREY predicts resistance mutations using positive and negative computational protein design. Methods Mol. Biol. 1529, 291–306 (2017)

    Article  Google Scholar 

  40. Ojewole, A.A., Jou, J.D., Fowler, V.G., Donald, B.R.: BBK* (Branch and Bound over K*): a provable and efficient ensemble-based protein design algorithm to optimize stability and binding affinity over large sequence spaces. J. Comput. Biol. 25(7), 726–739 (2018). https://doi.org/10.1089/cmb.2017.0267

    Article  MathSciNet  Google Scholar 

  41. Qi, Y., et al.: Continuous interdomain orientation distributions reveal components of binding thermodynamics. J. Mol. Biol. 430(18 Pt B), 3412–3426 (2018)

    Google Scholar 

  42. Reardon, P.N., et al.: Structure of an HIV-1-neutralizing antibody target, the lipid-bound gp41 envelope membrane proximal region trimer. Proc. Natl. Acad. Sci. U.S.A. 111(4), 1391–1396 (2014). https://doi.org/10.1073/pnas.1309842111

    Article  Google Scholar 

  43. Reeve, S.M., Gainza, P., Frey, K.M., Georgiev, I., Donald, B.R., Anderson, A.C.: Protein design algorithms predict viable resistance to an experimental antifolate. Proc. Natl. Acad. Sci. U.S.A. 112(3), 749–754 (2015). https://doi.org/10.1073/pnas.1411548112

    Article  Google Scholar 

  44. Roberts, K.E., Cushing, P.R., Boisguerin, P., Madden, D.R., Donald, B.R.: Computational design of a PDZ domain peptide inhibitor that rescues CFTR activity. PLoS Comput. Biol. 8(4), e1002477 (2012). https://doi.org/10.1371/journal.pcbi.1002477

  45. Roberts, K.E., Donald, B.R.: Improved energy bound accuracy enhances the efficiency of continuous protein design. Proteins 83(6), 1151–1164 (2015). https://doi.org/10.1002/prot.24808

    Article  Google Scholar 

  46. Roberts, K.E., Gainza, P., Hallen, M.A., Donald, B.R.: Fast gap-free enumeration of conformations and sequences for protein design. Proteins 83(10), 1859–1877 (2015). https://doi.org/10.1002/prot.24870

    Article  Google Scholar 

  47. Rudicell, R.S., et al.: Enhanced potency of a broadly neutralizing HIV-1 antibody in vitro improves protection against lentiviral infection in vivo. J. Virol. 88(21), 12,669–12,682 (2014). https://doi.org/10.1128/JVI.02213-14

  48. Sciretti, D., Bruscolini, P., Pelizzola, A., Pretti, M., Jaramillo, A.: Computational protein design with side-chain conformational entropy. Proteins 74(1), 176–191 (2009). https://doi.org/10.1002/prot.22145

    Article  Google Scholar 

  49. Silver, N.W., et al.: Efficient computation of small-molecule configurational binding entropy and free energy changes by ensemble enumeration. J. Chem. Theory Comput. 9(11), 5098–5115 (2013). https://doi.org/10.1021/ct400383v

  50. Simoncini, D., Allouche, D., de Givry, S., Delmas, C., Barbe, S., Schiex, T.: Guaranteed discrete energy optimization on large protein design problems. J. Chem. Theory Comput. 11(12), 5980–5989 (2015). https://doi.org/10.1021/acs.jctc.5b00594

    Article  Google Scholar 

  51. Stevens, B.W., Lilien, R.H., Georgiev, I., Donald, B.R., Anderson, A.C.: Redesigning the PheA domain of gramicidin synthetase leads to a new understanding of the enzyme’s mechanism and selectivity. Biochemistry 45(51), 15,495–15,504 (2006). https://doi.org/10.1021/bi061788m

  52. Traoré, S., et al.: A new framework for computational protein design through cost function network optimization. Bioinformatics 29(17), 2129–2136 (2013). https://doi.org/10.1093/bioinformatics/btt374

    Article  Google Scholar 

  53. Tzeng, S.R., Kalodimos, C.G.: Protein activity regulation by conformational entropy. Nature 488(7410), 236–240 (2012). https://doi.org/10.1038/nature11271

    Article  Google Scholar 

  54. Valiant, L.G.: The complexity of computing the permanent. Theoret. Comput. Sci. 8(2), 189–201 (1979)

    Article  MathSciNet  Google Scholar 

  55. Viricel, C., Simoncini, D., Barbe, S., Schiex, T.: Guaranteed weighted counting for affinity computation: beyond determinism and structure. In: Rueher, M. (ed.) CP 2016. LNCS, vol. 9892, pp. 733–750. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-44953-1_46

    Chapter  Google Scholar 

Download references

Acknowledgements

We thank Goke Ojewole, Mark Hallen, Jeffrey Martin, Marcel Frenkel, Terrence Oas, Jane and Dave Richardson, Hong Niu, and all members of the lab for helpful discussions; Jeffrey Martin for software optimizations; and the NIH (R01-GM078031 and R01-GM118543 to BRD) for funding.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bruce R. Donald .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Jou, J.D., Holt, G.T., Lowegard, A.U., Donald, B.R. (2019). Minimization-Aware Recursive \(K^{*}\) (\({ MARK}^{*}\)): A Novel, Provable Algorithm that Accelerates Ensemble-Based Protein Design and Provably Approximates the Energy Landscape. In: Cowen, L. (eds) Research in Computational Molecular Biology. RECOMB 2019. Lecture Notes in Computer Science(), vol 11467. Springer, Cham. https://doi.org/10.1007/978-3-030-17083-7_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-17083-7_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-17082-0

  • Online ISBN: 978-3-030-17083-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics