Abstract
Nuclear magnetic resonance (NMR) spectroscopy plays a critical role in structural genomics, and serves as a primary tool for determining protein structures, dynamics and interactions in physiologically-relevant solution conditions. The current speed of protein structure determination via NMR is limited by the lengthy time required in resonance assignment, which maps spectral peaks to specific atoms and residues in the primary sequence. Although numerous algorithms have been developed to address the backbone resonance assignment problem [68,2,10,37,14,64,1,31,60], little work has been done to automate side-chain resonance assignment [43, 48, 5]. Most previous attempts in assigning side-chain resonances depend on a set of NMR experiments that record through-bond interactions with side-chain protons for each residue. Unfortunately, these NMR experiments have low sensitivity and limited performance on large proteins, which makes it difficult to obtain enough side-chain resonance assignments. On the other hand, it is essential to obtain almost all of the side-chain resonance assignments as a prerequisite for high-resolution structure determination. To overcome this deficiency, we present a novel side-chain resonance assignment algorithm based on alternative NMR experiments measuring through-space interactions between protons in the protein, which also provide crucial distance restraints and are normally required in high-resolution structure determination. We cast the side-chain resonance assignment problem into a Markov Random Field (MRF) framework, and extend and apply combinatorial protein design algorithms to compute the optimal solution that best interprets the NMR data. Our MRF framework captures the contact map information of the protein derived from NMR spectra, and exploits the structural information available from the backbone conformations determined by orientational restraints and a set of discretized side-chain conformations (i.e., rotamers). A Hausdorff-based computation is employed in the scoring function to evaluate the probability of side-chain resonance assignments to generate the observed NMR spectra. The complexity of the assignment problem is first reduced by using a dead-end elimination (DEE) algorithm, which prunes side-chain resonance assignments that are provably not part of the optimal solution. Then an A* search algorithm is used to find a set of optimal side-chain resonance assignments that best fit the NMR data. We have tested our algorithm on NMR data for five proteins, including the FF Domain 2 of human transcription elongation factor CA150 (FF2), the B1 domain of Protein G (GB1), human ubiquitin, the ubiquitin-binding zinc finger domain of the human Y-family DNA polymerase Eta (pol η UBZ), and the human Set2-Rpb1 interacting domain (hSRI). Our algorithm assigns resonances for more than 90% of the protons in the proteins, and achieves about 80% correct side-chain resonance assignments. The final structures computed using distance restraints resulting from the set of assigned side-chain resonances have backbone RMSD 0.5 − 1.4 Å and all-heavy-atom RMSD 1.0 − 2.2 Å from the reference structures that were determined by X-ray crystallography or traditional NMR approaches. These results demonstrate that our algorithm can be successfully applied to automate side-chain resonance assignment and high-quality protein structure determination. Since our algorithm does not require any specific NMR experiments for measuring the through-bond interactions with side-chain protons, it can save a significant amount of both experimental cost and spectrometer time, and hence accelerate the NMR structure determination process.
This work is supported by the following grants from National Institutes of Health: R01 GM-65982 to B.R.D. and R01 GM-079376 to P.Z.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bailey-Kellogg, C., Chainraj, S., Pandurangan, G.: A Random Graph Approach to NMR Sequential Assignment. Journal of Computational Biology 12(6), 569–583 (2005)
Bailey-Kellogg, C., Widge, A., Kelley, J.J., Berardi, M.J., Bushweller, J.H., Donald, B.R.: The NOESY jigsaw: automated protein secondary structure and main-chain assignment from sparse, unassigned NMR data. Journal of Computational Biology 7(3-4), 537–558 (2000)
Baker, D., Sali, A.: Protein structure prediction and structural genomics. Science 294, 93–96 (2001)
Ball, G., Meenan, N., Bromek, K., Smith, B.O., Bella, J., Uhrín, D.: Measurement of one-bond 13Cα-1Hα residual dipolar coupling constants in proteins by selective manipulation of CαHα spins. Journal of Magnetic Resonance 180, 127–136 (2006)
Baran, M.C., Huang, Y.J., Moseley, H.N., Montelione, G.T.: Automated analysis of protein NMR assignments and structures. Chem. Rev. 104, 3456–3541 (2004)
Besag, J.: Spatial interaction and the statistical analysis of lattice systems. J. Royal Stat. Soc. B 36 (1974)
Bomar, M.G., Pai, M., Tzeng, S., Li, S., Zhou, P.: Structure of the ubiquitin-binding zinc finger domain of human DNA Y-polymerase η. EMBO reports 8, 247–251 (2007)
Boykov, Y., Veksler, O., Zabih, R.: Markov random fields with efficient approximations. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, p. 648 (1998)
Chen, C.Y., Georgiev, I., Anderson, A.C., Donald, B.R.: Computational structure-based redesign of enzyme activity. Proc. Natl. Acad. Sci. USA 106, 3764–3769 (2009)
Coggins, B.E., Zhou, P.: PACES: Protein sequential assignment by computer-assisted exhaustive search. Journal of Biomolecular NMR 26, 93–111 (2003)
Cornilescu, G., Marquardt, J.L., Ottiger, M., Bax, A.: Validation of Protein Structure from Anisotropic Carbonyl Chemical Shifts in a Dilute Liquid Crystalline Phase. Journal of the American Chemical Society 120, 6836–6837 (1998)
Desmet, J., Maeyer, M.D., Hazes, B., Lasters, I.: The dead-end elimination theorem and its use in protein side-chain positioning. Nature 356, 539–542 (1992)
Donald, B.R., Martin, J.: Automated NMR assignment and protein structure determination using sparse dipolar coupling constraints. Progress in NMR Spectroscopy 55, 101–127 (2009)
Eghbalnia, H.R., Bahrami, A., Wang, L.Y., Assadi, A., Markley, J.L.: Probabilistic identification of spin systems and their assignments including coil-helix inference as output (PISTACHIO). J. Biomol. NMR 32, 219–233 (2005)
Fiorito, F., Herrmann, T., Damberger, F.F., Wüthrich, K.: Automated amino acid side-chain NMR assignment of proteins using (13)C- and (15)N-resolved 3D [(1)H, (1)H]-NOESY. J. Biomol. NMR 42, 23–33 (2008)
Fiorito, F., Hiller, S., Wider, G., Wüthrich, K.: Automated resonance assignment of proteins: 6D APSY-NMR. J. Biomol. NMR 35, 27–37 (2006)
Fowler, C.A., Tian, F., Al-Hashimi, H.M., Prestegard, J.H.: Rapid determination of protein folds using residual dipolar couplings. Journal of Molecular Biology 304, 447–460 (2000)
Georgiev, I., Lilien, R.H., Donald, B.R.: The minimized dead-end elimination criterion and its application to protein redesign in a hybrid scoring and search algorithm for computing partition functions over molecular ensembles. Journal of Computational Chemistry 29, 1527–1542 (2008)
Goldstein, R.F.: Efficient rotamer elimination applied to protein side-chains and related spin glasses. Biophysical Journal 66, 1335–1340 (1994)
Grishaev, A., Llinás, M.: CLOUDS, a protocol for deriving a molecular proton density via NMR. Proc. Natl. Acad. Sci. USA 99, 6707–6712 (2002)
Grishaev, A., Llinás, M.: Protein structure elucidation from NMR proton densities. Proc. Natl. Acad. Sci. USA 99, 6713–6718 (2002)
Güntert, P.: Automated NMR Protein Structure Determination. Progress in Nuclear Magnetic Resonance Spectroscopy 43, 105–125 (2003)
Güntert, P.: Automated NMR protein structure calculation with CYANA. Meth. Mol. Biol. 278, 353–378 (2004)
Herrmann, T., Güntert, P., Wüthrich, K.: Protein NMR Structure Determination with Automated NOE Assignment Using the New Software CANDID and the Torsion Angle Dynamics Algorithm DYANA. Journal of Molecular Biology 319(1), 209–227 (2002)
Hiller, S., Joss, R., Wider, G.: Automated NMR assignment of protein side chain resonances using automated projection spectroscopy (APSY). J. Am. Chem. Soc. 130(36), 12073–12079 (2008)
Huang, Y.J., Tejero, R., Powers, R., Montelione, G.T.: A topology-constrained distance network algorithm for protein structure determination from NOESY data. Proteins: Structure Function and Bioinformatics 62(3), 587–603 (2006)
Huttenlocher, D.P., Jaquith, E.W.: Computing visual correspondence: Incorporating the probability of a false match. In: Proceedings of the Fifth International Conference on Computer Vision (ICCV 1995), pp. 515–522 (1995)
Huttenlocher, D.P., Kedem, K.: Distance Metrics for Comparing Shapes in the Plane. In: Donald, B.R., Kapur, D., Mundy, J. (eds.) Symbolic and Numerical Computation for Artificial Intelligence, pp. 201–219. Academic Press, London (1992)
Huttenlocher, D.P., Klanderman, G.A., Rucklidge, W.: Comparing Images Using the Hausdorff Distance. IEEE Trans. Pattern Anal. Mach. Intell. 15(9), 850–863 (1993)
Juszewski, K., Gronenborn, A.M., Clore, G.M.: Improving the Packing and Accuracy of NMR Structures with a Pseudopotential for the Radius of Gyration. Journal of the American Chemical Society 121, 2337–2338 (1999)
Kamisetty, H., Bailey-Kellogg, C., Pandurangan, G.: An efficient randomized algorithm for contact-based NMR backbone resonance assignment. Bioinformatics 22(2), 172–180 (2006)
Kamisetty, H., Xing, E.P., Langmead, C.J.: Free Energy Estimates of All-atom Protein Structures Using Generalized Belief Propagation. Journal of Computational Biology 15, 755–766 (2008)
Kindermann, R., Snell, J.L.: Markov Random Fields and Their Applications. American Mathematical Society, Providence (1980)
Kuszewski, J., Schwieters, C.D., Garrett, D.S., Byrd, R.A., Tjandra, N., Clore, G.M.: Completely automated, highly error-tolerant macromolecular structure determination from multidimensional nuclear overhauser enhancement spectra and chemical shift assignments. J. Am. Chem. Soc. 126(20), 6258–6273 (2004)
Langmead, C.J., Donald, B.R.: 3D structural homology detection via unassigned residual dipolar couplings. In: Proceedings of 2003 IEEE Comput. Syst. Bioinform. Conf., pp. 209–217 (2003)
Langmead, C.J., Donald, B.R.: High-throughput 3D structural homology detection via NMR resonance assignment. In: Proceedings of 2004 IEEE Comput. Syst. Bioinform. Conf., pp. 278–289 (2004)
Langmead, C.J., Yan, A.K., Lilien, R.H., Wang, L., Donald, B.R.: A polynomial-time nuclear vector replacement algorithm for automated NMR resonance assignments. In: Proceedings of the seventh annual international conference on Research in computational molecular biology, pp. 176–187 (2003)
Langmead, C.J., Donald, B.R.: An expectation/maximization nuclear vector replacement algorithm for automated NMR resonance assignments. J. Biomol. NMR 29(2), 111–138 (2004)
Leach, A.R., Lemon, A.P.: Exploring the conformational space of protein side chains using dead-end elimination and the A* algorithm. Proteins 33(2), 227–239 (1998)
Li, K.B., Sanctuary, B.C.: Automated extracting of amino acid spin systems in proteins using 3D HCCH-COSY/TOCSY spectroscopy and constrained partitioning algorithm (CPA). J. Chem. Inf. Comput. Sci. 36, 585–593 (1996)
Li, K.B., Sanctuary, B.C.: Automated resonance assignment of proteins using heteronuclear 3D NMR. 2. Side chain and sequence-specific assignment. J. Chem. Inf. Comput. Sci. 37, 467–477 (1997)
Li, M., Phatnani, H.P., Guan, Z., Sage, H., Greenleaf, A.L., Zhou, P.: Solution structure of the Set2-Rpb1 interacting domain of human Set2 and its interaction with the hyperphosphorylated C-terminal domain of Rpb1. Proceedings of the National Academy of Sciences 102, 17636–17641 (2005)
Lin, Y., Wagner, G.: Efficient side-chain and backbone assignment in large proteins: Application to tGCN5. J. Biomol. NMR 15, 227–239 (1999)
Linge, J.P., Habeck, M., Rieping, W., Nilges, M.: ARIA: Automated NOE assignment and NMR structure calculation. Bioinformatics 19(2), 315–316 (2003)
Looger, L.L., Hellinga, H.W.: Generalized dead-end elimination algorithms make large-scale protein side-chain structure prediction tractable: implications for protein design and structural genomics. J. Mol. Biol. 3007(1), 429–445 (2001)
Marin, A., Malliavin, T.E., Nicolas, P., Delsuc, M.A.: From NMR chemical shifts to amino acid types: investigation of the predictive power carried by nuclei. Journal of Biomolecular NMR 30, 47 (2004)
Masse, J.E., Keller, R., Pervushin, K.: SideLink: automated side-chain assignment of biopolymers from NMR data by relative-hypothesis-prioritization-based simulated logic. Journal of Magnetic Resonance 181(1), 45–67 (2006)
Montelione, G.T., Moseley, H.N.B.: Automated analysis of NMR assignments and structures for proteins. Curr. Opin. Struct. Biol. 9, 635–642 (1999)
Mumenthaler, C., Güntert, P., Braun, W., Wüthrich, K.: Automated combined assignment of NOESY spectra and three-dimensional protein structure determination. Journal of Biomolecular NMR 10(4), 351–362 (1997)
Ottiger, M., Delaglio, F., Bax, A.: Measurement of J and dipolar couplings from simplified two-dimensional NMR spectra. Journal of Magnetic Resonance 138, 373–378 (1998)
Prestegard, J.H., Bougault, C.M., Kishore, A.I.: Residual Dipolar Couplings in Structure Determination of Biomolecules. Chemical Reviews 104, 3519–3540 (2004)
Rieping, W., Habeck, M., Nilges, M.: Inferential Structure Determination. Science 309, 303–306 (2005)
Ruan, K., Briggman, K.B., Tolman, J.R.: De novo determination of internuclear vector orientations from residual dipolar couplings measured in three independent alignment media. Journal of Biomolecular NMR 41, 61–76 (2008)
Russell, S., Norvig, P.: Artificial Intelligence: A Modern Approach. Prentice Hall, Englewood Cliffs (2002)
Schwieters, C.D., Kuszewski, J.J., Tjandra, N., Clore, G.M.: The Xplor-NIH NMR molecular structure determination package. J. Magn. Reson. 160, 65–73 (2003)
Sun, X., Druzdzel, M.J., Yuan, C.: Dynamic Weighting A* Search-Based MAP Algorithm for Bayesian Networks. In: Proceedings of the 20th International Joint Conference on Artificial Intelligence, pp. 2385–2390 (2007)
Tjandra, N., Bax, A.: Direct measurement of distances and angles in biomolecules by NMR in a dilute liquid crystalline medium. Science 278, 1111–1114 (1997)
Tolman, J.R., Flanagan, J.M., Kennedy, M.A., Prestegard, J.H.: Nuclear magnetic dipole interactions in field-oriented proteins: Information for structure determination in solution. Proc. Natl. Acad. Sci. USA 92, 9279–9283 (1995)
Ulrich, E.L., Akutsu, H., Doreleijers, J.F., Harano, Y., Ioannidis, Y.E., Lin, J., Livny, M., Mading, S., Maziuk, D., Miller, Z., Nakatani, E., Schulte, C.F., Tolmie, D.E., Wenger, R.K., Yao, H., Markley, J.L.: BioMagResBank. Nucleic Acids Research 36, D402–D408 (2007)
Vitek, O., Bailey-Kellogg, C., Craig, B., Vitek, J.: Inferential backbone assignment for sparse data. J. Biomolecular NMR 35, 187–208 (2006)
Wang, L., Donald, B.R.: Exact solutions for internuclear vectors and backbone dihedral angles from NH residual dipolar couplings in two media, and their application in a systematic search algorithm for determining protein backbone structure. Jour. Biomolecular NMR 29(3), 223–242 (2004)
Wang, L., Mettu, R., Donald, B.R.: A Polynomial-Time Algorithm for De Novo Protein Backbone Structure Determination from NMR Data. Journal of Computational Biology 13(7), 1276–1288 (2006)
Wei, Z., Li, H.: A Markov random field model for network-based analysis of genomic data. Bioinformatics 23, 1537–1544 (2007)
Wu, K.-P., Chang, J.-M., Chen, J.-B., Chang, C.-F., Wu, W.-J., Huang, T.-H., Sung, T.-Y., Hsu, W.-L.: RIBRA-an Error-Tolerant Algorithm for the NMR Backbone Assignment Problem. In: Proceedings of the International conference on Research in Computational Molecular Biology (RECOMB 2005), pp. 229–244 (2005)
Xu, Y., Xu, D., Uberbacher, E.C.: An efficient computational method for globally optimal threading. J. Comput. Biol. 5(3), 597–614 (1998)
Zeng, J., Boyles, J., Tripathy, C., Wang, L., Yan, A., Zhou, P., Donald, B.R.: High-Resolution Protein Structure Determination Starting with a Global Fold Calculated from Exact Solutions to the RDC Equations. Journal of Biomolecular NMR 45, 265–281 (2009)
Zeng, J., Zhou, P., Donald, B.R.: A Markov Random Field Framework for Protein Side-Chain Resonance Assignment – Supplementary Material. Department of Computer Science, Duke University (January 2010), http://www.cs.duke.edu/donaldlab/Supplementary/recomb10/
Zimmerman, D.E., Kulikowski, C.A., Feng, W., Tashiro, M., Chien, C.-Y., Ríos, C.B., Moy, F.J., Powers, R., Montelione, G.T.: Automated analysis of protein NMR assignments using methods from artificial intelligence. J. Mol. Biol. 269, 592–610 (1997)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zeng, J., Zhou, P., Donald, B.R. (2010). A Markov Random Field Framework for Protein Side-Chain Resonance Assignment. In: Berger, B. (eds) Research in Computational Molecular Biology. RECOMB 2010. Lecture Notes in Computer Science(), vol 6044. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12683-3_36
Download citation
DOI: https://doi.org/10.1007/978-3-642-12683-3_36
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12682-6
Online ISBN: 978-3-642-12683-3
eBook Packages: Computer ScienceComputer Science (R0)