Abstract
To fully utilize all available information in protein structure prediction, including both backbone and side-chain structures, we present a novel algorithm for solving a generalized threading problem. In this problem, we consider simultaneously backbone threading and side-chain packing during the process of a protein structure prediction. For a given query protein sequence and a template structure, our goal is to find a threading alignment between the query sequence and the template structure, along with a rotamer assignment for each side-chain of the query protein, which optimizes an energy function that combines a backbone threading energy and a side-chain packing energy. This highly computationally challenging problem is solved through first formulating this problem as a graph-based optimization problem. Various graph-theoretic techniques are employed to achieve the computational efficiency to make our algorithm practically useful, which takes advantage of a number of special properties of the graph representing this generalized threading problem. The overall framework of our algorithm is a dynamic programming algorithm implemented on an optimal tree decomposition of the graph representation of our problem. By using various additional heuristic techniques such as the dead-end elimination, we have demonstrated that our algorithm can solve a generalized threading problem within practically acceptable amount of time and space, the first of its kind.
Similar content being viewed by others
References
Park, B., Levitt, M.: Energy functions that discriminate X-ray and near native folds from well-constructed decoys. J. Mol. Biol. 258(2), 367–392 (1996)
Ableson, A., Glasgow, J.I.: Crystallographic threading. Proc. Int. Conf. Intell. Syst. Mol. Biol. 2–9 (1999)
Park, B.H., Huang, E.S., Levitt, M.: Factors affecting the ability of energy functions to discriminate correct from incorrect folds. J. Mol. Biol. 266(4), 831–846 (1997)
Ayers, D.J., et al.: Enhanced protein fold recognition using secondary structure information from NMR. Protein Sci. 8(5), 1127–1133 (1999)
Bowie, J.U., Luthy, R., Eisenberg, D.: A method to identify protein sequences that fold into a known three-dimensional structure. Science 253(5016), 164–170 (1991)
Friedberg, I., et al.: The interplay of fold recognition and experimental structure determination in structural genomics. Curr. Opin. Struct. Biol. 14(3), 307–312 (2004)
Ginalski, K., et al.: Practical lessons from protein structure prediction. Nucleic. Acids Res. 33(6), 1874–1891 (2005)
Jones, D.T., et al.: Successful recognition of protein folds using threading methods biased by sequence similarity and predicted secondary structure. Proteins Suppl. 3, 104–111 (1999)
Jones, D.T.: GenTHREADER: an efficient and reliable protein fold recognition method for genomic sequences. J. Mol. Biol. 287(4), 797–815 (1999)
Karplus, K., Barrett, C., Hughey, R.: Hidden Markov models for detecting remote protein homologies. Bioinformatics 14(10), 846–856 (1998)
Karplus, K., et al.: Predicting protein structure using hidden Markov models. Proteins Suppl. 1, 134–139 (1997)
Yang, A.S., Honig, B.: Sequence to structure alignment in comparative modeling using PrISM. Proteins Suppl. 3, 66–72 (1999)
Petrey, D., et al.: Using multiple structure alignments, fast model building, and energetic analysis in fold recognition and homology modeling. Proteins 53(6), 430–435 (2003)
Skolnick, J., et al.: Ab initio protein structure prediction via a combination of threading, lattice folding, clustering, and structure refinement. Proteins Suppl. 5, 149–156 (2001)
Reva, B.A., et al.: Recognition of protein structure on coarse lattices with residue-residue energy functions. Protein Eng. 10(10), 1123–1130 (1997)
Reva, B.A., Skolnick, J., Finkelstein, A.V.: Averaging interaction energies over homologs improves protein fold recognition in gapless threading. Proteins 35(3), 353–359 (1999)
Zhang, Y., Arakaki, A.K., Skolnick, J.: TASSER: an automated method for the prediction of protein tertiary structures in CASP6. Proteins 61(7), 91–98 (2005)
Zhang, Y., Skolnick, J.: The protein structure prediction problem could be solved using the current PDB library. Proc. Natl. Acad. Sci. USA 102(4), 1029–1034 (2005)
Xu, Y., Xu, D.: Protein threading using PROSPECT: design and evaluation. Proteins 40(3), 343–354 (2000)
Xu, Y., Xu, D., Uberbacher, E.C.: An efficient computational method for globally optimal threading. J. Comput. Biol. 5(3), 597–614 (1998)
Xu, J., et al.: RAPTOR: optimal protein threading by linear programming. J. Bioinform. Comput. Biol. 1(1), 95–117 (2003)
Xu, J., Xu, Y., Li, M.: Protein threading by linear programming: theoretical analysis and computational results. J. Comb. Optim. 8, 403–418 (2004)
Xu, J. et al.: Protein threading by linear programming, Pac. Symp. Biocomput. 264–275 (2003)
Moult, J.: A decade of CASP: progress, bottlenecks and prognosis in protein structure prediction. Curr. Opin. Struct. Biol. 15(3), 285–289 (2005)
Kryshtafovych, A., et al.: Progress over the first decade of CASP experiments. Proteins 61(7), 225–236 (2005)
Moult, J., et al.: Critical assessment of methods of protein structure prediction (CASP)–round 6. Proteins 61(7), 3–7 (2005)
Lathrop, R.H.: The protein threading problem with sequence amino acid interaction preferences is NP-complete. Protein Eng. 7(9), 1059–1068 (1994)
Calland, P.Y.: On the structural complexity of a protein. Protein Eng. 16(2), 79–86 (2003)
Canutescu, A.A., Shelenkov, A.A., Dunbrack, R.L. Jr.: A graph-theory algorithm for rapid protein side-chain prediction. Protein Sci. 12(9), 2001–2014 (2003)
Buchete, N.V., Straub, J.E., Thirumalai, D.: Development of novel statistical potentials for protein fold recognition. Curr. Opin. Struct. Biol. 14(2), 225–232 (2004)
Chen, W.W., Shakhnovich, E.I.: Lessons from the design of a novel atomic potential for protein folding. Protein Sci. 14(7), 1741–1752 (2005)
Shimada, J., Shakhnovich, E.I.: The ensemble folding kinetics of protein G from an all-atom Monte Carlo simulation. Proc. Natl. Acad. Sci. USA 99(17), 11175–11180 (2002)
Vendruscolo, M., Najmanovich, R., Domany, E.: Can a pairwise contact potential stabilize native protein folds against decoys obtained by threading? Proteins 38(2), 134–148 (2000)
Crooks, G.E., Wolfe, J., Brenner, S.E.: Measurements of protein sequence-structure correlations. Proteins 57(4), 804–810 (2004)
Buchete, N.V., Straub, J.E., Thirumalai, D.: Orientational potentials extracted from protein structures improve native fold recognition. Protein Sci. 13(4), 862–874 (2004)
Cornell, W.D., et al.: A second generation force-field for the simulation of proteins, nucleic-acids, and organic-molecules. J. Am. Chem. Soc. 117(19), 5179–5197 (1995)
Liu, Z., Dominy, B., Shakhnovich, E.: Structural mining: Self-consistent design on flexible protein-peptide docking and transferable binding affinity potential. J. Am. Chem. Soc. (2004, accepted)
Liu, Z. et al.: Quantitative evaluation of protein-DNA interactions using an optimized knowledge-based potential. Nucleic Acids Res. (2005)
Song, Y. et al.: Efficient algorithms for protein threading via tree decomposition (2007, submitted)
Robertson, N., Seymour, P.D.: Graph Minors 2. Algorithmic aspects of tree-width. J. Algorithms 7(3), 309–322 (1986)
Aharoni, R., Herman, G.T., Loebl, M.: Jordan graphs. Graph. Model. Image Process. 58(4), 345–359 (1996)
Bodlaender, H.L.: A linear-time ie algorithm for finding three-decompositions of small treewidth. SIAM J. Comput. 25(6), 1305–1317 (1996)
Arnborg, S., Corneil, D.G., Proskurowski, A.: Complexity of finding embeddings in a K-tree. SIAM J. Algebr. Discret. Method. 8(2), 277–284 (1987)
Arnborg, S., Lagergren, J., Seese, D.: Easy problems for tree-decomposable graphs. J. Algorithms 12(2), 308–340 (1991)
Jordan, M.I.: Graphical models. Stat. Sci. 19(1), 140–155 (2004)
Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48(3), 443–453 (1970)
Gotoh, O.: An improved algorithm for matching biological sequences. J. Mol. Biol. 162(3), 705–708 (1982)
Thiele, R., Zimmer, R., Lengauer, T.: Protein threading by recursive dynamic programming. J. Mol. Biol. 290(3), 757–779 (1999)
Dunbrack, R.L. Jr., Karplus, M.: Backbone-dependent rotamer library for proteins. Application to side-chain prediction. J. Mol. Biol. 230(2), 543–574 (1993)
Desmet, J., De Maeyer, M., Lasters, I.: Theoretical and algorithmical optimization of the dead-end elimination theorem. Pac. Symp. Biocomput. 122–133 (1997)
Desmet, J., et al.: The dead-end elimination theorem and its use in protein side-chain positioning. Nature 356(6369), 539–542 (1992)
Goldstein, R.F.: Efficient rotamer elimination applied to protein side-chains and related spin glasses. Biophys. J. 66(5), 1335–1340 (1994)
Xu, J.: Rapid protein side-chain packing via tree decomposition. In: RECOMB 2005 (2005)
Holm, L., Sander, C.: Mapping the protein universe. Science 273(5275), 595–603 (1996)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Li, G., Liu, Z., Guo, JT. et al. An Algorithm for Simultaneous Backbone Threading and Side-Chain Packing. Algorithmica 51, 435–450 (2008). https://doi.org/10.1007/s00453-007-9070-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00453-007-9070-1