Skip to main content
Log in

Side-chain conformational space analysis (SCSA): A multi conformation-based QSAR approach for modeling and prediction of protein–peptide binding affinities

  • Published:
Journal of Computer-Aided Molecular Design Aims and scope Submit manuscript

Abstract

In this article, the concept of multi conformation-based quantitative structure–activity relationship (MCB-QSAR) is proposed, and based upon that, we describe a new approach called the side-chain conformational space analysis (SCSA) to model and predict protein–peptide binding affinities. In SCSA, multi-conformations (rather than traditional single-conformation) have received much attention, and the statistical average information on multi-conformations of side chains is determined using self-consistent mean field theory based upon side chain rotamer library. Thereby, enthalpy contributions (including electrostatic, steric, hydrophobic interaction and hydrogen bond) and conformational entropy effects to the binding are investigated in terms of occurrence probability of residue rotamers. Then, SCSA was applied into the dataset of 419 HLA-A*0201 binding peptides, and nonbonding contributions of each position in peptide ligands are well determined. For the peptides, the hydrogen bond and electrostatic interactions of the two ends are essential to the binding specificity, van der Waals and hydrophobic interactions of all the positions ensure strong binding affinity, and the loss of conformational entropy at anchor positions partially counteracts other favorable nonbonding effects.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Winkler DA (2002) The role of quantitative structure-activity relation-ships (QSAR) in biomolecular discovery. Brief Bioinform 3:73–86. doi:10.1093/bib/3.1.73

    Article  CAS  Google Scholar 

  2. Fujita T (1997) Recent success stories leading to commercializable bioactive compounds with the aid of traditional QSAR procedures. Quant Struct-Act Relat 16:107–112. doi:10.1002/qsar.19970160202

    Article  CAS  Google Scholar 

  3. Hansch C, Fujita T (1964) ρ-σ-π analysis: a method for the correlation of biological activity and chemical structure. J Am Chem Soc 86:1616–1626. doi:10.1021/ja01062a035

    Article  CAS  Google Scholar 

  4. Free SM, Wilson JB (1964) A mathematical contribution to structure-activity studies. J Med Chem 7:395–399. doi:10.1021/jm00334a001

    Article  CAS  Google Scholar 

  5. Winer H (1947) Structural determination of paraffin boiling point. J Am Chem Soc 69:2636–2641. doi:10.1021/ja01203a022

    Article  Google Scholar 

  6. Randic M (1975) On characterization of molecular branching. J Am Chem Soc 97:6609–6615. doi:10.1021/ja00856a001

    Article  CAS  Google Scholar 

  7. Balaban AT (1982) High discrimination distance-based topological index. Chem Phys Lett 89:399–404. doi:10.1016/0009-2614(82)80009-2

    Article  CAS  Google Scholar 

  8. Cramer RD, Patterson DE, Bunce JD (1988) Comparative molecular field analysis (CoMFA). 1. Effect of shape on binding of steroids to carrier proteins. J Am Chem Soc 110:5959–5967. doi:10.1021/ja00226a005

    Article  CAS  Google Scholar 

  9. Klebe G, Abraham U, Mietzner T (1994) Molecular similarity indices in a comparative analysis (CoMSIA) of drug molecules to correlate and predict their biological activity . J Med Chem 37:4130–4146. doi:10.1021/jm00050a010

    Article  CAS  Google Scholar 

  10. Silverman BD, Platt DE (1996) Comparative molecular moment analysis (CoMMA): 3D-QSAR without molecular superposition. J Med Chem 39:2129–2140. doi:10.1021/jm950589q

    Article  CAS  Google Scholar 

  11. Pastor M, Cruciani G, McLay I, Pickett S, Clementi S (2000) GRid-INdependent descriptors (GRIND): a novel class of alignment-independent three-dimensional molecular descriptors. J Med Chem 43:3233–3243. doi:10.1021/jm000941m

    Article  CAS  Google Scholar 

  12. Hopfinger AJ, Wang S, Tokarski JS, Jin BQ, Albuquerque M, Madhav PJ et al (1997) Construction of 3D-QSAR models using 4D-QSAR analysis formalism. J Am Chem Soc 119:10509–10524. doi:10.1021/ja9718937

    Article  CAS  Google Scholar 

  13. Vedani A, Dobler M (2002) Multidimensional QSAR: moving from three- to five-dimensional concepts. Quant Struct-Act Relat 21:382–390. doi:10.1002/1521-3838(200210)21:4<382::AID-QSAR382>3.0.CO;2-L

    Article  CAS  Google Scholar 

  14. Wade RC, Oritz AR, Gago F (1998) Comparative binding energy analysis. Perspect Drug Discov Des 9:19–34. doi:10.1023/A:1027247618908

    Article  Google Scholar 

  15. Pouplana R, Lozano JJ, Pérez C, Ruiz J (2002) Structure-based QSAR study on differential inhibition of human prostaglandin endoperoxide H synthase-2 (COX-2) by nonsteroidal anti-inflammatory drugs. J Comput-Aid Mol Des 16:683–709. doi:10.1023/A:1022488507391

    Article  CAS  Google Scholar 

  16. Santos-Filho OA, Hopfinger AJ (2006) Structure-based QSAR analysis of a set of 4-hydroxy-5, 6-dihydropyrones as inhibitors of HIV-1 protease: an application of the receptor-dependent (RD) 4D-QSAR formalism. J Chem Inf Model 46:345–354. doi:10.1021/ci050326x

    Article  CAS  Google Scholar 

  17. Zhou P, Tian F, Li Z (2007) A structure-based, quantitative structure-activity relationship approach for predicting HLA-A*0201-restricted cytotoxic T lymphocyte epitopes. Chem Biol Drug Des 69:56–67. doi:10.1111/j.1747-0285.2007.00472.x

    Article  CAS  Google Scholar 

  18. Walters DE, Hinds RM (1994) Genetically evolved receptor models: a computational approach to construction of receptor models. J Med Chem 37:2527–2536. doi:10.1021/jm00042a006

    Article  CAS  Google Scholar 

  19. Hahn M (1995) Receptor surface models. 1. Definition and construction. J Med Chem 38:2080–2090. doi:10.1021/jm00012a007

    Article  CAS  Google Scholar 

  20. Chen H, Zhou J, Xie G (1998) PARM: a genetic evolved algorithm to predict bioactivity. J Chem Inf Comput Sci 38:243–250. doi:10.1021/ci970004w

    CAS  Google Scholar 

  21. Frederick KK, Marlow MS, Valentine KG, Wand AJ (2007) Conformational entropy in molecular recognition by proteins. Nature 448:325–329. doi:10.1038/nature05959

    Article  CAS  Google Scholar 

  22. Lovell SC, Word JM, Richardson JS, Richardson DC (2000) The penultimate rotamer library. Proteins 40:389–408. doi:10.1002/1097-0134(20000815)40:3<389::AID-PROT50>3.0.CO;2-2

    Article  CAS  Google Scholar 

  23. Koehl P, Delarue M (1994) Application of a self consistent mean field theory to predict protein side-chain conformations and estimate their conformational entropy. J Mol Biol 239:249–275. doi:10.1006/jmbi.1994.1366

    Article  CAS  Google Scholar 

  24. Koehl P, Delarue M (1996) Mean-field minimization methods for biological macromolecules. Curr Opin Struct Biol 6:222–226. doi:10.1016/S0959-440X(96)80078-9

    Article  CAS  Google Scholar 

  25. Stanfield RL, Wilson IA (1995) Protein-peptide interactions. Curr Opin Struct Biol 5:103–113. doi:10.1016/0959-440X(95)80015-S

    Article  CAS  Google Scholar 

  26. Bhat TN, Sasisekharan V, Vijayan M (1979) An analysis of side chain conformations in proteins. Int J Pept Protein Res 13:170–184

    CAS  Google Scholar 

  27. Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H et al (2000) The protein data bank. Nucleic Acids Res 28:235–242. doi:10.1093/nar/28.1.235

    Article  CAS  Google Scholar 

  28. Dunbrack RL Jr (2002) Rotamer libraries in the 21st century. Curr Opin Struct Biol 12:431–440. doi:10.1016/S0959-440X(02)00344-5

    Article  CAS  Google Scholar 

  29. Sánchez R, Šali A (1997) Advances in comparative protein-structure modelling. Curr Opin Struct Biol 7:206–214. doi:10.1016/S0959-440X(97)80027-9

    Article  Google Scholar 

  30. Kitchen DB, Decornez H, Furr JR, Bajorath J (2004) Docking and scoring in virtual screening for drug discovery: methods and applications 3. Nat Rev Drug Discov 3:935–949. doi:10.1016/S0959-440X(97)80027-9

    Article  CAS  Google Scholar 

  31. Davidson E (1993) Molecular mechanics and modeling: overview. Chem Rev 93:2337–2350. doi:10.1021/cr00023a600

    Article  Google Scholar 

  32. Adcock SA, McCammon JA (2006) Molecular dynamics: survey of methods for simulating the activity of proteins. Chem Rev 106:1589–1615. doi:10.1021/cr040426m

    Article  CAS  Google Scholar 

  33. Word JM, Lovell SC, Richardson JS, Richardson DC (1999) Asparagine and glutamine: using hydrogen atom contacts in the choice of side-chain amide orientation. J Mol Biol 285:1735–1747. doi:10.1006/jmbi.1998.2401

    Article  CAS  Google Scholar 

  34. Cole C, Warwicker J (2002) Side-chain conformational entropy at protein-protein interfaces. Protein Sci 11:2860–2870. doi:10.1110/ps.0222702

    Article  CAS  Google Scholar 

  35. Jorgensen WL, Tirado-Rives J (1988) The OPLS potential functions for proteins. Energy minimization for crystals of cyclic peptides and crambin. J Am Chem Soc 110:1657–1666. doi:10.1021/ja00214a001

    Article  CAS  Google Scholar 

  36. Jorgensen WL, Maxwell DS, Tirado-Rives J (1996) Development and testing of the OPLS all-atom force field on conformational energetics and properties of organic liquids. J Am Chem Soc 118:11225–11236. doi:10.1021/ja9621760

    Article  CAS  Google Scholar 

  37. Chowdry AB, Reynolds KA, Hanes MS, Voorhies M, Pokala N, Handel TM (2007) An object-oriented library for computational protein design. J Comput Chem 28:2378–2388. doi:10.1002/jcc.20727

    Article  CAS  Google Scholar 

  38. Hasel W, Hendrikson TF, Still WC (1988) A rapid approximation to the solvent accessible surface areas of atoms. Tetrahedron Comp Methods 1:103–116. doi:10.1016/0898-5529(88)90015-2

    Article  CAS  Google Scholar 

  39. Juffer AH, Eisenhaber F, Hubbard SJ, Walther D, Argos P (1995) Comparison of atomic solvation parametric sets: applicability and limitations in protein folding and binding. Protein Sci 4:2499–2509

    Article  CAS  Google Scholar 

  40. Eisenberg D, McLachlan AD (1986) Solvation energy in protein folding and binding. Nature 319:199–203. doi:10.1038/319199a0

    Article  CAS  Google Scholar 

  41. Huey R, Morris GM, Olson AJ, Goodsell DS (2007) Semiempirical free energy force field with charge-based desolvation. J Comput Chem 28:1145–1152. doi:10.1002/jcc.20634

    Article  CAS  Google Scholar 

  42. Goodford PJ (1985) A computational procedure for determining energetically favorable binding sites on biologically important macromolecules. J Med Chem 28:849–857. doi:10.1021/jm00145a002

    Article  CAS  Google Scholar 

  43. Boobbyer DNA, Goodford PJ, McWhinnie PM, Wade RC (1989) New Hydrogen-bond potentials for use in determining energetically favorable binding sites on molecules of known structure. J Med Chem 32:1083–1094. doi:10.1021/jm00125a025

    Article  CAS  Google Scholar 

  44. Creamer TP (2000) Side-chain conformational entropy in protein unfolded states. Proteins 40:443–450. doi:10.1002/1097-0134(20000815)40:3<443::AID-PROT100>3.0.CO;2-L

    Article  CAS  Google Scholar 

  45. Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182. doi:10.1162/153244303322753616

    Article  Google Scholar 

  46. Polanski J, Bak A, Gieleciak R, Magdziarz T (2006) Modeling robust QSAR. J Chem Inf Model 46:2310–2318. doi:10.1021/ci050314b

    Article  CAS  Google Scholar 

  47. Schefzick S, Bradley M (2004) Comparison of commercially available genetic algorithms: GAs as variable selection tool. J Comput Aided Mol Des 18:511–521. doi:10.1007/s10822-004-5322-1

    Article  CAS  Google Scholar 

  48. Hoskuldsson P (1988) PLS regression methods. J Chemom 2:211–228. doi:10.1002/cem.1180020306

    Article  Google Scholar 

  49. Wold S, Sjöström M, Eriksson L (2001) PLS regression: a basic tool of chemometrics. Chemom Intell Lab Syst 58:109–130. doi:10.1016/S0169-7439(01)00155-1

    Article  CAS  Google Scholar 

  50. Sewald N, Jakubke H-D (2002) Peptides: chemistry and biology. Wiley-VCH, Weinheim

    Google Scholar 

  51. Madden DR (1995) The three-dimensional structure of peptide-MHC complexes. Annu Rev Immunol 13:587–622. doi:10.1146/annurev.iy.13.040195.003103

    Article  CAS  Google Scholar 

  52. Peters B, Bui H-H, Frankild S, Nielsen M, Lundegaard C, Kostem E et al (2006) A community resource benchmarking predictions of peptide binding to MHC-I molecules. PLOS Comput Biol 2:574–584. doi:10.1371/journal.pcbi.0020065

    Article  CAS  Google Scholar 

  53. Toseland CP, Clayton DJ, McSparron H, Hemsley SL, Blythe MJ, Paine K et al (2005) AntiJen: a quantitative immunology database integrating functional, thermodynamic, kinetic, biophysical, and cellular data. Immunome Res 1:4. doi:10.1186/1745-7580-1-4

    Article  CAS  Google Scholar 

  54. Parker KC, Bednarek MA, Coligan JE (1994) Scheme for ranking potential HLA-A2 binding peptides based on independent binding of individual peptide side-chain. J Immunol 152:163–175

    CAS  Google Scholar 

  55. Parker KC, Shields M, DiBrino M, Brooks A, Coligan JE (1995) Peptide binding to MHC class I molecules: implications for antigenic peptide prediction. Immunol Res 14:34–57

    Article  CAS  Google Scholar 

  56. Hagmann M (2000) Computers aid vaccine design. Science 290:80–82. doi:10.1126/science.290.5489.80

    Article  CAS  Google Scholar 

  57. Brusic V, Flower DR (2004) Bioinformatics tools for identifying T-cell epitopes. Drug Discov Today BioSilico 2:18–23. doi:10.1016/S1741-8364(04)02374-1

    Article  CAS  Google Scholar 

  58. Peoples GE, Goedegebuure PS, Smith R, Linehan DC, Yoshino I, Eberlein TY (1995) Breast and ovarian cancer-specific cytotoxic T lymphocytes recognize the same HER2/neu-derived peptide. Proc Natl Acad Sci USA 92:432–436. doi:10.1073/pnas.92.2.432

    Article  CAS  Google Scholar 

  59. McMichael AJ, Parham P, Brodsky FM, Pilch JR (1980) Influenza virus-specific cytotoxic T lymphocytes recognize HLA-molecules. Blocking by monoclonal anti-HLA antibodies. J Exp Med 152:195–203

    Google Scholar 

  60. Parkhurst MR, Fitzgerald EB, Southwood S, Sette A, Rosenberg SA, Kawakami Y (1998) Identification of a shared HLA-A*0201-restricted T-cell epitope from the melanoma antigen tyrosinase-related protein 2 (TRP2). Cancer Res 58:4895–4901

    CAS  Google Scholar 

  61. Tropsha A, Gramatica P, Gombar VK (2003) The importance of being earnest: validation is the absolute essential for successful application and inerpretation of QSPR models. QSAR Comb Sci 22:69–77. doi:10.1002/qsar.200390007

    Article  CAS  Google Scholar 

  62. Golbraikh A, Tropsha A (2002) Beware of q2!. J Mol Graph Model 20:269–276. doi:10.1016/S1093-3263(01)00123-1

    Article  CAS  Google Scholar 

  63. Aptula AO, Jeliazkovab NG, Schultzc TW, Cronin MTD (2005) The better predictive model: high q2 for the training set or low root mean square error of orediction for the test set?. QSAR Comb Sci 24:385–396. doi:10.1002/qsar.200430909

    Article  CAS  Google Scholar 

  64. Baroni M, Clement S, Cruciani G, Kettaneh-Wold S, Wold S (1993) D-optimal designs in QSAR. Quant Struct-Act Relat 12:225–231. doi:10.1002/qsar.19930120302

    Article  CAS  Google Scholar 

  65. de Aguiar PF, Bourguignon B, Khots MS, Massart DL, Phan-Than-Luu R (1995) D-optimal designs. Chemom Intell Lab Syst 30:199–210. doi:10.1016/0169-7439(94)00076-X

    Article  CAS  Google Scholar 

  66. Zhou P, Tian F, Wu Y, Li Z, Shang Z (2008) Quantitative sequence-activity model (QSAM): Applying QSAR strategy to model and predict bioactivity and function of peptides, proteins and nucleic acids. Curr Comput-Aided Drug Des (in press)

  67. McLachlan AD (1982) Rapid comparison of protein structures. Acta Crystallogr A38:871–873

    CAS  Google Scholar 

  68. Doytchinova IA, Flower DR (2002) Physicochemical explanation of peptide binding to HLA-A*0201 major histocompatibility complex: a three-dimensional quantitative structure-activity relationship study. Proteins 48:505–518. doi:10.1002/prot.10154

    Article  CAS  Google Scholar 

  69. Khan AR, Baker BM, Ghosh P, Biddison WE, Wiley DC (2000) The structure and stability of an HLA-A*0201/octameric Tax peptide complex with an empty conserved peptide-N-terminal binding site. J Immunol 164:6398–6405

    CAS  Google Scholar 

  70. Madden DR, Garboczi DN, Wiley DC (1993) The antigenic identity of peptide/MHC complexes, a comparison of the conformations of five viral peptides presented by HLA-A2. Cell 75:693–708. doi:10.1016/0092-8674(93)90490-H

    Article  CAS  Google Scholar 

  71. Shao J (1993) Linear model selection by cross-validation. J Am Stat Assoc 88:486–494. doi:10.2307/2290328

    Article  Google Scholar 

  72. DeLano WL (2002) The PyMOL molecular graphics system. DeLano Scientific, San Carlos, CA, USA

    Google Scholar 

  73. Doytchinova IA, Flower DR (2001) Toward the quantitative prediction of T-Cell epitopes: CoMFA and CoMSIA studies of peptides with affinity for the class I MHC molecule HLA-A*0201. J Med Chem 44:3572–3581. doi:10.1021/jm010021j

    Article  CAS  Google Scholar 

  74. Falk K, Rötzschke O, Stefanovic S, Jung G, Rammensee H-G (1991) Allele specific motifs revealed by sequencing of self-peptides eluted from MHC molecules. Nature 351:290–296. doi:10.1038/351290a0

    Article  CAS  Google Scholar 

  75. Ruppert J, Sidney J, Celis E, Kubo RT, Grey HM, Sette A (1993) Prominent role of secondary anchor residues in peptide binding to HLA-A*0201 molecules. Cell 74:929–937. doi:10.1016/0092-8674(93)90472-3

    Article  CAS  Google Scholar 

  76. Sapper MA, Bjorkman PJ (1991) Refined structure of the human histocompatibility antigen HLA-A2 at 2.6 Ǻ resolution. J Mol Biol 219:277–319. doi:10.1016/0022-2836(91)90567-P

    Article  Google Scholar 

  77. Madden DR, Garboczi DN, Wiley DC (1993) The antigenic identity of peptide-MHC complexes: a comparison of the conformations of five viral peptides presented by HLA-A2. Cell 75:693–708. doi:10.1016/0092-8674(93)90490-H

    Article  CAS  Google Scholar 

  78. Sarobe P, Pendleton CD, Akatsuka TD, Engelhard VH, Feinstone SM, Berzofsky JA (1998) Enhanced in vitro potency and in vivo immunogenicity of a CTL epitope from hepatitis C virus core protein following amino acid replacement at secondary HLA-A2.1 binding positions. J Clin Invest 102:1239–1248. doi:10.1172/JCI3714

    Article  CAS  Google Scholar 

  79. Kubo RT, Sette A, Grey HM, Appella E, Sakaguchi K, Zhu NZ et al (1994) Definition of specific peptide motifs for four major HLA-A alleles. J Immunol 152:3913–3925

    CAS  Google Scholar 

  80. Wallace AC, Laskowski RA, Thornton JM (1995) LIGPLOT: a program to generate schematic diagrams of protein-ligand interactions. Protein Eng 8:127–134. doi:10.1093/protein/8.2.127

    Article  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhicai Shang.

Electronic supplementary material

Below is the link to the electronic supplementary material.

10822_2008_9245_MOESM1_ESM.doc

Supplementary Material Sequences, experimental and calculated binding affinities of 419 HLA-A*0201-restricted CTL epitopes, as well as 26 crystal structures of HLA-A*0201-peptide complexes are provided as supporting information. (DOC 509 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhou, P., Chen, X. & Shang, Z. Side-chain conformational space analysis (SCSA): A multi conformation-based QSAR approach for modeling and prediction of protein–peptide binding affinities. J Comput Aided Mol Des 23, 129–141 (2009). https://doi.org/10.1007/s10822-008-9245-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10822-008-9245-0

Keywords

Navigation