Skip to main content

P2RANK: Knowledge-Based Ligand Binding Site Prediction Using Aggregated Local Features

  • Conference paper
  • First Online:
Algorithms for Computational Biology (AlCoB 2015)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 9199))

Included in the following conference series:

Abstract

The knowledge of protein-ligand binding sites is vital prerequisite for any structure-based virtual screening campaign. If no prior knowledge about binding sites is available, the ligand-binding site prediction methods are the only way to obtain the necessary information. Here we introduce P2RANK, a novel machine learning-based method for prediction of ligand binding sites from protein structure. P2RANK uses Random Forests learner to infer ligandability of local chemical neighborhoods near the protein surface which are represented by specific near-surface points and described by aggregating physico-chemical features projected on those points from neighboring protein atoms. The points with high predicted ligandability are clustered and ranked to obtain the resulting list of binding site predictions. The new method was compared with a state-of-the-art binding site prediction method Fpocket on three representative datasets. The results show that P2RANK outperforms Fpocket by 10 to 20 % points on all the datasets. Moreover, since P2RANK does not rely on any external software for computation of various complex features, such as sequence conservation scores or binding energies, it represents an ideal tool for inclusion into future structural bioinformatics pipelines.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. An, J., Totrov, M., Abagyan, R.: Pocketome via comprehensive identification and classification of ligand binding envelopes. Mol. Cell. Proteomics 4(6), 752–761 (2005)

    Article  Google Scholar 

  2. Boulesteix, A.L., Janitza, S., Kruppa, J., K-nig, I.R.: Overview of random forest methodology and practical guidance with emphasis on computational biology and bioinformatics. Wiley Interdisc. Rev. Data Min. Knowl. Discov. 2(6), 493–507 (2012)

    Article  Google Scholar 

  3. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  4. Brylinski, M., Skolnick, J.: A threading-based method (FINDSITE) for ligand-binding site prediction and functional annotation. Proc. Natl. Acad. Sci. U.S.A 105(1), 129–134 (2008)

    Article  Google Scholar 

  5. Capra, J.A., Laskowski, R.A., Thornton, J.M., Singh, M., Funkhouser, T.A.: Predicting protein ligand binding sites by combining evolutionary sequence conservation and 3d structure. PLoS Comput. Biol. 5(12), e1000585 (2009)

    Article  Google Scholar 

  6. Chen, K., Mizianty, M., Gao, J., Kurgan, L.: A critical comparative assessment of predictions of protein-binding sites for biologically relevant organic compounds. Structure (London, England: 1993) 19(5), 613–621 (2011)

    Article  Google Scholar 

  7. Chen, P., Huang, J.Z., Gao, X.: Ligandrfs: random forest ensemble to identify ligand-binding residues from sequence information alone. BMC Bioinform. 15(15), S4 (2014)

    Article  Google Scholar 

  8. Desaphy, J., Azdimousa, K., Kellenberger, E., Rognan, D.: Comparison and druggability prediction of protein-ligand binding sites from pharmacophore-annotated cavity shapes. J. Chem. Inf. Model. 52(8), 2287–2299 (2012)

    Article  Google Scholar 

  9. Eisenhaber, F., Lijnzaad, P., Argos, P., Sander, C., Scharf, M.: The double cubic lattice method: efficient approaches to numerical integration of surface area and volume and to dot surface contouring of molecular assemblies. J. Comput. Chem. 16(3), 273–284 (1995)

    Article  Google Scholar 

  10. Ghersi, D., Sanchez, R.: EasyMIFS and SiteHound: a toolkit for the identification of ligand-binding sites in protein structures. Bioinformatics (Oxford, England) 25(23), 3185–3186 (2009)

    Article  Google Scholar 

  11. Hartshorn, M., Verdonk, M., Chessari, G., Brewerton, S., Mooij, W., Mortenson, P., Murray, C.: Diverse, high-quality test set for the validation of protein-ligand docking performance. J. Med. Chem. 50(4), 726–741 (2007)

    Article  Google Scholar 

  12. Hendlich, M., Rippmann, F., Barnickel, G.: LIGSITE: automatic and efficient detection of potential small molecule-binding sites in proteins. J. Mol. Graph. Model. 15(6), 359–363, 389 (1997)

    Google Scholar 

  13. Henrich, S., Outi, S., Huang, B., Rippmann, F., Cruciani, G., Wade, R.: Computational approaches to identifying and characterizing protein binding sites for ligand design. J. Mol. Recogn. (JMR) 23(2), 209–219 (2010)

    Google Scholar 

  14. Huang, B.: MetaPocket: a meta approach to improve protein ligand binding site prediction. Omics J. Integr. Biol. 13(4), 325–330 (2009)

    Article  Google Scholar 

  15. Huang, B., Schroeder, M.: Ligsitecsc: predicting ligand binding sites using the connolly surface and degree of conservation. BMC Struct. Biol. 6(1), 19 (2006). http://www.biomedcentral.com/1472-6807/6/19

    Article  Google Scholar 

  16. Kauffman, C., Karypis, G.: Librus: combined machine learning and homology information for sequence-based ligand-binding residue prediction. Bioinformatics (Oxford, England) 25(23), 3099–3107 (2009). http://bioinformatics.oxfordjournals.org/cgi/pmidlookup?view=long&pmid=19786483

  17. Khazanov, N.A., Carlson, H.A.: Exploring the composition of protein-ligand binding sites on a large scale. PLoS Comput. Biol. 9(11), e1003321 (2013)

    Article  Google Scholar 

  18. Konc, J., Janei, D.: Binding site comparison for function prediction and pharmaceutical discovery. Curr. Opin. Struct. Biol. 25, 34–39 (2014)

    Article  Google Scholar 

  19. Krivak, R., Hoksza, D.: Improving protein-ligand binding site prediction accuracy by classification of inner pocket points using local features. J. Cheminformatics 7(1), 12 (2015). http://www.jcheminf.com/content/7/1/12

    Article  Google Scholar 

  20. Kyte, J., Doolittle, R.F.: A simple method for displaying the hydropathic character of a protein. J. Mol. Biol. 157(1), 105–132 (1982). http://www.sciencedirect.com/science/article/pii/0022283682905150

    Article  Google Scholar 

  21. Labute, P., Santavy, M.: Locating binding sites in protein structures (2001). http://www.chemcomp.com/journal/sitefind.htm. Accessed 16 April 2015

  22. Laurie, A., Jackson, R.: Q-SiteFinder: an energy-based method for the prediction of protein-ligand binding sites. Bioinformatics (Oxford, England) 21(9), 1908–1916 (2005)

    Article  Google Scholar 

  23. Laurie, A., Jackson, R.: Methods for the prediction of protein-ligand binding sites for structure-based drug design and virtual ligand screening. Curr. Protein Pept. Sci. 7(5), 395–406 (2006)

    Article  Google Scholar 

  24. Le Guilloux, V., Schmidtke, P., Tuffery, P.: Fpocket: an open source platform for ligand pocket detection. BMC Bioinform. 10(1), 168 (2009). http://www.biomedcentral.com/1471-2105/10/168

    Article  Google Scholar 

  25. Leis, S., Schneider, S., Zacharias, M.: In silico prediction of binding sites on proteins. Curr. Med. Chem. 17(15), 1550–1562 (2010)

    Article  Google Scholar 

  26. Levitt, D.G., Banaszak, L.J.: Pocket: a computer graphies method for identifying and displaying protein cavities and their surrounding amino acids. J. Mol. Graph. 10(4), 229–234 (1992). http://www.sciencedirect.com/science/article/pii/026378559280074N

    Article  Google Scholar 

  27. Morita, M., Nakamura, S., Shimizu, K.: Highly accurate method for ligand-binding site prediction in unbound state (apo) protein structures. Proteins 73(2), 468–479 (2008)

    Article  Google Scholar 

  28. Nair, R., Liu, J., Soong, T.T., Acton, T., Everett, J., Kouranov, A., Fiser, A., Godzik, A., Jaroszewski, L., Orengo, C., et al.: Structural genomics is the largest contributor of novel structural leverage. J. Struct. Funct. Genom. 10(2), 181–191 (2009)

    Article  Google Scholar 

  29. Nayal, M., Honig, B.: On the nature of cavities on protein surfaces: application to the identification of drug-binding sites. Proteins 63(4), 892–906 (2006)

    Article  Google Scholar 

  30. Pérot, S., Sperandio, O., Miteva, M., Camproux, A., Villoutreix, B.: Druggable pockets and binding site centric chemical space: a paradigm shift in drug discovery. Drug Discovery Today 15(15–16), 656–667 (2010)

    Article  Google Scholar 

  31. Pintar, A., Carugo, O., Pongor, S.: Cx, an algorithm that identifies protruding atoms in proteins. Bioinformatics 18(7), 980–984 (2002)

    Article  Google Scholar 

  32. Qiu, Z., Qin, C., Jiu, M., Wang, X.: A simple iterative method to optimize protein-ligand-binding residue prediction. J. Theor. Biol. 317, 219–223 (2013)

    Article  Google Scholar 

  33. Qiu, Z., Wang, X.: Improved prediction of protein ligand-binding sites using random forests. Protein Pept. Lett. 18(12), 1212–1218 (2011). http://www.ingentaconnect.com/content/ben/ppl/2011/00000018/00000012/art00005

    Article  Google Scholar 

  34. Rognan, D.: Docking Methods for Virtual Screening: Principles and Recent Advances, pp. 153–176. Wiley, Weinheim (2011). http://dx.doi.org/10.1002/9783527633326.ch6

    Book  Google Scholar 

  35. Schmidtke, P., Souaille, C., Estienne, F., Baurin, N., Kroemer, R.: Large-scale comparison of four binding site detection algorithms. J. Chem. Inf. Model. 50(12), 2191–2200 (2010)

    Article  Google Scholar 

  36. Schneider, S., Zacharias, M.: Combining geometric pocket detection and desolvation properties to detect putative ligand binding sites on proteins. J. Struct. Biol. 180(3), 546–550 (2012)

    Article  Google Scholar 

  37. Schomburg, K., Bietz, S., Briem, H., Henzler, A., Urbaczek, S., Rarey, M.: Facing the challenges of structure-based target prediction by inverse virtual screening. J. Chem. Inf. Model. 54(6), 1676–1686 (2014)

    Article  Google Scholar 

  38. Skolnick, J., Brylinski, M.: FINDSITE: a combined evolution/structure-based approach to protein function prediction. Briefings Bioinform. 10(4), 378–391 (2009)

    Article  Google Scholar 

  39. Steinbeck, C., Han, Y., Kuhn, S., Horlacher, O., Luttmann, E., Willighagen, E.: The chemistry development kit (CDK): an open-source java library for chemo- and bioinformatics. J. Chem. Inf. Comput. Sci. 43(2), 493–500 (2003). pMID: 12653513

    Article  Google Scholar 

  40. Svetnik, V., Liaw, A., Tong, C., Culberson, J.C., Sheridan, R.P., Feuston, B.P.: Random forest: a classification and regression tool for compound classification and qsar modeling. J. chem. Inf. Comput. Sci. 43(6), 1947–1958 (2003)

    Article  Google Scholar 

  41. Weisel, M., Proschak, E., Schneider, G.: Pocketpicker: analysis of ligand binding-sites with shape descriptors. Chem. Central J. 1(1), 7 (2007). http://journal.chemistrycentral.com/content/1/1/7

    Article  Google Scholar 

  42. Xie, L., Xie, L., Bourne, P.E.: Structure-based systems biology for analyzing off-target binding. Curr. Opin. Struct. Biol. 21(2), 189–199 (2011)

    Article  Google Scholar 

  43. Zhang, Z., Li, Y., Lin, B., Schroeder, M., Huang, B.: Identification of cavities on protein surface using multiple computational approaches for drug binding site prediction. Bioinformatics (Oxford, England) 27(15), 2083–2088 (2011)

    Article  Google Scholar 

  44. Zheng, X., Gan, L., Wang, E., Wang, J.: Pocket-based drug design: exploring pocket space. AAPS J. 15, 228–241 (2012)

    Article  Google Scholar 

Download references

Acknowledgments

This work was supported by the Czech Science Foundation grant 14-29032P and by project SVV-2015-260222 and by the Charles University in Prague, project GA UK No. 174615.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Radoslav Krivák .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Krivák, R., Hoksza, D. (2015). P2RANK: Knowledge-Based Ligand Binding Site Prediction Using Aggregated Local Features. In: Dediu, AH., Hernández-Quiroz, F., Martín-Vide, C., Rosenblueth, D. (eds) Algorithms for Computational Biology. AlCoB 2015. Lecture Notes in Computer Science(), vol 9199. Springer, Cham. https://doi.org/10.1007/978-3-319-21233-3_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-21233-3_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-21232-6

  • Online ISBN: 978-3-319-21233-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics