P2RANK: Knowledge-Based Ligand Binding Site Prediction Using Aggregated Local Features

Krivák, Radoslav; Hoksza, David

doi:10.1007/978-3-319-21233-3_4

Radoslav Krivák⁸ &
David Hoksza⁸

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 9199))

Included in the following conference series:

International Conference on Algorithms for Computational Biology

1292 Accesses
3 Citations

Abstract

The knowledge of protein-ligand binding sites is vital prerequisite for any structure-based virtual screening campaign. If no prior knowledge about binding sites is available, the ligand-binding site prediction methods are the only way to obtain the necessary information. Here we introduce P2RANK, a novel machine learning-based method for prediction of ligand binding sites from protein structure. P2RANK uses Random Forests learner to infer ligandability of local chemical neighborhoods near the protein surface which are represented by specific near-surface points and described by aggregating physico-chemical features projected on those points from neighboring protein atoms. The points with high predicted ligandability are clustered and ranked to obtain the resulting list of binding site predictions. The new method was compared with a state-of-the-art binding site prediction method Fpocket on three representative datasets. The results show that P2RANK outperforms Fpocket by 10 to 20 % points on all the datasets. Moreover, since P2RANK does not rely on any external software for computation of various complex features, such as sequence conservation scores or binding energies, it represents an ideal tool for inclusion into future structural bioinformatics pipelines.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

An, J., Totrov, M., Abagyan, R.: Pocketome via comprehensive identification and classification of ligand binding envelopes. Mol. Cell. Proteomics 4(6), 752–761 (2005)
Article Google Scholar
Boulesteix, A.L., Janitza, S., Kruppa, J., K-nig, I.R.: Overview of random forest methodology and practical guidance with emphasis on computational biology and bioinformatics. Wiley Interdisc. Rev. Data Min. Knowl. Discov. 2(6), 493–507 (2012)
Article Google Scholar
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Article MathSciNet MATH Google Scholar
Brylinski, M., Skolnick, J.: A threading-based method (FINDSITE) for ligand-binding site prediction and functional annotation. Proc. Natl. Acad. Sci. U.S.A 105(1), 129–134 (2008)
Article Google Scholar
Capra, J.A., Laskowski, R.A., Thornton, J.M., Singh, M., Funkhouser, T.A.: Predicting protein ligand binding sites by combining evolutionary sequence conservation and 3d structure. PLoS Comput. Biol. 5(12), e1000585 (2009)
Article Google Scholar
Chen, K., Mizianty, M., Gao, J., Kurgan, L.: A critical comparative assessment of predictions of protein-binding sites for biologically relevant organic compounds. Structure (London, England: 1993) 19(5), 613–621 (2011)
Article Google Scholar
Chen, P., Huang, J.Z., Gao, X.: Ligandrfs: random forest ensemble to identify ligand-binding residues from sequence information alone. BMC Bioinform. 15(15), S4 (2014)
Article Google Scholar
Desaphy, J., Azdimousa, K., Kellenberger, E., Rognan, D.: Comparison and druggability prediction of protein-ligand binding sites from pharmacophore-annotated cavity shapes. J. Chem. Inf. Model. 52(8), 2287–2299 (2012)
Article Google Scholar
Eisenhaber, F., Lijnzaad, P., Argos, P., Sander, C., Scharf, M.: The double cubic lattice method: efficient approaches to numerical integration of surface area and volume and to dot surface contouring of molecular assemblies. J. Comput. Chem. 16(3), 273–284 (1995)
Article Google Scholar
Ghersi, D., Sanchez, R.: EasyMIFS and SiteHound: a toolkit for the identification of ligand-binding sites in protein structures. Bioinformatics (Oxford, England) 25(23), 3185–3186 (2009)
Article Google Scholar
Hartshorn, M., Verdonk, M., Chessari, G., Brewerton, S., Mooij, W., Mortenson, P., Murray, C.: Diverse, high-quality test set for the validation of protein-ligand docking performance. J. Med. Chem. 50(4), 726–741 (2007)
Article Google Scholar
Hendlich, M., Rippmann, F., Barnickel, G.: LIGSITE: automatic and efficient detection of potential small molecule-binding sites in proteins. J. Mol. Graph. Model. 15(6), 359–363, 389 (1997)
Google Scholar
Henrich, S., Outi, S., Huang, B., Rippmann, F., Cruciani, G., Wade, R.: Computational approaches to identifying and characterizing protein binding sites for ligand design. J. Mol. Recogn. (JMR) 23(2), 209–219 (2010)
Google Scholar
Huang, B.: MetaPocket: a meta approach to improve protein ligand binding site prediction. Omics J. Integr. Biol. 13(4), 325–330 (2009)
Article Google Scholar
Huang, B., Schroeder, M.: Ligsitecsc: predicting ligand binding sites using the connolly surface and degree of conservation. BMC Struct. Biol. 6(1), 19 (2006). http://www.biomedcentral.com/1472-6807/6/19
Article Google Scholar
Kauffman, C., Karypis, G.: Librus: combined machine learning and homology information for sequence-based ligand-binding residue prediction. Bioinformatics (Oxford, England) 25(23), 3099–3107 (2009). http://bioinformatics.oxfordjournals.org/cgi/pmidlookup?view=long&pmid=19786483
Khazanov, N.A., Carlson, H.A.: Exploring the composition of protein-ligand binding sites on a large scale. PLoS Comput. Biol. 9(11), e1003321 (2013)
Article Google Scholar
Konc, J., Janei, D.: Binding site comparison for function prediction and pharmaceutical discovery. Curr. Opin. Struct. Biol. 25, 34–39 (2014)
Article Google Scholar
Krivak, R., Hoksza, D.: Improving protein-ligand binding site prediction accuracy by classification of inner pocket points using local features. J. Cheminformatics 7(1), 12 (2015). http://www.jcheminf.com/content/7/1/12
Article Google Scholar
Kyte, J., Doolittle, R.F.: A simple method for displaying the hydropathic character of a protein. J. Mol. Biol. 157(1), 105–132 (1982). http://www.sciencedirect.com/science/article/pii/0022283682905150
Article Google Scholar
Labute, P., Santavy, M.: Locating binding sites in protein structures (2001). http://www.chemcomp.com/journal/sitefind.htm. Accessed 16 April 2015
Laurie, A., Jackson, R.: Q-SiteFinder: an energy-based method for the prediction of protein-ligand binding sites. Bioinformatics (Oxford, England) 21(9), 1908–1916 (2005)
Article Google Scholar
Laurie, A., Jackson, R.: Methods for the prediction of protein-ligand binding sites for structure-based drug design and virtual ligand screening. Curr. Protein Pept. Sci. 7(5), 395–406 (2006)
Article Google Scholar
Le Guilloux, V., Schmidtke, P., Tuffery, P.: Fpocket: an open source platform for ligand pocket detection. BMC Bioinform. 10(1), 168 (2009). http://www.biomedcentral.com/1471-2105/10/168
Article Google Scholar
Leis, S., Schneider, S., Zacharias, M.: In silico prediction of binding sites on proteins. Curr. Med. Chem. 17(15), 1550–1562 (2010)
Article Google Scholar
Levitt, D.G., Banaszak, L.J.: Pocket: a computer graphies method for identifying and displaying protein cavities and their surrounding amino acids. J. Mol. Graph. 10(4), 229–234 (1992). http://www.sciencedirect.com/science/article/pii/026378559280074N
Article Google Scholar
Morita, M., Nakamura, S., Shimizu, K.: Highly accurate method for ligand-binding site prediction in unbound state (apo) protein structures. Proteins 73(2), 468–479 (2008)
Article Google Scholar
Nair, R., Liu, J., Soong, T.T., Acton, T., Everett, J., Kouranov, A., Fiser, A., Godzik, A., Jaroszewski, L., Orengo, C., et al.: Structural genomics is the largest contributor of novel structural leverage. J. Struct. Funct. Genom. 10(2), 181–191 (2009)
Article Google Scholar
Nayal, M., Honig, B.: On the nature of cavities on protein surfaces: application to the identification of drug-binding sites. Proteins 63(4), 892–906 (2006)
Article Google Scholar
Pérot, S., Sperandio, O., Miteva, M., Camproux, A., Villoutreix, B.: Druggable pockets and binding site centric chemical space: a paradigm shift in drug discovery. Drug Discovery Today 15(15–16), 656–667 (2010)
Article Google Scholar
Pintar, A., Carugo, O., Pongor, S.: Cx, an algorithm that identifies protruding atoms in proteins. Bioinformatics 18(7), 980–984 (2002)
Article Google Scholar
Qiu, Z., Qin, C., Jiu, M., Wang, X.: A simple iterative method to optimize protein-ligand-binding residue prediction. J. Theor. Biol. 317, 219–223 (2013)
Article Google Scholar
Qiu, Z., Wang, X.: Improved prediction of protein ligand-binding sites using random forests. Protein Pept. Lett. 18(12), 1212–1218 (2011). http://www.ingentaconnect.com/content/ben/ppl/2011/00000018/00000012/art00005
Article Google Scholar
Rognan, D.: Docking Methods for Virtual Screening: Principles and Recent Advances, pp. 153–176. Wiley, Weinheim (2011). http://dx.doi.org/10.1002/9783527633326.ch6
Book Google Scholar
Schmidtke, P., Souaille, C., Estienne, F., Baurin, N., Kroemer, R.: Large-scale comparison of four binding site detection algorithms. J. Chem. Inf. Model. 50(12), 2191–2200 (2010)
Article Google Scholar
Schneider, S., Zacharias, M.: Combining geometric pocket detection and desolvation properties to detect putative ligand binding sites on proteins. J. Struct. Biol. 180(3), 546–550 (2012)
Article Google Scholar
Schomburg, K., Bietz, S., Briem, H., Henzler, A., Urbaczek, S., Rarey, M.: Facing the challenges of structure-based target prediction by inverse virtual screening. J. Chem. Inf. Model. 54(6), 1676–1686 (2014)
Article Google Scholar
Skolnick, J., Brylinski, M.: FINDSITE: a combined evolution/structure-based approach to protein function prediction. Briefings Bioinform. 10(4), 378–391 (2009)
Article Google Scholar
Steinbeck, C., Han, Y., Kuhn, S., Horlacher, O., Luttmann, E., Willighagen, E.: The chemistry development kit (CDK): an open-source java library for chemo- and bioinformatics. J. Chem. Inf. Comput. Sci. 43(2), 493–500 (2003). pMID: 12653513
Article Google Scholar
Svetnik, V., Liaw, A., Tong, C., Culberson, J.C., Sheridan, R.P., Feuston, B.P.: Random forest: a classification and regression tool for compound classification and qsar modeling. J. chem. Inf. Comput. Sci. 43(6), 1947–1958 (2003)
Article Google Scholar
Weisel, M., Proschak, E., Schneider, G.: Pocketpicker: analysis of ligand binding-sites with shape descriptors. Chem. Central J. 1(1), 7 (2007). http://journal.chemistrycentral.com/content/1/1/7
Article Google Scholar
Xie, L., Xie, L., Bourne, P.E.: Structure-based systems biology for analyzing off-target binding. Curr. Opin. Struct. Biol. 21(2), 189–199 (2011)
Article Google Scholar
Zhang, Z., Li, Y., Lin, B., Schroeder, M., Huang, B.: Identification of cavities on protein surface using multiple computational approaches for drug binding site prediction. Bioinformatics (Oxford, England) 27(15), 2083–2088 (2011)
Article Google Scholar
Zheng, X., Gan, L., Wang, E., Wang, J.: Pocket-based drug design: exploring pocket space. AAPS J. 15, 228–241 (2012)
Article Google Scholar

Download references

Acknowledgments

This work was supported by the Czech Science Foundation grant 14-29032P and by project SVV-2015-260222 and by the Charles University in Prague, project GA UK No. 174615.

Author information

Authors and Affiliations

FMP, Department of Software Engineering, Charles University in Prague, Malostranské nám. 25, 118 00, Prague, Czech Republic
Radoslav Krivák & David Hoksza

Authors

Radoslav Krivák
View author publications
You can also search for this author in PubMed Google Scholar
David Hoksza
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Radoslav Krivák .

Editor information

Editors and Affiliations

Research Group on Mathematical Linguistics, Rovira i Virgili University, Tarragona, Spain
Adrian-Horia Dediu
Faculty of Science, National Autonomous University of Mexico - UNAM, Mexico City, Mexico
Francisco Hernández-Quiroz
Research Group on Mathematical Linguistics, Rovira i Virgili University, Tarragona, Spain
Carlos Martín-Vide
Institute for Research in Applied Mathematics and Systems – IIMAS, National Autonomous University of Mexico - UNAM, Mexico City, Mexico
David A. Rosenblueth

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Krivák, R., Hoksza, D. (2015). P2RANK: Knowledge-Based Ligand Binding Site Prediction Using Aggregated Local Features. In: Dediu, AH., Hernández-Quiroz, F., Martín-Vide, C., Rosenblueth, D. (eds) Algorithms for Computational Biology. AlCoB 2015. Lecture Notes in Computer Science(), vol 9199. Springer, Cham. https://doi.org/10.1007/978-3-319-21233-3_4

Download citation

DOI: https://doi.org/10.1007/978-3-319-21233-3_4
Published: 28 July 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-21232-6
Online ISBN: 978-3-319-21233-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics