Skip to main content
Log in

Surrogate docking: structure-based virtual screening at high throughput speed

  • Published:
Journal of Computer-Aided Molecular Design Aims and scope Submit manuscript

Summary

Structure-based screening using fully flexible docking is still too slow for large molecular libraries. High quality docking of a million molecule library can take days even on a cluster with hundreds of CPUs. This performance issue prohibits the use of fully flexible docking in the design of large combinatorial libraries. We have developed a fast structure-based screening method, which utilizes docking of a limited number of compounds to build a 2D QSAR model used to rapidly score the rest of the database. We compare here a model based on radial basis functions and a Bayesian categorization model. The number of compounds that need to be actually docked depends on the number of docking hits found. In our case studies reasonable quality models are built after docking of the number of molecules containing 50 docking hits. The rest of the library is screened by the QSAR model. Optionally a fraction of the QSAR-prioritized library can be docked in order to find the true docking hits. The quality of the model only depends on the training set size – not on the size of the library to be screened. Therefore, for larger libraries the method yields higher gain in speed no change in performance. Prioritizing a large library with these models provides a significant enrichment with docking hits: it attains the values of 13 and 35 at the beginning of the score-sorted libraries in our two case studies: screening of the NCI collection and a combinatorial libraries on CDK2 kinase structure. With such enrichments, only a fraction of the database must actually be docked to find many of the true hits. The throughput of the method allows its use in screening of large compound collections and in the design of large combinatorial libraries. The strategy proposed has an important effect on efficiency but does not affect retrieval of actives, the latter being determined by the quality of the docking method itself.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Stahura F.L., Bajorath J., (2004) Comb. Chem. High Throughput Screen 7:259

    CAS  Google Scholar 

  2. Bajorath J., (2002) Nat. Re.v Drug Discov. 1: 882

    Article  CAS  Google Scholar 

  3. Engels M.F., Venkatarangan P., (2001) Curr. Opin. Drug Discov. Develop. 4: 275

    CAS  Google Scholar 

  4. Kuntz I.D., Blaney J.M., Oatley S.J., Langridge R., Ferrin T.E., (1982) J. Mol. Biol. 161: 269

    Article  CAS  Google Scholar 

  5. Tatsumi R., Fukunishi Y., Nakamura H., (2004) J. Comput. Chem. 25: 1995

    Article  CAS  Google Scholar 

  6. Holtje H.D., (1974) Arch. Pharm. (Weinheim) 307: 969

    Article  CAS  Google Scholar 

  7. Steindl T., Langer T., (2004) J. Chem. Inf. Comput. Sci. 44: 1849

    Article  CAS  Google Scholar 

  8. Eksterowicz J.E., Evensen E., Lemmen C., Brady G.P., Lanctot J.K., Bradley E.K., Saiah E., Robinson L.A., Grootenhuis P.D., Blaney J.M., (2002) J. Mol. Graph. Model. 20: 469

    Article  CAS  Google Scholar 

  9. Smellie A., Kahn S.D., Teig S., (1995) J. Chem. Inf. Comput. Sci. 35: 285

    Article  CAS  Google Scholar 

  10. Hurst T., (1994) J. Chem. Inf. Comput. Sci. 34: 190

    CAS  Google Scholar 

  11. Sprous D.G., Lowis D.R., Leonard J.M., Heritage T., Burkett S.N., Baker D.S., Clark R.D., (2004) J. Comb. Chem. 6: 530

    Article  CAS  Google Scholar 

  12. Makino S., Ewing T.J., Kuntz I.D., (1999) J. Comput. Aided Mol. Des. 13: 513

    Article  CAS  Google Scholar 

  13. Sun Y., Ewing T.J., Skillman A.G., Kuntz I.D., (1998) J. Comput. Aided Mol. Des. 12: 597

    Article  CAS  Google Scholar 

  14. Lamb M.L., Burdick K.W., Toba S., Young M.M., Skillman A.G., Zou X., Arnold J.R., Kuntz I.D., (2001) Proteins 42: 296

    Article  CAS  Google Scholar 

  15. Kick E.K., Roe D.C., Skillman A.G., Liu G., Ewing T.J., Sun Y., Kuntz I.D., Ellman J.A., (1997) Chem. Biol. 4: 297

    Article  CAS  Google Scholar 

  16. Pipeline Pilot V 3.5, Scitegic Inc., (2004) San Diego

  17. Buhmann, M.D., Radial Basis Functions: Theory and Implementations, Cambridge University Press, 2003

  18. Klon A.E., Glick M., Thoma M., Acklin P., Davies J.W., (2004) J. Med. Chem. 47: 2743

    Article  CAS  Google Scholar 

  19. Kellenberger E., Rodrigo J., Muller P., Rognan D., (2004) Proteins 57: 225

    Article  CAS  Google Scholar 

  20. Jacobsson M., Liden P., Stjernschantz E., Bostrom H., Narinder V., 2003 J. Med. Chem., 46: 5781

    Article  CAS  Google Scholar 

  21. Klon A.E., Glick M., Davies J.W. 2004 J. Chem. Inf. Comput. Sci. 44: 2216

    Article  CAS  Google Scholar 

  22. Klon A.E., Glick M., Davies J.W., 2004 J. Med. Chem. 47: 4356

    Article  CAS  Google Scholar 

  23. Bender A., Mussa H.Y., Glen R.C., Reiling S., 2004 J. Chem. Inf. Comput. Sci. 44: 170

    Article  CAS  Google Scholar 

  24. Schapira M., Abagyan R., Totrov M., (2003) J. Med. Chem. 46: 3045

    Article  CAS  Google Scholar 

  25. Abagyan R., Orry A., 2004. ICM User’s Guide MolSoft, L.L.C. La Jolla

    Google Scholar 

  26. http://www.rcsb.org/pdb/

  27. http :// dtp.nci.nih.gov/docs/3d_database/structural_information/structural_data.html

  28. Pearlman, R.S. and Kubinyi H. (Eds.), 3D Molecular Structures: Generation and Use in 3D-Searching, ESCOM Science Publishers Leiden, 1993, p. 21

  29. Chang Y.T., Gray N.S., Rosania G.R., Sutherlin D.P., Kwon S., Norman T.C., Sarohia R., Leost M., Meijer L., Schultz P.G., (1999) Chem. Biol. 6: 361

    Article  CAS  Google Scholar 

  30. Available Chemicals Directory, Elsevier MDL, San Leandro, 2004

  31. Hann M., Hudson B., Lewell X., Lifely R., Miller L., Ramsden N., (1999) J. Chem. Inf. Comput. Sci. 39: 897

    Article  CAS  Google Scholar 

  32. Butina D., (1999) J. Chem. Inf. Comput. Sci. 39: 747

    Article  CAS  Google Scholar 

  33. Li, D., MapMaker: an integrated compound library design tool, Philadelphia, 2004, August 22–26

  34. Blair R.M., Fang H., Branham W.S., Hass B.S., Dial S.L., Moland C.L., Tong W., Shi L., Perkins R., Sheehan D.M., (2000) Toxicol. Sci. 54: 138

    Article  CAS  Google Scholar 

  35. Fang H., Tong W., Shi L.M., Blair R., Perkins R., Branham W., Hass B.S., Xie Q., Dial S.L., Moland C.L., Sheehan D.M., (2001) Chem. Res. Toxicol. 14: 280

    Article  CAS  Google Scholar 

  36. Rogers, D., Multicriteria Modeling: The Next Stage in Handling Large Data Sets, Anaheim, 2004, March 27–April 1

  37. MDL Information Systems, Inc., 14600 Catalina Street, San Leandro, CA 94577

  38. Ghose A.K., Crippen G.M., (1986) J Comput. Chem. 7: 565

    Article  CAS  Google Scholar 

  39. Ghose A.K., Pritchett A., Crippen G.M., (1988) J. Chem. Inf. Comput. Sci. 9: 80

    CAS  Google Scholar 

  40. Pipeline Pilot V 3.5. User Manual; section “Extended Connectivity Fingerprints”, Scitegic Inc., San Diego, 2004

  41. Daylight Theory User Manual; section “Fingerprints - Screening and Similarity”, Daylight Chemical Information Systems, Inc., Mission Viejo, 2004

  42. Bayes T., (1958) Biometrika 45: 296

    Article  Google Scholar 

  43. Xu H., Agrafiotis D.K., (2003) J. Chem. Inf. Comput. Sci. 43: 1933

    Article  CAS  Google Scholar 

  44. Dongarra, J.J., LINPACK, http://www.netlib.org/linpack/, (1988)

  45. Hand D., Mannila H., Smyth P., 2001. Principles of Data Mining The MIT Press Cambridge, Massachsetts

    Google Scholar 

  46. Pearlman D.A., Charifson P.S., (2001) J. Med. Chem. 44: 502

    Article  CAS  Google Scholar 

  47. De Borda J., 1781 Memoire sur les elections au scrutin historie de l’academie royale des sciences Paris

    Google Scholar 

  48. Breiman, Freidman, Olshen and Stone, 1984. Classification and Regression Trees, Wadsworth

  49. Wold, H. and Gani, J. (Ed.), The PLS Approach, in Perspectives in Probability and Statistics, Academic Press London, 1975

  50. Aleksander, I. and Morton, H., 1995. An introduction to Neural Computing, Chapman and Hall

  51. Back T., 1996. Evolutionary Algorithms in Theory and Practice – Evolution Strategies, Evolutionary Programming, Genetic Algorithms Oxford University Press New York, Oxford

    Google Scholar 

Download references

Acknowledgement

The authors gratefully acknowledge Dr Weida Tong (National Center for Toxicological Research, U.S. Food & Drug Administration, Jefferson, AR) for providing us the experimental data set and 2D structures of Estrogen Receptor compounds used in this study.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anton Filikov.

Electronic supplementary material

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yoon, S., Smellie, A., Hartsough, D. et al. Surrogate docking: structure-based virtual screening at high throughput speed. J Comput Aided Mol Des 19, 483–497 (2005). https://doi.org/10.1007/s10822-005-9002-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10822-005-9002-6

Keywords

Navigation