Skip to main content
Log in

Evaluation of different virtual screening strategies on the basis of compound sets with characteristic core distributions and dissimilarity relationships

  • Published:
Journal of Computer-Aided Molecular Design Aims and scope Submit manuscript

Abstract

In this work, computational compound screening strategies on the basis of two- and three-dimensional (2D and 3D) molecular representations were investigated including similarity searching and support vector machine (SVM) ranking. Calculations based on topological fingerprints and molecular shape queries and features were compared. A unique aspect of the analysis setting apart from previous comparisons of 2D and 3D virtual screening approaches has been the design of compound reference, training, and test data sets with controlled incremental increases in intra-set structural diversity and different categories of structural relationships between reference/training and test sets. The use of these data sets made it possible to assess the relative performance of 2D and 3D screening strategies under increasingly challenging conditions ultimately leading to the use of training and test sets with essentially unrelated structures. The results showed that 3D similarity searching had little advantage over 2D searching in identifying active compounds with remote structural relationships. However, 3D SVM models trained on the basis of shape features were superior to other approaches (including 2D SVM) when the detection of structure–activity relationships became increasingly challenging. Such 3D SVM methods has thus far only been little investigated in virtual screening, proving a wealth of opportunities for further analyses.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Geppert H, Vogt M, Bajorath J (2010) Current trends in ligand-based virtual screening: molecular representations, data mining methods, new application areas, and performance evaluation. J Chem Inf Model 50:205–216

    Article  CAS  Google Scholar 

  2. Eckert H, Bajorath J (2007) Molecular similarity analysis in virtual screening: foundations, limitations and novel approaches. Drug Discov Today 12:225–233

    Article  CAS  Google Scholar 

  3. Kearnes S, McCloskey K, Berndl M, Pande V, Riley P (2016) Molecular graph convolutions: moving beyond fingerprints. J Comput Aided Mol Des 30:595–608

    Article  CAS  Google Scholar 

  4. Rogers D, Hahn M (2010) Extended-connectivity fingerprints. J Chem Inf Model 50:742–754

    Article  CAS  Google Scholar 

  5. Wu Z, Ramsundar B, Feinberg EN, Gomes J, Geniesse C, Pappu AS, Leswing K, Pande V (2017) MoleculeNet: a benchmark for molecular machine learning. Chem Sci 9(2):513–530

    Article  Google Scholar 

  6. Wolber G, Langer T (2005) LigandScout: 3-D pharmacophores derived from protein-bound ligands and their use as virtual screening filters. J Chem Inf Model 45:160–169

    Article  CAS  Google Scholar 

  7. Hu G, Kuang G, Xiao W, Li W, Liu G, Tang Y (2012) Performance evaluation of 2D fingerprint and 3D shape similarity methods in virtual screening. J Chem Inf Model 52:1103–1113

    Article  CAS  Google Scholar 

  8. Cramer RD, Patterson DE, Bunce JD (1998) Comparative molecular field analysis (CoMFA) 1 effect of shape on binding of steroids to carrier proteins. J Am Chem Soc 110:5959–5967

    Article  Google Scholar 

  9. Schneider G, Schneider P, Renner S (2006) Scaffold-hopping: how far can you jump? QSAR Comb Sci 25:1162–1171

    Article  CAS  Google Scholar 

  10. Schneider G, Neidhart W, Giller T, Schmid G (1999) “Scaffold-Hopping” by topological pharmacophore search: a contribution to virtual screening. Angew Chemie Int Ed 38:2894–2896

    Article  CAS  Google Scholar 

  11. Grisoni F, Merk D, Byrne R, Schneider G (2018) Scaffold-Hopping from synthetic drugs by holistic molecular representation. Sci Rep 8:16469

    Article  Google Scholar 

  12. Rush TS, Grant JA, Mosyak L, Nicholls A (2005) A shape-based 3-D Scaffold Hopping method and its application to a bacterial protein−protein interaction. J Med Chem 48:1489–1495

    Article  CAS  Google Scholar 

  13. Naylor E, Arredouani A, Vasudevan SR, Lewis AM, Parkesh R, Mizote A, Rosen D, Thomas JM, Izumi M, Ganesan A, Galione A, Churchill GC (2009) Identification of a chemical probe for NAADP by virtual screening. Nat Chem Biol 5:220–226

    Article  CAS  Google Scholar 

  14. ROCS version 3.2.2.2; OpenEye Scientific Software Inc, Santa Fe, NM

  15. Hawkins PCD, Skillman AG, Nicholls A (2007) Comparison of shape-matching and docking as virtual screening tools. J Med Chem 50:74–82

    Article  CAS  Google Scholar 

  16. Kearnes S, Pande V (2016) ROCS-derived features for virtual screening. J Comput Aided Mol Des 30:609–617

    Article  CAS  Google Scholar 

  17. Sato T, Yuki H, Takaya D, Sasaki S, Tanaka A, Honma T (2012) Application of support vector machine to three-dimensional shape-based virtual screening using comprehensive three-dimensional molecular shape overlay with known inhibitors. J Chem Inf Model 52:1015–1026

    Article  CAS  Google Scholar 

  18. Hu B, Kuang Z-K, Feng S-Y, Wang D, He S-B, Kong D-X (2016) Three-dimensional biologically relevant spectrum (BRS-3D): shape similarity profile based on PDB ligands as molecular descriptors. Molecules 21:e1554

    Article  Google Scholar 

  19. Bento AP, Gaulton A, Hersey A, Bellis LJ, Chambers J, Davies M, Krüger FA, Light Y, Mak L, McGlinchey S, Nowotka M, Papadatos G, Santos R, Overington JP (2014) The ChEMBL bioactivity database: an update. Nucleic Acids Res 42:D1083–D1090

    Article  CAS  Google Scholar 

  20. Naveja JJ, Vogt M, Stumpfe D, Medina-Franco JL, Bajorath J (2019) Systematic extraction of analogue series from large compound collections using a new computational compound–core relationship method. ACS Omega 4:1027–1032

    Article  CAS  Google Scholar 

  21. Irwin JJ, Sterling T, Mysinger MM, Bolstad ES, Coleman RG (2012) ZINC: a free tool to discover chemistry for biology. J Chem Inf Model 52:1757–1768

    Article  CAS  Google Scholar 

  22. Jones E, Oliphant T, Peterson P, others {SciPy}: Open Source Scientific Tools for {Python} http://www.scipy.org. Accessed June 5 2019

  23. OEChem TK Version 2.1.5; OpenEye Scientific Software Inc, Santa, Fe, NM

  24. Molecular Operating Environment (MOE) 2019.01; Chemical Computing Group ULC: 1010 Sherbooke St West Suite #910 Montreal QC Canada H3A 2R7

  25. Halgren TA (1999) MMFF VI MMFF94s option for energy minimization studies. J Comput Chem 20:720–729

    Article  CAS  Google Scholar 

  26. OEOmega TK Version 2.8.0; OpenEye Scientificc Software Inc, Santa Fe, NM

  27. Kirchmair J, Distinto S, Markt P, Schuster D, Spitzer GM, Liedl KR, Wolber G (2009) How to optimize shape-based virtual screening: choosing the right query and including chemical information. J Chem Inf Model 49:678–692

    Article  CAS  Google Scholar 

  28. Miyao T, Bajorath J (2018) Exploring ensembles of bioactive or virtual analogs of X-ray ligands for shape similarity searching. J Comput Aided Mol Des 32:759–767

    Article  CAS  Google Scholar 

  29. Vapnik VN (2000) The nature of statistical learning theory. Springer-Verlag, New York

    Book  Google Scholar 

  30. Boser BE, Guyon IM, Vapnik VN (1992) A training algorithm for optimal margin classifiers. In: Proceedings of the fifth annual workshop on Computational learning theory—COLT ’92 ACM Press, New York, pp 144–152

  31. Ralaivola L, Swamidass SJ, Saigo H, Baldi P (2005) Graph Kernels for chemical informatics. Neural Netw 18:1093–1110

    Article  Google Scholar 

  32. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay É (2011) Scikit-Learn: machine learning in python. J Mach Learn Res 12:2825–2830

    Google Scholar 

  33. Good AC, Hermsmeier MA, Hindle SA (2004) Measuring CAMD technique performance: a virtual screening case study in the design of validation experiments. J Comput Aided Mol Des 18:529–536

    Article  CAS  Google Scholar 

Download references

Acknowledgements

We thank OpenEye Scientific Software, Inc., for providing a free academic license of the OpenEye chemistry toolkits.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tomoyuki Miyao.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Miyao, T., Jasial, S., Bajorath, J. et al. Evaluation of different virtual screening strategies on the basis of compound sets with characteristic core distributions and dissimilarity relationships. J Comput Aided Mol Des 33, 729–743 (2019). https://doi.org/10.1007/s10822-019-00218-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10822-019-00218-8

Keywords

Navigation