Skip to main content
Log in

Extraction and validation of substructure profiles for enriching compound libraries

  • Published:
Journal of Computer-Aided Molecular Design Aims and scope Submit manuscript

Abstract

Compounds known to be potent against a specific protein target may potentially contain a signature profile of common substructures that is highly correlated to their potency. These substructure profiles may be useful in enriching compound libraries or for prioritizing compounds against a specific protein target. With this objective in mind, a set of compounds with known potency against six selected kinases (2 each from 3 kinase families) was used to generate binary molecular fingerprints. Each fingerprint key represents a substructure that is found within a compound and the frequency with which the fingerprint occurs was then tabulated. Thereafter, a frequent pattern mining technique was applied with the aim of uncovering substructures that are not only well represented among known potent inhibitors but are also unrepresented among known inactive compounds and vice versa. Substructure profiles that are representative of potent inhibitors against each of the 3 kinase families were thus extracted. Based on our validation results, these substructure profiles demonstrated significant enrichment for highly potent compounds against their respective kinase targets. The advantages of using our approach over conventional methods in analyzing such datasets and its application in the mining of substructures for enriching compound libraries are presented.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Zhang C, Habets G, Bollag G (2011) Nat Biotechnol 29(11):981

    Article  CAS  Google Scholar 

  2. Eglen R, Reisine T (2011) Pharmacol Ther 130(2):144

    Article  CAS  Google Scholar 

  3. Eglen RM, Reisine T (2009) Assay Drug Dev Technol 7(1):22

    Article  CAS  Google Scholar 

  4. Manning G, Whyte DB, Martinez R, Hunter T, Sudarsanam S (2002) Science 298(5600):1912

    Article  CAS  Google Scholar 

  5. Bamborough P, Brown MJ, Christopher JA, Chung CW, Mellor GW (2011) J Med Chem 54(14):5131

    Article  CAS  Google Scholar 

  6. Bhagwat SS (2009) Curr Opin Investig Drugs 10(12):1266

    CAS  Google Scholar 

  7. Brandvold KR, Soellner MB (2011) 242nd National meeting of the American-Chemical-Society (ACS), Denver, CO, Aug 28–Sep 01, 2011. Abstracts of papers of the American Chemical Society 242, 338-MEDI

  8. Cherry M, Williams DH (2004) Curr Med Chem 11(6):663

    Article  CAS  Google Scholar 

  9. Daub H, Godl K, Brehmer D, Klebl B, Muller G (2004) Assay Drug Dev Technol 2(2):215

    Article  CAS  Google Scholar 

  10. Anastassiadis T, Deacon SW, Devarajan K, Ma HC, Peterson JR (2011) Nat Biotechnol 29(11):1039

    Article  CAS  Google Scholar 

  11. Godl K, Daub H (2004) Cell Cycle 3(4):393

    Article  CAS  Google Scholar 

  12. Karaman MW, Herrgard S, Treiber DK, Gallant P, Atteridge CE, Campbell BT, Chan KW, Ciceri P, Davis MI, Edeen PT, Faraoni R, Floyd M, Hunt JP, Lockhart DJ, Milanov ZV, Morrison MJ, Pallares G, Patel HK, Pritchard S, Wodicka LM, Zarrinkar PP (2008) Nat Biotechnol 26(1):127

    Article  CAS  Google Scholar 

  13. Morphy R (2010) J Med Chem 53(4):1413

    Article  CAS  Google Scholar 

  14. Subramanian G, Sud M (2010) Acs Medicinal Chem Lett 1(8):395

    Article  CAS  Google Scholar 

  15. Agrawal R, Imielinski T, Swami A (1993) Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD international conference on management of data. ACM, Washington, DC, USA, p 207

  16. Brin S, Motwani R, Silverstein C (1997) Beyond market baskets: generalizing association rules to correlations. In: Proceedings of the 1997 ACM SIGMOD international conference on management of data. ACM, Tucson, AZ, USA, p 265

  17. Silverstein C, Brin S, Motwani R (1998) Data Min Knowl Disc 2(1):39

    Article  Google Scholar 

  18. Kinase SARfari. https://www.ebi.ac.uk/chembl/sarfari/kinasesarfari. Accessed 2011

  19. Gaulton A, Bellis LJ, Bento AP, Chambers J, Davies M, Hersey A, Light Y, McGlinchey S, Michalovich D, Al-Lazikani B, Overington JP (2012) Nucleic Acids Res 40(D1):D1100–D1107. doi:10.1093/nar/gkr777

    Google Scholar 

  20. Overington JP (2009) 238th National meeting of the American Chemical Society, DC, August 16–20, 2009. Abstracts of papers of the American Chemical Society 238, 39-COMP

  21. Wadler S (2001) Drug Resist Updat 4(6):347

    Article  CAS  Google Scholar 

  22. Bradham C, McClay DR (2006) Cell Cycle 5(8):824

    Article  CAS  Google Scholar 

  23. Raymond E, Faivre S, Armand JP (2000) Drugs 60(Suppl 1):15

    Article  CAS  Google Scholar 

  24. Chen T, George JA, Taylor CC (2006) Anticancer Drugs 17(2):123

    Article  Google Scholar 

  25. Heron-Milhavet L, Khouya N, Fernandez A, Lamb NJ (2011) Histol Histopathol 26(5):651

    CAS  Google Scholar 

  26. Kawakami T, Kawakami Y, Kitaura J (2002) J Biochem 132(5):677

    Article  CAS  Google Scholar 

  27. Liew CY, Ma XH, Yap CW (2010) J Comput Aided Mol Des 24(2):131

    Article  CAS  Google Scholar 

  28. Han LY, Ma XH, Lin HH, Jia J, Zhu F, Xue Y, Li ZR, Cao ZW, Ji ZL, Chen YZ (2008) J Mol Graph Model 26(8):1276

    Article  CAS  Google Scholar 

  29. Yap CW (2011) J Comput Chem 32(7):1466

    Article  CAS  Google Scholar 

  30. Durant JL, Leland BA, Henry DR, Nourse JG (2002) J Chem Inf Comput Sci 42(6):1273

    Article  CAS  Google Scholar 

  31. Li QL, Chen TJ, Wang YL, Bryant SH (2010) Drug Discovery Today 15(23–24):1052

    Article  CAS  Google Scholar 

  32. Bryant S (2006) 231st National meeting of the American Chemical Society, Atlanta, GA March 26–30, 2006. Abstracts of papers of the American Chemical Society 231, 80-COMP

  33. PubChem Fingerprints. ftp://ftp.ncbi.nih.gov/pubchem/data_spec/pubchem_fingerprints.txt. Accessed 2011

  34. Klekota J, Roth FP (2008) Bioinformatics 24(21):2518

    Article  CAS  Google Scholar 

  35. Japkowicz N, Shah M (2011) Performance measures I. Evaluating learning algorithms: a classification perspective. Cambridge University Press, Cambridge

  36. Rogers DJ, Tanimoto TT (1960) Science 132(3434):1115

    Article  CAS  Google Scholar 

Download references

Acknowledgments

The PhD scholarship awarded to WKY from the Novartis Institute for Tropical Diseases is gratefully acknowledged.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wee Kiang Yeo.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (DOCX 358 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yeo, W.K., Go, M.L. & Nilar, S. Extraction and validation of substructure profiles for enriching compound libraries. J Comput Aided Mol Des 26, 1127–1141 (2012). https://doi.org/10.1007/s10822-012-9604-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10822-012-9604-8

Keywords

Navigation