Skip to main content

Advertisement

Log in

Managing missing measurements in small-molecule screens

  • Published:
Journal of Computer-Aided Molecular Design Aims and scope Submit manuscript

Abstract

In a typical high-throughput screening (HTS) campaign, less than 1 % of the small-molecule library is characterized by confirmatory experiments. As much as 99 % of the library’s molecules are set aside—and not included in downstream analysis—although some of these molecules would prove active were they sent for confirmatory testing. These missing experimental measurements prevent active molecules from being identified by screeners. In this study, we propose managing missing measurements using imputation—a powerful technique from the machine learning community—to fill in accurate guesses where measurements are missing. We then use these imputed measurements to construct an imputed visualization of HTS results, based on the scaffold tree visualization from the literature. This imputed visualization identifies almost all groups of active molecules from a HTS, even those that would otherwise be missed. We validate our methodology by simulating HTS experiments using the data from eight quantitative HTS campaigns, and the implications for drug discovery are discussed. In particular, this method can rapidly and economically identify novel active molecules, each of which could have novel function in either binding or selectivity in addition to representing new intellectual property.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Macarron R, Banks MN, Bojanic D, Burns DJ, Cirovic DA, Garyantes T, Green DVS, Hertzberg RP, Janzen WP, Paslay JW, Schopfer U, Sittampalam GS (2011) Nat Rev Drug Discov 10(3):188. http://dx.doi.org/10.1038/nrd3368

    Google Scholar 

  2. Glick M, Klon A, Acklin P, Davies J (2004) J Biomol Screen 9(1):32. PMID: 15006146

    Google Scholar 

  3. Glick M, Jenkins J, Nettles J, Hitchings H, Davies J (2006) J Chem Inf Model 46(1):193. PMID: 16426055

    Google Scholar 

  4. Posner BA, Xi H, Mills JEJ (2009) J Chem Inf Model 49(10):2202–2210

    Google Scholar 

  5. Swamidass SJ, Bittker JA, Bodycombe NE, Ryder SP, Clemons PA (2010) J Biomol Screen 15(6):680

    Article  Google Scholar 

  6. Swamidass SJ, Calhoun BT, Bittker JA, Bodycombe NE, Clemons PA (2011) Bioinformatics 27(16):2271–2278

    Google Scholar 

  7. Inglese J, Auld D, Jadhav A, Johnson R, Simeonov A, Yasgar A, Zheng W, Austin C (2006) Proc Natl Acad Sci 103(31):11473, PMID: 16864780

    Google Scholar 

  8. Varin T, Gubler H, Parker C, Zhang J, Raman P, Ertl P, Schuffenhauer A (2010) J Chem Inf Model 277–279, PMID: 21073183

  9. Yan S, Asatryan H, Li J, Zhou Y (2005) J Chem Inf Model 45(6):1784

    Article  CAS  Google Scholar 

  10. Lakshminarayan K, Harp S, Goldman R, Samad T, et al. (1996) Proceedings of the second international conference on knowledge discovery and data mining , pp 140–145

  11. Ranu S, Calhoun BT, Singh AK, Swamidass SJ (2011) Mol Inf 30(9):809. doi:10.1002/minf.201100058

    Google Scholar 

  12. Tanrikulu Y, Kondru R, Schneider G, So W, Bitter H (2010) Mol Inf 29(10):678

    Article  CAS  Google Scholar 

  13. Schuffenhauer A, Ertl P, Roggo S, Wetzel S, Koch M, Waldmann H (2007) J Chem Inf Model 47(1):47

    Article  CAS  Google Scholar 

  14. Wang Y, Xiao J, Suzek T, Zhang J, Wang J, Bryant S (2009) Nucleic acids research 37 (Web Server issue), W623. PMID: 19498078

  15. Bolton E, Wang Y, Thiessen P, Bryant S (2008) Annu Rep Comput Chem 4:217. PMID: 19498078

    Google Scholar 

  16. McCulley J, Myung K (2011) Cell Cycle 10:3434

    Article  CAS  Google Scholar 

  17. Lee KY, Yang K, Cohn MA, Sikdar N, D’Andrea AD, Myung K (2010) J Biol Chem 285:10362

    Article  CAS  Google Scholar 

  18. Jones M, Hamana N, Nezu J, Shimane M (2000) Genomics 63(1):40

    Article  CAS  Google Scholar 

  19. Quinn A, Allali-Hassani A, Vedadi M, Simeonov A (2010) . Mol BioSyst 6(5):782

    Article  CAS  Google Scholar 

  20. Liu F, Chen X, Allali-Hassani A, Quinn A, Wigle TJ, Wasney GA, Dong A, Senisterra G, Chau I, Siarheyeva A et al. (2010) J Med Chem 53(15):5844–5857

  21. Lee J, Thompson J, Botuyan M, Mer G (2007) Nat Struct Mol Biol 15(1):109

    Article  CAS  Google Scholar 

  22. Sonkoly E, Wei T, Janson PC, Saaf A, Lundeberg L, Tengvall-Linder M, Norstedt G, Alenius H, Homey B, Scheynius A, Stahle M, Pivarcsi A (2007) PLoS ONE 2:e610

  23. Chan JA, Krichevsky AM, Kosik KS (2005) Cancer Res 65:6029

    Article  CAS  Google Scholar 

  24. Biertumpfel C, Zhao Y, Kondo Y, Ramon-Maiques S, Gregory M, Lee JY, Masutani C, Lehmann AR, Hanaoka F, Yang W (2010) Nature 465:1044

    Article  Google Scholar 

  25. Albertella MR, Green CM, Lehmann AR, O’Connor MJ (2005) Cancer Res 65:9799

    Article  CAS  Google Scholar 

  26. Marchand C, Lea W, Jadhav A, Dexheimer T, Austin C, Inglese J, Pommier Y, Simeonov A (2009) Mol Cancer Ther 8(1):240

    Article  CAS  Google Scholar 

  27. Dexheimer T, Antony S, Marchand C, Pommier Y (2008) Anticancer Agents Med Chem 8(4):381

    Article  Google Scholar 

  28. Arner ES (2009) Biochim Biophys Acta 1790:495

    Article  CAS  Google Scholar 

  29. Witte AB, Anestal K, Jerremalm E, Ehrsson H, Arner ES (2005) Free Radic Biol Med 39:696

    Article  CAS  Google Scholar 

  30. Baldi P, Brunak S (2001) Bioinformatics: the machine learning approach. The MIT Press, Cambridge

    Google Scholar 

  31. Swamidass S, Azencott C, Lin T, Gramajo H, Tsai S, Baldi P (2009) J Chem Inf Model 49(4):756

    Article  CAS  Google Scholar 

Download references

Acknowledgments

MRB collaborated with SJS to write the initial manuscript. MRB implemented the imputed tree based on an idea by SJS and ran most of the experiments. BTC prepared the imputed data downloaded from PubChem. Edward Holson provided helpful comments and edits to the manuscript. The Pathology and Immunology Department at the Washington University in St. Louis supports BTC, MRB, and SJS. Marvin was used to generate the chemical structures in Fig. 4; Marvin 5.3.5, 2010, ChemAxon (http://www.chemaxon.com).

Conflict of interest

The authors declare they have no conflict of interests to disclose.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to S. Joshua. Swamidass.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Browning, M.R., Calhoun, B.T. & Swamidass, S.J. Managing missing measurements in small-molecule screens. J Comput Aided Mol Des 27, 469–478 (2013). https://doi.org/10.1007/s10822-013-9642-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10822-013-9642-x

Keywords

Navigation