Abstract
Data driven decision making is a key element of today’s pharmaceutical research, including early drug discovery. It comprises questions like which target to pursue, which chemical series to pursue, which compound to make next, or which compound to select for advanced profiling and promotion to pre-clinical development. In the following paper we will exemplify how data integrity, i.e. the context data is generated in and auxiliary information that is provided for individual result records, can influence decision making in early lead discovery programs. In addition we will describe some approaches which we pursue at Boehringer Ingelheim to reduce the risk for getting misguided.
Similar content being viewed by others
References
Beck B (2012) BioProfile—extract knowledge from corporate databases to assess cross-reactivities of compounds. Bioorg Med Chem 20:5428–5435
Wenlock MC, Carlsson LA (2015) How experimental errors influence drug metabolism and pharmacokinetic QSAR/QSPR models. J Chem Inf Model 55:125–134
Baell JB, Holloway GA (2010) New substructure filters for removal of pan assay interference compounds (PAINS) from screening libraries and for their exclusion in bioassays. J Med Chem 53:2719–2740
Baell JB (2015) Screening-based translation of public research encounters painful problems. ACS Med Chem Lett 6:229–234
Dahlin JL, Nissink JWM, Strasser JM, Francis S, Higgins L, Zhou H, Zhang Z, Walters MA (2015) PAINS in the assay: chemical mechanisms of assay interference and promiscuous enzymatic inhibition observed during a sulfhydryl-scavenging HTS. J Med Chem 58:2091–2113
McGovern SL, Caselli E, Grigorieff N, Shoichet BK (2002) A common mechanism underlying promiscuous inhibitors from virtual and high-throughput screening. J Med Chem 45:1712–1722
Nissink JWM, Blackburn S (2014) Quantification of frequent-hitter behavior based on historical high-throughput screening data. Future Med Chem 6:1113–1126
Rishton GM (1997) Reactive compounds and in vitro false positives in HTS. Drug Discov Tod 2:382–384
Roche O, Schneider P, Zuegge J, Guba W, Kansy M, Alanine A, Bleicher K, Danel F, Gutknecht EM, Rogers-Evans M, Neidhart W, Stalder H, Dillon M, Sjogren E, Fotouhi N, Gillespie P, Goodnow R, Harris W, Jones P, Taniguchi M, Tsujii S, Von der Saal W, Zimmermann G, Schneider G (2002) Development of a virtual screening method for identification of “frequent hitters” in compound libraries. J Med Chem 45:137–142
Sink R, Gobec S, Pecar S, Zega A (2010) False positives in the early stages of drug discovery. Curr Top Med Chem 17:4231–4255
Feng BY, Simeonov A, Jadhav A, Babaoglu K, Inglese J, Shoichet BK, Austin CP (2007) A high-throughput screen for aggregation-based inhibition in a large compound library. J Med Chem 50:2385–2390
Fligge TA, Schuler A (2006) Integration of a rapid automated solubility classification into early validation of hits obtained by high throughput screening. J Pharm Biomed Anal 42:449–454
Kramer C, Heinisch T, Fligge T, Beck B, Clark T (2009) A consistent dataset of kinetic solubilities for early-phase drug discovery. ChemMedChem 4:1529–1536
Jadhav A, Ferreira RS, Klumpp C, Mott BT, Austin CP, Inglese J, Thomas CJ, Maloney DJ, Shoichet BK, Simeonov A (2010) Quantitative analyses of aggregation, autofluorescence, and reactivity artifacts in a screen for inhibitors of a thiol protease. J Med Chem 53:37–51
Sullivan E, Tucker EM, Dale IL (1999) Measurement of [Ca2+] using the fluorometric imaging plate reader (FLIPR). Methods Mol Biol 114:125–133
Registered Trademark of PerkinElmer, Waltham, United States. http://www.perkinelmer.com
Holdgate G, Geschwindner S, Breeze A, Davies G, Colclough N, Temesi D, Ward L (2013) Biophysical methods in drug discovery from small molecule to pharmaceutical. Methods Mol Biol 1008:327–355
Ohnacker G, Kalbfleisch W (1970) CCBF—Ein System zur Computerbearbeitung chemischer und biologischer Forschungsergebnisse. Angew Chem 82:628–633
Hashem Ibrahim Abaker Targio, Yaqoob Ibrar, Anuar Nor Badrul, Mokhtar Salimah, Gani Abdullah, Khan Samee Ullah (2015) big data” on cloud computing: review and open research issues. Inf Syst 47:98–115
Snijders C, Matzat U, Reips U-D (2012) ‘Big data’: big gaps of knowledge in the field of Internet science. Int J Internet Sci 7:1–5
Geppert T, Beck B (2014) Fuzzy matched pairs: a means to determine the pharmacophore impact on molecular interaction. J Chem Inf Model 54:1093–1102
Griffen E, Leach AG, Robb GR, Warner DJ (2011) Matched molecular pairs as a medicinal chemistry tool. J Med Chem 54:7739–7750
Leach AG, Jones HD, Cosgrove DA, Kenny PW, Ruston L, MacFaul P, Wood JM, Colclough N, Law B (2006) Matched molecular pairs as a guide in the optimization of pharmaceutical properties; a study of aqueous solubility, plasma protein binding and oral exposure. J Med Chem 49:6672–6682
Bornot A, Blackett C, Engkvist O, Murray C, Bendtsen C (2014) The role of historical bioactivity data in the deconvolution of phenotypic screens. J Biomol Screen 19:696–706
Lee J, Bogyo M (2013) Target deconvolution techniques in modern phenotypic profiling. Curr Opin Chem Biol 17:118–126
Lounkine E, Keiser MJ, Whitebread S, Mikhailov D, Hamon J, Jenkins JL, Lavan P, Weber E, Doak AK, Cote S, Shoichet BK, Urban L (2012) Large-scale prediction and testing of drug activity on side-effect targets. Nature 486:361–367
Hu Y, Bajorath J (2014) Learning from ‘big data’: compounds and targets. Drug Discov Tod 19:357–360
Kramer C, Lewis R (2012) QSARs, data and error in the modern age of drug discovery. Curr Top Med Chem 12:1896–1902
Kramer C, Fuchs JE, Whitebread S, Gedeck P, Liedl KR (2014) Matched molecular pair analysis: significance and the impact of experimental uncertainty. J Med Chem 57:3786–3802
http://dataconomy.com/the-four-essentials-vs-for-a-big-data-analytics-platform/
Acknowledgments
We would like to thank our colleagues Ralf Heilker, Helmut Romig, Thilo Fligge, and Frank Dullweber for stimulating discussions.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Beck, B., Seeliger, D. & Kriegl, J.M. The impact of data integrity on decision making in early lead discovery. J Comput Aided Mol Des 29, 911–921 (2015). https://doi.org/10.1007/s10822-015-9871-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10822-015-9871-2