Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Automated de novo molecular design by hybrid machine intelligence and rule-driven chemical synthesis

Abstract

Chemical creativity in the design of new synthetic chemical entities (NCEs) with drug-like properties has been the domain of medicinal chemists. Here, we explore the capability of a chemistry-savvy machine intelligence to generate synthetically accessible molecules. DINGOS (design of innovative NCEs generated by optimization strategies) is a virtual assembly method that combines a rule-based approach with a machine learning model trained on successful synthetic routes described in chemical patent literature. This unique combination enables a balance between ligand-similarity-based generation of innovative compounds by scaffold hopping and the forward-synthetic feasibility of the designs. In a prospective proof-of-concept application, DINGOS successfully produced sets of de novo designs for four approved drugs that were in agreement with the desired structural and physicochemical properties. Target prediction indicated more than 50% of the designs to be biologically active. Four selected computer-generated compounds were successfully synthesized in accordance with the synthetic route proposed by DINGOS. The results of this study demonstrate the capability of machine learning models to capture implicit chemical knowledge from chemical reaction data and suggest feasible syntheses of new chemical matter.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Overview of the DINGOS software.
Fig. 2: Representation of the single-step molecule assembly procedure.
Fig. 3: Flow chart summarizing the ith iteration of the DINGOS algorithm.
Fig. 4: Distance comparison of the DINGOS, ChEMBL bioactive and construction sets.
Fig. 5: Selected de novo designs generated by DINGOS.

Data availability

The trained machine learning model, CAS numbers of the training data and reaction SMARTS used in this Article are provded in the Code Ocean capsule https://doi.org/10.24433/CO.6930970.v132. All molecules were preprocessed in accordance with the procedure stated in the Methods (see ‘Molecular building blocks’ section).

Code availability

The code for this Article, along with an accompanying computational environment, are available and executable online as a Code Ocean capsule: https://doi.org/10.24433/CO.6930970.v132.

References

  1. Shih, H.-P., Zhang, X. & Aronov, A. M. Drug discovery effectiveness from the standpoint of therapeutic mechanisms and indications. Nat. Rev. Drug Discov. 17, 19–33 (2017).

    Article  Google Scholar 

  2. Hartenfeller, M. & Schneider, G. Enabling future drug discovery by de novo design. Wiley Interdiscip. Rev. Comput. Mol. Sci. 1, 742–759 (2011).

    Article  Google Scholar 

  3. Blakemore, D. C. et al. Organic synthesis provides opportunities to transform drug discovery. Nat. Chem. 10, 383–394 (2018).

    Article  Google Scholar 

  4. Schneider, P. & Schneider, G. De novo design at the edge of chaos. J. Med. Chem. 59, 4077–4086 (2016).

    Article  Google Scholar 

  5. Sliwoski, G., Kothiwale, S., Meiler, J. & Lowe, E. W. Computational methods in drug discovery. Pharmacol. Rev. 66, 334–395 (2013).

    Article  Google Scholar 

  6. Chen, H., Engkvist, O., Wang, Y., Olivecrona, M. & Blaschke, T. The rise of deep learning in drug discovery. Drug Discov. Today 23, 1241–1250 (2018).

    Article  Google Scholar 

  7. Merk, D., Friedrich, L., Grisoni, F. & Schneider, G. De novo design of bioactive small molecules by artificial intelligence. Mol. Inform. 37, 1700153 (2018).

    Article  Google Scholar 

  8. Gupta, A. et al. Generative recurrent networks for de novo drug design. Mol. Inform. 37, 1700111 (2018).

    Article  Google Scholar 

  9. Merk, D., Grisoni, F., Friedrich, L. & Schneider, G. Tuning artificial intelligence on the de novo design of natural-product-inspired retinoid X receptor modulators. Commun. Chem. 1, 68 (2018).

    Article  Google Scholar 

  10. Lowe, D. M. Chemical reactions from US patents (1976–Sep2016) (2017); https://figshare.com/articles/Chemical_reactions_from_US_patents_1976-Sep2016_/5104873

  11. Coley, C. W., Rogers, L., Green, W. H. & Jensen, K. F. Computer-assisted retrosynthesis based on molecular similarity. ACS Cent. Sci. 3, 1237–1245 (2017).

    Article  Google Scholar 

  12. Feng, F., Lai, L. & Pei, J. Computational chemical synthesis analysis and pathway design. Front. Chem. 6, 199 (2018).

    Article  Google Scholar 

  13. Szymkuć, S. et al. Computer-assisted synthetic planning: the end of the beginning. Angew. Chem. Int. Ed. 55, 5904–5937 (2016).

    Article  Google Scholar 

  14. Segler, M. H. S., Preuss, M. & Waller, M. P. Planning chemical syntheses with deep neural networks and symbolic AI. Nature 555, 604–610 (2018).

    Article  Google Scholar 

  15. Weininger, D. SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J. Chem. Inf. Comput. Sci. 28, 31–36 (1988).

    Article  Google Scholar 

  16. Grisoni, F. et al. Scaffold hopping from natural products to synthetic mimetics by holistic molecular similarity. Commun. Chem. 1, 44 (2018).

    Article  Google Scholar 

  17. Merk, D., Grisoni, F., Friedrich, L., Gelzinyte, E. & Schneider, G. Scaffold hopping from synthetic RXR modulators by virtual screening and de novo design. Med. Chem. Commun. 9, 1289–1292 (2018).

    Article  Google Scholar 

  18. Grisoni, F., Merk, D., Byrne, R. & Schneider, G. Scaffold-hopping from synthetic drugs by holistic molecular representation. Sci. Rep. 8, 16469 (2018).

    Article  Google Scholar 

  19. MACCS-II (MDL Information Systems, 1987).

  20. Kingma, D. P. & Ba, J. Adam: a method for stochastic optimization. In Proceedings of 3 rd International Conference on Learning Representations, ICLR2015, 1–13 (2015).

  21. Gaulton, A. et al. The ChEMBL database in 2017. Nucleic Acids Res. 45, D945–D954 (2017).

    Article  Google Scholar 

  22. ChEMBL Database (EBI, 2017); https://www.ebi.ac.uk/chembl/

  23. Johnson, M. A. & Maggiora, G. M. Concepts and Applications of Molecular Similarity (Wiley, 1990).

  24. Lipinski, C. A., Lombardo, F., Dominy, B. W. & Feeney, P. J. Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Adv. Drug Deliv. Rev. 23, 3–25 (1997).

    Article  Google Scholar 

  25. Reker, D., Rodrigues, T., Schneider, P. & Schneider, G. Identifying the macromolecular targets of de novo-designed chemical entities through self-organizing map consensus. Proc. Natl Acad. Sci. USA 111, 4067–4072 (2014).

    Article  Google Scholar 

  26. Reutlinger, M. et al. Chemically advanced template search (CATS) for scaffold-hopping and prospective target prediction for ‘orphan’ molecules. Mol. Inform. 32, 133–138 (2013).

    Article  Google Scholar 

  27. Molecular Operating Environment (MOE) (Chemical Computing Group, 2017).

  28. O’Boyle, N. M. & Sayle, R. A. Comparing structural fingerprints using a literature-based similarity benchmark. J. Cheminform. 8, 1–14 (2016).

    Article  Google Scholar 

  29. RDKit: Open-source Cheminformatics (RDKit); www.rdkit.org

  30. Reaxys (Elsevier).

  31. Wolber, G. & Langer, T. LigandScout: 3D pharmacophores derived from protein-bound ligands and their use as virtual screening filters. J. Chem. Inf. Model. 45, 160–169 (2005).

    Article  Google Scholar 

  32. Button, A., Merk, A., Hiss, J. A. & Schneider, G. Automated de novo molecular design by hybrid machine intelligence and rule-driven chemical synthesis. Code Ocean (2019); https://doi.org/10.24433/CO.6930970.v1

Download references

Acknowledgements

The authors thank L. Friedrich, C. Brunner, B. Huisman, X. Zhang and R. Byrne for stimulating discussions and technical support. D.M. was financially supported by an ETH Zurich Postdoctoral Fellowship (grant no. 16–2 FEL-07). This research was financially supported by the Swiss National Science Foundation (grant no. 205321_182176 to G.S.).

Author information

Authors and Affiliations

Authors

Contributions

A.B. programmed the software and performed the computational experiments. A.B., J.A.H. and G.S. designed the algorithm and analysed the data. D.M. supervised the chemical part of the study and, together with A.B., synthesized the compounds. G.S. designed the study. All authors analysed the results and contributed to the manuscript.

Corresponding author

Correspondence to Gisbert Schneider.

Ethics declarations

Competing interests

G.S. declares a potential conflict of interest in his role as life-science industry consultant and cofounder of inSili.com GmbH, Zurich. No other competing interests are declared.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary material

Supplementary figures and tables

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Button, A., Merk, D., Hiss, J.A. et al. Automated de novo molecular design by hybrid machine intelligence and rule-driven chemical synthesis. Nat Mach Intell 1, 307–315 (2019). https://doi.org/10.1038/s42256-019-0067-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s42256-019-0067-7

This article is cited by

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research