Skip to main content

Advertisement

Log in

Ring system-based chemical graph generation for de novo molecular design

  • Published:
Journal of Computer-Aided Molecular Design Aims and scope Submit manuscript

Abstract

Generating chemical graphs in silico by combining building blocks is important and fundamental in virtual combinatorial chemistry. A premise in this area is that generated structures should be irredundant as well as exhaustive. In this study, we develop structure generation algorithms regarding combining ring systems as well as atom fragments. The proposed algorithms consist of three parts. First, chemical structures are generated through a canonical construction path. During structure generation, ring systems can be treated as reduced graphs having fewer vertices than those in the original ones. Second, diversified structures are generated by a simple rule-based generation algorithm. Third, the number of structures to be generated can be estimated with adequate accuracy without actual exhaustive generation. The proposed algorithms were implemented in structure generator Molgilla. As a practical application, Molgilla generated chemical structures mimicking rosiglitazone in terms of a two dimensional pharmacophore pattern. The strength of the algorithms lies in simplicity and flexibility. Therefore, they may be applied to various computer programs regarding structure generation by combining building blocks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20

Similar content being viewed by others

References

  1. Faulon J-L, Bender A (2010) Handbook of chemoinformatics algorithms. CRC Press, Boca Raton

    Book  Google Scholar 

  2. Pólya G, Read RC (1987) Combinatorial enumeration of groups, graphs, and chemical compounds. Springer, New York

    Book  Google Scholar 

  3. Balaban AT, Kennedy JW, Quintas L (1988) The number of alkanes having N carbons and a longest chain of length D: an application of a theorem of Polya. J Chem Educ 65:304–313

    Article  CAS  Google Scholar 

  4. Gugisch R, Kerber A, Laue R, Meringer M, Weidinger J (2000) MOLGEN-COMB, a software package for combinatorial chemistry. MATCH 41:189–203

    CAS  Google Scholar 

  5. Ruch E, Klein DJ (1983) Double cosets in chemistry and physics. Theor Chim Acta 63:447–472

    Article  CAS  Google Scholar 

  6. Lindsay RK, Buchanan BG, Feigenbaum EA, Lederberg J (1993) DENDRAL: a case study of the first expert system for scientific hypothesis formation. Artif Intell 61:209–261

    Article  Google Scholar 

  7. Sasaki S, Kudo Y (1985) Structure elucidation system using structural information from multisources: CHEMICS. J Chem Inf Comput Sci 25:252–257

    Article  CAS  Google Scholar 

  8. Funatsu K, Miyabayashi N, Sasaki S (1988) Further development of structure generation in the automated structure elucidation system CHEMICS. J Chem Inf Comput Sci 28:18–28

    Article  CAS  Google Scholar 

  9. Benecke C, Grüner T, Kerber A, Laue R, Wieland T (1997) MOLecular structure GENeration with MOLGEN, new features and future developments. Fresen J Anal Chem 359:23–32

    Article  CAS  Google Scholar 

  10. Benecke C, Grund R, Hohberger R, Kerber A, Laue R, Wieland T (1995) MOLGEN+, a generator of connectivity isomers and stereoisomers for molecular structure elucidation. Anal Chim Acta 314:141–147

    Article  CAS  Google Scholar 

  11. Grüner T, Laue R, Meringer M (1997) Algorithms for group actions: homomorphism principle and orderly generation applied to graphs. In: DIMACS Series in Discrete Mathematics and Theoretical Computer Science; American Mathematical Society, vol 28, pp 113–122

  12. Faulon JL (1992) On using graph-equivalent classes for the structure elucidation of large molecules. J Chem Inf Comput Sci 32:338–348

    Article  CAS  Google Scholar 

  13. Kawashita N, Yamasaki H, Miyao T, Kawai K, Sakae Y, Ishikawa T, Mori K, Nakamura S, Kaneko H (2015) <Review> A mini-review on chemoinformatics approaches for drug discovery. J Comput Aided Chem 16:15–29

    Article  Google Scholar 

  14. Schneider G, Fechner U (2005) Computer-based de novo design of drug-like molecules. Nat Rev Drug Discov 4:649–663

    Article  CAS  Google Scholar 

  15. Schneider G, Neidhart W, Giller T, Schmid G (1999) “Scaffold-Hopping” by topological pharmacophore search: a contribution to virtual screening. Angew Chem Int Ed 38:2894–2896

    Article  CAS  Google Scholar 

  16. Lewell XQ, Judd DB, Watson SP, Hann MM (1998) RECAP-retrosynthetic combinatorial analysis procedure: a powerful new technique for identifying privileged molecular fragments with useful applications in combinatorial chemistry. J Chem Inf Comput Sci 38:511–522

    Article  CAS  Google Scholar 

  17. Hartenfeller M, Zettl H, Walter M, Rupp M, Reisen F, Proschak E, Weggen S, Stark H, Schneider G (2012) DOGS: reaction-driven de novo design of bioactive compounds. PLoS Comput Biol 8:e1002380

    Article  CAS  Google Scholar 

  18. Lessel U, Wellenzohn B, Lilienthal M, Claussen H (2009) Searching fragment spaces with feature trees. J Chem Inf Model 49:270–279

    Article  CAS  Google Scholar 

  19. Rella M (2011) Software review of FTrees and FTrees-FS in pipeline pilot FTrees and FTrees-FS in pipeline pilot. BioSolveIT GmbH. An Der Zieglei 79, 53757 Sankt Augustin, Germany. http://www.biosolveit.de/FTrees. See Web Site for Pricing Information. J Am Chem Soc, vol 133, pp 17101–17102

  20. Shimizu M, Nagamochi H, Akutsu T (2011) Enumerating tree-like chemical graphs with given upper and lower bounds on path frequencies. BMC Bioinform 12:1–9

    Article  Google Scholar 

  21. Zhao Y, Hayashida M, Jindalertudomdee J, Nagamochi H, Akutsu T (2013) Breadth-first search approach to enumeration of tree-like chemical compounds. J Bioinform Comput Biol 11:1343007

    Article  Google Scholar 

  22. Nakano S, Uno T (2005) Generating colored trees. In: Kratsch D (ed) Graph-theoretic concepts in computer science Lecture notes in computer science, vol 3787. Springer, Berlin, pp 249–260

    Chapter  Google Scholar 

  23. Suzuki M, Nagamochi H, Akutsu T (2014) Efficient enumeration of monocyclic chemical graphs with given path frequencies. J Cheminform 6:31

    Article  Google Scholar 

  24. Akutsu T, Fukagawa D, Jansson J, Sadakane K (2012) Inferring a graph from path frequency. Discrete Appl Math 160:1416–1428

    Article  Google Scholar 

  25. McKay BD (1998) Isomorph-free exhaustive generation. J Algorithms 26:306–324

    Article  Google Scholar 

  26. Jaworska J, Nikolova-Jeliazkova N, Aldenberg T (2005) QSAR applicability domain estimation by projection of the training set descriptor space: a review. ATLA 33:445–459

    CAS  Google Scholar 

  27. Miyao T, Kaneko H, Funatsu K (2014) Ring-system-based exhaustive structure generation for inverse-QSPR/QSAR. Mol Inform 33:764–778

    CAS  Google Scholar 

  28. Bemis GW, Murcko MA (1996) The properties of known drugs. 1. Molecular frameworks. J Med Chem 39:2887–2893

    Article  CAS  Google Scholar 

  29. Wester MJ, Pollock SN, Coutsias EA, Allu TK, Muresan S, Oprea TI (2008) Scaffold topologies. 2. Analysis of chemical databases. J Chem Inf Model 48:1311–1324

    Article  CAS  Google Scholar 

  30. Fisanick W, Lipkus AH, Rusinko A (1994) Similarity searching on CAS registry substances. 2. 2D structural similarity. J Chem Inf Comput Sci 34:130–140

    Article  CAS  Google Scholar 

  31. Rarey M, Stahl M (2001) Similarity searching in large combinatorial chemistry spaces. J Comput Aided Mol Des 15:497–520

    Article  CAS  Google Scholar 

  32. McKay BD, Royle G F (1985) Constructing the cubic graphs on up to 20 vertices. Department of Mathematics, University of Western Australia

  33. Fink T, Reymond JL (2007) Virtual exploration of the chemical universe up to 11 atoms of C, N, O, F: assembly of 26.4 million structures (110.9 million stereoisomers) and analysis for new ring systems, stereochemistry, physicochemical properties, compound classes, and drug discove. J Chem Inf Model 47:342–353

    Article  CAS  Google Scholar 

  34. Blum LC, Reymond J-L (2009) 970 million druglike small molecules for virtual screening in the chemical universe database GDB-13. J Am Chem Soc 131:8732–8733

    Article  CAS  Google Scholar 

  35. Ruddigkeit L, van Deursen R, Blum LC, Reymond J-L (2012) Enumeration of 166 billion organic small molecules in the chemical universe database GDB-17. J Chem Inf Model 52:2864–2875

    Article  CAS  Google Scholar 

  36. Miyao T, Arakawa M, Funatsu K (2010) Exhaustive structure generation for inverse-QSPR/QSAR. Mol Inform 29:111–125

    Article  CAS  Google Scholar 

  37. Faulon JL (1996) Stochastic generator of chemical structure. 2. Using simulated annealing to search the space of constitutional isomers. J Chem Inf Comput Sci 36:731–740

    Article  CAS  Google Scholar 

  38. Virshup AM, Contreras-García J, Wipf P, Yang W, Beratan DN (2013) Stochastic voyages into uncharted chemical space produce a representative library of all possible drug-like compounds. J Am Chem Soc 135:7296–7303

    Article  CAS  Google Scholar 

  39. Bento AP, Gaulton A, Hersey A, Bellis LJ, Chambers J, Davies M, Krüger FA, Light Y, Mak L, McGlinchey S, Nowotka M, Papadatos G, Santos R, Overington JP (2014) The ChEMBL bioactivity database: an update. Nucleic Acids Res 42:1083–1090

    Article  Google Scholar 

  40. Landrum G RDKit (2016) Open-source cheminformatics http://www.rdkit.org. Accessed 12 Mar 2016

  41. Berthold MR, Cebron N, Dill F, Gabriel TR, Koetter T, Meinl T, Ohl P, Sieb C, Thiel K, Wiswedel B (2008) KNIME: the Konstanz information miner. In: Preisach C, Burkhardt H, Schmidt-Thieme L, Decker R (eds) Data analysis, machine learning and applications. Springer, Berlin, pp 319–326

    Chapter  Google Scholar 

  42. Taylor RD, MacCoss M, Lawson ADG (2014) Rings in drugs. J Med Chem 57:5845–5859

    Article  CAS  Google Scholar 

  43. Arakawa M, Yamada Y, Funatsu K (2005) Development of the computer software. J Comput Aided Chem 6:90–96

    Article  Google Scholar 

  44. Chemish: Chemometorics Software (2016) http://www.cheminfonavi.co.jp/chemish. Accessed 12 Mar 2016

  45. Rishton GM (1997) Reactive compounds and in vitro false positives in HTS. Drug Discov Today 2:382–384

    Article  CAS  Google Scholar 

  46. Rishton GM (2003) Nonleadlikeness and leadlikeness in biochemical screening. Drug Discov Today 8:86–96

    Article  CAS  Google Scholar 

  47. Pavlov D, Rybalkin M, Karulin B, Kozhevnikov M, Savelyev A, Churinov A (2011) Indigo: universal cheminformatics API. J Cheminform 3:4

    Article  Google Scholar 

  48. Durant JL, Leland BA, Henry DR, Nourse JG (2002) Reoptimization of MDL keys for use in drug discovery. J Chem Inf Comput Sci 42:1273–1280

    Article  CAS  Google Scholar 

  49. Ashton M, Barnard J, Casset F, Charlton M, Downs G, Gorse D, Holliday J, Lahana R, Willett P (2002) Identification of diverse database subsets using property-based and fragment-based molecular descriptions. Quant Struct Act Rel 21:598–604

    Article  CAS  Google Scholar 

  50. Rizos CV, Elisaf MS, Mikhailidis DP, Liberopoulos EN (2009) How safe is the use of thiazolidinediones in clinical practice? Expert Opin Drug Saf 8:15–32

    Article  CAS  Google Scholar 

  51. Miyao T, Kaneko H, Funatsu K (2016) Ring-system-based chemical structure enumeration for de novo design. Yakugaku Zasshi 136:101–106

    Article  CAS  Google Scholar 

  52. Miyao T, Kaneko H, Funatsu K (2016) Inverse QSPR/QSAR analysis for chemical structure generation (from Y to X). J Chem Inf Model 56:286–299

    Article  CAS  Google Scholar 

  53. Randic M (1975) Characterization of molecular branching. J Am Chem Soc 97:6609–6615

    Article  CAS  Google Scholar 

  54. Reutlinger M, Koch CP, Reker D, Todoroff N, Schneider P, Rodrigues T, Schneider G (2013) Chemically advanced template search (CATS) for scaffold-hopping and prospective target prediction for “Orphan” molecules. Mol Inform 32:133–138

    Article  CAS  Google Scholar 

  55. Baell JB, Holloway GA (2010) New substructure filters for removal of pan assay interference compounds (PAINS) from screening libraries and for their exclusion in bioassays. J Med Chem 53:2719–2740

    Article  CAS  Google Scholar 

  56. Allu TK, Oprea TI (2005) Rapid evaluation of synthetic and molecular complexity for in silico chemistry. J Chem Inf Model 45:1237–1243

    Article  CAS  Google Scholar 

  57. Funatsu K, Sasaki S (1988) Computer-assisted organic synthesis design and reaction prediction system, “AIPHOS”. Tetrahedron Comput Methodol 1:27–37

    Article  CAS  Google Scholar 

Download references

Acknowledgments

The authors are grateful to G. Schneider and D. Reker at the Department of Chemistry and Applied Biosciences, Institute of Pharmaceutical Sciences, ETH Zurich. G. Schneider supported the authors by giving valuable advice for the improvement of our structure generation algorithms, particularly the descriptor calculation and how to generate feasible structures in a chemistry point of view. D. Reker and the authors have discussed how to develop diversity-oriented generation algorithms. The authors also acknowledge the support of the Core Research for Evolutionary Science and Technology (CREST) Project ‘Development of a knowledge-generating platform driven by big data in drug discovery through production processes’ of the Japan Science and Technology Agency (JST). T.M. is a JSPS Research Fellow.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kimito Funatsu.

Electronic supplementary material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Miyao, T., Kaneko, H. & Funatsu, K. Ring system-based chemical graph generation for de novo molecular design. J Comput Aided Mol Des 30, 425–446 (2016). https://doi.org/10.1007/s10822-016-9916-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10822-016-9916-1

Keywords

Navigation