Skip to main content
Log in

Comparison of bioactive chemical space networks generated using substructure- and fingerprint-based measures of molecular similarity

  • Published:
Journal of Computer-Aided Molecular Design Aims and scope Submit manuscript

Abstract

Chemical space networks (CSNs) have recently been introduced as a conceptual alternative to coordinate-based representations of chemical space. CSNs were initially designed as threshold networks using the Tanimoto coefficient as a continuous similarity measure. The analysis of CSNs generated from sets of bioactive compounds revealed that many statistical properties were strongly dependent on their edge density. While it was difficult to compare CSNs at pre-defined similarity threshold values, CSNs with constant edge density were directly comparable. In the current study, alternative CSN representations were constructed by applying the matched molecular pair (MMP) formalism as a substructure-based similarity criterion. For more than 150 compound activity classes, MMP-based CSNs (MMP-CSNs) were compared to corresponding threshold CSNs (THR-CSNs) at a constant edge density by applying different parameters from network science, measures of community structure distributions, and indicators of structure–activity relationship (SAR) information content. MMP-CSNs were found to be an attractive alternative to THR-CSNs, yielding low edge densities and well-resolved topologies. MMP-CSNs and corresponding THR-CSNs often had similar topology and closely corresponding community structures, although there was only limited overlap in similarity relationships. The homophily principle from network science was shown to affect MMP-CSNs and THR-CSNs in different ways, despite the presence of conserved topological features. Moreover, activity cliff distributions in alternative CSN designs markedly differed, which has important implications for SAR analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Dobson CM (2004) Chemical space and biology. Nature 432:824–828

    Article  CAS  Google Scholar 

  2. Bohacek RS, McMartin C, Guida WC (1996) The art and practice of structure-based drug design: a molecular modelling perspective. Med Res Rev 16:3–50

    Article  CAS  Google Scholar 

  3. Pearlman R, Smith K (2002) Novel software tools for chemical diversity. 3D QSAR in drug design: three-dimensional. Quant Struct Act Relat 2:339–353

    Google Scholar 

  4. Maggiora GM, Bajorath J (2014) Chemical space networks—a powerful new paradigm for the description of chemical space. J Comput Aided Mol Des 28:795–802

    Article  CAS  Google Scholar 

  5. Maggiora GM, Vogt M, Stumpfe D, Bajorath J (2014) Molecular similarity in medicinal chemistry. J Med Chem 57:3186–3204

    Article  CAS  Google Scholar 

  6. Watts D, Strogatz S (1998) Collective dynamics of ‘small-world’ networks. Nature 393:440–442

    Article  CAS  Google Scholar 

  7. Barabási A, Albert R (1999) Emergence of scaling in random networks. Science 286:509–512

    Article  Google Scholar 

  8. Newman M (2010) Networks—an introduction. Oxford University Press Inc., New York

    Google Scholar 

  9. Newman M (2003) The structure and function of complex networks. SIAM Rev 45:167–256

    Article  Google Scholar 

  10. Albert R, Barabási A (2002) Statistical mechanics of complex networks. Rev Mod Phys 74:47–97

    Article  Google Scholar 

  11. McPherson M, Smith-Lovin L, Cook J (2001) Birds of a feather: homophily in social networks. Annu Rev Sociol 27:415–444

    Article  Google Scholar 

  12. Wawer M, Peltason L, Weskamp N, Teckentrup A, Bajorath J (2008) Structure-activity relationship anatomy by network-like similarity graphs and local structure-activity relationship indices. J Med Chem 51:6075–6084

    Article  CAS  Google Scholar 

  13. Tanaka N, Ohno K, Niimi T, Moritomo A, Mori K, Orita M (2009) Small-world phenomena in chemical library networks: application to fragment-based drug discovery. J Chem Inf Model 49:2677–2686

    Article  CAS  Google Scholar 

  14. Krein MP, Sukumar N (2011) Exploration of the topology of chemical spaces with network measures. J Phys Chem A 115:12905–12918

    Article  CAS  Google Scholar 

  15. Fourches D, Tropsha A (2013) Using graph indices for the analysis and comparison of chemical data sets. Mol Inf 32:827–842

    Article  CAS  Google Scholar 

  16. Zwierzyna M, Vogt M, Maggiora GM, Bajorath J (2015) Design and characterization of chemical space networks for different compound data sets. J Comput Aided Mol Des 29:113–125

    Article  CAS  Google Scholar 

  17. Rogers D, Hahn M (2010) Extended-connectivity fingerprints. J Chem Inf Model 50:742–754

    Article  CAS  Google Scholar 

  18. Stumpfe D, Hu Y, Dimova D, Bajorath J (2014) Recent progress in understanding activity cliffs and their utility in medicinal chemistry. J Med Chem 57:18–28

    Article  CAS  Google Scholar 

  19. Hu X, Hu Y, Vogt M, Stumpfe D, Bajorath J (2012) MMP-cliffs: systematic identification of activity cliffs on the basis of matched molecular pairs. J Chem Inf Model 52:1138–1145

    Article  CAS  Google Scholar 

  20. Stumpfe D, Bajorath J (2012) Frequency of occurrence and potency range distribution of activity cliffs in bioactive compounds. J Chem Inf Model 52:2348–2353

    Article  CAS  Google Scholar 

  21. Kenny PW, Sadowski J (2005) Structure modification in chemical databases. In: Oprea TI (ed) Chemoinformatics in drug discovery. Wiley-VCH, Weinheim, pp 271–285

    Chapter  Google Scholar 

  22. Hussain J, Rea C (2010) Computationally efficient algorithm to identify matched molecular pairs (MMPs) in large data sets. J Chem Inf Model 50:339–348

    Article  CAS  Google Scholar 

  23. Gaulton A, Bellis LJ, Bento AP, Chambers J, Davies M, Hersey A, Light Y, McGlinchey S, Michalovich D, Al-Lazikani B, Overington JP (2012) ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res 40(Database issue):D1100–D1107

    Article  CAS  Google Scholar 

  24. Java Universal Network/Graph Framework. http://jung.sourceforge.net. Accessed 12 Oct 2014

  25. Fruchterman TMJ, Reingold EM (1991) Graph drawing by force-directed placement. Softw Pract Exp 21:1129–1164

    Article  Google Scholar 

  26. Newman M, Park J (2003) Why social networks are different from other types of networks. Phys Rev E 68:036122

    Article  CAS  Google Scholar 

  27. Foster D, Foster J, Grassberger P, Paczuski M (2011) Clustering drives assortativity and community structure in ensembles of networks. Phys Rev E 84:066117

    Article  Google Scholar 

  28. Newman M (2004) Fast algorithm for detecting community structure in networks. Phys Rev E 69:066133

    Article  CAS  Google Scholar 

  29. Maggiora GM, Shanmugasundaram V (2005) An information-theoretic characterization of partitioned property spaces. J Math Chem 38:1–20

    Article  CAS  Google Scholar 

  30. Vinh NX, Epps J, Bailey J (2010) Information theoretic measures for clusterings comparison: variants, properties, normalization and correction for chance. J Mach Learn Res 11:2837–2854

    Google Scholar 

Download references

Acknowledgments

The authors thank Ye Hu for help with data set collection and Dilyana Dimova for MMP routines. BZ is supported by the China Scholarship Council.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jürgen Bajorath.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, B., Vogt, M., Maggiora, G.M. et al. Comparison of bioactive chemical space networks generated using substructure- and fingerprint-based measures of molecular similarity. J Comput Aided Mol Des 29, 595–608 (2015). https://doi.org/10.1007/s10822-015-9852-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10822-015-9852-5

Keywords

Navigation