Abstract
Current approaches for the assessment of molecular similarity can generally be divided into descriptor-based and substructure-based methods. The former require the application of similarity metrics that yield continuous similarity values, whereas the readout of the latter is binary (i.e. similar vs. not similar). However, it is also possible to combine descriptor-based and substructure-based methods to exploit advantages of individual methods in context and generate similarity measures for special applications. Herein we present a hybrid measure for asymmetric similarity calculations on the basis of maximum common core structures. This similarity function can be effectively applied to compare small reference compounds with larger test molecules, which is difficult using conventional metrics.
Similar content being viewed by others
References
Maggiora GM, Vogt M, Stumpfe D, Bajorath J (2014) Molecular similarity in medicinal chemistry. J Med Chem 57:3186–3204
Willett P (2014) The calculation of molecular structural similarity: principles and practice. Mol Inf 33(6–7):403–413
Vogt M, Stumpfe D, Geppert H, Bajorath J (2010) Scaffold hopping using two-dimensional fingerprints: true potential, black magic, or a hopeless endeavor? Guidelines for virtual screening. J Med Chem 12:5707–5715
Gardiner EJ, Holliday JD, O’Dowd C, Willett P (2011) Effectiveness of 2D fingerprints for scaffold hoping. Future Med Chem 3:405–414
Maggiora GM, Shanmugasundaram V (2004) Molecular similarity measures. In: Bajorath J (ed) Chemoinformatics—concepts, methods, and tools for drug discovery. Humana Press, Totowa NJ
Raymond W, Willett P (2002) Effectiveness of graph-based and fingerprint-based similarity measures for virtual screening of 2D chemical structure databases. J Comput-Aided Mol Des 16:59–71
Kenny PW, Sadowski J (2005) Structure modification in chemical databases. In: Oprea TI (ed) Chemoinformatics in drug discovery. Wiley-VCH, Weinheim, pp 271–285
Hussain J, Rea C (2010) Computationally efficient algorithm to identify matched molecular pairs (MMPs) in large data sets. J Chem Inf Model 50:339–348
Zhang B, Vogt M, Maggiora GM, Bajorath J (2015) Design of chemical space networks using a Tanimoto similarity variant based upon maximum common substructures. J Comput Aided Mol Des 29:937–950
Maggiora GM, Bajorath J (2014) Chemical space networks—a powerful new paradigm for the description of chemical space. J Comput-Aided Mol Des 28:795–802
Tversky A (1977) Features of similarity. Psychol Rev 84:327–352
Horvath D, Marcou G, Varnek A (2013) Do not hesitate to use Tversky—and other hints for successful active analogue searches with feature count descriptors. J Chem Inf Model 53:1543–1562
Duesbury E, Holliday J, Willett P (2015) Maximum common substructure-based data fusion in similarity searching. J Chem Inf Model 55:222–230
Wu M, Vogt M, Maggiora GM, Bajorath J (2016) Design of chemical space networks on the basis of Tversky similarity. J Comput-Aided Mol Des 30:1–12
OEChem TK version 2.0.0; OpenEye Scientific Software, Santa Fe, NM. http://www.eyesopen.com
Rogers D, Hahn M (2010) Extended-connectivity fingerprints. J Chem Inf Model 50:742–754
Gaulton A, Bellis LJ, Bento AP, Chambers J, Davies M, Hersey A, Light Y, McGlinchey S, Michalovich D, Al-Lazikani B, Overington JP (2012) ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res 40(Database issue):D1100–D1107
Cochran WG (1977) Sampling Techniques, 3rd edn. Wiley, New York
Wang Y, Eckert H, Bajorath J (2007) Apparent asymmetry in fingerprint similarity searching is a direct consequence of differences in bit densities and molecular size. ChemMedChem 2:1037–1042
Wang Y, Bajorath J (2008) Balancing the influence of molecular complexity on fingerprint similarity searching. J Chem Inf Model 48:75–84
Wang Y, Bajorath J (2010) Advanced fingerprint methods for similarity searching: balancing molecular complexity effects. Comb Chem High-Throughput Screen 13:220–228
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Kunimoto, R., Vogt, M. & Bajorath, J. Maximum common substructure-based Tversky index: an asymmetric hybrid similarity measure. J Comput Aided Mol Des 30, 523–531 (2016). https://doi.org/10.1007/s10822-016-9935-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10822-016-9935-y