Abstract
The applicability domain (AD) of models developed for regulatory use has attached great attention recently. The AD of quantitative structure–activity relationship (QSAR) models is the response and chemical structure space in which the model makes predictions with a given reliability. The evaluation of AD of regressions QSAR models for congeneric sets of chemicals can be find in many papers and books while the issue about metrics for the evaluation of an AD for the non-linear models (like neural networks) for the diverse set of chemicals represents the new field of investigations in QSAR studies. The scientific society is standing before the challenge to find out reliable way for the evaluation of an AD of non linear models. The new metrics for the evaluation of the AD of the counter propagation artificial neural network (CP ANN) models are discussed in the article: the Euclidean distances between an object (molecule) and the corresponding excited neuron of the neural network and between an object (molecule) and the representative object (vector of average values of descriptors). The investigation of the training and test sets chemicals coverage in the descriptors space was made with the respect to false predicted chemicals. The leverage approach was used to compare non linear (CP ANN) models with linear ones.
Similar content being viewed by others
References
OECD (2007) Guidance document on the validation of (quantitative) structure-activity relationships [(Q)SAR] MODELS. OECD Environment Health and Safety Publications, Series on Testing and Assessment No. 69, [http://appli1.oecd.org/olis/2007doc.nsf/linkto/env-jm-mono(2007)2]
Netzeva TI, Worth AP, Aldenberg T, Benigni R, Cronin MTD, Gramatica P, Jaworska JS et al (2005) Current status of methods for defining the applicability domain of (quantitative) structure–activity relationships. The report and recommendations of ECVAM Workshop 52. ATLA 33:155–173
Jaworska J, Nikolova-Jeliazkova N, Aldenberg T (2005) QSAR applicability domain estimation by projection of the training set descriptor space: a review. Altern Lab Anim 33(5):445–459, ISSN: 0261-1929
Tetko IV, Sushko I, Pandey AK, Zhu H, Tropsha A, Papa E, Oberg T, Todeschini R, Fourches D, Varnek A (2008) Critical assessment of QSAR models of environmental toxicity against Tetrahymena pyriformis: focusing on applicability domain and overfitting by variable selection. J Chem Inf Model 48(9):1733–1746. doi:10.1021/ci800151m
Dimitrov SD, Dimitrova GD, Pavlov TS, Dimitrova N, Patlewicz GY, Niemela J, Mekenyan OG (2005) A stepwise approach for defining the applicability domain of SAR and QSAR models. J Chem Inf Model 45:839–849. doi:10.1021/ci0500381
Duda RO, Hart PE, and Stork DG (2001) Pattern Classification. Wiley, New York: 654. doi: 10.1007/s00357-007-0015-9
Nikolova N, Jaworska J (2003) Approaches to measure chemical similarity—a review. QSAR Comb Sci 22(9–10):1006–1026. doi:10.1002/qsar.200330831
Haykin S (1998) Neural networks: a comprehensive foundation, 2nd edn. Prentice Hall. Inc, Englewood Cliffs
Benfenati E, Benigni R, DeMarini D, Helma C, Kirkland D, Martin TM, Mazzatorta P, Ouedrago-Arras G, Richard AM, Schilter B, Schoonen WG, Snyder RD, Yang C (2009) Predictive models for carcinogenicity: frameworks, state-of-the-art, and perspectives. J Environ Sci Health C 27:57–90. doi:10.1080/10590500902885593
Walker JD, Lars Carlsen L, Jaworska J (2003) Improving opportunities for regulatory acceptance of QSARS: the importance of model domain, uncertainty, validity and predictability. QSAR Comb Sci 22(3):346–350. doi:10.1002/qsar.200390024
Eriksson L, Jaworska J, Worth AP, Cronin MT, McDowell RM, Gramatica P (2003) Methods for reliability and uncertainty assessment and for applicability evaluations of classification- and regression-based QSARs. Environ Health Perspect 111:1361–1375. doi:10.1289/ehp.5758
Gramatica P, Pilutti P, Papa E (2003) Predicting the NO″3 radical tropospheric degradability of organic pollutants by theoretical molecular descriptors. Atmos Environ 37:3115–3124. doi:10.1016/S1352-2310(03)00293-0
Caesar project web page http://www.caesar-project.eu/software
Fjodorova N, Vračko M, Tušar M, Jezierska A, Novič M, Kühne R, Schüürmann G (2010) Quantitative and qualitative models for carcinogenicity prediction for non-congeneric chemicals using CP ANN method for regulatory uses. Mol Divers 14(3):581–594. doi:10.1007/s11030-009-9190-4
Fjodorova N, Vračko M, Novič M, Roncaglioni A, Benfenati E (2010) New public QSAR model for carcinogenicity. Chem Cent J 4(Suppl 1):S3, http://www.journal.chemistrycentral.com/content/4/S1/S3. doi:10.1186/1752-153X-4-S1-S3
Distributed Structure-Searchable Toxicity (DSSTox) Public Database Network http://www.epa.gov/ncct/dsstox/sdf_cpdbas.html
Zupan J, Gasteiger J (1999) Neural networks in chemistry and drug design. Wiley-VCH Verlag GmbH, Weinheim
Maran E, Novic M, Barbieri P, Zupan J (2004) Application of counterpropagation artificial neural network for modelling properties of fish antibiotics. SAR QSAR Environ Res 15(5–6):469–480. doi:10.1080/10629360412331297461
Tropsha A, Gramatica P, Gombar VK (2003) The importance of being earnest: validation is the absolute essential for successful application and interpretation of QSPR models. QSAR Comb Sci 22:69–77. doi:10.1002/qsar.200390007
Gramatica P (2007) Principles of QSAR models validation: internal and external. QSAR Comb Sci 26(5):694–701. doi:10.1002/qsar.20061015
Saliner AG, Patlewicz G, Worth AP (2005) A similarity based approach for chemical category classification. JRS report EUR 21867 EN:1–44
Benigni R, Bossa C, Jeliazkova N, Netzeva TI, Worth AP (2008) The Benigni/Bossa rulebase for mutagenicity and carcinogenicity—a module of Toxtree. EUR 23241 EN:1–70
Acknowledgments
Authors thank for the European Commission for the financial support under project CAESAR (SSPI-022674), the Slovenian Ministry of Higher Education, Science and Technology (grant P1-017).
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Fjodorova, N., Novič, M., Roncaglioni, A. et al. Evaluating the applicability domain in the case of classification predictive models for carcinogenicity based on the counter propagation artificial neural network. J Comput Aided Mol Des 25, 1147–1158 (2011). https://doi.org/10.1007/s10822-011-9499-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10822-011-9499-9