Machine Learning to Predict Toxicity of Compounds

Grenet, Ingrid; Yin, Yonghua; Comet, Jean-Paul; Gelenbe, Erol

doi:10.1007/978-3-030-01418-6_33

Machine Learning to Predict Toxicity of Compounds

Ingrid Grenet¹⁸,
Yonghua Yin¹⁹,
Jean-Paul Comet¹⁸ &
…
Erol Gelenbe^18,19

Conference paper
First Online: 27 September 2018

7369 Accesses
8 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11139))

Abstract

Toxicology studies are subject to several concerns, and they raise the importance of an early detection of the potential for toxicity of chemical compounds which is currently evaluated through in vitro assays assessing their bioactivity, or using costly and ethically questionable in vivo tests on animals. Thus we investigate the prediction of the bioactivity of chemical compounds from their physico-chemical structure, and propose that it be automated using machine learning (ML) techniques based on data from in vitro assessment of several hundred chemical compounds. We provide the results of tests with this approach using several ML techniques, using both a restricted dataset and a larger one. Since the available empirical data is unbalanced, we also use data augmentation techniques to improve the classification accuracy, and present the resulting improvements.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Akbani, R., Kwek, S., Japkowicz, N.: Applying support vector machines to imbalanced datasets. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) ECML 2004. LNCS (LNAI), vol. 3201, pp. 39–50. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-30115-8_7
Chapter Google Scholar
Breiman, L.: Random Forests. Mach. Learn. 45, 5–32 (2001)
Article Google Scholar
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)
Article Google Scholar
Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794. ACM (2016)
Google Scholar
Chollet, F., et al.: Keras (2015). https://github.com/fchollet/keras
Cramer, C.E., Gelenbe, E.: Video quality and traffic QoS in learning-based subsampled and receiver-interpolated video sequences. IEEE J. Sel. Areas Commun. 18(2), 150–167 (2000)
Article Google Scholar
Dix, D.J., Houck, K.A., Martin, M.T., Richard, A.M., Setzer, R.W., Kavlock, R.J.: The ToxCast program for prioritizing toxicity testing of environmental chemicals. Toxicol. Sci. 95(1), 5–12 (2007)
Article Google Scholar
Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29(5), 1189–1232 (2001). https://doi.org/10.1214/aos/1013203451
Article MathSciNet MATH Google Scholar
Gelenbe, E.: Learning in the recurrent random neural network. Neural Comput. 5(1), 154–164 (1993)
Article Google Scholar
Gelenbe, E., Mao, Z.H., Li, Y.D.: Function approximation with spiked random networks. IEEE Trans. Neural Netw. 10(1), 3–9 (1999)
Article Google Scholar
Gelenbe, E.: Réseaux neuronaux aléatoires stables. Comptes rendus de l’Académie des Sciences. Série 2, Mécanique, Physique, Chimie, Sciences de l’Univers, Sciences de la Terre 310(3), 177–180 (1990)
Google Scholar
Gelenbe, E.: A class of genetic algorithms with analytical solution. Rob. Auton. Syst. 22, 59–64 (1997)
Article Google Scholar
Gelenbe, E.: Learning in genetic algorithms. In: Sipper, M., Mange, D., Pérez-Uribe, A. (eds.) ICES 1998. LNCS, vol. 1478, pp. 268–279. Springer, Heidelberg (1998). https://doi.org/10.1007/BFb0057628
Chapter Google Scholar
Gelenbe, E., Yin, Y.: Deep learning with dense random neural networks. In: Gruca, A., Czachórski, T., Harezlak, K., Kozielski, S., Piotrowska, A. (eds.) ICMMI 2017. AISC, vol. 659, pp. 3–18. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-67792-7_1
Chapter Google Scholar
Goh, G.B., Hodas, N.O., Vishnu, A.: Deep learning for computational chemistry. J. Comput. Chem. 38(16), 1291–1307 (2017)
Article Google Scholar
He, H., Garcia, E.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)
Article Google Scholar
Hansch, C.: Quantitative structure-activity relationships and the unnamed science. Acc. Chem. Res. 26(4), 147–153 (1993)
Article Google Scholar
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)
Article Google Scholar
Lemaître, G., Nogueira, F., Aridas, C.K.: Imbalanced-learn: a python toolbox to tackle the curse of imbalanced datasets in machine learning. J. Mach. Learn. Res. 18(17), 1–5 (2017)
MATH Google Scholar
Martin, M.T., Judson, R.S., Reif, D.M., Kavlock, R.J., Dix, D.J.: Profiling chemicals based on chronic toxicity results from the U.S. EPA ToxRef database. Environ. Health Perspect. 117(3), 392–399 (2009)
Article Google Scholar
Rogers, D., Hahn, M.: Extended-connectivity fingerprints. J. Chem. Inf. Model. 50(5), 742–754 (2010)
Article Google Scholar
Schultz, T.W., Hewitt, M., Netzeva, T.I., Cronin, M.T.D.: Assessing applicability domains of toxicological QSARs: definition, confidence in predicted values, and the role of mechanisms of action. QSAR Comb. Sci. 26(2), 238–254 (2007)
Article Google Scholar
Sipes, N.S., et al.: Predictive models of prenatal developmental toxicity from ToxCast high-throughput screening data. Toxicol. Sci. 124(1), 109–127 (2011)
Article Google Scholar
Thomas, R.S., et al.: A comprehensive statistical analysis of predicting in vivo hazard using high-throughput in vitro screening. Toxicol. Sci. 128(2), 398–417 (2012)
Article Google Scholar
Yin, Y., Gelenbe, E.: Single-cell based random neural network for deep learning. In: 2017 International Joint Conference on Neural Networks (IJCNN), pp. 86–93 (2017)
Google Scholar
Yin, Y., Wang, L., Gelenbe, E.: Multi-layer neural networks for quality of service oriented server-state classification in cloud servers. In: 2017 International Joint Conference on Neural Networks (IJCNN), pp. 1623–1627 (2017)
Google Scholar
Zang, Q., Rotroff, D.M., Judson, R.S.: Binary classification of a large collection of environmental chemicals from estrogen receptor assays by quantitative structure-activity relationship and machine learning methods. J. Chem. Inf. Model. 53(12), 3244–3261 (2013)
Article Google Scholar
Zhang, Y., Yin, Y., Guo, D., Yu, X., Xiao, L.: Cross-validation based weights and structure determination of chebyshev-polynomial neural networks for pattern classification. Pattern Recogn. 47(10), 3414–3428 (2014)
Article Google Scholar

Download references

Author information

Authors and Affiliations

University Côte d’Azur, I3S Laboratory, UMR CNRS 7271, CS 40121, 06903, Sophia Antipolis Cedex, France
Ingrid Grenet, Jean-Paul Comet & Erol Gelenbe
Intelligent Systems and Networks Group, Department of Electrical and Electronic Engineering, Imperial College, London, UK
Yonghua Yin & Erol Gelenbe

Authors

Ingrid Grenet
View author publications
You can also search for this author in PubMed Google Scholar
Yonghua Yin
View author publications
You can also search for this author in PubMed Google Scholar
Jean-Paul Comet
View author publications
You can also search for this author in PubMed Google Scholar
Erol Gelenbe
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ingrid Grenet .

Editor information

Editors and Affiliations

Czech Academy of Sciences, Prague 8, Czech Republic
Věra Kůrková
Open University of Cyprus, Latsia, Cyprus
Yannis Manolopoulos
CITEC Bielefeld University, Bielefeld, Germany
Barbara Hammer
Democritus University of Thrace, Xanthi, Greece
Lazaros Iliadis
University of Piraeus, Piraeus, Greece
Ilias Maglogiannis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Grenet, I., Yin, Y., Comet, JP., Gelenbe, E. (2018). Machine Learning to Predict Toxicity of Compounds. In: Kůrková, V., Manolopoulos, Y., Hammer, B., Iliadis, L., Maglogiannis, I. (eds) Artificial Neural Networks and Machine Learning – ICANN 2018. ICANN 2018. Lecture Notes in Computer Science(), vol 11139. Springer, Cham. https://doi.org/10.1007/978-3-030-01418-6_33

Download citation

DOI: https://doi.org/10.1007/978-3-030-01418-6_33
Published: 27 September 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01417-9
Online ISBN: 978-3-030-01418-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics