Abstract
This paper investigates Recurrent Neural Networks (RNNs) in the context of virtual High-Throughput Screening (vHTS). In the proposed approach, RNNs, particularly Bidrectional Dynamic Cortex Memories (BDCMs), are trained to derive the chemical activity of molecules directly from human readable strings (SMILES), uniquely describing entire molecular structures. Thereby, the so far obligatory procedure of computing task-specific fingerprint features is omitted completely. Moreover, it is shown that RNNs in principle are capable to incorporate contextual information even over entire sequences. They can not only gain information from this raw string representation, they are also able to produce comparably reliable predictions, i.e. yielding similar and partially even better AUC rates, as previously proposed state-of-the-art methods. Their performance is confirmed on different publicly available data sets. The research reveals a great potential of RNN-based methods in vHTS applications and opens novel perspectives in computational drug design.
A. Dörr and S. Otte—Equal contribution
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Balfer, J., Heikamp, K., Laufer, S., Bajorath, J.: Modeling of compound profiling experiments using support vector machines. Chem. Biol. Drug Des. 84(1), 75–85 (2014)
Bender, A., Mussa, H.Y., Glen, R.C., Reiling, S.: Molecular similarity searching using atom environments, information-based feature selection, and a naive bayesian classifier. J. Chem. Inf. Model. 44(1), 170–178 (2004)
Dörr, A., Rosenbaum, L., Zell, A.: A ranking method for the concurrent learning of compounds with various activity profiles. J. Cheminf. 7(1), 1–18 (2015)
Fontaine, F., Pastor, M., Zamora, I., Sanz, F.: Anchor-grind: filling the gap between standard 3D QSAR and the grid-independent descriptors. J. Med. Chem. 48(7), 2687–2694 (2005)
Gers, F.A., Schmidhuber, J., Cummins, F.: Learning to forget: continual prediction with LSTM. Neural Comput. 12, 2451–2471 (1999)
Gers, F.A., Schraudolph, N.N., Schmidhuber, J.: Learning precise timing with LSTM recurrent networks. J. Mach. Learn. Res. 3, 115–143 (2002)
Gers, F., Schmidhuber, J.: LSTM recurrent networks learn simple context-free and context-sensitive languages. IEEE Trans. Neural Netw. 12(6), 1333–1340 (2001)
Graves, A., Schmidhuber, J.: Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Netw. 18(5–6), 602–610 (2005)
Heikamp, K., Bajorath, J.: Prediction of compounds with closely related activity profiles using weighted support vector machine linear combinations. J. Chem. Inf. Model. 53(4), 791–801 (2013)
Hinselmann, G., Rosenbaum, L., Jahn, A., Fechner, N., Zell, A.: jCompoundMapper: an open source java library and command-line tool for chemical fingerprints. J. Cheminf. 3(1), 3 (2011)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Otte, S., Liwicki, M., Zell, A.: Dynamic cortex memory: enhancing recurrent neural networks for gradient-based sequence learning. In: Wermter, S., Weber, C., Duch, W., Honkela, T., Koprinkova-Hristova, P., Magg, S., Palm, G., Villa, A.E.P. (eds.) ICANN 2014. LNCS, vol. 8681, pp. 1–8. Springer, Heidelberg (2014)
Otte, S., Liwicki, M., Zell, A.: An analysis of dynamic cortex memory networks. In: International Joint Conference on Neural Networks (IJCNN), Killarney, Ireland, pp. 3338–3345, July 2015
Otte, S., Krechel, D., Liwicki, M.: JANNLab neural network framework for java. In: Poster Proceedings of the International Conference on Machine Learning and Data Mining (MLDM), pp. 39–46. ibai-publishing, New York, July 2013
Rogers, D., Hahn, M.: Extended-connectivity fingerprints. J. Chem. Inf. Model. 50, 742–754 (2010)
Schuster, M., Paliwal, K.K.: Bidirectional recurrent neural networks. IEEE Trans. Signal Process. 45(11), 2673–2681 (1997)
Steinbeck, C., Han, Y., Kuhn, S., Horlacher, O., Luttmann, E., Willighagen, E.: The chemistry development kit (CDK): an open-source java library for chemo-and bioinformatics. J. Chem. Inf. Model. 43(2), 493–500 (2003)
Steinbeck, C., Hoppe, C., Kuhn, S., Floris, M., Guha, R., Willighagen, E.: Recent developments of the chemistry development kit (CDK)-an open-source java library for chemo-and bioinformatics. Curr. Pharm. Des. 12(17), 2111–2120 (2006)
Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N.D., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 27, pp. 3104–3112. Curran Associates, Inc. (2014)
Swamidass, S.J., Azencott, C., Lin, T., Gramajo, H., Tsai, S., Baldi, P.: Influence relevance voting: an accurate and interpretable virtual high throughput screening method. J. Chem. Inf. Model. 49(4), 756–766 (2009)
Weininger, D.: Smiles, a chemical language and information system. 1. introduction to methodology and encoding rules. J. Chem. Inf. Model. 28(1), 31–36 (1988)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Dörr, A., Otte, S., Zell, A. (2016). Investigating Recurrent Neural Networks for Feature-Less Computational Drug Design. In: Villa, A., Masulli, P., Pons Rivero, A. (eds) Artificial Neural Networks and Machine Learning – ICANN 2016. ICANN 2016. Lecture Notes in Computer Science(), vol 9886. Springer, Cham. https://doi.org/10.1007/978-3-319-44778-0_17
Download citation
DOI: https://doi.org/10.1007/978-3-319-44778-0_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-44777-3
Online ISBN: 978-3-319-44778-0
eBook Packages: Computer ScienceComputer Science (R0)