Abstract
Docking simulation is often performed for the activity prediction instead of ligand-based methods based on machine learning approaches, when we have no known drug information about a target protein. Because it calculates the binding energy by virtually docking a drug candidate compound with a binding pocket of a target protein, and does not require any other experimental information. However, the conformation search of a compound and evaluation of binding energy in a docking simulation are computationally heavy tasks, and thus it requires huge computation resources. Therefore, a machine learning-based method to predict the activity of a drug candidate compound against a novel target protein is highly required. Recently, Tsubaki et al. proposed an end-to-end learning method to predict the activity of compounds for novel target proteins. However, the prediction accuracy was insufficient because they used only amino acid sequence information for introducing protein information to a network. In this research, we proposed an end-to-end learning based compound activity prediction using binding pocket information of a target protein, which is more directly important to the activity. The proposed method predicts the activity by end-to-end learning using graph neural network for both compound structure and protein binding pocket structure. As a result of experiments on MUV dataset, the proposed method showed higher accuracy than existing method using only amino acid sequence information. In addition, proposed method achieved equivalent accuracy to docking simulation using AutoDock Vina with much shorter computing time.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Mullard, A.: New drugs cost US$2.6 billion to develop. Nat. Rev. Drug Discov. 13(12), 877 (2014)
Trott, O., Olson, A.J.: AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. J. Comput. Chem. 31(2), 455–461 (2010)
Friesner, R.A., et al.: Glide: a new approach for rapid, accurate docking and scoring. J. Med. Chem. 47(7), 1739–1749 (2004)
Zsoldos, Z., Reid, D., Simon, A., Sadjad, S.B., Johnson, A.P.: eHiTS: a new fast, exhaustive flexible ligand docking system. J. Mol. Graph. Modell. 26(1), 198–212 (2007)
Nakazawa, T.: New paradigm for machine translation: how the neural machine translation works. J. Inf. Process. Manage. 60(5), 299–306 (2017)
Tsubaki, M., Tomii, K., Sese, J.: Compound–protein interaction prediction with end-to-end learning of neural networks for graphs and sequences. Bioinformatics 35(2), 309–318 (2019)
Landrum, G.: RDKit: open-source cheminformatics
Sobolev, V., Sorokine, A., Prilusky, J., Abola, E.E., Edelman, M.: Automated analysis of interatomic contacts in proteins. Bioinformatics (Oxford, England) 15(4), 327–332 (1999)
Ito, J.-I., Tabei, Y., Shimizu, K., et al.: PDB-scale analysis of known and putative ligand-binding sites with structural sketches. Proteins Struct. Funct. Bioinform. 80(3), 747–763 (2012)
Costa, F., De Grave, K. (n.d.). Fast Neighborhood Sub-graph Pairwise Distance Kernel (2010)
Mysinger, M.M., Carchia, M., Irwin, J.J., Shoichet, B.K.: Directory of useful decoys, enhanced (DUD-E): better ligands and decoys for better benchmarking. J. Med. Chem. 55(14), 6582–6594 (2012)
Rohrer, S.G., Baumann, K.: Maximum unbiased validation (MUV) data sets for virtual screening based on PubChem bioactivity data. J. Chem. Inf. Model. 49(2), 169–184 (2009)
Liu, H., Sun, J., Guan, J., Zheng, J., Zhou, S.: Improv- ing compound–protein interaction prediction by building up highly credible negative samples. Bioinformatics 31(12), i221–i229 (2015)
Wang, Y., et al.: PubChem’s BioAssay database. Nucleic Acids Res. 40(Database issue), D400–D412 (2012)
Ragoza, M., Hochuli, J., Idrobo, E., Sunseri, J., Koes, D.R.: Protein-ligand scoring with convolutional neural networks. J. Chem. Inf. Model. 57(4), 942–957 (2017)
Acknowledgement
This work was supported by JSPS KAKENHI Grant Number 18K11524. The numerical calculations were carried out on the TSUBAME3.0 supercomputer at Tokyo Institute of Technology. (Part of) This work is conducted as research activities of AIST - Tokyo Tech Real World Big-Data Computation Open Innovation Laboratory (RWBC-OIL).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Tanebe, T., Ishida, T. (2019). End-to-End Learning Based Compound Activity Prediction Using Binding Pocket Information. In: Huang, DS., Jo, KH., Huang, ZK. (eds) Intelligent Computing Theories and Application. ICIC 2019. Lecture Notes in Computer Science(), vol 11644. Springer, Cham. https://doi.org/10.1007/978-3-030-26969-2_21
Download citation
DOI: https://doi.org/10.1007/978-3-030-26969-2_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-26968-5
Online ISBN: 978-3-030-26969-2
eBook Packages: Computer ScienceComputer Science (R0)