Abstract
The identification of compound-protein interactions (CPIs) is an essential step in the drug discovery process; however, existing sequence-based or graph-based single-granularity compound representations have difficulty in accurately predicting CPIs. In this paper, we propose MGCPI (Multi-granularity CPI), an end-to-end deep learning framework to predict the compound-protein interactions, which integrates the molecular features of both graph and sequence representation from the input and mines protein structure information by transformer and pre-training methods. Our experiments demonstrated that the multi-granularity molecular representation method is able to fuse protein information from multiple perspectives to enhance the predictive capability of the model and achieve competitive or higher performance compared to various existing CPI prediction methods. Additionally, the ablative analysis verified that the multi-granularity model is more robust than single representation-based models.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bredel, M., Jacoby, E.: Chemogenomics: an emerging strategy for rapid target and drug discovery. Nat. Rev. Genet. 5(4), 262–275 (2004)
Bleakley, K., Yamanishi, Y.: Supervised prediction of drug–target interactions using bipartite local models. Bioinformatics 25(18), 2397–2403 (2009)
Cheng, F., Zhou, Y., Li, J., et al.: Prediction of chemical–protein interactions: multitarget-QSAR versus computational chemogenomic methods. Mol. BioSyst. 8(9), 2373–2384 (2012)
Gönen, M.: Predicting drug–target interactions from chemical and genomic kernels using Bayesian matrix factorization. Bioinformatics 28(18), 2304–2310 (2012)
Jacob, L., Vert, J.P.: Protein-ligand interaction prediction: an improved chemogenomics approach. Bioinformatics 24(19), 2149–2156 (2008)
Van Laarhoven, T., Nabuurs, S.B., Marchiori, E.: Gaussian interaction profile kernels for predicting drug–target interaction. Bioinformatics 27(21), 3036–3043 (2011)
Wang, F., Liu, D., Wang, H., et al.: Computational screening for active compounds targeting protein sequences: methodology and experimental validation. J. Chem. Inf. Model. 51(11), 2821–2828 (2011)
Wang, Y., Zeng, J.: Predicting drug-target interactions using restricted Boltzmann machines. Bioinformatics 29(13), i126–i134 (2013)
Yamanishi, Y., Araki, M., Gutteridge, A., et al.: Prediction of drug–target interaction networks from the integration of chemical and genomic spaces. Bioinformatics 24(13), i232–i240 (2008)
Gilmer, J., Schoenholz, S.S., Riley, P.F., et al.: Neural message passing for quantum chemistry. International conference on machine learning. PMLR, 1263–1272 (2017)
Hu, W., Liu, B., Gomes, J., et al.: Strategies for pre-training graph neural networks. arXiv preprint arXiv:1905.12265 (2019)
Mansimov, E., Mahmood, O., Kang, S., et al.: Molecular geometry prediction using a deep generative graph neural network. Sci. Rep. 9(1), 20381 (2019)
Nguyen, T., Le, H., Quinn, T.P., et al.: GraphDTA: predicting drug–target binding affinity with graph neural networks. Bioinformatics 37(8), 1140–1147 (2021)
Berg, R., Kipf, T.N., Welling, M.: Graph convolutional matrix completion. arXiv preprint arXiv:1706.02263 (2017)
Veličković, P., Cucurull, G., Casanova, A., et al.: Graph attention networks. arXiv preprint arXiv:1710.10903 (2017)
Xu, K., Hu, W., Leskovec, J., et al.: How powerful are graph neural networks?. arXiv preprint arXiv:1810.00826 (2018)
Karimi, M., Wu, D., Wang, Z., et al.: DeepAffinity: interpretable deep learning of compound–protein affinity through unified recurrent and convolutional neural networks. Bioinformatics 35(18), 3329–3338 (2019)
Weininger, D.: SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J. Chemi. Info. Comp. Sci. 28(1), 31–36 (1988)
Chen, Y., Wu, L., Zaki, M.J.: Reinforcement learning based graph-to-sequence model for natural question generation. arXiv preprint arXiv:1908.04942 (2019)
Pareja, A., Domeniconi, G., Chen, J., et al.: Evolvegcn: evolving graph convolutional networks for dynamic graphs. Proceedings of the AAAI conference on artificial intelligence 34(04), 5363–5370 (2020)
Yu, W., Yu, M., Zhao, T., et al.: Identifying referential intention with heterogeneous contexts. Proceedings of The Web Conference 2020, pp. 962–972 (2020)
Zhang, C., Huang, C., Yu, L., et al.: Camel: content-aware and meta-path augmented metric learning for author identification. Proceedings of the World Wide Web Conference. 2018, 709–718 (2018)
Zhang, C., Swami, A., Chawla, N.V.: Shne: Representation learning for semantic-associated heterogeneous networks. Proceedings of the twelfth ACM international conference on web search and data mining, pp. 690–698 (2019)
Landrum, G.: RDKit: Open-source cheminformatics (2006)
O’Boyle, N.M.: Towards a universal SMILES representation-a standard method to generate canonical SMILES based on the InChI. Journal of Cheminformatics 4, 1–14 (2012)
Wu, Z., Pan, S., Chen, F., et al.: A comprehensive survey on graph neural networks. IEEE Trans. Neural Netw. Learn. Sys. 32(1), 4–24 (2020)
Hamilton, W., Ying, Z., Leskovec, J.: Inductive representation learning on large graphs. Adva. Neural Info. Proce. Sys. 30 (2017)
Vaswani, A., Shazeer, N., Parmar, N., et al.: Attention is all you need. Advances in neural information processing systems 30 (2017)
Chen, L., Tan, X., Wang, D., et al.: TransformerCPI: improving compound–protein interaction prediction by sequence-based deep learning with self-attention mechanism and label reversal experiments. Bioinformatics 36(16), 4406–4414 (2020)
Liu, H., Sun, J., Guan, J., et al.: Improving compound–protein interaction prediction by building up highly credible negative samples. Bioinformatics 31(12), i221–i229 (2015)
Wishart, D.S., Knox, C., Guo, A.C., et al.: DrugBank: a knowledgebase for drugs, drug actions and drug targets. Nucleic Acids Research 36(suppl_1), D901-D906 (2008)
Günther, S., Kuhn, M., Dunkel, M., et al.: SuperTarget and matador: resources for exploring drug-target relationships. Nucleic Acids Research 36(suppl_1), D919-D922 (2007)
Tsubaki, M., Tomii, K., Sese, J.: Compound–protein interaction prediction with end-to-end learning of neural networks for graphs and sequences. Bioinformatics 35(2), 309–318 (2019)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Mei, J.P., Kwoh, C.K., Yang, P., et al.: Drug–target interaction prediction by learning from local information and neighbors. Bioinformatics 29(2), 238–245 (2013)
Xia, Z., Wu, L.Y., Zhou, X., et al.: Semi-supervised drug-protein interaction prediction from heterogeneous biological spaces. BMC systems biology BioMed Central 4(2), 1–16 (2010)
Zheng, X., Ding, H., Mamitsuka, H., et al.: Collaborative matrix factorization with multiple similarities for predicting drug-target interactions. Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 1025–1033 (2013)
Liu, Y., Wu, M., Miao, C., et al.: Neighborhood regularized logistic matrix factorization for drug-target interaction prediction. PLoS Comput. Biol. 12(2), e1004760 (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Lin, P., Jiang, L., Ahmed, F.S., Ruan, X., Liu, X., Liu, J. (2023). MGCPI: A Multi-granularity Neural Network for Predicting Compound-Protein Interactions. In: Huang, DS., Premaratne, P., Jin, B., Qu, B., Jo, KH., Hussain, A. (eds) Advanced Intelligent Computing Technology and Applications. ICIC 2023. Lecture Notes in Computer Science, vol 14088. Springer, Singapore. https://doi.org/10.1007/978-981-99-4749-2_12
Download citation
DOI: https://doi.org/10.1007/978-981-99-4749-2_12
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-4748-5
Online ISBN: 978-981-99-4749-2
eBook Packages: Computer ScienceComputer Science (R0)