Abstract
The importance of drug design cannot be overemphasized. Recently, artificial intelligence (AI) based drug design has begun to gain momentum due to the great advancement in experimental data, computational power and learning models. However, a major issue remains for all AI-based learning models is efficient molecular representations. Here we propose Neighborhood complex (NC) based molecular featurization (or feature engineering), for the first time. In particular, we reveal deep connections between NC and Dowker complex (DC) for molecular interaction based bipartite graphs, for the first time. Further, NC-based persistent spectral models are developed and the associated persistent attributes are used as molecular descriptors or fingerprints. To test our models, we consider protein-ligand binding affinity prediction. Our NC-based machine learning (NCML) models, in particular, NC-based gradient boosting tree (NC-GBT), are tested on three most-commonly used datasets, i.e., including PDBbind-v2007, PDBbind-v2013 and PDBbind-v2016, and extensively compared with other existing state-of-the-art models. It has been found that our NCML models can achieve state-of-the-art results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Afifi, K., Al-Sadek, A.F.: Improving classical scoring functions using random forest: the non-additivity of free energy terms’ contributions in binding. Chem. Biol. Drug Des. 92(2), 1429–1434 (2018)
Boyles, F., Deane, C.M., Morris, G.M.: Learning from the ligand: using ligand-based features to improve binding affinity prediction. Bioinformatics 36(3), 758–764 (2020)
Cang, Z.X., Mu, L., Wei, G.W.: Representability of algebraic topology for biomolecules in machine learning based scoring and virtual screening. PLoS Comput. Biol. 14(1), e1005929 (2018)
Cang, Z.X., Wei, G.W.: TopologyNet: topology based deep convolutional and multi-task neural networks for biomolecular property predictions. PLOS Comput. Biol. 13(7), e1005690 (2017)
Chowdhury, S., Mémoli, F.: A functorial Dowker theorem and persistent homology of asymmetric networks. J. Appl. Comput. Topol. 2(1), 115–175 (2018)
Dowker, C.H.: Homology groups of relations. Ann. Math., 84–95 (1952)
Duvenaud, D.K., et al.: Convolutional networks on graphs for learning molecular fingerprints. In: Advances in Neural Information Processing Systems, pp. 2224–2232 (2015)
Feinberg, E.N., et al.: PotentialNet for molecular property prediction. ACS Cent. Sci. 4(11), 1520–1530 (2018)
Hassan-Harrirou, H., Zhang, C., Lemmin, T.: RoseNet: improving binding affinity prediction by leveraging molecular mechanics energies with an ensemble of 3d convolutional neural networks. J. Chem. Inf. Model. (2020)
Jiménez, J., Skalic, M., Martinez-Rosell, G., De Fabritiis, G.: K\(_{DEEP}\): protein-ligand absolute binding affinity prediction via 3D-convolutional neural networks. J. Chem. Inf. Model. 58(2), 287–296 (2018)
Jones, D., et al.: Improved protein-ligand binding affinity prediction with structure-based deep fusion inference. J. Chem. Inf. Model. 61(4), 1583–1592 (2021)
Karlov, D.S., Sosnin, S., Fedorov, M.V., Popov, P.: graphDelta: MPNN scoring function for the affinity prediction of protein-ligand complexes. ACS Omega 5(10), 5150–5159 (2020)
Kozlov, D.: Combinatorial Algebraic Topology, vol. 21. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-71962-5
Kozlov, D.N.: Chromatic numbers, morphism complexes, and Stiefel-Whitney characteristic classes. arXiv preprint math/0505563 (2005)
Li, H.J., Leung, K.S., Wong, M.H., Ballester, P.J.: Improving AutoDock Vina using random forest: the growing accuracy of binding affinity prediction by the effective exploitation of larger data sets. Mol. Inf. 34(2–3), 115–126 (2015)
Liu, J., Wang, R.X.: Classification of current scoring functions. J. Chem. Inf. Model. 55(3), 475–482 (2015)
Lo, Y.C., Rensi, S.E., Torng, W., Altman, R.B.: Machine learning in chemoinformatics and drug discovery. Drug Disc. Today 23(8), 1538–1546 (2018)
Lovász, L.: Kneser’s conjecture, chromatic number, and homotopy. J. Comb. Theory Ser. A 25(3), 319–324 (1978)
Meng, Z.Y., Xia, K.L.: Persistent spectral based machine learning (PerSpect ML) for drug design. Science Advances (2021, in press)
Nguyen, D.D., Cang, Z.X., Wei, G.W.: A review of mathematical representations of biomolecular data. Phys. Chem. Chem. Phys. 22, 4343–4367 (2020)
Nguyen, D.D., Cang, Z., Wu, K., Wang, M., Cao, Y., Wei, G.-W.: Mathematical deep learning for pose and binding affinity prediction and ranking in D3R Grand Challenges. J. Comput. Aided Mol. Des. 33(1), 71–82 (2018). https://doi.org/10.1007/s10822-018-0146-6
Nguyen, D.D., Wei, G.W.: AGL-Score: algebraic graph learning score for protein-ligand binding scoring, ranking, docking, and screening. J. Chem. Inf. Model. 59(7), 3291–3304 (2019)
Puzyn, T., Leszczynski, J., Cronin, M.T.: Recent Advances in QSAR Studies: Methods and Applications, vol. 8. Springer, Dordrecht (2010). https://doi.org/10.1007/978-1-4020-9783-6
Rezaei, M.A., Li, Y., Wu, D.O., Li, X., Li, C.: Deep learning in drug design: protein-ligand binding affinity prediction. IEEE/ACM Trans. Comput. Biol. Bioinform. (2020)
Sánchez-Cruz, N., Medina-Franco, J.L., Mestres, J., Barril, X.: Extended connectivity interaction features: improving binding affinity prediction through chemical description. Bioinformatics 37(10), 1376–1382 (2021)
Song, T., et al.: SE-OnionNet: a convolution neural network for protein-ligand binding affinity prediction. Front. Genet. 11, 1805 (2020)
Stepniewska-Dziubinska, M.M., Zielenkiewicz, P., Siedlecki, P.: Development and evaluation of a deep learning model for protein-ligand binding affinity prediction. Bioinformatics 34(21), 3666–3674 (2018)
Su, M.Y., et al.: Comparative assessment of scoring functions: the CASF-2016 update. J. Chem. Inf. Model. 59(2), 895–913 (2018)
Wang, K., Zhou, R., Li, Y., Li, M.: DeepDTAF: a deep learning method to predict protein-ligand binding affinity. Brief. Bioinform. (2021)
Wang, Z., et al.: OnionNet-2: a convolutional neural network model for predicting protein-ligand binding affinity based on residue-atom contacting shells. arXiv preprint arXiv:2103.11664 (2021)
Winter, R., Montanari, F., Noé, F., Clevert, D.A.: Learning continuous and data-driven molecular descriptors by translating equivalent chemical representations. Chem. Sci. 10(6), 1692–1701 (2019)
Wójcikowski, M., Kukiełka, M., Stepniewska-Dziubinska, M.M., Siedlecki, P.: Development of a protein-ligand extended connectivity (PLEC) fingerprint and its application for binding affinity predictions. Bioinformatics 35(8), 1334–1341 (2019)
Zhou, J., et al.: Distance-aware molecule graph attention network for drug-target binding affinity prediction. arXiv preprint arXiv:2012.09624 (2020)
Zhu, F., Zhang, X., Allen, J.E., Jones, D., Lightstone, F.C.: Binding affinity prediction by pairwise function based on neural network. J. Chem. Inf. Model. 60(6), 2766–2772 (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Liu, X., Xia, K. (2021). Neighborhood Complex Based Machine Learning (NCML) Models for Drug Design. In: Reyes, M., et al. Interpretability of Machine Intelligence in Medical Image Computing, and Topological Data Analysis and Its Applications for Medical Data. IMIMIC TDA4MedicalData 2021 2021. Lecture Notes in Computer Science(), vol 12929. Springer, Cham. https://doi.org/10.1007/978-3-030-87444-5_9
Download citation
DOI: https://doi.org/10.1007/978-3-030-87444-5_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-87443-8
Online ISBN: 978-3-030-87444-5
eBook Packages: Computer ScienceComputer Science (R0)