Skip to main content

Abstract

The importance of drug design cannot be overemphasized. Recently, artificial intelligence (AI) based drug design has begun to gain momentum due to the great advancement in experimental data, computational power and learning models. However, a major issue remains for all AI-based learning models is efficient molecular representations. Here we propose Neighborhood complex (NC) based molecular featurization (or feature engineering), for the first time. In particular, we reveal deep connections between NC and Dowker complex (DC) for molecular interaction based bipartite graphs, for the first time. Further, NC-based persistent spectral models are developed and the associated persistent attributes are used as molecular descriptors or fingerprints. To test our models, we consider protein-ligand binding affinity prediction. Our NC-based machine learning (NCML) models, in particular, NC-based gradient boosting tree (NC-GBT), are tested on three most-commonly used datasets, i.e., including PDBbind-v2007, PDBbind-v2013 and PDBbind-v2016, and extensively compared with other existing state-of-the-art models. It has been found that our NCML models can achieve state-of-the-art results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 44.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 59.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Afifi, K., Al-Sadek, A.F.: Improving classical scoring functions using random forest: the non-additivity of free energy terms’ contributions in binding. Chem. Biol. Drug Des. 92(2), 1429–1434 (2018)

    Article  Google Scholar 

  2. Boyles, F., Deane, C.M., Morris, G.M.: Learning from the ligand: using ligand-based features to improve binding affinity prediction. Bioinformatics 36(3), 758–764 (2020)

    Google Scholar 

  3. Cang, Z.X., Mu, L., Wei, G.W.: Representability of algebraic topology for biomolecules in machine learning based scoring and virtual screening. PLoS Comput. Biol. 14(1), e1005929 (2018)

    Article  Google Scholar 

  4. Cang, Z.X., Wei, G.W.: TopologyNet: topology based deep convolutional and multi-task neural networks for biomolecular property predictions. PLOS Comput. Biol. 13(7), e1005690 (2017)

    Article  Google Scholar 

  5. Chowdhury, S., Mémoli, F.: A functorial Dowker theorem and persistent homology of asymmetric networks. J. Appl. Comput. Topol. 2(1), 115–175 (2018)

    Article  MathSciNet  Google Scholar 

  6. Dowker, C.H.: Homology groups of relations. Ann. Math., 84–95 (1952)

    Google Scholar 

  7. Duvenaud, D.K., et al.: Convolutional networks on graphs for learning molecular fingerprints. In: Advances in Neural Information Processing Systems, pp. 2224–2232 (2015)

    Google Scholar 

  8. Feinberg, E.N., et al.: PotentialNet for molecular property prediction. ACS Cent. Sci. 4(11), 1520–1530 (2018)

    Article  Google Scholar 

  9. Hassan-Harrirou, H., Zhang, C., Lemmin, T.: RoseNet: improving binding affinity prediction by leveraging molecular mechanics energies with an ensemble of 3d convolutional neural networks. J. Chem. Inf. Model. (2020)

    Google Scholar 

  10. Jiménez, J., Skalic, M., Martinez-Rosell, G., De Fabritiis, G.: K\(_{DEEP}\): protein-ligand absolute binding affinity prediction via 3D-convolutional neural networks. J. Chem. Inf. Model. 58(2), 287–296 (2018)

    Article  Google Scholar 

  11. Jones, D., et al.: Improved protein-ligand binding affinity prediction with structure-based deep fusion inference. J. Chem. Inf. Model. 61(4), 1583–1592 (2021)

    Article  Google Scholar 

  12. Karlov, D.S., Sosnin, S., Fedorov, M.V., Popov, P.: graphDelta: MPNN scoring function for the affinity prediction of protein-ligand complexes. ACS Omega 5(10), 5150–5159 (2020)

    Article  Google Scholar 

  13. Kozlov, D.: Combinatorial Algebraic Topology, vol. 21. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-71962-5

    Book  MATH  Google Scholar 

  14. Kozlov, D.N.: Chromatic numbers, morphism complexes, and Stiefel-Whitney characteristic classes. arXiv preprint math/0505563 (2005)

    Google Scholar 

  15. Li, H.J., Leung, K.S., Wong, M.H., Ballester, P.J.: Improving AutoDock Vina using random forest: the growing accuracy of binding affinity prediction by the effective exploitation of larger data sets. Mol. Inf. 34(2–3), 115–126 (2015)

    Article  Google Scholar 

  16. Liu, J., Wang, R.X.: Classification of current scoring functions. J. Chem. Inf. Model. 55(3), 475–482 (2015)

    Article  Google Scholar 

  17. Lo, Y.C., Rensi, S.E., Torng, W., Altman, R.B.: Machine learning in chemoinformatics and drug discovery. Drug Disc. Today 23(8), 1538–1546 (2018)

    Article  Google Scholar 

  18. Lovász, L.: Kneser’s conjecture, chromatic number, and homotopy. J. Comb. Theory Ser. A 25(3), 319–324 (1978)

    Article  MathSciNet  Google Scholar 

  19. Meng, Z.Y., Xia, K.L.: Persistent spectral based machine learning (PerSpect ML) for drug design. Science Advances (2021, in press)

    Google Scholar 

  20. Nguyen, D.D., Cang, Z.X., Wei, G.W.: A review of mathematical representations of biomolecular data. Phys. Chem. Chem. Phys. 22, 4343–4367 (2020)

    Article  Google Scholar 

  21. Nguyen, D.D., Cang, Z., Wu, K., Wang, M., Cao, Y., Wei, G.-W.: Mathematical deep learning for pose and binding affinity prediction and ranking in D3R Grand Challenges. J. Comput. Aided Mol. Des. 33(1), 71–82 (2018). https://doi.org/10.1007/s10822-018-0146-6

    Article  Google Scholar 

  22. Nguyen, D.D., Wei, G.W.: AGL-Score: algebraic graph learning score for protein-ligand binding scoring, ranking, docking, and screening. J. Chem. Inf. Model. 59(7), 3291–3304 (2019)

    Article  Google Scholar 

  23. Puzyn, T., Leszczynski, J., Cronin, M.T.: Recent Advances in QSAR Studies: Methods and Applications, vol. 8. Springer, Dordrecht (2010). https://doi.org/10.1007/978-1-4020-9783-6

    Book  Google Scholar 

  24. Rezaei, M.A., Li, Y., Wu, D.O., Li, X., Li, C.: Deep learning in drug design: protein-ligand binding affinity prediction. IEEE/ACM Trans. Comput. Biol. Bioinform. (2020)

    Google Scholar 

  25. Sánchez-Cruz, N., Medina-Franco, J.L., Mestres, J., Barril, X.: Extended connectivity interaction features: improving binding affinity prediction through chemical description. Bioinformatics 37(10), 1376–1382 (2021)

    Article  Google Scholar 

  26. Song, T., et al.: SE-OnionNet: a convolution neural network for protein-ligand binding affinity prediction. Front. Genet. 11, 1805 (2020)

    Google Scholar 

  27. Stepniewska-Dziubinska, M.M., Zielenkiewicz, P., Siedlecki, P.: Development and evaluation of a deep learning model for protein-ligand binding affinity prediction. Bioinformatics 34(21), 3666–3674 (2018)

    Article  Google Scholar 

  28. Su, M.Y., et al.: Comparative assessment of scoring functions: the CASF-2016 update. J. Chem. Inf. Model. 59(2), 895–913 (2018)

    Article  Google Scholar 

  29. Wang, K., Zhou, R., Li, Y., Li, M.: DeepDTAF: a deep learning method to predict protein-ligand binding affinity. Brief. Bioinform. (2021)

    Google Scholar 

  30. Wang, Z., et al.: OnionNet-2: a convolutional neural network model for predicting protein-ligand binding affinity based on residue-atom contacting shells. arXiv preprint arXiv:2103.11664 (2021)

  31. Winter, R., Montanari, F., Noé, F., Clevert, D.A.: Learning continuous and data-driven molecular descriptors by translating equivalent chemical representations. Chem. Sci. 10(6), 1692–1701 (2019)

    Article  Google Scholar 

  32. Wójcikowski, M., Kukiełka, M., Stepniewska-Dziubinska, M.M., Siedlecki, P.: Development of a protein-ligand extended connectivity (PLEC) fingerprint and its application for binding affinity predictions. Bioinformatics 35(8), 1334–1341 (2019)

    Article  Google Scholar 

  33. Zhou, J., et al.: Distance-aware molecule graph attention network for drug-target binding affinity prediction. arXiv preprint arXiv:2012.09624 (2020)

  34. Zhu, F., Zhang, X., Allen, J.E., Jones, D., Lightstone, F.C.: Binding affinity prediction by pairwise function based on neural network. J. Chem. Inf. Model. 60(6), 2766–2772 (2020)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kelin Xia .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Liu, X., Xia, K. (2021). Neighborhood Complex Based Machine Learning (NCML) Models for Drug Design. In: Reyes, M., et al. Interpretability of Machine Intelligence in Medical Image Computing, and Topological Data Analysis and Its Applications for Medical Data. IMIMIC TDA4MedicalData 2021 2021. Lecture Notes in Computer Science(), vol 12929. Springer, Cham. https://doi.org/10.1007/978-3-030-87444-5_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-87444-5_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-87443-8

  • Online ISBN: 978-3-030-87444-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics