Skip to main content

Novel Methodology for Improving the Generalization Capability of Chemo-Informatics Deep Learning Models

  • Conference paper
  • First Online:
ICT Innovations 2022. Reshaping the Future Towards a New Normal (ICT Innovations 2022)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1740))

Included in the following conference series:

  • 268 Accesses

Abstract

In the last decade, the research community has implemented various applications of deep learning concepts to solve quite advanced tasks in chemistry, ranging from computational chemistry to materials and drug design and even chemical synthesis problems at both laboratory and industrial – grades. Because of the advantages as a high-performance prediction tool in molecular simulations, deep learning is becoming far more than just a temporary trend. Instead, it is foreseen as a tool that will be essential to employ throughout tackling a range of different issues in chemical sciences in the nearest future. In this paper, we propose a novel methodology for regularization of deep neural networks used in chemo-informatics. The methodology consists of four blocks: Class of initial conditions; Orthogonalization, Activation and Standardization. Three graph-based architectures are developed: deep tensor neural network, directed acyclic graph and convolutional graph model. Graph-based models are more convenient for modeling molecules since the molecules and their features are often naturally represented by graphs. Several experiments are obtained on datasets from MoleculeNet aggregator: QM7, QM8, QM9, ToxCast, Tox21, ClinTox, BBBP and SIDER, for predicting geometric, energetic, electronic and thermodynamic properties on small molecules. The obtained results outperform some of the published references and give directions for further improvement. As a particular example, in one of the architectures, we have reduced mean absolute error by more than 12 times compared to conventional regression models, and more than 3 times in comparison to deep networks where the proposed methodology is not implemented.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 84.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Sandjakoska, L., Bogdanova, A.M.: Deep learning: the future of chemoinformatics and drug development. In: 15th International Conference on Informatics and Information Technologies, CIIT (2018)

    Google Scholar 

  2. Unterthiner, T., Mayr, A., Klambauer, G., Hochreiter, S.: Toxicity prediction using deep learning. arXiv preprint arXiv:1503.01445 (2015)

  3. Unterthiner, T., et al.: Deep learning for drug target prediction. Work. Represent. Learn. Methods Complex Outputs (2014)

    Google Scholar 

  4. Hamanaka, M., et al.: CGBVS-DNN: prediction of compound-protein interactions based on deep learning. Mol. Inf. 36(1–2), 1600045 (2017)

    Article  Google Scholar 

  5. Ma, J., Sheridan, R.P., Liaw, A., Dahl, G.E., Svetnik, V.: Deep neural nets as a method for quantitative structure–activity relationships. J. Chem. Inf. Model. 55(2), 263–274 (2015)

    Article  Google Scholar 

  6. Hughes, T.B., Miller, G.P., Swamidass, S.J.: Modeling epoxidation of drug-like molecules with a deep machine learning network. ACS Cent. Sci. 1(4), 168–180 (2015)

    Article  Google Scholar 

  7. Caruana, R.: Multitask learning. Mach. Learn. 28(1), 41–75 (1997)

    Article  MathSciNet  Google Scholar 

  8. Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013)

    Article  Google Scholar 

  9. Bengio, Y.: Deep learning of representations: looking forward. In: Dediu, A.-H., Martín-Vide, C., Mitkov, R., Truthe, B. (eds.) SLSP 2013. LNCS (LNAI), vol. 7978, pp. 1–37. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-39593-2_1

    Chapter  Google Scholar 

  10. Tian, K., Shao, M., Wang, Y., Guan, J., Zhou, S.: Boosting compound-protein interaction prediction by deep learning. Methods 110, 64–72 (2016)

    Article  Google Scholar 

  11. Zawbaa, H.M., Szlȩk, J., Grosan, C., Jachowicz, R., Mendyk, A.: Computational intelligence modeling of the macromolecules release from PLGA microspheres—Focus on feature selection. PLoS ONE 11(6), e0157610 (2016)

    Article  Google Scholar 

  12. Lusci, A., Pollastri, G., Baldi, P.: Deep architectures and deep learning in chemoinformatics: the prediction of aqueous solubility for drug-like molecules. J. Chem. Inf. Model. 53(7), 1563–1575 (2013)

    Article  Google Scholar 

  13. Martins, I.F., Teixeira, A.L., Pinheiro, L., Falcao, A.O.: J. Chem. Inf. Model. 52, 1686–1697 (2012)

    Article  Google Scholar 

  14. https://keras.io/

  15. Schütt, K.T., Arbabzadah, F., Chmiela, S., Müller, K.R., Tkatchenko, A.: Quantum-chemical insights from deep tensor neural networks. Nat. Commun. 8(1), 1–8 (2017)

    Article  Google Scholar 

  16. Altae-Tran, H., Ramsundar, B., Pappu, A.S., Pande, V.: Low data drug discovery with one-shot learning. ACS Cent. Sci. 3(4), 283–293 (2017)

    Article  Google Scholar 

  17. Gayvert, K.M., Madhukar, N.S., Elemento, O.: A data-driven approach to predicting successes and failures of clinical trials. Cell Chem. Biol. 23(10), 1294–1301 (2016)

    Article  Google Scholar 

  18. Artemov, A.V., Putin, E., Vanhaelen, Q., Aliper, A., Ozerov, I.V., Zhavoronkov, A.: Integrated deep learned transcriptomic and structure-based predictor of clinical trials outcomes. BioRxiv, p. 095653 (2016)

    Google Scholar 

  19. Jain, A.N., Nicholls, A.: Recommendations for evaluation of computational methods. J. Comput. Aided Mol. Des. 22(3–4), 133–139 (2008). https://doi.org/10.1007/s10822-008-9196-5

    Article  Google Scholar 

  20. Wu, Z., et al.: MoleculeNet: a benchmark for molecular machine learning. Chem. Sci. 9(2), 513–530 (2018)

    Article  Google Scholar 

  21. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)

    Google Scholar 

  22. Miller, G.A.: WordNet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995)

    Article  Google Scholar 

  23. Ramsundar, B.: Molecular machine learning with DeepChem. Doctoral dissertation, Stanford University (2018)

    Google Scholar 

  24. Abadi, M., et al.: Tensorflow: large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467 (2016)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ljubinka Sandjakoska .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Sandjakoska, L., Bogdanova, A.M., Pejov, L. (2022). Novel Methodology for Improving the Generalization Capability of Chemo-Informatics Deep Learning Models. In: Zdravkova, K., Basnarkov, L. (eds) ICT Innovations 2022. Reshaping the Future Towards a New Normal. ICT Innovations 2022. Communications in Computer and Information Science, vol 1740. Springer, Cham. https://doi.org/10.1007/978-3-031-22792-9_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-22792-9_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-22791-2

  • Online ISBN: 978-3-031-22792-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics