TransG-net: transformer and graph neural network based multi-modal data fusion network for molecular properties prediction

Zhang, Taohong; Chen, Saian; Wulamu, Aziguli; Guo, Xuxu; Li, Qianqian; Zheng, Han

doi:10.1007/s10489-022-04351-0

TransG-net: transformer and graph neural network based multi-modal data fusion network for molecular properties prediction

Published: 01 December 2022

Volume 53, pages 16077–16088, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Taohong Zhang^1,3,
Saian Chen¹^nAff3,
Aziguli Wulamu^1,3,
Xuxu Guo^1,3,
Qianqian Li¹ &
…
Han Zheng²

1110 Accesses
3 Citations
1 Altmetric
Explore all metrics

Abstract

Molecular properties prediction is an important task in the field of materials, especially in computational drug and materials discovery. Deep learning (DL) is one of the most popular methods for molecular properties prediction due to its ability to establish quantitative relationships between molecular representations and target properties. In order to improve the performance of DL algorithms, it is crucial to select appropriate representation of molecules. Molecular graph has become one of the choices as it can be easily input into graph neural network (GNN)-based DL models for learning. However, model performance is limited if molecular representation is only used because it only contains atomic information, bond information, and adjacency relationships between atoms. Therefore, we use molecular mass spectrum as another representation to provide supplement information which is not contained in the graph data. In this paper, a transformer-based model, named Mass Spectrum Transformer (MST), is proposed to perform quantitative analysis of molecular spectra, then it is combined with the graph neural network to form a multi-modal data fusion model TransG-Net for accurate molecular properties prediction. Several feature fusion methods are adopted and the best method is chosen to further enhance the performance of the model. A multi-modal dataset is collected in this paper which is composed of molecular graph data and spectra. Data augmentation is performed to simulate the experimentally measured molecular spectra for the generalizability of the model. Experimental results show that MST outperforms previous best mass spectrum-based methods for molecular properties prediction. In addition, TransG-Net combining MST and GNN achieves better performance than state-of-the-art well-designed message passing models, which proves the effectiveness of our multi-modal data fusion method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Integrating concept of pharmacophore with graph neural networks for chemical property prediction and interpretation

Article Open access 04 August 2022

Algebraic graph-assisted bidirectional transformers for molecular property prediction

Article Open access 10 June 2021

Transfer learning with graph neural networks for improved molecular property prediction in the multi-fidelity setting

Article Open access 26 February 2024

Data availability

The dataset in this research paper is from PubChem [10] and HMDB [9]. The ids of all the molecules we used are listed in the repository https://github.com/chensaian/TransG-Net.

References

Wu Z, Ramsundar B, Feinberg EN, Gomes J, Geniesse C, Pappu AS, Leswing K, Pande V (2018) MoleculeNet: a benchmark for molecular machine learning. Chem Sci 9(2):513–530
Article Google Scholar
Chen H, Engkvist O, Wang Y, Olivecrona M, Blaschke T (2018) The rise of deep learning in drug discovery. Drug Discov Today 23(6):1241–1250
Article Google Scholar
Goh GB, Hodas NO, Vishnu A (2017) Deep learning for computational chemistry. J Comput Chem 38(16):1291–1307
Article Google Scholar
Cai J, Chu X, Xu K, Li H, Wei J (2020) Machine learning-driven new material discovery. Nanoscale Adv 2(8):3115–3130
Article Google Scholar
Wei J, Chu X, Sun XY, Xu K, Deng HX, Chen J, Wei Z, Lei M (2019) Machine learning in materials science. InfoMat 1(3):338–358
Article Google Scholar
Shen J, Nicolaou CA (2020) Molecular property prediction: recent trends in the era of artificial intelligence. Drug Discov Today Technol 32-33:29–36
Article Google Scholar
Zhang J, Mucs D, Norinder U, Svensson F (2019) LightGBM: an effective and scalable algorithm for prediction of chemical toxicity – application to the Tox21 and mutagenicity data sets. J Chem Inf Model 59(10):4150–4158
Article Google Scholar
Sheridan RP, Wang W, Liaw A et al (2016) Extreme gradient boosting as a method for quantitative structure-activity relationships. J Chem Inf Model 56(12):2353–2360
Article Google Scholar
Wishart D, Guo A, Oler E et al (2022) HMDB 5.0: The human metabolome database for 2022. Nucleic Acids Res 50(1):622–631
Article Google Scholar
Kim S, Chen J, Cheng T, Gindulyte A, He J, He S, Li Q, Shoemaker BA, Thiessen PA, Yu B, Zaslavsky L, Zhang J, Bolton EE (2021) PubChem in 2021: new data content and improved web interfaces. Nucleic Acids Res 49(1):1388–1395
Article Google Scholar
Wu Z, Pan S, Chen F et al (2022) A comprehensive survey on graph neural networks. IEEE Trans Neural Net Learning Sys 32(1):4–24
Article MathSciNet Google Scholar
Wieder O, Kohlbacher S, Kuenemann M, Garon A, Ducrot P, Seidel T, Langer T (2020) A compact review of molecular property prediction with graph neural networks. Drug Discov Today Technol 37:1–12
Article Google Scholar
Xiong Z, Wang D, Liu X, Zhong F, Wan X, Li X, Li Z, Luo X, Chen K, Jiang H, Zheng M (2019) Pushing the boundaries of molecular representation for drug discovery with the graph attention mechanism. J Med Chem 63(16):8749–8760
Article Google Scholar
Yang K, Swanson K, Jin W, Coley C, Eiden P, Gao H, Guzman-Perez A, Hopper T, Kelley B, Mathea M, Palmer A, Settels V, Jaakkola T, Jensen K, Barzilay R (2019) Analyzing learned molecular representations for property prediction. J Chem Inf Model 59(8):3370–3388
Article Google Scholar
Wang Y, Magar R, Liang C, Barati Farimani A (2022) Improving molecular contrastive learning via faulty negative mitigation and decomposed fragment contrast. J Chem Inf Model 62(11):2713–2725
Article Google Scholar
Atz K, Grisoni F, Schneider G (2021) Geometric deep learning on molecular representations. Nature Machine Intel 3:1023–1032
Article Google Scholar
Chen J, Zheng S, Song Y et al (2021) Learning attributed graph representation with communicative message passing transformer. In: IJCAI pp. 2831–2838
Chen D, Gao K, Nguyen DD, Chen X, Jiang Y, Wei GW, Pan F (2021) Algebraic graph-assisted bidirectional transformers for molecular property prediction. Nat Commun 12:3521
Article Google Scholar
Li S, Zhou J, Xu T, Dou D, Xiong H (2022) GeomGCL: geometric graph contrastive learning for molecular property prediction. AAAI 36(4):4541–4549
Article Google Scholar
Zhang D, Xia S, Zhang Y (2022) Accurate prediction of aqueous free solvation energies using 3d atomic feature-based graph neural network with transfer learning. J Chem Inf Model 62(8):1840–1848
Article Google Scholar
Fang X, Liu L, Lei J, He D, Zhang S, Zhou J, Wang F, Wu H, Wang H (2022) Geometry-enhanced molecular representation learning for property prediction. Nature Mach Intel 4:127–134
Article Google Scholar
Li Y, Hsieh CY, Lu R, Gong X, Wang X, Li P, Liu S, Tian Y, Jiang D, Yan J, Bai Q, Liu H, Zhang S, Yao X (2022) An adaptive graph learning method for automated molecular interactions and properties predictions. Nature Mach Intel 4:645–651
Article Google Scholar
Ji H, Deng H, Lu H, Zhang Z (2020) Predicting a molecular fingerprint from an electron ionization mass spectrum with deep neural networks. Anal Chem 92(13):8649–8653
Article Google Scholar
Park WB, Chung J, Jung J, Sohn K, Singh SP, Pyo M, Shin N, Sohn KS (2017) Classification of crystal structure using a convolutional neural network. IUCrJ 4(4):486–494
Article Google Scholar
Wang H, Xie Y, Li D, Deng H, Zhao Y, Xin M, Lin J (2020) Rapid identification of X-ray diffraction patterns based on very limited data by interpretable convolutional neural networks. J Chem Inf Model 60(4):2004–2011
Article Google Scholar
Lee JW, Park WB, Lee JH, Singh SP, Sohn KS (2020) A deep-learning technique for phase identification in multiphase inorganic compounds using synthetic XRD powder patterns. Nat Commun 11:86
Article Google Scholar
Szymanski NJ, Bartel CJ, Zeng Y, Tu Q, Ceder G (2021) Probabilistic deep learning approach to automate the interpretation of multi-phase diffraction spectra. Chem Mater 33(11):4204–4215
Article Google Scholar
Pattanaik L, Coley CW (2020) Molecular representation: going long on fingerprints. Chem 6(6):1204–1207
Article Google Scholar
Huang K, Fu T, Glass LM et al (2020) DeepPurpose: a deep learning library for drug-target interaction prediction. Bioinform 36(22–23):5545–5547
Google Scholar
Kipf TN, Welling M (2017) Semi-supervised classification with graph convolutional networks. In: ICLR
Gilmer J, Schoenholz SS, Riley PF et al (2017) Neural message passing for quantum chemistry. In: ICML 70:1263–1272
Lu C, Liu Q, Wang C, Huang Z, Lin P, He L (2019) Molecular property prediction: a multilevel quantum interactions modeling perspective. AAAI 33(1):1052–1060
Article Google Scholar
Song Y, Zheng S, Niu Z et al (2020) Communicative representation learning on attributed molecular graphs. In: IJCAI pp. 2831–2838
Wei JN, Belanger D, Adams RP, Sculley D (2019) Rapid prediction of electron-ionization mass spectrometry using neural networks. ACS Cent Sci 5(4):700–708
Article Google Scholar
Huber F, van der Burg S, van der Hooft JJJ, Ridder L (2021) MS2DeepScore: a novel deep learning similarity measure to compare tandem mass spectra. J Cheminform 13:84
Article Google Scholar
Fine JA, Rajasekar AA, Jethava KP, Chopra G (2020) Spectral deep learning for prediction and prospective validation of functional groups. Chem Sci 11(18):4618–4630
Article Google Scholar
Shrivastava AD, Swainston N, Samanta S, Roberts I, Wright Muelas M, Kell DB (2021) MassGenie: a transformer-based deep learning method for identifying small molecules from their mass spectra. Biomolecules 11(12):1793
Article Google Scholar
Vaswani A, Shazeer N, Parmar N et al (2017) Attention is all you need. In: NeurIPS pp. 6000-6010
Devlin J, Chang MW, Lee K et al (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL pp. 4171-4186
Dosovitskiy A, Beyer L, Kolesnikov A et al (2021) An image is worth 16x16 words: transformers for image recognition at scale. In: ICLR
Han K, Wang Y, Chen H et al (2022) A survey on vision transformer. In IEEE Transactions on Pattern Analysis and Machine Intelligence
Carion N, Massa F, Synnaeve G et al (2020) End-to-end object detection with transformers. In: ECCV pp. 213-229
Zhang G, Luo Z, Cui K et al (2022) Meta-DETR: Image-level few-shot detection with inter-class correlation exploitation. In IEEE Transactions on Pattern Analysis and Machine Intelligence
Dhamija T, Gupta A, Gupta S, Anjum, Katarya R, Singh G (2022) Semantic segmentation in medical images through transfused convolution and transformer networks. Appl Intell:1–17
Lee K, Chang H, Jiang L et al (2022) ViTGAN: training GANs with vision transformers. In: ICLR
Chen Y, Guo B, Shen Y et al (2022) Video summarization with u-shaped transformer. Appl Intell
Veličković P, Cucurull G, Casanova A et al (2018) Graph attention networks. In: ICLR
Baltrušaitis T, Ahuja C, Morency LP (2019) Multimodal machine learning: a survey and taxonomy. IEEE Trans Pattern Anal Mach Intell 41(2):423–443
Article Google Scholar
Rahate A, Walambe R, Ramanna S, Kotecha K (2022) Multimodal co-learning: challenges, applications with datasets, recent advances and future directions. Inform Fusion 81:203–239
Article Google Scholar
Kim JH, On KW, Kim J et al (2017) Hadamard product for low-rank bilinear pooling. In: ICLR
Li M, Dyett B, Zhang X (2019) Automated femtoliter droplet-based determination of oil–water partition coefficient. Anal Chem 91(16):10371–10375
Article Google Scholar
Schütt KT, Sauceda HE, Kindermans PJ, Tkatchenko A, Müller KR (2018) SchNet–A deep learning architecture for molecules and materials. J Chem Phys 148(24):2417–2422
Article Google Scholar

Download references

Acknowledgments

This paper is sponsored by the National Study Abroad Fund of China and supported by The National Key Research and Development Program of China (2017YFB1002304). This work was supported by Key Laboratory of AI and Information Processing (Hechi University), Education Department of Guangxi Zhuang Autonomous Region (2022GXZDSY001).

Code availability

The code of model is in the repository https://github.com/chensaian/TransG-Net. The model is implemented using torch-geometric 2.0.2 and Pytorch 1.10.

Author information

Saian Chen
Present address: Beijing Key Laboratory of Knowledge Engineering for Materials Science, Beijing, 100083, China

Authors and Affiliations

Department of Computer, School of Computer and Communication Engineering, University of Science and Technology Beijing (USTB), Beijing, 100083, China
Taohong Zhang, Saian Chen, Aziguli Wulamu, Xuxu Guo & Qianqian Li
Key Laboratory of AI and Information Processing (Hechi University), Education Department of Guangxi Zhuang Autonomous Region, Hechi, 546300, Guangxi, China
Han Zheng
Beijing Key Laboratory of Knowledge Engineering for Materials Science, Beijing, 100083, China
Taohong Zhang, Aziguli Wulamu & Xuxu Guo

Authors

Taohong Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Saian Chen
View author publications
You can also search for this author in PubMed Google Scholar
Aziguli Wulamu
View author publications
You can also search for this author in PubMed Google Scholar
Xuxu Guo
View author publications
You can also search for this author in PubMed Google Scholar
Qianqian Li
View author publications
You can also search for this author in PubMed Google Scholar
Han Zheng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Aziguli Wulamu or Han Zheng.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhang, T., Chen, S., Wulamu, A. et al. TransG-net: transformer and graph neural network based multi-modal data fusion network for molecular properties prediction. Appl Intell 53, 16077–16088 (2023). https://doi.org/10.1007/s10489-022-04351-0

Download citation

Accepted: 12 November 2022
Published: 01 December 2022
Issue Date: June 2023
DOI: https://doi.org/10.1007/s10489-022-04351-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

TransG-net: transformer and graph neural network based multi-modal data fusion network for molecular properties prediction

Abstract

Access this article

Similar content being viewed by others

Integrating concept of pharmacophore with graph neural networks for chemical property prediction and interpretation

Algebraic graph-assisted bidirectional transformers for molecular property prediction

Transfer learning with graph neural networks for improved molecular property prediction in the multi-fidelity setting

Data availability

References

Acknowledgments

Code availability

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

TransG-net: transformer and graph neural network based multi-modal data fusion network for molecular properties prediction

Abstract

Access this article

Similar content being viewed by others

Integrating concept of pharmacophore with graph neural networks for chemical property prediction and interpretation

Algebraic graph-assisted bidirectional transformers for molecular property prediction

Transfer learning with graph neural networks for improved molecular property prediction in the multi-fidelity setting

Data availability

References

Acknowledgments

Code availability

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation