Multimodal heterogeneous graph attention network

Jia, Xiangen; Jiang, Min; Dong, Yihong; Zhu, Feng; Lin, Haocai; Xin, Yu; Chen, Huahui

doi:10.1007/s00521-022-07862-6

Multimodal heterogeneous graph attention network

Original Article
Published: 10 October 2022

Volume 35, pages 3357–3372, (2023)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Xiangen Jia¹,
Min Jiang²,
Yihong Dong ORCID: orcid.org/0000-0002-6048-2377¹,
Feng Zhu¹,
Haocai Lin¹,
Yu Xin¹ &
…
Huahui Chen¹

2085 Accesses
9 Citations
1 Altmetric
Explore all metrics

Abstract

The real world involves many graphs and networks that are essentially heterogeneous, in which various types of relations connect multiple types of vertices. With the development of information networks, node features can be described by data of different modalities, resulting in multimodal heterogeneous graphs. However, most existed methods can only handle unimodal heterogeneous graphs. Moreover, most existing heterogeneous graph mining methods are based on meta-paths that depend on domain experts for modeling. In this paper, we propose a novel multimodal heterogeneous graph attention network (MHGAT) to address these problems. Specifically, we exploit edge-level aggregation to capture graph heterogeneity information to achieve more informative representations adaptively. Further, we use the modality-level attention mechanism to obtain multimodal fusion information. Because plain graph convolutional networks can not capture higher-order neighborhood information, we utilize the residual connection and the dense connection access to obtain it. Extensive experimental results show that the MHGAT outperforms state-of-the-art baselines on three datasets for node classification, clustering, and visualization tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

HetCAN: A Heterogeneous Graph Cascade Attention Network with Dual-Level Awareness

Multi-view Heterogeneous Graph Neural Networks for Node Classification

Article Open access 24 June 2024

Fusing multiplex heterogeneous networks using graph attention-aware fusion networks

Article Open access 24 November 2024

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

Artificial Intelligence

Data availibility

The datasets that support the findings of this study are available in https://github.com/jiaxiangen/MHGAT.

Notes

References

Abu-El-Haija S, Perozzi B, Kapoor A, Alipourfard N, Lerman K, Harutyunyan H, Ver Steeg G, Galstyan A(2019) Mixhop: higher-order graph convolutional architectures via sparsified neighborhood mixing. In: 36th international conference on machine learning, ICML 2019, vol. 2019, pp. 32–41
Baltrusaitis T, Ahuja C, Morency LP (2019) Multimodal machine learning: a survey and taxonomy. IEEE Trans Pattern Anal Mach Intell 41(2):423–443
Article Google Scholar
Bengio Y, Courville A, Vincent P (2013) Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell 35(8):1798–1828
Article Google Scholar
Chen J, Zhang A (2020) Hgmf: heterogeneous graph-based fusion for multimodal data with incompleteness. In: Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining, pp. 1295–1305
Chen Y, Yuan J, You Q, Luo J (2018) Twitter sentiment analysis via bi-sense emoji embedding and attention-based lstm. In: Proceedings of the 26th ACM international conference on multimedia, MM ’18, Association for Computing Machinery, New York, pp. 117-125,
Defferrard M, Bresson X, Vandergheynst Pierre (2016) Convolutional neural networks on graphs with fast localized spectral filtering. Adv Neural Inf Process Syst 59:3844–3852
Google Scholar
Dong Y, Chawla NV, Swami A (2017) Metapath2vec: scalable representation learning for heterogeneous networks. In: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining, Part F 1296:135–144
Fu TY, Lee WC, Lei Z (2017) HIN2Vec: Explore meta-paths in heterogeneous information networks for representation learning. In: International conference on information and knowledge management, proceedings, vol. Part F1318, pp 1797–1806
Grover A, Leskovec J (2016) Node2vec: scalable feature learning for networks. In: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining, vol. 13–17, pp. 855–864
Hamilton WL, Ying R, Leskovec J (2017) Inductive representation learning on large graphs. Adv Neural Inf Process Syst 2017:1025–1035
Google Scholar
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR)
Hong H, Guo H, Lin Y, Yang X, Li Z, Ye J (2020) An attention-based graph neural network for heterogeneous structural learning. In: AAAI, pp. 4132–4139
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings - 30th IEEE conference on computer vision and pattern recognition, CVPR 2017, vol. 2017, pp. 2261–2269
Jing Y, Yang Y, Wang X, Song M, Tao D (2021) Amalgamating knowledge from heterogeneous graph neural networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 15709–15718
Kipf TN, Welling M (2017) Semi-supervised classification with graph convolutional networks. In: Proceedings of the 5th international conference on learning representations
Kiros R, Salakhutdinov R, Zemel RS (2014) Unifying visual-semantic embeddings with multimodal neural language models. CoRR arXiv:1411.2539
Le Q, Mikolov T (2014) Distributed representations of sentences and documents. In: Proceedings of the 31st international conference on international conference on machine learning - Vol. 32, pp. 1188-1196, AAAA.org
Li Q, Han Z, Wu XM (2018) Deeper insights into graph convolutional networks for semi-supervised learning. In: Proceedings of the 32nd AAAI conference on artificial intelligence, AAAI ’18, pp. 3538–3545, AAAI Press
Luan S, Zhao M, Chang XW, Precup D (2019) Break the ceiling: stronger multi-scale deep graph convolutional networks. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems. Curran Associates Inc, Red Hook, pp 10945–10955
Luo L, Fang Y, Cao X, Zhang X, Zhang W (2021) Detecting communities from heterogeneous graphs: a context path-based graph neural network model. In: Proceedings of the 30th ACM international conference on information & knowledge management, pp. 1170–1180
Lv Q, Ding M, Liu Q, Chen Y, Feng W, He S, Zhou C, Jiang J, Dong Y, Tang J (2021) Are we really making much progress? revisiting, benchmarking and refining heterogeneous graph neural networks. In: Proceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining, pp. 1150–1160
Mroueh Y, Marcheret E, Goel V (2015) Deep multimodal learning for audio-visual speech recognition. In: 2015 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp. 2130–2134
Parisot S, Ktena Sofia I, Ferrante E, Lee M, Guerrero R, Glocker B, Rueckert D (2018) Disease prediction using graph convolutional networks: application to autism spectrum disorder and alzheimer’s disease. Med Image Anal 48:117–130
Article Google Scholar
Perozzi B, Al-Rfou R, Skiena S (2014) DeepWalk: Online learning of social representations. in: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining, pp. 701–710
Ragesh R, Sellamanickam S, Iyer A, Bairi R, Lingam V (2021) Hetegcn: heterogeneous graph convolutional networks for text classification. In: Proceedings of the 14th ACM international conference on web search and data mining, pp. 860–868
Sak H, Senior A, Rao K, Beaufays F (2015) Fast and accurate recurrent neural network acoustic models for speech recognition. In: Proceedings of the annual conference of the international speech communication association, INTERSPEECH, vol. 2015, pp. 1468–1472
Shi C, Hu B, Zhao WX, Yu PS (2019) Heterogeneous information network embedding for recommendation. IEEE Trans Knowl Data Eng 31(2):357–370
Article Google Scholar
Shi C, Li Y, Zhang J, Sun Y, Yu PS (2017) A survey of heterogeneous information network analysis. IEEE Trans Knowl Data Eng 29(1):17–37
Article Google Scholar
Silberer C, Lapata M (2014) Learning grounded meaning representations with autoencoders. In: 52nd annual meeting of the association for computational linguistics, ACL 2014 - proceedings of the conference, vol. 1, pp. 721–732
Song K, Zhang Y, Wang X, Zuo J (2019) Representation learning for heterogeneous network with multiple link attributes. In: ACM unternational conference proceeding series, pp. 1358–1368
Srivastava N, Salakhutdinov R (2014) Multimodal learning with deep Boltzmann machines. J Mach Learn Res 15:2949–2980
MATH Google Scholar
Su X, Xue S, Liu F, Wu J, Yang J, Zhou C, Hu W, Paris C, Nepal S, Jin D, Sheng QZ, Yu PS (2022) A comprehensive survey on community detection with deep learning. IEEE Trans Neural Netw Learn Syst pp. 1–21. https://ieeexplore.ieee.org/document/9732192
Tang Q, Qu J, Wang M, Zhang M, Yan M, Mei J (2015) LINE: large-scale information network embedding. In: Proceedings of the 24th international conference on World Wide Web, International World Wide Web Conferences Steering Committee, pp. 1067–1077
van der Maaten L, Hinton G (2008) Visualizing data using t-sne. J Mach Learn Res 9(86):2579–2605
MATH Google Scholar
Veličković P, Casanova A, Liò P, Cucurull G, Romero A, Bengio Y (2018) Graph attention networks. In: 6th international conference on learning representations, ICLR 2018 - conference track proceedings, arXiv: 1710.10903
Wang J, Jun H, Qian S, Fang Q, Changsheng X (2020) Multimodal graph convolutional networks for high quality content recognition. Neurocomputing 412:42–51
Article Google Scholar
Wang X, Ji H, Cui P, Yu P, Shi C, Wang B, Ye Y (2019) Heterogeneous graph attention network. In: The web conference 2019 - proceedings of the World Wide Web conference, WWW 2019, pp. 2022–2032
Wang X, Zhu M, Bo D, Cui P, Shi C, Pei J (2020) AM-GCN: adaptive multi-channel graph convolutional networks. In: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining, pp. 1243–1253
Wei Y, He X, Wang X, Hong R, Nie L, Chua TS (2019) MMGCN: multi-modal graph convolution network for personalized recommendation of micro-video. In: MM 2019 - proceedings of the 27th ACM international conference on multimedia, pp. 1437–1445
Wu J, Li B, Qin Y, Ni W, Zhang H, Fu R, Sun Y (2021) A multiscale graph convolutional network for change detection in homogeneous and heterogeneous remote sensing images. Int J Appl Earth Obs Geoinf 105:102615
Google Scholar
Xie Y, Yao C, Gong M, Chen C, Qin AK (2020) Graph convolutional networks with multi-level coarsening for graph classification. Knowl-Based Syst 194:105578
Article Google Scholar
Yao L, Mao C, Luo Y (2019) Graph convolutional networks for text classification. In: Proceedings of the AAAI conference on artificial intelligence vol. 33, pp. 7370–7377
You J, Ying R, Leskovec J (2019) Position-aware graph neural networks. In: Kamalika C, Ruslan S (eds) Proceedings of the 36th international conference on machine learning, vol. 97 of Proceedings of machine learning research, Long Beach, California, USA, pp. 7134–7143
Zhang J, Lu CT, Zhou M, Xie S, Chang Y, Yu Philip S (2016) HEER: heterogeneous graph embedding for emerging relation detection from news. In: Proceedings - 2016 IEEE international conference on big data, big data 2016, IEEE, pp. 803–812
Zhang Z, Cui P, Zhu W (2020) Deep learning on graphs: a survey. IEEE Trans Knowl Data Eng 34:249–270
Article Google Scholar
Zhou J, Huang JX, Hu QV, He L (2020) SK-GCN: modeling syntax and knowledge via graph convolutional network for aspect-level sentiment classification. Knowl-Based Syst 205:106292
Article Google Scholar

Download references

Acknowledgments

This work was supported in part by Zhejiang NSF Grants No. LY20F020009, China NSF Grant No. 61572266 and No.61602133, Ningbo NSF Grants No.202003N4086, as well as programs sponsored by K.C. Wong Magna Fund in Ningbo University. (Corresponding author: Yihong Dong.)

Author information

Authors and Affiliations

Faculty of Electrical Engineering and Computer Science, Ningbo University, Ningbo, 315040, China
Xiangen Jia, Yihong Dong, Feng Zhu, Haocai Lin, Yu Xin & Huahui Chen
Ningbo Smart Urban Management Center, Ningbo, 315041, China
Min Jiang

Authors

Xiangen Jia
View author publications
You can also search for this author inPubMed Google Scholar
Min Jiang
View author publications
You can also search for this author inPubMed Google Scholar
Yihong Dong
View author publications
You can also search for this author inPubMed Google Scholar
Feng Zhu
View author publications
You can also search for this author inPubMed Google Scholar
Haocai Lin
View author publications
You can also search for this author inPubMed Google Scholar
Yu Xin
View author publications
You can also search for this author inPubMed Google Scholar
Huahui Chen
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Yihong Dong.

Ethics declarations

Conflict of interest

The authors have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Jia, X., Jiang, M., Dong, Y. et al. Multimodal heterogeneous graph attention network. Neural Comput & Applic 35, 3357–3372 (2023). https://doi.org/10.1007/s00521-022-07862-6

Download citation

Received: 28 February 2022
Accepted: 21 September 2022
Published: 10 October 2022
Issue Date: February 2023
DOI: https://doi.org/10.1007/s00521-022-07862-6

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multimodal heterogeneous graph attention network

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

HetCAN: A Heterogeneous Graph Cascade Attention Network with Dual-Level Awareness

Multi-view Heterogeneous Graph Neural Networks for Node Classification

Fusing multiplex heterogeneous networks using graph attention-aware fusion networks

Explore related subjects

Data availibility

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now