Embedding text-rich graph neural networks with sequence and topical semantic structures

Yu, Zhizhi; Jin, Di; Liu, Ziyang; He, Dongxiao; Wang, Xiao; Tong, Hanghang; Han, Jiawei

doi:10.1007/s10115-022-01768-4

Embedding text-rich graph neural networks with sequence and topical semantic structures

Regular Paper
Published: 17 October 2022

Volume 65, pages 613–640, (2023)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Zhizhi Yu ORCID: orcid.org/0000-0001-5954-3593¹,
Di Jin¹,
Ziyang Liu²^na1,
Dongxiao He¹,
Xiao Wang³,
Hanghang Tong⁴ &
…
Jiawei Han⁴

876 Accesses
1 Altmetric
Explore all metrics

Abstract

Graph neural networks (GNNs) have demonstrated great power in tackling various analytical tasks on graph (i.e. network) data. However, graphs in the real world are usually text-rich, implying that valuable semantic structures need to be considered carefully. Existing GNNs for text-rich networks typically treat the text as attribute words alone, which inevitably leads to the loss of important semantic structures, limiting the representation capability of GNNs. To solve this limitation, we propose AS-GNN, an end-to-end adaptive GNN architecture via unified modelling of semantic structure and network propagation on text-rich networks. Specifically, we utilize semantic structure modelling part to capture both the local word-sequence and the global topic semantic structures from text. We then augment the original text-rich network into a tri-typed heterogeneous network (including document nodes, word nodes, and topic nodes) and accordingly design a semantic-aware propagation of information by introducing a discriminative convolutional mechanism. We further train these two parts together by leveraging distribution sharing and joint training strategies, so as to adaptively generate an appropriate network structure aiming at the learning objectives. In addition, we present a simplified semantic architecture S-GNN, which adopts the cascaded “Structure-GNN” pattern, to promote the efficiency of the model and be easily combined with existing GNNs. Extensive experiments on text-rich networks demonstrate the superiority of our new architectures over state of the arts. Meanwhile, such architectures can also be applied to e-commerce search scenes, and experiments on a real e-commerce problem from JD further illustrate the effectiveness of AS-GNN over the baselines.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

HGNN-ETA: Heterogeneous graph neural network enriched with text attribute

Article 23 November 2022

Semi-supervised Classification Based on Graph Convolution Encoder Representations from BERT

AutoTGRL: an automatic text-graph representation learning framework

Article 08 December 2023

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

Notes

References

Long Q, Jin Y, Wu Y, Song G (2021) Theoretically improving graph neural networks via anonymous walk graph kernels. In: Proceedings of WWW, pp 1204–1214
Jin D, Huo C, Liang C, Yang L (2021) Heterogeneous graph neural network via attribute completion. In: Proceedings of WWW, pp 391–400
Rong Y, Huang W, Xu T, Huang J (2020) Dropedge: towards deep graph convolutional networks on node classification. In: Proceedings of ICLR
Wu J, He J, Xu J (2019) Demo-net: degree-specific graph neural networks for node and graph classification. In: Proceedings of SIGKDD, pp 406–415
You J, Ying R, Leskovec J (2019) Position-aware graph neural networks. In: Proceedings of ICML, pp 7134–7143
Liu Z, Wan M, Guo S, Achan K, Yu PS (2020) Basconv: aggregating heterogeneous interactions for basket recommendation with graph convolutional neural network. In: Proceedings of SDM, pp 64–72
Kipf TN, Welling M (2017) Semi-supervised classification with graph convolutional networks. In: Proceedings of ICLR
Velickovic P, Cucurull G, Casanova A, Romero A, Liò P, Bengio Y (2018) Graph attention networks. In: Proceedings of ICLR
Bachmann G, Bécigneul G, Ganea O (2020) Constant curvature graph convolutional networks. In: Proceedings of ICML, pp 486–496
Jin D, Huang J, Jiao P, Yang L, He D, Fogelman-Soulié F, Huang Y (2019) A novel generative topic embedding model by introducing network communities. In: Proceedings of WWW, pp 2886–2892
Yang C, Liu Z, Zhao D, Sun M, Chang EY (2015) Network representation learning with rich text information. In: Proceedings of IJCAI, pp 2111–2117
Tang J, Zhang J, Yao L, Li J, Zhang L, Su Z (2008) Arnetminer: extraction and mining of academic social networks. In: Proceedings of SIGKDD, pp 990–998
Errica F, Podda M, Bacciu D, Micheli A (2020) A fair comparison of graph neural networks for graph classification. In: Proceedings of ICLR
Zhu J, Yan Y, Zhao L, Heimann M, Akoglu L, Koutra D (2020) Beyond homophily in graph neural networks: current limitations and effective designs. In: Proceedings of NeurIPS
Zhu J, Rossi RA, Rao AB, Mai T, Lipka N, Ahmed NK, Koutra D (2021) Graph neural networks with heterophily. In: Proceedings of AAAI, pp 11168–11176
Shi Y, Shen J, Li Y, Zhang N, He X, Lou Z, Zhu Q, Walker M, Kim M, Han J (2019) Discovering hypernymy in text-rich heterogeneous information network by exploiting context granularity. In: Proceedings of CIKM, pp 599–608
Shang J, Zhang X, Liu L, Li S, Han J (2020) Nettaxo: Automated topic taxonomy construction from text-rich network. In: Proceedings of WWW, pp 1908–1919
Jin D, Yu Z, Jiao P, Pan S, Yu PS, Zhang W (2021) A survey of community detection approaches: from statistical modeling to deep learning. TKDE. https://doi.org/10.1109/TKDE.2021.3104155
Hu F, Zhu Y, Wu S, Wang L, Tan T (2019) Hierarchical graph convolutional networks for semi-supervised node classification. In: Proceedings of IJCAI, pp 4532–4539
Liu X, You X, Zhang X, Wu J, Lv P (2020) Tensor graph convolutional networks for text classification. In: Proceedings of AAAI, pp 8409–8416
Miao Y, Grefenstette E, Blunsom P (2017) Discovering discrete latent topics with neural variational inference. In: Proceedings of ICML, vol 70, pp 2410–2419
Kingma DP, Welling M (2014) Auto-encoding variational bayes. In: Proceedings of ICLR
Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet allocation. J Mach Learn Res 3:993–1022
MATH Google Scholar
Devlin J, Chang M, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of NAACL-HLT, pp 4171–4186
Yang Y, Wang X, Song M, Yuan J, Tao D (2019) SPAGAN: shortest path graph attention network. In: Proceedings of IJCAI, pp 4099–4105
Wang X, Ji H, Shi C, Wang B, Ye Y, Cui P, Yu PS (2019) Heterogeneous graph attention network. In: Proceedings of WWW, pp 2022–2032
Velickovic P, Fedus W, Hamilton WL, Liò P, Bengio Y, Hjelm RD (2019) Deep graph infomax. In: Proceedings of ICLR
Hamilton WL, Ying Z, Leskovec J (2017) Inductive representation learning on large graphs. In: Proceedings of NeurIPS, pp 1024–1034
Wang X, Zhu M, Bo D, Cui P, Shi C, Pei J (2020) AM-GCN: adaptive multi-channel graph convolutional networks. In: Proceedings of SIGKDD, pp 1243–1253
Pei H, Wei B, Chang KC, Lei Y, Yang B (2020) Geom-gcn: Geometric graph convolutional networks. In: Proceedings of ICLR
Jin D, Song X, Yu Z, Liu Z, Zhang H, Cheng Z, Han J (2021) Bite-gcn: a new GCN architecture via bidirectional convolution of topology and features on text-rich networks. In: Proceedings of WSDM, pp 157–165
Pennington J, Socher R, Manning CD (2014) Glove: global vectors for word representation. In: Proceedings of EMNLP, pp 1532–1543
Cui P, Liu Y, Liu B (2019) A neural topic model based on variational auto-encoder for aspect extraction from opinion texts. In: Proceedings of NLPCC, vol 11838, pp 660–671
Cui P, Hu L, Liu Y (2020) Enhancing extractive text summarization with topic-aware graph neural networks. In: Proceedings of COLING, pp 5360–5371
Laurens VDM, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9(2605):2579–2605
MATH Google Scholar
Jin D, Wang K, Zhang G, Jiao P, He D, Fogelman-Soulié F, Huang X (2020) Detecting communities with multiplex semantics by distinguishing background, general, and specialized topics. IEEE Trans Knowl Data Eng 32(11):2144–2158
Article Google Scholar
Wan S, Lan Y, Guo J, Xu J, Pang L, Cheng X (2016) A deep architecture for semantic matching with multiple positional sentence representations. In: Proceedings of AAAI, pp 2835–2841
Hu B, Lu Z, Li H, Chen Q (2014) Convolutional neural network architectures for matching natural language sentences. In: Proceedings of NeurIPS, pp 2042–2050
Xiong C, Dai Z, Callan J, Liu Z, Power R (2017) End-to-end neural ad-hoc ranking with kernel pooling. In: Proceedings of SIGIR, pp 55–64
Pang L, Lan Y, Guo J, Xu J, Wan S, Cheng X (2016) Text matching as image recognition. In: Proceedings of AAAI, vol 16, pp 2793–2799
Mitra B, Diaz F, Craswell N (2017) Learning to match using local and distributed representations of text for web search. In: Proceedings of WWW, pp 1291–1299
Zhang X, Zhang C, Dong XL, Shang J, Han J (2021) Minimally-supervised structure-rich text categorization via learning on text-rich networks. In: Proceedings of WWW
Chen W, Liu C, Yin J, Yan H, Zhang Y (2017) Mining e-commercial data: a text-rich heterogeneous network embedding approach. In: Proceedings of IJCNN, pp 1403–1410
Wang L, Yu X, Tao F (2019) A community-enhanced retrieval model for text-rich heterogeneous information networks. In: Proceedings of ICDM workshops, pp 505–513
Yao L, Mao C, Luo Y (2019) Graph convolutional networks for text classification. In: Proceedings of AAAI, pp 7370–7377
Ragesh R, Sellamanickam S, Iyer A, Bairi R, Lingam V (2021) Hetegcn: heterogeneous graph convolutional networks for text classification. In: Proceedings of WSDM, pp 860–868
Zhang Y, Yu X, Cui Z, Wu S, Wen Z, Wang L (2020) Every document owns its structure: inductive text classification via graph neural networks. In: Proceedings of ACL, pp 334–339
Sun K, Lin Z, Zhu Z (2020) Multi-stage self-supervised learning for graph convolutional networks on graphs with few labeled nodes. In: Proceedings of AAAI, pp 5892–5899
Xu K, Li C, Tian Y, Sonobe T, Kawarabayashi K, Jegelka S (2018) Representation learning on graphs with jumping knowledge networks. In: Proceedings of ICML, vol 80, pp 5449–5458
Jiang Y, Shang Y, Liu Z, Shen H, Xiao Y, Xiong W, Xu S, Yan W, Jin D (2020) BERT2DNN: BERT distillation with massive unlabeled data for online e-commerce search. In: Proceedings of ICDM, pp 212–221
Nguyen TV, Rao N, Subbian K (2020) Learning robust models for e-commerce product search. In: Proceedings of ACL, pp 6861–6869
Niu X, Li B, Li C, Xiao R, Sun H, Wang H, Deng H, Chen Z (2020) Gated heterogeneous graph representation learning for shop search in e-commerce. In: Proceedings of CIKM, pp 2165–2168
Yu Z, Jin D, Liu Z, He D, Wang X, Tong H, Han J (2021) AS-GCN: adaptive semantic architecture of graph convolutional networks for text-rich networks. In: Proceedings of ICDM, pp 837–846

Download references

Acknowledgements

This work was supported by the Natural Science Foundation of China under Grants 62272340, 62276187, 61876128, and 62172052.

Author information

Work is done during Ziyang Liu work at JD.com.

Authors and Affiliations

College of Intelligence and Computing, Tianjin University, Tianjin, 300350, China
Zhizhi Yu, Di Jin & Dongxiao He
School of Software, Tsinghua University, Beijing, 100084, China
Ziyang Liu
School of Computer Science (National Pilot Software Engineering School), Beijing University of Posts and Telecommunications, Beijing, 100876, China
Xiao Wang
Department of Computer Science, University of Illinois at Urbana-Champaign, Champaign, IL, 61801, USA
Hanghang Tong & Jiawei Han

Authors

Zhizhi Yu
View author publications
You can also search for this author inPubMed Google Scholar
Di Jin
View author publications
You can also search for this author inPubMed Google Scholar
Ziyang Liu
View author publications
You can also search for this author inPubMed Google Scholar
Dongxiao He
View author publications
You can also search for this author inPubMed Google Scholar
Xiao Wang
View author publications
You can also search for this author inPubMed Google Scholar
Hanghang Tong
View author publications
You can also search for this author inPubMed Google Scholar
Jiawei Han
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Dongxiao He.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Yu, Z., Jin, D., Liu, Z. et al. Embedding text-rich graph neural networks with sequence and topical semantic structures. Knowl Inf Syst 65, 613–640 (2023). https://doi.org/10.1007/s10115-022-01768-4

Download citation

Received: 11 January 2022
Revised: 12 September 2022
Accepted: 19 September 2022
Published: 17 October 2022
Issue Date: February 2023
DOI: https://doi.org/10.1007/s10115-022-01768-4

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Embedding text-rich graph neural networks with sequence and topical semantic structures

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

HGNN-ETA: Heterogeneous graph neural network enriched with text attribute

Semi-supervised Classification Based on Graph Convolution Encoder Representations from BERT

AutoTGRL: an automatic text-graph representation learning framework

Explore related subjects

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now