Skip to main content
Log in

NGAT: attention in breadth and depth exploration for semi-supervised graph representation learning

NGAT: 基于广度和深度探索注意力机制的半监督图表示学习

  • Research Article
  • Published:
Frontiers of Information Technology & Electronic Engineering Aims and scope Submit manuscript

Abstract

Recently, graph neural networks (GNNs) have achieved remarkable performance in representation learning on graph-structured data. However, as the number of network layers increases, GNNs based on the neighborhood aggregation strategy deteriorate due to the problem of oversmoothing, which is the major bottleneck for applying GNNs to real-world graphs. Many efforts have been made to improve the process of feature information aggregation from directly connected nodes, i.e., breadth exploration. However, these models perform the best only in the case of three or fewer layers, and the performance drops rapidly for deep layers. To alleviate oversmoothing, we propose a nested graph attention network (NGAT), which can work in a semi-supervised manner. In addition to breadth exploration, a k-layer NGAT uses a layer-wise aggregation strategy guided by the attention mechanism to selectively leverage feature information from the kth-order neighborhood, i.e., depth exploration. Even with a 10-layer or deeper architecture, NGAT can balance the need for preserving the locality (including root node features and the local structure) and aggregating the information from a large neighborhood. In a number of experiments on standard node classification tasks, NGAT outperforms other novel models and achieves state-of-the-art performance.

摘要

近年来图神经网络 (GNN) 在图结构数据表示学习方面取得显著成绩. 然而, 随着网络层数增加, 由于过度平滑问题, 基于邻域信息聚合策略的GNN性能恶化, 这也是GNN应用于真实图的主要瓶颈. 研究人员对直连节点的特征信息聚合过程进行了许多改进, 即广度探索. 然而, 这些模型仅在层数为3或更少的情况下才表现最佳, 而在深层情况下性能迅速下降. 为缓解过度平滑, 本文提出一种嵌套的图注意网络, 即基于双重注意力机制的多尺度特征融合模型NGAT, 该网络可以半监督形式工作. 除广度探索, k层NGAT运用注意力机制引导的分层聚合策略, 选择性地利用来自k阶邻域的信息特征, 即深度探索. 即使对于10层或更深的架构, NGAT也能平衡保留局部性 (包括根节点特征和局部结构) 和从大型邻域聚合信息的需求. 本文在公开数据集上对比了现有图神经网络模型, 实验表明本文提出的NGAT模型具备更强的节点嵌入学习能力.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Atwood J, Towsley D, 2016. Diffusion-convolutional neural networks. Proc 30th Int Conf on Neural Information Processing Systems, p.2001–2009.

  • Belkin M, Niyogi P, Sindhwani V, 2006. Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J Mach Learn Res, 7:2399–2434.

    MathSciNet  MATH  Google Scholar 

  • Bruna J, Zaremba W, Szlam A, et al., 2014. Spectral networks and locally connected networks on graphs. https://arxiv.org/abs/1312.6203

  • Buchnik E, Cohen E, 2018. Bootstrapped graph diffusions: exposing the power of nonlinearity. Proc ACM Int Conf on Measurement and Modeling of Computer Systems, p.8–10. https://doi.org/10.1145/3219617.3219621

  • Chapelle O, Scholkopf B, Zien A, 2009. Semi-supervised learning (Chapelle, O. et al., Eds.; 2006) [book reviews]. IEEE Trans Neur Netw, 20(3):542. https://doi.org/10.1109/TNN.2009.2015974

    Article  Google Scholar 

  • Chen J, Ma TF, Xiao C, 2018. FastGCN: fast learning with graph convolutional networks via importance sampling. https://arxiv.org/abs/1801.10247

  • Defferrard M, Bresson X, Vandergheynst P, 2016. Convolutional neural networks on graphs with fast localized spectral filtering. Proc 30th Int Conf on Neural Information Processing Systems, p.3844–3852.

  • Grover A, Leskovec J, 2016. node2vec: scalable feature learning for networks. Proc 22nd ACM SIGKDD Int Conf on Knowledge Discovery and Data Mining, p.855–864. https://doi.org/10.1145/2939672.2939754

  • Hamilton WL, Ying R, Leskovec J, 2017. Inductive representation learning on large graphs. Proc 31st Int Conf on Neural Information Processing Systems, p.1025–1035.

  • He KM, Zhang XY, Ren SQ, et al., 2016. Deep residual learning for image recognition. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.770–778. https://doi.org/10.1109/CVPR.2016.90

  • Kingma DP, Ba J, 2014. Adam: a method for stochastic optimization. https://arxiv.org/abs/1412.6980

  • Kipf TN, Welling M, 2017. Semi-supervised classification with graph convolutional networks. https://arxiv.org/abs/1609.02907

  • Klicpera J, Bojchevski A, Günnemann S, 2019. Predict then propagate: graph neural networks meet personalized pagerank. https://arxiv.org/abs/1810.05997v4

  • Knyazev B, Taylor GW, Amer MR, 2019. Understanding attention and generalization in graph neural networks. Proc 33rd Conf on Neural Information Processing Systems, p.4204–4214.

  • Krizhevsky A, Sutskever I, Hinton GE, 2012. ImageNet classification with deep convolutional neural networks. Proc 25th Int Conf on Neural Information Processing Systems, p.1097–1105.

  • Lee J, Lee I, Kang J, 2019. Self-attention graph pooling. https://arxiv.org/abs/1904.08082

  • Li QM, Han ZC, Wu XM, 2018. Deeper insights into graph convolutional networks for semi-supervised learning. Proc 32nd AAAI Conf on Artificial Intelligence, p.3538–3545.

  • Liao RJ, Zhao ZZ, Urtasun R, et al., 2019. LanczosNet: multi-scale deep graph convolutional networks. https://arxiv.org/abs/1901.01484v1

  • Namata G, London B, Getoor L, et al., 2012. Query-driven active surveying for collective classification. Proc 10th Int Workshop on Mining and Learning with Graphs, Article 8.

  • Niepert M, Ahmed M, Kutzkov K, 2016. Learning convolutional neural networks for graphs. Proc 33rd Int Conf on Machine Learning, p.2014–2023.

  • Perozzi B, Al-Rfou R, Skiena S, 2014. DeepWalk: online learning of social representations. Proc 20th ACM SIGKDD Int Conf on Knowledge Discovery and Data Mining, p.701–710. https://doi.org/10.1145/2623330.2623732

  • Ribeiro LFR, Saverese PHP, Figueiredo DR, 2017. struc2vec: learning node representations from structural identity. Proc 23rd ACM SIGKDD Int Conf on Knowledge Discovery and Data Mining, p.385–394. https://doi.org/10.1145/3097983.3098061

  • Sen P, Namata G, Bilgic M, et al., 2008. Collective classification in network data. AI Mag, 29(3):93. https://doi.org/10.1609/aimag.v29i3.2157

    Google Scholar 

  • Shchur O, Mumme M, Bojchevski A, et al., 2018. Pitfalls of graph neural network evaluation. https://arxiv.org/abs/1811.05868

  • Simonyan K, Zisserman A, 2014. Very deep convolutional networks for large-scale image recognition. https://arxiv.org/abs/1409.1556

  • Srivastava N, Hinton G, Krizhevsky A, et al., 2014. Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res, 15(1):1929–1958.

    MathSciNet  MATH  Google Scholar 

  • Thekumparampil KK, Wang C, Oh S, et al., 2018. Attention-based graph neural network for semi-supervised learning. https://arxiv.org/abs/1803.03735

  • van der Maaten L, Hinton G, 2008. Visualizing data using t-SNE. J Mach Learn Res, 9:2579–2605.

    MATH  Google Scholar 

  • Vaswani A, Shazeer N, Parmar N, et al., 2017. Attention is all you need. Proc 31st Int Conf on Neural Information Processing Systems, p.6000–6010.

  • Velickovic P, Cucurull G, Casanova A, et al., 2018. Graph attention networks. https://arxiv.org/abs/1710.10903v1

  • Veličković P, Fedus W, Hamilton WL, et al., 2019. Deep graph infomax. https://arxiv.org/abs/1809.10341

  • Wu F, Zhang TY, de Souza AH Jr, et al., 2019. Simplifying graph convolutional networks. https://arxiv.org/abs/1902.07153

  • Wu ZH, Pan SR, Chen FW, et al., 2019. A comprehensive survey on graph neural networks. https://arxiv.org/abs/1901.00596

  • Xu K, Li CT, Tian YL, et al., 2018. Representation learning on graphs with jumping knowledge networks. https://arxiv.org/abs/1806.03536

  • Xu K, Hu WH, Leskovec J, et al., 2019. How powerful are graph neural networks? https://arxiv.org/abs/1810.00826

  • Yang ZL, Cohen W, Salakhudinov R, 2016. Revisiting semi-supervised learning with graph embeddings. Proc 33rd Int Conf on Machine Learning, p.40–48.

  • Zhou J, Cui GQ, Zhang ZY, et al., 2018. Graph neural networks: a review of methods and applications. https://arxiv.org/abs/1812.08434

  • Zhu XJ, Ghahramani Z, Lafferty J, 2003. Semi-supervised learning using Gaussian fields and harmonic functions. Proc 20th Int Conf on Machine Learning, p.912–919.

  • Zou DF, Hu ZN, Wang YW, et al., 2019. Layer-dependent importance sampling for training deep and large graph convolutional networks. Proc 33rd Int Conf on Neural Information Processing Systems, p.11247–11256.

Download references

Author information

Authors and Affiliations

Authors

Contributions

Jianke HU and Yin ZHANG designed the research. Jianke HU processed the data and drafted the paper. Yin ZHANG helped organize the paper. Jianke HU and Yin ZHANG revised and finalized the paper.

Corresponding author

Correspondence to Yin Zhang  (张引).

Ethics declarations

Jianke HU and Yin ZHANG declare that they have no conflict of interest.

Additional information

Project supported by China Knowledge Centre for Engineering Sciences and Technology (CKCEST)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hu, J., Zhang, Y. NGAT: attention in breadth and depth exploration for semi-supervised graph representation learning. Front Inform Technol Electron Eng 23, 409–421 (2022). https://doi.org/10.1631/FITEE.2000657

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1631/FITEE.2000657

Key words

关键词

CLC number