Abstract
Graph attention networks stack self-attention layers to compute the neighbor-specific weights. Due to inherent noise and artificially correlated dimensions, attention scores fail to create optimal linear combinations for feature aggregation from the neighborhood. Multiple attention heads solve the problem to an extent but at the cost of additional memory overhead and larger variance in results. In this work, we introduce a novel concept of computing attention scores using a low-rank approximation of the intended neighborhood. The sub-space feature representation of the neighborhood discards the adverse effect of noise and artificially correlated dimensions. Extensive experiments on graph datasets show that the proposed framework outperforms the state-of-the-art methods. The reduced variance in our metrics Kruskal–Wallis test also indicates that the proposed model is able to give stable results as compared to other state-of-the-art methods.






Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.References
Belkin M, Niyogi P, Sindhwani V (2006) Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J Mach Learn Res 7(11):2399–2434
Botev ZI, Grotowski JF, Kroese DP et al (2010) Kernel density estimation via diffusion. Ann Stat 38(5):2916–2957
Clevert DA, Unterthiner T, Hochreiter S (2015) Fast and accurate deep network learning by exponential linear units (ELUs). arXiv preprint arXiv:1511.07289
Defferrard M, Bresson X, Vandergheynst P (2016) Convolutional neural networks on graphs with fast localized spectral filtering. Adv Neural Inf Process Syst 29
Devlin J, Chang MW, Lee K, Toutanova K (2018) Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805
Getoor L (2005) Link-based classification. In: Advanced methods for knowledge discovery from complex data. Springer, pp 189–207
Hamilton WL, Ying R, Leskovec J (2017) Representation learning on graphs: methods and applications. arXiv preprint arXiv:1709.05584
He Y, Wai HT (2022) Detecting central nodes from low-rank excited graph signals via structured factor analysis. IEEE Trans Signal Process
Jolliffe IT (1982) A note on the use of principal components in regression. J R Stat Soc Ser C (Appl Stat) 31(3):300–303
Kim KI, Steinke F, Hein M (2010) Semi-supervised regression using hessian energy with an application to semi-supervised dimensionality reduction
Kipf TN, Welling M (2016) Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907
Li Z, Sun Y, Zhu J, Tang S, Zhang C, Ma H (2021) Improve relation extraction with dual attention-guided graph convolutional networks. Neural Comput Appl 33(6):1773–1784
Maji G (2020) Influential spreaders identification in complex networks with potential edge weight based k-shell degree neighborhood method. J Comput Sci 39:101055
McKight PE, Najab J (2010) Kruskal–Wallis test. In: The Corsini encyclopedia of psychology, p 1
Monti F, Boscaini D, Masci J, Rodola E, Svoboda J, Bronstein MM (2017) Geometric deep learning on graphs and manifolds using mixture model CNNs. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5115–5124
Namata G, London B, Getoor L, Huang B (2012) Query-driven active surveying for collective classification. In: 10th international workshop on mining and learning with graphs, vol 8
O’Grady KE (1982) Measures of explained variance: cautions and limitations. Psychol Bull 92(3):766
Perozzi B, Al-Rfou R, Skiena S (2014) DeepWalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining, pp 701–710
Sen P, Namata G, Bilgic M, Getoor L, Galligher B, Eliassi-Rad T (2008) Collective classification in network data. AI Magazine 29(3):93–93
Varon C, Alzate C, Suykens JA (2015) Noise level estimation for model selection in kernel PCA denoising. IEEE Trans Neural Netw Learn Syst 26(11):2650–2663
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. arXiv preprint arXiv:1706.03762
Veličković P, Cucurull G, Casanova A, Romero A, Lio P, Bengio Y (2017) Graph attention networks. arXiv preprint arXiv:1710.10903
Weston J, Ratle F, Mobahi H, Collobert R (2012) Deep learning via semi-supervised embedding. In: Neural networks: tricks of the trade. Springer, pp 639–655
Xu B, Wang N, Chen T, Li M (2015) Empirical evaluation of rectified activations in convolutional network. arXiv preprint arXiv:1505.00853
Xu Y, Li M, Cui L, Huang S, Wei F, Zhou M (2019) LayoutLM: pre-training of text and layout for document image understanding
Xu Y, Xu Y, Lv T, Cui L, Wei F, Wang G, Lu Y, Florencio D, Zhang C, Che W, Zhang M, Zhou L (2020) LayoutLMv2: multi-modal pre-training for visually-rich document understanding
Yang Z, Cohen W, Salakhudinov R (2016) Revisiting semi-supervised learning with graph embeddings. In: International conference on machine learning. PMLR, pp 40–48
Ye Y, Ji S (2021) Sparse graph attention networks. IEEE Trans Knowl Data Eng
Yuan J, Cao M, Cheng H, Yu H, Xie J, Wang C (2022) A unified structure learning framework for graph attention networks. Neurocomputing
Zhou A, Li Y (2021) Structural attention network for graph. Appl Intell 51(8):6255–6264
Zhu X, Ghahramani Z, Lafferty JD (2003) Semi-supervised learning using Gaussian fields and harmonic functions. In: Proceedings of the 20th international conference on machine learning (ICML-03), pp 912–919
Zhuang C, Ma Q (2018) Dual graph convolutional networks for graph-based semi-supervised classification. In: Proceedings of the 2018 world wide web conference, pp 499–508
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Yadav, R.K., Abhishek, Verma, A. et al. Low-rank GAT: toward robust quantification of neighborhood influence. Neural Comput & Applic 35, 3925–3936 (2023). https://doi.org/10.1007/s00521-022-07914-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-022-07914-x