Abstract
Graph Neural Networks (GNNs) have shown excellent performance in graph-related tasks and have arisen widespread attention. However, most existing works on GNNs mainly focus on proposing a novel GNN model or modifying the models to improve performance in various graph-related tasks and seldom consider possible problems in the graph data as model input data. This paper studies that low-degree nodes, accounting for most of the real-world graphs, naturally have message passing insufficiency problem due to too little information from other nodes when generating node embedding, which will affect the performance of the GNNs. To solve this problem, we propose a simple but practical method-Optimize Graph Then Training (OGT), which adds edges between low-degree nodes and some nodes with the same predicted label based on the GNNs prediction results and the inherent information in the graph. The OGT aims to improve the performance of GNNs in semi-supervised node classification tasks by augmenting the input data. More importantly, the OGT is regarded as a data preprocessing technique and can be used naturally with baseline GNN models (e.g., GCN, GAT, GraphSAGE, and SGC) to improve the performance of these models without making other modifications. Extensive experiments on three benchmark citation datasets with five typical GNN models verify that OGT consistently improves the performance of various GNNs to a great extent, where 1.9% (Cora), 1.0% (Citeseer), and 1.3% (Pubmed) average accuracy improvement on the node classification task.










Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Guo Z, Wang H (2021) A deep graph neural network-based mechanism for social recommendations. IEEE Trans Ind Inf 17(4):2776–2783
Fan W, Ma Y, Li Q, He Y, Zhao YE, Tang J, Yin D (2019) Graph neural networks for social recommendation. In: Proceedings of the world wide web conference, pp 417–426
Shen Z, Luo T, Zhou Y, Yu H, Du P (2021) NPI-GNN: predicting ncRNA-protein interactions with deep graph neural networks. Brief Bioinform 22(5):bbab051
Fout A, Byrd J, Shariat B, Ben-Hur A (2017) Protein interface prediction using graph convolutional networks. In: Proceedings of the advances in neural information processing systems, pp 6530–6539
Guo Y, Luo X, Chen L, Deng M (2021) Dna-gcn: Graph convolutional networks for predicting dna-protein binding. In: Proceedings of the international conference on intelligent computing, pp 458–466
Li K, Feng Y, Gao Y, Qiu J (2020) Hierarchical graph attention networks for semi-supervised node classification. Appl Intell 50(10):3441–3451
Oono K, Suzuki T (2020) Graph neural networks exponentially lose expressive power for node classification. In: Proceedings of the international conference on learning representations
Liu M, Gao H, Ji S (2020) Towards deeper graph neural networks. In: Proceedings of the knowledge discovery and data mining, pp 338–348
Liu Z, Nguyen T-K, Fang Y (2021) Tail-gnn: Tail-node graph neural networks. In: Proceedings of the knowledge discovery and data mining, pp 1109–1119
Zhang M, Chen Y (2018) Link prediction based on graph neural networks. In: Proceedings of the advances in neural information processing systems, pp 5171–5181
Cai L, Ji S (2020) A multi-scale approach for graph link prediction. In: Proceedings of the AAAI conference on artificial intelligence, pp 3308–3315
Zhang M, Chen Y (2017) Weisfeiler-lehman neural machine for link prediction. In: Proceedings of the knowledge discovery and data mining, pp 575–583
Gilmer J, Schoenholz SS, Riley PF, Vinyals O, Dahl GE (2017) Neural message passing for quantum chemistry. In: Proceedings of the international conference on machine learning, pp 1263–1272
McPherson M, Smith-Lovin L, Cook JM (2001) Birds of a feather: homophily in social networks. Rev Sociol 27:415–444
Wang H, Xu T, Liu Q, Lian D, Chen E, Du D, Wu H, Su W (2019) Mcne: an end-to-end framework for learning multiple conditional network representations of social network. In: Proceedings of the knowledge discovery and data mining, pp 1064–1072
Sen P, Namata G, Bilgic M, Getoor L, Gallagher B, Eliassi-Rad T (2008) Collective classification in network data. AI Mag 29(3):93–106
Namata G, London B, Getoor L, Huang B (2012) Query-driven active surveying for collective classification. In: Proceedings of the international workshop on mining and learning with graphs
Li S, Xu LD, Zhao S (2015) The internet of things: a survey. Inf Syst Front 17(2):243–259
Newman MEJ (2005) Power laws, pareto distributions and Zipf’s law. Contemp Phys 46(5):323–351
Clauset A, Shalizi CR, Newman MEJ (2009) Power-law distributions in empirical data. Siam Rev 51(4):661–703
Abu-El-Haija S, Kapoor A, Perozzi B, Lee J (2019) N-GCN: multi-scale graph convolution for semi-supervised node classification. In: Proceedings of the uncertainty in artificial intelligence, pp 841–851
Kipf TN, Welling M (2017) Semi-supervised classification with graph convolutional networks. In: Proceedings of the international conference on learning representations
Wang Y, Wang W, Liang Y, Cai Y, Liu J, Hooi B (2020) Nodeaug: Semi-supervised node classification with data augmentation. In: Proceedings of the knowledge discovery and data mining, pp 207–217
Hamilton WL, Ying Z, Leskovec J (2017) Inductive representation learning on large graphs. In: Proceedings of the advances in neural information processing systems, pp 1024–1034
Velickovic P, Cucurull G, Casanova A, Romero A, Liò P, Bengio Y (2018) Graph attention networks. In: Proceedings of the international conference on learning representations
Wu F,Jr, AHS, Zhang T, Fifty C, Yu T, Weinberger KQ (2019) Simplifying graph convolutional networks. In: Proceedings of the international conference on machine learning, pp 6861–6871
Chen D, Lin Y, Li W, Li P, Zhou J, Sun X (2020) Measuring and relieving the over-smoothing problem for graph neural networks from the topological view. In: Proceedings of the AAAI conference on artificial intelligence, pp 3438–3445
Zhou K, Dong Y, Wang K, Lee WS, Hooi B, Xu H, Feng J (2021) Understanding and resolving performance degradation in deep graph convolutional networks. In: Proceedings of the 30th ACM international conference on information and knowledge management, pp 2728–2737
Zhao L, Akoglu L (2020) Pairnorm: Tackling oversmoothing in gnns. In: Proceedings of the international conference on learning representations
Xu K, Li C, Tian Y, Sonobe T, Kawarabayashi K, Jegelka S (2018) Representation learning on graphs with jumping knowledge networks. In: Proceedings of the international conference on machine learning, pp 5449–5458
Rong Y, Huang W, Xu T, Huang J (2020) Dropedge: Towards deep graph convolutional networks on node classification. In: Proceedings of the international conference on learning representations
Srivastava N, Hinton GE, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
Zhao T, Liu Y, Neves L, Woodford OJ, Jiang M, Shah N (2021) Data augmentation for graph neural networks. In: Proceedings of the AAAI conference on artificial intelligence, pp 11015–11023
Feng W, Zhang J, Dong Y, Han Y, Luan H, Xu Q, Yang Q, Kharlamov E, Tang J (2020) Graph random neural networks for semi-supervised learning on graphs. In: Proceedings of the neural information processing systems
Jin D, Huo C, Liang C, Yang L (2021) Heterogeneous graph neural network via attribute completion. In: Proceedings of the web conference, pp 391–400
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Hosmer DW, Lemeshow S (1989) Applied logistic regression
Kumar D, Priyanka NA (2020) Decision tree classifier: a detailed survey. Int J Inf Decis Sci 12:246–269
Bianchi FM, Grattarola D, Livi L, Alippi C (2021) Graph neural networks with convolutional arma filters. IEEE Trans Pattern Anal Mach Intell PP(99):1–1
Li Q, Han Z, Wu X (2018) Deeper insights into graph convolutional networks for semi-supervised learning. In: Proceedings of the AAAI conference on artificial intelligence, pp 3538–3545
Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L, Desmaison A, Köpf A, Yang EZ, DeVito Z, Raison M, Tejani A, Chilamkurthy S, Steiner B, Fang L, Bai J, Chintala S (2019) Pytorch: an imperative style, high-performance deep learning library. In: Proceedings of the advances in neural information processing systems, pp 8024–8035
Wang M, Yu L, Zheng D, Gan Q, Gai Y, Ye Z, Li M, Zhou J, Huang Q, Ma C, Huang Z, Guo Q, Zhang H, Lin H, Zhao J, Li J, Smola AJ, Zhang Z (2019) Deep graph library: towards efficient and scalable deep learning on graphs. arXiv:1909.01315
Qu M, Bengio Y, Tang J (2019) Gmnn: Graph markov neural networks. In: Proceedings of the international conference on machine learning, pp 5241–5250
McCallum A, Nigam K (1998) A comparison of event models for naive bayes text classification. In: Proceedings of the AAAI-98 workshop on learning for text categorization
Schapire RE (2013) Explaining adaboost. In: Empirical inference
Sagi O, Rokach L (2018) Ensemble learning: a survey. Wiley Interdiscip Rev Data Min Knowl Discov 8:e1249
Acknowledgements
This paper was supported by the National Natural Science Foundation of China (Nos. 62162005, 61763003 and U21A20474) and the Innovation Project of Guangxi Graduate Education (XYCSZ2022020), Research Fund of Guangxi Key Lab of Multi-source Information Mining & Security (No. 19-A-02-01), Guangxi 1000-Plan of Training Middle-aged/Young Teachers in Higher Education Institutions, Guangxi “Bagui Scholar” Teams for Innovation and Research Project, Guangxi Talent Highland Project of Big Data Intelligence and Application, Guangxi Collaborative Innovation Center of Multisource Information Integration and Intelligent Processing.
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wei, Q., Wang, J., Hu, J. et al. OGT: optimize graph then training GNNs for node classification. Neural Comput & Applic 34, 22209–22222 (2022). https://doi.org/10.1007/s00521-022-07677-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-022-07677-5