Skip to main content
Log in

OGT: optimize graph then training GNNs for node classification

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Graph Neural Networks (GNNs) have shown excellent performance in graph-related tasks and have arisen widespread attention. However, most existing works on GNNs mainly focus on proposing a novel GNN model or modifying the models to improve performance in various graph-related tasks and seldom consider possible problems in the graph data as model input data. This paper studies that low-degree nodes, accounting for most of the real-world graphs, naturally have message passing insufficiency problem due to too little information from other nodes when generating node embedding, which will affect the performance of the GNNs. To solve this problem, we propose a simple but practical method-Optimize Graph Then Training (OGT), which adds edges between low-degree nodes and some nodes with the same predicted label based on the GNNs prediction results and the inherent information in the graph. The OGT aims to improve the performance of GNNs in semi-supervised node classification tasks by augmenting the input data. More importantly, the OGT is regarded as a data preprocessing technique and can be used naturally with baseline GNN models (e.g., GCN, GAT, GraphSAGE, and SGC) to improve the performance of these models without making other modifications. Extensive experiments on three benchmark citation datasets with five typical GNN models verify that OGT consistently improves the performance of various GNNs to a great extent, where 1.9% (Cora), 1.0% (Citeseer), and 1.3% (Pubmed) average accuracy improvement on the node classification task.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Guo Z, Wang H (2021) A deep graph neural network-based mechanism for social recommendations. IEEE Trans Ind Inf 17(4):2776–2783

    Article  Google Scholar 

  2. Fan W, Ma Y, Li Q, He Y, Zhao YE, Tang J, Yin D (2019) Graph neural networks for social recommendation. In: Proceedings of the world wide web conference, pp 417–426

  3. Shen Z, Luo T, Zhou Y, Yu H, Du P (2021) NPI-GNN: predicting ncRNA-protein interactions with deep graph neural networks. Brief Bioinform 22(5):bbab051

    Article  Google Scholar 

  4. Fout A, Byrd J, Shariat B, Ben-Hur A (2017) Protein interface prediction using graph convolutional networks. In: Proceedings of the advances in neural information processing systems, pp 6530–6539

  5. Guo Y, Luo X, Chen L, Deng M (2021) Dna-gcn: Graph convolutional networks for predicting dna-protein binding. In: Proceedings of the international conference on intelligent computing, pp 458–466

  6. Li K, Feng Y, Gao Y, Qiu J (2020) Hierarchical graph attention networks for semi-supervised node classification. Appl Intell 50(10):3441–3451

    Article  Google Scholar 

  7. Oono K, Suzuki T (2020) Graph neural networks exponentially lose expressive power for node classification. In: Proceedings of the international conference on learning representations

  8. Liu M, Gao H, Ji S (2020) Towards deeper graph neural networks. In: Proceedings of the knowledge discovery and data mining, pp 338–348

  9. Liu Z, Nguyen T-K, Fang Y (2021) Tail-gnn: Tail-node graph neural networks. In: Proceedings of the knowledge discovery and data mining, pp 1109–1119

  10. Zhang M, Chen Y (2018) Link prediction based on graph neural networks. In: Proceedings of the advances in neural information processing systems, pp 5171–5181

  11. Cai L, Ji S (2020) A multi-scale approach for graph link prediction. In: Proceedings of the AAAI conference on artificial intelligence, pp 3308–3315

  12. Zhang M, Chen Y (2017) Weisfeiler-lehman neural machine for link prediction. In: Proceedings of the knowledge discovery and data mining, pp 575–583

  13. Gilmer J, Schoenholz SS, Riley PF, Vinyals O, Dahl GE (2017) Neural message passing for quantum chemistry. In: Proceedings of the international conference on machine learning, pp 1263–1272

  14. McPherson M, Smith-Lovin L, Cook JM (2001) Birds of a feather: homophily in social networks. Rev Sociol 27:415–444

    Google Scholar 

  15. Wang H, Xu T, Liu Q, Lian D, Chen E, Du D, Wu H, Su W (2019) Mcne: an end-to-end framework for learning multiple conditional network representations of social network. In: Proceedings of the knowledge discovery and data mining, pp 1064–1072

  16. Sen P, Namata G, Bilgic M, Getoor L, Gallagher B, Eliassi-Rad T (2008) Collective classification in network data. AI Mag 29(3):93–106

    Google Scholar 

  17. Namata G, London B, Getoor L, Huang B (2012) Query-driven active surveying for collective classification. In: Proceedings of the international workshop on mining and learning with graphs

  18. Li S, Xu LD, Zhao S (2015) The internet of things: a survey. Inf Syst Front 17(2):243–259

    Article  Google Scholar 

  19. Newman MEJ (2005) Power laws, pareto distributions and Zipf’s law. Contemp Phys 46(5):323–351

    Article  Google Scholar 

  20. Clauset A, Shalizi CR, Newman MEJ (2009) Power-law distributions in empirical data. Siam Rev 51(4):661–703

    Article  MathSciNet  MATH  Google Scholar 

  21. Abu-El-Haija S, Kapoor A, Perozzi B, Lee J (2019) N-GCN: multi-scale graph convolution for semi-supervised node classification. In: Proceedings of the uncertainty in artificial intelligence, pp 841–851

  22. Kipf TN, Welling M (2017) Semi-supervised classification with graph convolutional networks. In: Proceedings of the international conference on learning representations

  23. Wang Y, Wang W, Liang Y, Cai Y, Liu J, Hooi B (2020) Nodeaug: Semi-supervised node classification with data augmentation. In: Proceedings of the knowledge discovery and data mining, pp 207–217

  24. Hamilton WL, Ying Z, Leskovec J (2017) Inductive representation learning on large graphs. In: Proceedings of the advances in neural information processing systems, pp 1024–1034

  25. Velickovic P, Cucurull G, Casanova A, Romero A, Liò P, Bengio Y (2018) Graph attention networks. In: Proceedings of the international conference on learning representations

  26. Wu F,Jr, AHS, Zhang T, Fifty C, Yu T, Weinberger KQ (2019) Simplifying graph convolutional networks. In: Proceedings of the international conference on machine learning, pp 6861–6871

  27. Chen D, Lin Y, Li W, Li P, Zhou J, Sun X (2020) Measuring and relieving the over-smoothing problem for graph neural networks from the topological view. In: Proceedings of the AAAI conference on artificial intelligence, pp 3438–3445

  28. Zhou K, Dong Y, Wang K, Lee WS, Hooi B, Xu H, Feng J (2021) Understanding and resolving performance degradation in deep graph convolutional networks. In: Proceedings of the 30th ACM international conference on information and knowledge management, pp 2728–2737

  29. Zhao L, Akoglu L (2020) Pairnorm: Tackling oversmoothing in gnns. In: Proceedings of the international conference on learning representations

  30. Xu K, Li C, Tian Y, Sonobe T, Kawarabayashi K, Jegelka S (2018) Representation learning on graphs with jumping knowledge networks. In: Proceedings of the international conference on machine learning, pp 5449–5458

  31. Rong Y, Huang W, Xu T, Huang J (2020) Dropedge: Towards deep graph convolutional networks on node classification. In: Proceedings of the international conference on learning representations

  32. Srivastava N, Hinton GE, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958

    MathSciNet  MATH  Google Scholar 

  33. Zhao T, Liu Y, Neves L, Woodford OJ, Jiang M, Shah N (2021) Data augmentation for graph neural networks. In: Proceedings of the AAAI conference on artificial intelligence, pp 11015–11023

  34. Feng W, Zhang J, Dong Y, Han Y, Luan H, Xu Q, Yang Q, Kharlamov E, Tang J (2020) Graph random neural networks for semi-supervised learning on graphs. In: Proceedings of the neural information processing systems

  35. Jin D, Huo C, Liang C, Yang L (2021) Heterogeneous graph neural network via attribute completion. In: Proceedings of the web conference, pp 391–400

  36. Breiman L (2001) Random forests. Mach Learn 45(1):5–32

    Article  MATH  Google Scholar 

  37. Hosmer DW, Lemeshow S (1989) Applied logistic regression

  38. Kumar D, Priyanka NA (2020) Decision tree classifier: a detailed survey. Int J Inf Decis Sci 12:246–269

    Google Scholar 

  39. Bianchi FM, Grattarola D, Livi L, Alippi C (2021) Graph neural networks with convolutional arma filters. IEEE Trans Pattern Anal Mach Intell PP(99):1–1

    Article  Google Scholar 

  40. Li Q, Han Z, Wu X (2018) Deeper insights into graph convolutional networks for semi-supervised learning. In: Proceedings of the AAAI conference on artificial intelligence, pp 3538–3545

  41. Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L, Desmaison A, Köpf A, Yang EZ, DeVito Z, Raison M, Tejani A, Chilamkurthy S, Steiner B, Fang L, Bai J, Chintala S (2019) Pytorch: an imperative style, high-performance deep learning library. In: Proceedings of the advances in neural information processing systems, pp 8024–8035

  42. Wang M, Yu L, Zheng D, Gan Q, Gai Y, Ye Z, Li M, Zhou J, Huang Q, Ma C, Huang Z, Guo Q, Zhang H, Lin H, Zhao J, Li J, Smola AJ, Zhang Z (2019) Deep graph library: towards efficient and scalable deep learning on graphs. arXiv:1909.01315

  43. Qu M, Bengio Y, Tang J (2019) Gmnn: Graph markov neural networks. In: Proceedings of the international conference on machine learning, pp 5241–5250

  44. McCallum A, Nigam K (1998) A comparison of event models for naive bayes text classification. In: Proceedings of the AAAI-98 workshop on learning for text categorization

  45. Schapire RE (2013) Explaining adaboost. In: Empirical inference

  46. Sagi O, Rokach L (2018) Ensemble learning: a survey. Wiley Interdiscip Rev Data Min Knowl Discov 8:e1249

    Article  Google Scholar 

Download references

Acknowledgements

This paper was supported by the National Natural Science Foundation of China (Nos. 62162005, 61763003 and U21A20474) and the Innovation Project of Guangxi Graduate Education (XYCSZ2022020), Research Fund of Guangxi Key Lab of Multi-source Information Mining & Security (No. 19-A-02-01), Guangxi 1000-Plan of Training Middle-aged/Young Teachers in Higher Education Institutions, Guangxi “Bagui Scholar” Teams for Innovation and Research Project, Guangxi Talent Highland Project of Big Data Intelligence and Application, Guangxi Collaborative Innovation Center of Multisource Information Integration and Intelligent Processing.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Jinyan Wang or Tong Yi.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wei, Q., Wang, J., Hu, J. et al. OGT: optimize graph then training GNNs for node classification. Neural Comput & Applic 34, 22209–22222 (2022). https://doi.org/10.1007/s00521-022-07677-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-022-07677-5

Keywords

Navigation