Abstract
The popularity of Graph Neural Networks (GNNs) has grown significantly because GNNs handle relational datasets such as social networks and citation networks. However, the usual relational dataset is sparse, and GNNs are easy to overfit to the dataset. To alleviate the overfitting problems, model ensemble methods are widely studied and adopted. However, model ensemble methods for GNNs are not well explored. In this study, we propose simple but effective model ensemble methods for GNNs. This is the first study that adopts stochastic weights averaging (SWA) for GNNs. Furthermore, we propose a new model ensemble method, Dirichlet stochastic weighs averaging (DSWA). DSWA adopts the running averages of the trained weights with random proportions sampled by Dirichlet distributions. DSWA provides the diverse model and its ensembles on inference time without the training time increases. We validate our models on the Cora, the Citeseer, and Pubmed datasets under usual settings and few-shot learning settings. We observe that the performance of current GNNs deteriorates when the number of specified data is limited. DSWA improves the performance of few-shot node classification tasks as well as the general node classification tasks.
Graphical abstract
![](http://media.springernature.com/lw685/springer-static/image/art%3A10.1007%2Fs10489-024-05708-3/MediaObjects/10489_2024_5708_Figd_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10489-024-05708-3/MediaObjects/10489_2024_5708_Fige_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10489-024-05708-3/MediaObjects/10489_2024_5708_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10489-024-05708-3/MediaObjects/10489_2024_5708_Fig2_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10489-024-05708-3/MediaObjects/10489_2024_5708_Fig3_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10489-024-05708-3/MediaObjects/10489_2024_5708_Fig4_HTML.png)
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data Availability
The datasets analyzed during the current study are available in the pytorch_geometric repository, https://github.com/pyg-team/pytorch_geometric
Change history
06 December 2024
A Correction to this paper has been published: https://doi.org/10.1007/s10489-024-06086-6
08 February 2025
A Correction to this paper has been published: https://doi.org/10.1007/s10489-024-06099-1
References
Chakraborty M, Byshkin M, Crestani F (2020) Patent citation network analysis: a perspective from descriptive statistics and ergms. PLoS ONE 15(12):0241797
Ebesu T, Fang Y (2017) Neural citation network for context-aware citation recommendation. In: Proceedings of the 40th international ACM SIGIR conference on research and development in information retrieval. pp 1093–1096
Laghari AA, Laghari MA (2021) Quality of experience assessment of calling services in social network. ICT Express. 7(2):158–161
Nilashi M, Abumalloh RA, Almulihi A, Alrizq M, Alghamdi A, Ismail MY, Bashar A, Zogaan WA, Asadi S (2023) big social data analysis for impact of food quality on travelers’ satisfaction in eco-friendly hotels. ICT Express. 9(2):182–188
Xu H, Jiang C, Liang X, Li Z (2019) Spatial-aware graph relation network for large-scale object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 9298–9307
Wang Y, Kitani K, Weng X (2021) Joint object detection and multi-object tracking with graph neural networks. In: 2021 IEEE International Conference on Robotics and Automation (ICRA). IEEE, pp 13708–13715
Kipf TN, Welling M (2016) Semi-supervised classification with graph convolutional networks. arXiv:1609.02907
Veličković P, Cucurull G, Casanova A, Romero A, Liò P, Bengio Y (2018) Graph attention networks. In: International conference on learning representations
Bianchi FM, Grattarola D, Livi L, Alippi C (2021) Graph neural networks with convolutional arma filters. IEEE Trans Pattern Anal Mach Intell 44(7):3496–3507
Wu F, Souza A, Zhang T, Fifty C, Yu T, Weinberger K (2019) Simplifying graph convolutional networks. In: International conference on machine learning. PMLR, p 6861–6871
Chen M, Wei Z, Huang Z, Ding B, Li Y (2020) Simple and deep graph convolutional networks. In: International conference on machine learning. PMLR, pp 1725–1735
Dvornik N, Schmid C, Mairal J (2019) Diversity with cooperation: Ensemble methods for few-shot classification. In: Proceedings of the IEEE/CVF international conference on computer vision. pp 3723–3731
Jiang W, Zhang Y, Kwok JT (2021) Seen: few-shot classification with self-ensemble. In: 2021 International Joint Conference on Neural Networks (IJCNN). IEEE, p 1–8
Izmailov P, Wilson A, Podoprikhin D, Vetrov D, Garipov T (2018) Averaging weights leads to wider optima and better generalization. In: 34th Conference on uncertainty in artificial intelligence 2018, UAI 2018. p 876–885
Gilmer J, Schoenholz SS, Riley PF, Vinyals O, Dahl GE (2017) Neural message passing for quantum chemistry. In: International conference on machine learning. PMLR, p 1263–1272
Wilson AG, Izmailov P (2020) Bayesian deep learning and a probabilistic perspective of generalization. Adv Neural Inf Process Syst 33:4697–4708. arXiv:2002.08791
Blundell C, Cornebise J, Kavukcuoglu K, Wierstra D (2015) Weight uncertainty in neural network. In: International conference on machine learning. PMLR, p 1613–1622
Gal Y, Ghahramani Z (2016) Dropout as a bayesian approximation: representing model uncertainty in deep learning. In: International conference on machine learning. PMLR, pp 1050–1059
Daxberger E, Kristiadi A, Immer A, Eschenhagen R, Bauer M, Hennig P (2021) Laplace redux-effortless bayesian deep learning. Adv Neural Inf Process Syst 34:20089–20103
Zhang Y, Pal S, Coates M, Ustebay D (2019) Bayesian graph convolutional neural networks for semi-supervised classification. Proc AAAI Conf Artif Intell 33:5829–5836
Hasanzadeh A, Hajiramezanali E, Boluki S, Zhou M, Duffield N, Narayanan K, Qian X (2020) Bayesian graph neural networks with adaptive connection sampling. In: International conference on machine learning. PMLR, pp 4094–4104
Papp PA, Martinkus K, Faber L, Wattenhofer R (2021) dropgnn: random dropouts increase the expressiveness of graph neural networks. Adv Neural Inf Process Syst 34:21997–22009
Rong Y, Huang W, Xu T, Huang J (2019) dropedge: towards deep graph convolutional networks on node classification. In: International conference on learning representations
Wang Y, Karaletsos T (2021) Stochastic aggregation in graph neural networks. arXiv:2102.12648
Sen P, Namata G, Bilgic M, Getoor L, Galligher B, Eliassi-Rad T (2008) Collective classification in network data. AI Mag 29(3):93–93
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv:1412.6980
Defferrard M, Bresson X, Vandergheynst P (2016) Convolutional neural networks on graphs with fast localized spectral filtering. Adv Neural Inf Process Syst 29:3844–3852
Klicpera J, Bojchevski A, Günnemann S (2018) Predict then propagate: graph neural networks meet personalized pagerank. In: International conference on learning representations
Fey M, Lenssen JE (2019) Fast graph representation learning with pytorch geometric. arXiv:1903.02428
Maaten L, Hinton G (2008) Visualizing data using t-sne. J Mach Learn Res 9(11)
Acknowledgements
This work was supported by the Basic Study and Interdisciplinary R&D Foundation Fund of the University of Seoul (2021).
Funding
The authors declare that there is no funding issue to disclose.
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Conflicts of Interest
The authors declare that there is no conflict of interest or competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The incorrect graphical abstract was captured.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Park, M., Chang, R. & Song, K. Dirichlet stochastic weights averaging for graph neural networks. Appl Intell 54, 10516–10524 (2024). https://doi.org/10.1007/s10489-024-05708-3
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-024-05708-3