A novel graph oversampling framework for node classification in class-imbalanced graphs

Xia, Riting; Zhang, Chunxu; Zhang, Yan; Liu, Xueyan; Yang, Bo

doi:10.1007/s11432-023-3897-2

A novel graph oversampling framework for node classification in class-imbalanced graphs

Research Paper
Published: 15 April 2024

Volume 67, article number 162101, (2024)
Cite this article

Science China Information Sciences Aims and scope Submit manuscript

Riting Xia^1,2,
Chunxu Zhang^1,3,
Yan Zhang⁴,
Xueyan Liu^1,3 &
…
Bo Yang^1,3

582 Accesses
Explore all metrics

Abstract

Graph neural network (GNN) is a promising method to analyze graphs. Most existing GNNs adopt the class-balanced assumption, which cannot deal with class-imbalanced graphs well. The oversampling technique is effective in alleviating class-imbalanced problems. However, most graph oversampling methods generate synthetic minority nodes and their edges after applying GNNs. They ignore the problem that the representations of the original and synthetic minority nodes are dominated by majority nodes caused by aggregating neighbor information through GNN before oversampling. In this paper, we propose a novel graph oversampling framework, termed distribution alignment-based oversampling for node classification in class-imbalanced graphs (named Graph-DAO). Our framework generates synthetic minority nodes before GNN to avoid the dominance of majority nodes caused by message passing in GNNs. Additionally, we introduce a distribution alignment method based on the sum-product network to learn more information about minority nodes. To our best knowledge, it is the first to use the sum-product network to solve the class-imbalanced problem in node classification. A large number of experiments on four real datasets show that our method achieves the optimal results on the node classification task for class-imbalanced graphs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Joint Graph Augmentation and Adaptive Synthetic Sampling for Imbalanced Node Classification

Nia-GNNs: neighbor-imbalanced aware graph neural networks for imbalanced node classification

Article 15 June 2024

Subgraph generation applied in GraphSAGE deal with imbalanced node classification

Article 13 July 2024

References

Li M, Zhu Z. Spatial-temporal fusion graph neural networks for traffic flow forecasting. In: Proceedings of the AAAI Conference on Artificial Intelligence, 2021. 4189–4196
Wang Z, Wang C, Gao C, et al. An evolutionary autoencoder for dynamic community detection. Sci China Inf Sci, 2020, 63: 212205
Article MathSciNet Google Scholar
Romanou A, Smeros P, Aberer K. On representation learning for scientific news articles using heterogeneous knowledge graphs. In: Proceedings of the Web Conference, Ljubljana, 2021. 422–425
Sen P, Namata G, Bilgic M, et al. Collective classification in network data. AI Mag, 2008, 29: 93–106
Google Scholar
Mohammadrezaei M, Shiri M E, Rahmani A M. Identifying fake accounts on social networks based on graph analysis and classification algorithms. Secur Commun Netw, 2018, 2018: 1–8
Article Google Scholar
Xu K, Hu W, Leskovec J, et al. How powerful are graph neural networks? In: Proceedings of the Learning Representations, New Orleans, 2019
Ye Y, Ji S. Sparse graph attention networks. IEEE Trans Knowl Data Eng, 2023, 35: 905–916
Google Scholar
Kipf T N, Welling M. Semi-supervised classification with graph convolutional networks. In: Proceedings of the Learning Representations, Toulon, 2017
Guo L, Tang L, Chen T, et al. DA-GCN: a domain-aware attentive graph convolution network for shared-account cross-domain sequential recommendation. In: Proceedings of the 30th International Joint Conference on Artificial Intelligence, Montreal, 2021. 2483–2489
Chen H, Zhuang F, Xiao L. AMA-GCN: adaptive multi-layer aggregation graph convolutional network for disease prediction. In: Proceedings of the 30th International Joint Conference on Artificial Intelligence, Montreal, 2021. 2235–2241
Eliasof M, Haber E, Treister E. PDE-GCN: novel architectures for graph neural networks motivated by partial differential equation. In: Proceedings of the Advances in Neural Information Processing Systems, 2021. 3836–3849
You H, Lu Z, Zhou Z, et al. Early-bird GCNs: graph-network co-optimization towards more efficient GCN training and inference via drawing early-bird lottery tickets. In: Proceedings of the 34th Conference on Innovative Applications of Artificial Intelligence, 2022. 8910–8918
Li S, Li W T, Wang W. CO-GCN for multi-view semi-supervised learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, New York, 2020. 4691–4698
Tian L, Wu H. MI-GCN: node mutual information-based graph convolutional network. In: Proceedings of the Web Conference, Lyon, 2022. 996–1003
Hamilton W L, Ying R, Leskovec J. Inductive representation learning on large graphs. In: Proceedings of the Advances in Neural Information Processing Systems, Long Beach, 2017. 1024–1034
Yin J, Gan C, Zhao K, et al. A novel model for imbalanced data classification. In: Proceedings of the AAAI Conference on Artificial Intelligence, New York, 2020. 6680–6687
Menon A K, Jayasumana S, Rawat A S, et al. Long-tail learning via logit adjustment. In: Proceedings of the Learning Representations, 2021
Chen D, Lin Y, Zhao G, et al. Topology-imbalance learning for semi-supervised node classification. In: Proceedings of the Advances in Neural Information Processing Systems, 2021. 29885–29897
Park J, Song J, Yang E. GraphENS: neighbor-aware ego network synthesis for class-imbalanced node classification. In: Proceedings of the Learning Representations, 2022
Zhao T, Zhang X, Wang S. GraphSMOTE: imbalanced node classification on graphs with graph neural networks. In: Proceedings of the Web Search and Data Mining, 2021. 833–841
Shi M, Tang Y, Zhu X, et al. Multi-class imbalanced graph convolutional network learning. In: Proceedings of the 29th International Joint Conference on Artificial Intelligence, 2020. 2879–2885
Li X, Wen L, Deng Y, et al. Graph neural network with curriculum learning for imbalanced node classification. 2022. arXiv:2202.02529
Wu L, Xia J, Gao Z, et al. GraphMixup: improving class-imbalanced node classification on graphs by self-supervised context prediction. In: Proceedings of the Machine Learning and Knowledge Discovery in Databases: European Conference, Grenoble, 2022. 519–535
Song J, Park J, Yang E. TAM: topology-aware margin loss for class-imbalanced node classification. In: Proceedings of the Machine Learning, Baltimore, 2022. 20369–20383
Wang K, An J, Zhou M, et al. Minority-weighted graph neural network for imbalanced node classification in social networks of internet of people. IEEE Int Things J, 2023, 10: 330–340
Article Google Scholar
Chawla N V, Bowyer K W, Hall L O, et al. SMOTE: synthetic minority over-sampling technique. J Artif Intell Res, 2002, 16: 321–357
Article Google Scholar
Wang Z, Ye X, Wang C, et al. RSDNE: exploring relaxed similarity and dissimilarity from completely-imbalanced labels for network embedding. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence, New Orleans, 2018. 475–482
Wang Y. Fair graph representation learning with imbalanced and biased data. In: Proceedings of the 15th ACM International Conference on Web Search and Data Mining, 2022. 1557–1558
Ghorbani M, Kazi A, Baghshah M S, et al. RA-GCN: graph convolutional network for disease prediction problems with imbalanced data. Med Image Anal, 2022, 75: 102272
Article Google Scholar
Huang Z, Tang Y, Chen Y. A graph neural network-based node classification model on class-imbalanced graph data. Knowl-Based Syst, 2022, 244: 108538
Article Google Scholar
Shi S, Qiao K, Yang S, et al. AdaGCN: adaptive boosting algorithm for graph convolutional networks on imbalanced node classification. 2022. arXiv:2105.11625
Qian Y, Zhang C, Zhang Y, et al. Co-modality graph contrastive learning for imbalanced node classification. In: Proceedings of the Advances in Neural Information Processing Systems, 2022
Qu L, Zhu H, Zheng R, et al. ImGAGN: imbalanced network embedding via generative adversarial graph networks. In: Proceedings of the Conference on Knowledge Discovery and Data Mining, 2021. 1390–1398
Poon H, Domingos P. Sum-product networks: a new deep architecture. In: Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence, Barcelona, 2011. 337–346
Gens R, Domingos P. Learning the structure of sum-product networks. In: Proceedings of the Machine Learning, Atlanta, 2013. 873–880
Sánchez-Cauce R, París I, Díez F. Sum-product networks: a survey. IEEE Trans Pattern Anal Mach Intell, 2022, 44: 3821–3839
Google Scholar
Peharz R, Vergari A, Stelzner K, et al. Random sum-product networks: a simple and effective approach to probabilistic deep learning. In: Proceedings of the 35th Conference on Uncertainty in Artificial Intelligence, Tel Aviv, 2019. 334–344
Peharz R, Vergari A, Stelzner K, et al. Einsum networks: fast and scalable learning of tractable probabilistic circuits. In: Proceedings of the 37th International Conference on Machine Learning, 2020. 7563–7574
Dennis A, Ventura D. Learning the architecture of sum-product networks using clustering on variables. In: Proceedings of the Advances in Neural Information Processing Systems, Lake Tahoe, 2012. 3239–3247
Papastavridis, John G. Tensor Calculus and Analytical Dynamics. New York: CRC Press, 1988
Google Scholar
Nath A, Domingos P M. Learning relational sum-product networks. In: Proceedings of the National Conference on Artificial Intelligence, Austin, 2015. 2878–28867
Zheng K, Pronobis A, Rao R. Learning graph-structured sum-product networks for probabilistic semantic maps. In: Proceedings of the National Conference on Artificial Intelligence, New Orleans, 2018. 4547–4555
Xia R, Zhang Y, Zhang C. Multi-head variational graph autoencoder constrained by sum-product networks. In: Proceedings of the ACM Web Conference, Austin, 2023. 641–650
Kipf T N, Welling M. Variational graph auto-encoders. In: Proceedings of the Advances in Neural Information Processing Systems, 2017. 1–3
Koller D, Friedman N. Probabilistic Graphical Models: Principles and Techniques. Cambridge: MIT Press, 2009
Google Scholar
Romanou A, Smeros P, Aberer K. On representation learning for scientific news articles using heterogeneous knowledge graphs. In: Proceedings of the ACM Web Conference, 2021. 422–425
Wang Z, Wang C, Li X, et al. Evolutionary Markov dynamics for network community detection. IEEE Trans Knowl Data Eng, 2022, 34: 1206–1220
Article Google Scholar
Tian Y, Zhang C, Guo Z. Recipe2Vec: multi-modal recipe representation learning with graph neural networks. In: Proceedings of the 31st International Joint Conference on Artificial Intelligence, Vienna, 2022. 3473–3479
Chen Z, Chen F, Zhang L, et al. Bridging the gap between spatial and spectral domains: a survey on graph neural networks. 2020. doi: https://doi.org/10.1145/3627816
Wang X, Zhang M. How powerful are spectral graph neural networks. In: Proceedings of the Machine Learning, Baltimore, 2022. 23341–23362
Zhu H, Koniusz P. Simple spectral graph convolution. In: Proceedings of the Learning Representations, 2021
Verma V, Lamb A, Beckham C, et al. Manifold mixup: better representations by interpolating hidden states. In: Proceedings of the 36th International Conference on Machine Learning, Long Beach, 2019. 6438–6447
Lu Q, Getoor L. Link-based classification. machine learning. In: Proceedings of the 20th International Conference, Washington, 2003. 496–503
Buda M, Maki A, Mazurowski M A. A systematic study of the class imbalance problem in convolutional neural networks. Neural Netw, 2018, 106: 249–259
Article Google Scholar
Cao K, Wei C, Gaidon A, et al. Learning imbalanced datasets with label-distribution-aware margin loss. In: Proceedings of Annual Conference on Neural Information Processing Systems, 2019. 1565–1576
Perozzi B, Al-Rfou R, Skiena S. Deepwalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, 2017. 701–710
Mernyei P, Cangea C. Wiki-CS: a wikipedia-based benchmark for graph neural networks. In press, 2022. arXiv:2007.02901
Yuan B, Ma X. Sampling + reweighting: boosting the performance of adaBoost on imbalanced datasets. In: Proceedings of the International Joint Conference on Neural Networks, Brisbane, 2012. 1–6
Ando S, Huang C Y. Deep over-sampling framework for classifying imbalanced data. In: Proceedings of the Machine Learning and Knowledge Discovery in Databases: European Conference, Skopje, 2017. 770–785
Khan A H, Cao X, Li S, et al. BAS-ADAM: an ADAM based approach to improve the performance of beetle antennae search optimizer. IEEE CAA J Autom Sin, 2020, 7: 461–471
Article Google Scholar
Shen X, Zhu X, Jiang X, et al. Visualization of non-metric relationships by adaptive learning multiple maps t-SNE regularization. In: Proceedings of the IEEE BigData, Boston, 2017. 3882–3887

Download references

Acknowledgements

This work was supported by National Key R&D Program of China (Grant No. 2021ZD0112500), National Natural Science Foundation of China (Grant Nos. U22A2098, 62172185, 62202200, 62206105), and Fundamental Research Funds for the Central Universities, JLU.

Author information

Authors and Affiliations

Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, 130012, China
Riting Xia, Chunxu Zhang, Xueyan Liu & Bo Yang
College of Artificial Intelligence, Jilin University, Changchun, 130012, China
Riting Xia
College of Computer Science and Technology, Jilin University, Changchun, 130012, China
Chunxu Zhang, Xueyan Liu & Bo Yang
College of Communication Engineering, Jilin University, Changchun, 130012, China
Yan Zhang

Authors

Riting Xia
View author publications
You can also search for this author in PubMed Google Scholar
Chunxu Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yan Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Xueyan Liu
View author publications
You can also search for this author in PubMed Google Scholar
Bo Yang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Xueyan Liu or Bo Yang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xia, R., Zhang, C., Zhang, Y. et al. A novel graph oversampling framework for node classification in class-imbalanced graphs. Sci. China Inf. Sci. 67, 162101 (2024). https://doi.org/10.1007/s11432-023-3897-2

Download citation

Received: 23 May 2023
Revised: 07 October 2023
Accepted: 31 October 2023
Published: 15 April 2024
DOI: https://doi.org/10.1007/s11432-023-3897-2

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A novel graph oversampling framework for node classification in class-imbalanced graphs

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Joint Graph Augmentation and Adaptive Synthetic Sampling for Imbalanced Node Classification

Nia-GNNs: neighbor-imbalanced aware graph neural networks for imbalanced node classification

Subgraph generation applied in GraphSAGE deal with imbalanced node classification

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

A novel graph oversampling framework for node classification in class-imbalanced graphs

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Joint Graph Augmentation and Adaptive Synthetic Sampling for Imbalanced Node Classification

Nia-GNNs: neighbor-imbalanced aware graph neural networks for imbalanced node classification

Subgraph generation applied in GraphSAGE deal with imbalanced node classification

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation