research-article

Imbalanced Nodes Classification for Graph Neural Networks Based on Valuable Sample Mining

Authors:
Min Liu

School of Software, Beijing University of Technology, China

School of Software, Beijing University of Technology, China

0000-0002-2226-8556
View Profile

,
Siwen Jin

Computer Department, Beijing Technology and Business University, China

Computer Department, Beijing Technology and Business University, China

0000-0002-5158-7871
View Profile

,
Luo Jin

Department of Computer Science, Wenzhou-Kean University, China

Department of Computer Science, Wenzhou-Kean University, China

0000-0003-2377-9208
View Profile

,
Shuohan Wang

School of Electronic Information Engineering, Sias University, China

School of Electronic Information Engineering, Sias University, China

0000-0002-7058-2902
View Profile

,
Yu Fang

School of Mechanical and Marine Engineering, Beibu Gulf University, China

School of Mechanical and Marine Engineering, Beibu Gulf University, China

0000-0002-6637-9354
View Profile

,
Yuliang Shi

School of Software, Beijing University of Technology, China

School of Software, Beijing University of Technology, China

0000-0003-4908-4922
View Profile

EITCE '22: Proceedings of the 2022 6th International Conference on Electronic Information Technology and Computer EngineeringOctober 2022Pages 1957–1962https://doi.org/10.1145/3573428.3573772

Published:15 March 2023Publication History

EITCE '22: Proceedings of the 2022 6th International Conference on Electronic Information Technology and Computer Engineering

Pages 1957–1962

ABSTRACT

Node classification is an important task in graph neural networks, but most existing studies assume that samples from different classes are balanced. However, the class imbalance problem is widespread and can seriously affect the model's performance. Reducing the adverse effects of imbalanced datasets on model training is crucial to improve the model's performance. Therefore, a new loss function FD-Loss is reconstructed based on the traditional algorithm-level approach to the imbalance problem. Firstly, we propose sample mismeasurement distance to filter edge-hard samples and simple samples based on the distribution. Then, the weight coefficients are defined based on the mismeasurement distance and used in the loss function weighting term, so that the loss function focuses only on valuable samples. Experiments on several benchmarks demonstrate that our loss function can effectively solve the sample node imbalance problem and improve the classification accuracy by 4% compared to existing methods in the node classification task.

References

Wu, L., Lin, H., Gao, Z., Tan, C., & Li, S. 2021. GraphMixup: Improving Class-Imbalanced Node Classification on Graphs by Self-supervised Context Prediction. ArXiv, abs/2106.11133.Google Scholar
Liu, Z., Chen, C., Yang, X., Zhou, J., Li, X., & Song, L. 2018. Heterogeneous Graph Neural Networks for Malicious Account Detection. Proceedings of the 27th ACM International Conference on Information and Knowledge Management.Google Scholar
Seyda Ertekin, Jian Huang, and C Lee Giles. Active Learning for Class Imbalance Problem. In the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR, 2007, pages 823–824, 2007.Google Scholar
Wang, Xinyue, Bo Liu, Siyu Cao, Liping Jing and Jian Yu. “Important sampling based active learning for imbalance classification.” Science China Information Sciences 63, 2020: 1-14.Google Scholar
Nitesh V Chawla, Kevin W Bowyer, Lawrence O Hall, and W Philip Kegelmeyer. SMOTE: Synthetic Minority Over-sampling Technique. Journal of artificial intelligence research, 16: 321–357, 2002.Google Scholar
Haibo He, Yang Bai, Edwardo A. Garcia, and Shutao Li. ADASYN: Adaptive synthetic sampling approach for imbalanced learning. In the International Joint Conference on Neural Networks, IJCNN 2008, part of the IEEE World Congress on Computational Intelligence, WCCI 2008, Hong Kong, China, June 1-6, 2008, pages 1322–1328. IEEE, 2008.Google Scholar
Iman Nekooeimehr and Susana K Lai-Yuen. Adaptive Semi-unsupervised Weighted Oversampling (A-SUWO) for Imbalanced Datasets. Expert Systems with Applications, 46:405–416, 2016.Google ScholarDigital Library
Chen Huang, Yining Li, Chen Change Loy, and Xiaoou Tang. Learning Deep Representation for Imbalanced Classification. In the 29th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, pages 5375–5384, 2016.Google Scholar
Mengye Ren, Wenyuan Zeng, Bin Yang, and Raquel Urtasun. Learning to Reweight Examples for Robust Deep Learning. In International Conference on Machine Learning, pages 4334–4343. PMLR, 2018.Google Scholar
Yin Cui, Menglin Jia, Tsung-Yi Lin, Yang Song, and Serge J. Belongie. Class-Balanced Loss Based on Effective Number of Samples. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16-20, 2019, pages 9268–9277. Computer Vision Foundation / IEEE, 2019.Google Scholar
Kaidi Cao, Colin Wei, Adrien Gaidon, Nikos Aréchiga, and Tengyu Ma. Learning Imbalanced Datasets with Label-Distribution-Aware Margin Loss. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada, pages 1565–1576, 2019.Google Scholar
William L. Hamilton, Zhitao Ying, and Jure Leskovec. Inductive Representation Learning on Large Graphs. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, pages 1024–1034, 2017.Google Scholar
Wei Li, Ruihan Bao, Keiko Harimoto, Deli Chen, Jingjing Xu, and Qi Su. Modeling the Stock Relation with Graph Network for Overnight Stock Movement Prediction. In the 29th International Joint Conference on Artificial Intelligence, IJCAI 2020, pages 4541–4547. ijcai.org, 2020.Google Scholar
Oleksandr Shchur, Maximilian Mumme, Aleksandar Bojchevski, and Stephan Günnemann. Pitfalls of Graph Neural Network Evaluation. arXiv preprint: 1811.05868, 2018.Google Scholar
Wu, Zhenqin, Bharath Ramsundar, Evan N. Feinberg, Joseph Gomes, Caleb Geniesse, Aneesh S. Pappu, Karl Leswing and Vijay S. Pande. “MoleculeNet: A Benchmark for Molecular Machine Learning.” arXiv: Learning, 2017: n. pag.Google Scholar
Nikolay G Prokoptsev, AE Alekseenko, and Yaroslav Aleksandrovich Kholodov. Traffic Flow Speed Prediction on Transportation Graph with Convolutional Neural Networks. Computer research and modeling, 10(3):359–367, 2018.Google Scholar
Min Shi, Yufei Tang, Xingquan Zhu, David A. Wilson, and Jianxun Liu. Multi-Class Imbalanced Graph Convolutional Network Learning. In the 29th International Joint Conference on Artifificial Intelligence, IJCAI 2020, pages 2879–2885. ijcai.org, 2020.Google Scholar
Zhang, Zhilu and Mert Rory Sabuncu. “Generalized Cross Entropy Loss for Training Deep Neural Networks with Noisy Labels.” NeurIPS, 2018.Google Scholar
Lin, Tsung-Yi, Priya Goyal, Ross B. Girshick, Kaiming He and Piotr Dollár. “Focal Loss for Dense Object Detection.” IEEE Transactions on Pattern Analysis and Machine Intelligence 42, 2020: 318-327.Google ScholarCross Ref
Ivan Tomek Two Modifications of CNN. IEEE Transactions on Systems, Man, and Cybernetics, 1976.Google Scholar
Inderjeet Mani and I Zhang. KNN Approach to Unbalanced Data Distributions: A Case Study Involving Information Extraction. In Workshop on Learning from Imbalanced Datasets, volume 126, 2003.Google Scholar
Miroslav Kubat, Stan Matwin, Addressing the Curse of Imbalanced Training Sets: One-sided Selection. In the 14th International Conference on Machine Learning (ICML 1997), Nashville, Tennessee, USA, July 8-12, 1997, volume 97, pages 179–186. Morgan Kaufmann, 1997.Google Scholar
Min Shi, Yufei Tang, Xingquan Zhu, David A. Wilson, and Jianxun Liu. Multi-Class Imbalanced Graph Convolutional Network Learning. In the 29th International Joint Conference on Artifificial Intelligence, IJCAI 2020, pages 2879–2885. ijcai.org, 2020.Google Scholar
Mahsa Ghorbani, Anees Kazi, Mahdieh Soleymani Baghshah, Hamid R. Rabiee, and Nassir Navab. RA-GCN: Graph Convolutional Network for Disease Prediction Problems with Imbalanced Data. arXiv preprint: 2103.00221, 2021.Google Scholar
Shi Shuhao, Kai Qiao, Shuai Yang, Linyuan Wang, Jian Chen and Bin Yan. “Boosting-GNN: Boosting Algorithm for Graph Networks on Imbalanced Node Classification.” Frontiers in Neurorobotics 15, 2021: n. pag.Google Scholar
Zhao Tianxiang, Xiang Zhang and Suhang Wang. “GraphSMOTE: Imbalanced Node Classification on Graphs with Graph Neural Networks.” Proceedings of the 14th ACM International Conference on Web Search and Data Mining, 2021: n. pag.Google Scholar
Chen Deli, Yankai Lin, Guangxiang Zhao, Xuancheng Ren, Peng Li, Jie Zhou and Xu Sun. “Topology-Imbalance Learning for Semi-Supervised Node Classification.” NeurIPS, 2021.Google Scholar
Ciano, G., Rossi, A., Bianchini, M., & Scarselli, F. 2022. On Inductive–Transductive Learning With Graph Neural Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44, 758-769.Google Scholar
Kipf, Thomas and Max Welling. “Semi-Supervised Classification with Graph Convolutional Networks.” ArXiv abs/1609.02907. 2017: n. pag.Google Scholar
Yang, Z., Cohen, W.W., & Salakhutdinov, R. 2016. Revisiting Semi-Supervised Learning with Graph Embeddings. ArXiv, abs/1603.08861.Google Scholar
Ley, M. 2002. The DBLP Computer Science Bibliography: Evolution, Research Issues, Perspectives. SPIRE.Google Scholar
Shchur, O., Mumme, M., Bojchevski, A., & Günnemann, S. 2018. Pitfalls of Graph Neural Network Evaluation. ArXiv, abs/1811.05868.Google Scholar
Velickovic, Petar, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio’ and Yoshua Bengio. “Graph Attention Networks.” ArXiv abs/1710.10903. 2018: n. pag.Google Scholar
Hamilton, W.L., Ying, Z., & Leskovec, J. 2017. Inductive Representation Learning on Large Graphs. NIPS.Google Scholar

Index Terms

Imbalanced Nodes Classification for Graph Neural Networks Based on Valuable Sample Mining
1. Software and its engineering
  1. Software creation and management
    1. Software development process management
      1. Software development methods
        Agile software development

Recommendations

Imbalanced Graph Classification via Graph-of-Graph Neural Networks
CIKM '22: Proceedings of the 31st ACM International Conference on Information & Knowledge Management

Graph Neural Networks (GNNs) have achieved unprecedented success in identifying categorical labels of graphs. However, most existing graph classification problems with GNNs follow the protocol of balanced data splitting, which misaligns with many real-...
Read More
GraphSMOTE: Imbalanced Node Classification on Graphs with Graph Neural Networks
WSDM '21: Proceedings of the 14th ACM International Conference on Web Search and Data Mining

Node classification is an important research topic in graph learning. Graph neural networks (GNNs) have achieved state-of-the-art performance of node classification. However, existing GNNs address the problem where node samples for different classes are ...
Read More
A graph neural network-based node classification model on class-imbalanced graph data
Abstract
Node classification for highly imbalanced graph data is challenging, with existing graph neural networks (GNNs) typically utilizing a balanced class distribution to learn node embeddings on graph data. However, when dealing with an ...
Highlights
- A novel GNN-based imbalanced node classification model GNN-INCM is proposed.
- ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

EITCE '22: Proceedings of the 2022 6th International Conference on Electronic Information Technology and Computer Engineering
October 2022
1999 pages
ISBN:9781450397148
DOI:10.1145/3573428

Copyright © 2022 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 15 March 2023
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Graph Neural Network
Imbalanced dataset
Loss function
Node classification
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate508of972submissions,52%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 34
  Total Downloads
- Downloads (Last 12 months)28
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Imbalanced Nodes Classification for Graph Neural Networks Based on Valuable Sample Mining

EITCE '22: Proceedings of the 2022 6th International Conference on Electronic Information Technology and Computer Engineering

ABSTRACT

References

Cited By

Index Terms

Recommendations

Imbalanced Graph Classification via Graph-of-Graph Neural Networks

GraphSMOTE: Imbalanced Node Classification on Graphs with Graph Neural Networks

A graph neural network-based node classification model on class-imbalanced graph data

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Imbalanced Nodes Classification for Graph Neural Networks Based on Valuable Sample Mining

EITCE '22: Proceedings of the 2022 6th International Conference on Electronic Information Technology and Computer Engineering

ABSTRACT

References

Cited By

Index Terms

Recommendations

Imbalanced Graph Classification via Graph-of-Graph Neural Networks

GraphSMOTE: Imbalanced Node Classification on Graphs with Graph Neural Networks

A graph neural network-based node classification model on class-imbalanced graph data

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media