research-article

Exploring Neural Scaling Law and Data Pruning Methods For Node Classification on Large-scale Graphs

Authors:

Zhewei WeiAuthors Info & Claims

WWW '24: Proceedings of the ACM Web Conference 2024

Pages 780 - 791

https://doi.org/10.1145/3589334.3645571

Published: 13 May 2024 Publication History

Abstract

Recently, how the model performance scales with the training sample size has been extensively studied for large models on vision and language related domains. Nevertheless, the ubiquitous node classification tasks on web-scale graphs were ignored, where the traits of these tasks, such as non-IIDness and transductive setting, are likely to cause different scaling laws and motivate novel techniques to beat the law. Therefore, we first explore the neural scaling law for node classification tasks on three large-scale graphs. Then, we benchmark several state-of-the-art data pruning methods on these tasks, not only validating the possibility of improving the original unsatisfactory power law but also gaining insights into a hard-and-representative principle on picking an effective subset of training nodes. Moreover, we leverage the transductive setting to propose a novel data pruning method, which instantiates our principle in a test set-targeted manner. Our method consistently outperforms related methods on all three datasets. Meanwhile, we utilize a PAC-Bayesian framework to analyze our method, extending prior results to account for both hardness and representativeness. In addition to a promising way to ease GNN training on web-scale graphs, our study offers knowledge of the relationship between training nodes and GNN generalization.

Supplemental Material

MP4 File

video presentation

Download
107.07 MB

MP4 File

Supplemental video

Download
5.10 MB

References

[1]

HongYun Cai, Vincent Wenchen Zheng, and Kevin Chen-Chuan Chang. 2017. Active Learning for Graph Embedding. CoRR (2017).

[2]

Ming Chen, Zhewei Wei, Bolin Ding, Yaliang Li, Ye Yuan, Xiaoyong Du, and Ji-Rong Wen. 2020. Scalable Graph Neural Networks via Bidirectional Propagation. In Advances in Neural Information Processing Systems.

[3]

Eli Chien, Jianhao Peng, Pan Li, and Olgica Milenkovic. 2021. Adaptive Universal Generalized PageRank Graph Neural Network. In International Conference on Learning Representations.

[4]

Kashyap Chitta, José M. Álvarez, Elmar Haussmann, and Clément Farabet. 2022. Training Data Subset Search With Ensemble Active Learning. IEEE Transactions on Intelligent Transportation Systems (2022).

Digital Library

[5]

Vitaly Feldman and Chiyuan Zhang. 2020. What Neural Networks Memorize and Why: Discovering the Long Tail via Influence Estimation. In Advances in Neural Information Processing Systems.

[6]

Johannes Gasteiger, Aleksandar Bojchevski, and Stephan Günnemann. 2019. Combining Neural Networks with Personalized PageRank for Classification on Graphs. In International Conference on Learning Representations.

[7]

Teofilo F Gonzalez. 1985. Clustering to minimize the maximum intercluster distance. Theoretical computer science (1985), 293--306.

[8]

Mitchell A Gordon, Kevin Duh, and Jared Kaplan. 2021. Data and Parameter Scaling Laws for Neural Machine Translation. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing.

[9]

William L Hamilton, Rex Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. In Advances in neural information processing systems.

[10]

Tom Henighan, Jared Kaplan, Mor Katz, Mark Chen, Christopher Hesse, Jacob Jackson, Heewoo Jun, Tom B. Brown, Prafulla Dhariwal, Scott Gray, Chris Hallacy, Benjamin Mann, Alec Radford, Aditya Ramesh, Nick Ryder, Daniel M. Ziegler, John Schulman, Dario Amodei, and Sam McCandlish. 2020. Scaling Laws for Autoregressive Generative Modeling. CoRR, Vol. abs/2010.14701 (2020). https://arxiv.org/abs/2010.14701

[11]

Danny Hernandez, Jared Kaplan, Tom Henighan, and Sam McCandlish. 2021. Scaling Laws for Transfer. CoRR, Vol. abs/2102.01293 (2021). https://arxiv.org/abs/2102.01293

[12]

Joel Hestness, Sharan Narang, Newsha Ardalani, Gregory F. Diamos, Heewoo Jun, Hassan Kianinejad, Md. Mostofa Ali Patwary, Yang Yang, and Yanqi Zhou. 2017. Deep Learning Scaling is Predictable, Empirically. CoRR, Vol. abs/1712.00409 (2017). http://arxiv.org/abs/1712.00409

[13]

Jordan Hoffmann, Sebastian Borgeaud, Arthur Mensch, Elena Buchatskaya, Trevor Cai, Eliza Rutherford, Diego de Las Casas, Lisa Anne Hendricks, Johannes Welbl, Aidan Clark, Tom Hennigan, Eric Noland, Katie Millican, George van den Driessche, Bogdan Damoc, Aurelia Guy, Simon Osindero, Karen Simonyan, Erich Elsen, Jack W. Rae, Oriol Vinyals, and Laurent Sifre. 2022. Training Compute-Optimal Large Language Models. arxiv: 2203.15556 [cs.CL]

[14]

Weihua Hu, Matthias Fey, Marinka Zitnik, Yuxiao Dong, Hongyu Ren, Bowen Liu, Michele Catasta, and Jure Leskovec. 2020. Open Graph Benchmark: Datasets for Machine Learning on Graphs. arXiv preprint arXiv:2005.00687 (2020).

[15]

Jared Kaplan, Sam McCandlish, Tom Henighan, Tom B. Brown, Benjamin Chess, Rewon Child, Scott Gray, Alec Radford, Jeffrey Wu, and Dario Amodei. 2020. Scaling Laws for Neural Language Models. CoRR, Vol. abs/2001.08361 (2020). https://arxiv.org/abs/2001.08361

[16]

Nicolas Keriven. 2022. Not too little, not too much: a theoretical analysis of graph (over)smoothing. In The First Learning on Graphs Conference.

[17]

Thomas N. Kipf and Max Welling. 2016. Variational Graph Auto-Encoders. https://arxiv.org/abs/1611.07308

[18]

Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In International Conference on Learning Representations.

[19]

liang, Charu Aggarwal, Prasenjit Mitra, and Suhang Wang. 2020. Investigating and Mitigating Degree-Related Biases in Graph Convoltuional Networks. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management.

[20]

Can Liu, Li Sun, Xiang Ao, Jinghua Feng, Qing He, and Hao Yang. 2021b. Intention-Aware Heterogeneous Graph Attention Networks for Fraud Transactions Detection. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining.

Digital Library

[21]

Zemin Liu, Trung-Kien Nguyen, and Yuan Fang. 2021a. Tail-GNN: Tail-Node Graph Neural Networks. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining.

Digital Library

[22]

Sitao Luan, Chenqing Hua, Qincheng Lu, Jiaqi Zhu, Mingde Zhao, Shuyuan Zhang, Xiao-Wen Chang, and Doina Precup. 2022. Revisiting Heterophily For Graph Neural Networks. In Advances in Neural Information Processing Systems.

[23]

Jiaqi Ma, Junwei Deng, and Qiaozhu Mei. 2021. Subgroup Generalization and Fairness of Graph Neural Networks. In Advances in Neural Information Processing Systems.

[24]

Kristof Meding, Luca M. Schulze Buschoff, Robert Geirhos, and Felix A. Wichmann. 2022. Trivial or Impossible -- dichotomous data difficulty masks model differences (on ImageNet and beyond). In International Conference on Learning Representations.

[25]

Baharan Mirzasoleiman, Jeff Bilmes, and Jure Leskovec. 2020. Coresets for Data-Efficient Training of Machine Learning Models. In Proceedings of the 37th International Conference on Machine Learning.

Digital Library

[26]

Behnam Neyshabur, Srinadh Bhojanapalli, and Nathan Srebro. 2018. A PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks. In International Conference on Learning Representations.

[27]

Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. 1999. The PageRank Citation Ranking : Bringing Order to the Web. In The Web Conference.

[28]

Mansheej Paul, Surya Ganguli, and Gintare Karolina Dziugaite. 2021. Deep Learning on a Data Diet: Finding Important Examples Early in Training. In Advances in Neural Information Processing Systems.

[29]

Jonathan S. Rosenfeld, Amir Rosenfeld, Yonatan Belinkov, and Nir Shavit. 2020. A Constructive Prediction of the Generalization Error Across Scales. In International Conference on Learning Representations.

[30]

Ben Sorscher, Robert Geirhos, Shashank Shekhar, Surya Ganguli, and Ari Morcos. 2022. Beyond neural scaling laws: beating power law scaling via data pruning. In Advances in Neural Information Processing Systems.

[31]

Jiliang Tang, Charu C. Aggarwal, and Huan Liu. 2016. Node Classification in Signed Social Networks. In SDM.

[32]

Mariya Toneva, Alessandro Sordoni, Remi Tachet des Combes, Adam Trischler, Yoshua Bengio, and Geoffrey J. Gordon. 2019. An Empirical Study of Example Forgetting during Deep Neural Network Learning. In International Conference on Learning Representations.

[33]

Petar Velivc ković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. 2018. Graph Attention Networks. In International Conference on Learning Representations.

[34]

Felix Wu, Amauri Souza, Tianyi Zhang, Christopher Fifty, Tao Yu, and Kilian Weinberger. 2019. Simplifying Graph Convolutional Networks. In Proceedings of the 36th International Conference on Machine Learning. 6861--6871.

[35]

Xinyi Wu, Zhengdao Chen, William Wei Wang, and Ali Jadbabaie. 2023. A Non-Asymptotic Analysis of Oversmoothing in Graph Neural Networks. In The Eleventh International Conference on Learning Representations.

[36]

Yuexin Wu, Yichong Xu, Aarti Singh, Yiming Yang, and Artur Dubrawski. 2021. Active Learning for Graph Neural Networks via Node Feature Propagation. arxiv: 1910.07567 [cs.LG]

[37]

Congxi Xiao, Jingbo Zhou, Jizhou Huang, Tong Xu, and Hui Xiong. 2023. Spatial Heterophily Aware Graph Neural Networks. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining.

Digital Library

[38]

Keyulu Xu, Mozhi Zhang, Stefanie Jegelka, and Kenji Kawaguchi. 2021. Optimization of Graph Neural Networks: Implicit Acceleration by Skip Connections and More Depth. In Proceedings of the 38th International Conference on Machine Learning. 11592--11602.

[39]

Xiaohua Zhai, Alexander Kolesnikov, Neil Houlsby, and Lucas Beyer. 2022. Scaling Vision Transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[40]

Dan Zhang, Yifan Zhu, Yuxiao Dong, Yuandong Wang, Wenzheng Feng, Evgeny Kharlamov, and Jie Tang. 2023. ApeGNN: Node-Wise Adaptive Aggregation in GNNs for Recommendation. In Proceedings of the ACM Web Conference 2023.

Digital Library

[41]

Yuheng Zhang, Hanghang Tong, Yinglong Xia, Yan Zhu, Yuejie Chi, and Lei Ying. [n.,d.]. Batch Active Learning with Graph Neural Networks via Multi-Agent Deep Reinforcement Learning. Proceedings of the AAAI Conference on Artificial Intelligence ([n.,d.]).

[42]

Jiong Zhu, Ryan A. Rossi, Anup Rao, Tung Mai, Nedim Lipka, Nesreen K. Ahmed, and Danai Koutra. 2021. Graph Neural Networks with Heterophily. Proceedings of the AAAI Conference on Artificial Intelligence (2021).

[43]

Jiong Zhu, Yujun Yan, Lingxiao Zhao, Mark Heimann, Leman Akoglu, and Danai Koutra. 2020. Beyond Homophily in Graph Neural Networks: Current Limitations and Effective Designs. Advances in Neural Information Processing Systems (2020).

Index Terms

Exploring Neural Scaling Law and Data Pruning Methods For Node Classification on Large-scale Graphs
1. Computing methodologies
  1. Machine learning
    1. Machine learning algorithms
    2. Machine learning approaches
      1. Neural networks

Recommendations

Data Fine-Pruning: A Simple Way to Accelerate Neural Network Training
Network and Parallel Computing
Abstract
The training process of a neural network is the most time-consuming procedure before being deployed to applications. In this paper, we investigate the loss trend of the training data during the training process. We find that given a fixed set of ...
GraFN: Semi-Supervised Node Classification on Graph with Few Labels via Non-Parametric Distribution Assignment
SIGIR '22: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval

Despite the success of Graph Neural Networks (GNNs) on various applications, GNNs encounter significant performance degradation when the amount of supervision signals, i.e., number of labeled nodes, is limited, which is expected as GNNs are trained ...
Label-Consistency based Graph Neural Networks for Semi-supervised Node Classification
SIGIR '20: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval

Graph neural networks (GNNs) achieve remarkable success in graph-based semi-supervised node classification, leveraging the information from neighboring nodes to improve the representation learning of target node. The success of GNNs at node ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

WWW '24: Proceedings of the ACM Web Conference 2024

May 2024

4826 pages

ISBN:9798400701719

DOI:10.1145/3589334

General Chairs:
Tat-Seng Chua
National University of Singapore
,
Chong-Wah Ngo
Singapore Management University
,
Proceedings Chair:
Roy Ka-Wei Lee
Singapore University of Technology and Design
,
Program Chairs:
Ravi Kumar
Google
,
Hady W. Lauw
Singapore Management University

Copyright © 2024 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 May 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Natural Science Foundation of China

Conference

WWW '24

Sponsor:

SIGWEB

WWW '24: The ACM Web Conference 2024

May 13 - 17, 2024

Singapore, Singapore

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
253
Total Downloads

Downloads (Last 12 months)253
Downloads (Last 6 weeks)16

Reflects downloads up to 14 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten