skip to main content
10.1145/3589334.3645571acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

Exploring Neural Scaling Law and Data Pruning Methods For Node Classification on Large-scale Graphs

Published: 13 May 2024 Publication History

Abstract

Recently, how the model performance scales with the training sample size has been extensively studied for large models on vision and language related domains. Nevertheless, the ubiquitous node classification tasks on web-scale graphs were ignored, where the traits of these tasks, such as non-IIDness and transductive setting, are likely to cause different scaling laws and motivate novel techniques to beat the law. Therefore, we first explore the neural scaling law for node classification tasks on three large-scale graphs. Then, we benchmark several state-of-the-art data pruning methods on these tasks, not only validating the possibility of improving the original unsatisfactory power law but also gaining insights into a hard-and-representative principle on picking an effective subset of training nodes. Moreover, we leverage the transductive setting to propose a novel data pruning method, which instantiates our principle in a test set-targeted manner. Our method consistently outperforms related methods on all three datasets. Meanwhile, we utilize a PAC-Bayesian framework to analyze our method, extending prior results to account for both hardness and representativeness. In addition to a promising way to ease GNN training on web-scale graphs, our study offers knowledge of the relationship between training nodes and GNN generalization.

Supplemental Material

MP4 File
video presentation
MP4 File
Supplemental video

References

[1]
HongYun Cai, Vincent Wenchen Zheng, and Kevin Chen-Chuan Chang. 2017. Active Learning for Graph Embedding. CoRR (2017).
[2]
Ming Chen, Zhewei Wei, Bolin Ding, Yaliang Li, Ye Yuan, Xiaoyong Du, and Ji-Rong Wen. 2020. Scalable Graph Neural Networks via Bidirectional Propagation. In Advances in Neural Information Processing Systems.
[3]
Eli Chien, Jianhao Peng, Pan Li, and Olgica Milenkovic. 2021. Adaptive Universal Generalized PageRank Graph Neural Network. In International Conference on Learning Representations.
[4]
Kashyap Chitta, José M. Álvarez, Elmar Haussmann, and Clément Farabet. 2022. Training Data Subset Search With Ensemble Active Learning. IEEE Transactions on Intelligent Transportation Systems (2022).
[5]
Vitaly Feldman and Chiyuan Zhang. 2020. What Neural Networks Memorize and Why: Discovering the Long Tail via Influence Estimation. In Advances in Neural Information Processing Systems.
[6]
Johannes Gasteiger, Aleksandar Bojchevski, and Stephan Günnemann. 2019. Combining Neural Networks with Personalized PageRank for Classification on Graphs. In International Conference on Learning Representations.
[7]
Teofilo F Gonzalez. 1985. Clustering to minimize the maximum intercluster distance. Theoretical computer science (1985), 293--306.
[8]
Mitchell A Gordon, Kevin Duh, and Jared Kaplan. 2021. Data and Parameter Scaling Laws for Neural Machine Translation. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing.
[9]
William L Hamilton, Rex Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. In Advances in neural information processing systems.
[10]
Tom Henighan, Jared Kaplan, Mor Katz, Mark Chen, Christopher Hesse, Jacob Jackson, Heewoo Jun, Tom B. Brown, Prafulla Dhariwal, Scott Gray, Chris Hallacy, Benjamin Mann, Alec Radford, Aditya Ramesh, Nick Ryder, Daniel M. Ziegler, John Schulman, Dario Amodei, and Sam McCandlish. 2020. Scaling Laws for Autoregressive Generative Modeling. CoRR, Vol. abs/2010.14701 (2020). https://arxiv.org/abs/2010.14701
[11]
Danny Hernandez, Jared Kaplan, Tom Henighan, and Sam McCandlish. 2021. Scaling Laws for Transfer. CoRR, Vol. abs/2102.01293 (2021). https://arxiv.org/abs/2102.01293
[12]
Joel Hestness, Sharan Narang, Newsha Ardalani, Gregory F. Diamos, Heewoo Jun, Hassan Kianinejad, Md. Mostofa Ali Patwary, Yang Yang, and Yanqi Zhou. 2017. Deep Learning Scaling is Predictable, Empirically. CoRR, Vol. abs/1712.00409 (2017). http://arxiv.org/abs/1712.00409
[13]
Jordan Hoffmann, Sebastian Borgeaud, Arthur Mensch, Elena Buchatskaya, Trevor Cai, Eliza Rutherford, Diego de Las Casas, Lisa Anne Hendricks, Johannes Welbl, Aidan Clark, Tom Hennigan, Eric Noland, Katie Millican, George van den Driessche, Bogdan Damoc, Aurelia Guy, Simon Osindero, Karen Simonyan, Erich Elsen, Jack W. Rae, Oriol Vinyals, and Laurent Sifre. 2022. Training Compute-Optimal Large Language Models. arxiv: 2203.15556 [cs.CL]
[14]
Weihua Hu, Matthias Fey, Marinka Zitnik, Yuxiao Dong, Hongyu Ren, Bowen Liu, Michele Catasta, and Jure Leskovec. 2020. Open Graph Benchmark: Datasets for Machine Learning on Graphs. arXiv preprint arXiv:2005.00687 (2020).
[15]
Jared Kaplan, Sam McCandlish, Tom Henighan, Tom B. Brown, Benjamin Chess, Rewon Child, Scott Gray, Alec Radford, Jeffrey Wu, and Dario Amodei. 2020. Scaling Laws for Neural Language Models. CoRR, Vol. abs/2001.08361 (2020). https://arxiv.org/abs/2001.08361
[16]
Nicolas Keriven. 2022. Not too little, not too much: a theoretical analysis of graph (over)smoothing. In The First Learning on Graphs Conference.
[17]
Thomas N. Kipf and Max Welling. 2016. Variational Graph Auto-Encoders. https://arxiv.org/abs/1611.07308
[18]
Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In International Conference on Learning Representations.
[19]
liang, Charu Aggarwal, Prasenjit Mitra, and Suhang Wang. 2020. Investigating and Mitigating Degree-Related Biases in Graph Convoltuional Networks. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management.
[20]
Can Liu, Li Sun, Xiang Ao, Jinghua Feng, Qing He, and Hao Yang. 2021b. Intention-Aware Heterogeneous Graph Attention Networks for Fraud Transactions Detection. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining.
[21]
Zemin Liu, Trung-Kien Nguyen, and Yuan Fang. 2021a. Tail-GNN: Tail-Node Graph Neural Networks. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining.
[22]
Sitao Luan, Chenqing Hua, Qincheng Lu, Jiaqi Zhu, Mingde Zhao, Shuyuan Zhang, Xiao-Wen Chang, and Doina Precup. 2022. Revisiting Heterophily For Graph Neural Networks. In Advances in Neural Information Processing Systems.
[23]
Jiaqi Ma, Junwei Deng, and Qiaozhu Mei. 2021. Subgroup Generalization and Fairness of Graph Neural Networks. In Advances in Neural Information Processing Systems.
[24]
Kristof Meding, Luca M. Schulze Buschoff, Robert Geirhos, and Felix A. Wichmann. 2022. Trivial or Impossible -- dichotomous data difficulty masks model differences (on ImageNet and beyond). In International Conference on Learning Representations.
[25]
Baharan Mirzasoleiman, Jeff Bilmes, and Jure Leskovec. 2020. Coresets for Data-Efficient Training of Machine Learning Models. In Proceedings of the 37th International Conference on Machine Learning.
[26]
Behnam Neyshabur, Srinadh Bhojanapalli, and Nathan Srebro. 2018. A PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks. In International Conference on Learning Representations.
[27]
Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. 1999. The PageRank Citation Ranking : Bringing Order to the Web. In The Web Conference.
[28]
Mansheej Paul, Surya Ganguli, and Gintare Karolina Dziugaite. 2021. Deep Learning on a Data Diet: Finding Important Examples Early in Training. In Advances in Neural Information Processing Systems.
[29]
Jonathan S. Rosenfeld, Amir Rosenfeld, Yonatan Belinkov, and Nir Shavit. 2020. A Constructive Prediction of the Generalization Error Across Scales. In International Conference on Learning Representations.
[30]
Ben Sorscher, Robert Geirhos, Shashank Shekhar, Surya Ganguli, and Ari Morcos. 2022. Beyond neural scaling laws: beating power law scaling via data pruning. In Advances in Neural Information Processing Systems.
[31]
Jiliang Tang, Charu C. Aggarwal, and Huan Liu. 2016. Node Classification in Signed Social Networks. In SDM.
[32]
Mariya Toneva, Alessandro Sordoni, Remi Tachet des Combes, Adam Trischler, Yoshua Bengio, and Geoffrey J. Gordon. 2019. An Empirical Study of Example Forgetting during Deep Neural Network Learning. In International Conference on Learning Representations.
[33]
Petar Velivc ković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. 2018. Graph Attention Networks. In International Conference on Learning Representations.
[34]
Felix Wu, Amauri Souza, Tianyi Zhang, Christopher Fifty, Tao Yu, and Kilian Weinberger. 2019. Simplifying Graph Convolutional Networks. In Proceedings of the 36th International Conference on Machine Learning. 6861--6871.
[35]
Xinyi Wu, Zhengdao Chen, William Wei Wang, and Ali Jadbabaie. 2023. A Non-Asymptotic Analysis of Oversmoothing in Graph Neural Networks. In The Eleventh International Conference on Learning Representations.
[36]
Yuexin Wu, Yichong Xu, Aarti Singh, Yiming Yang, and Artur Dubrawski. 2021. Active Learning for Graph Neural Networks via Node Feature Propagation. arxiv: 1910.07567 [cs.LG]
[37]
Congxi Xiao, Jingbo Zhou, Jizhou Huang, Tong Xu, and Hui Xiong. 2023. Spatial Heterophily Aware Graph Neural Networks. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining.
[38]
Keyulu Xu, Mozhi Zhang, Stefanie Jegelka, and Kenji Kawaguchi. 2021. Optimization of Graph Neural Networks: Implicit Acceleration by Skip Connections and More Depth. In Proceedings of the 38th International Conference on Machine Learning. 11592--11602.
[39]
Xiaohua Zhai, Alexander Kolesnikov, Neil Houlsby, and Lucas Beyer. 2022. Scaling Vision Transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[40]
Dan Zhang, Yifan Zhu, Yuxiao Dong, Yuandong Wang, Wenzheng Feng, Evgeny Kharlamov, and Jie Tang. 2023. ApeGNN: Node-Wise Adaptive Aggregation in GNNs for Recommendation. In Proceedings of the ACM Web Conference 2023.
[41]
Yuheng Zhang, Hanghang Tong, Yinglong Xia, Yan Zhu, Yuejie Chi, and Lei Ying. [n.,d.]. Batch Active Learning with Graph Neural Networks via Multi-Agent Deep Reinforcement Learning. Proceedings of the AAAI Conference on Artificial Intelligence ([n.,d.]).
[42]
Jiong Zhu, Ryan A. Rossi, Anup Rao, Tung Mai, Nedim Lipka, Nesreen K. Ahmed, and Danai Koutra. 2021. Graph Neural Networks with Heterophily. Proceedings of the AAAI Conference on Artificial Intelligence (2021).
[43]
Jiong Zhu, Yujun Yan, Lingxiao Zhao, Mark Heimann, Leman Akoglu, and Danai Koutra. 2020. Beyond Homophily in Graph Neural Networks: Current Limitations and Effective Designs. Advances in Neural Information Processing Systems (2020).

Index Terms

  1. Exploring Neural Scaling Law and Data Pruning Methods For Node Classification on Large-scale Graphs

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      WWW '24: Proceedings of the ACM Web Conference 2024
      May 2024
      4826 pages
      ISBN:9798400701719
      DOI:10.1145/3589334
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 13 May 2024

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. data pruning
      2. graph neural networks
      3. neural scaling laws

      Qualifiers

      • Research-article

      Funding Sources

      Conference

      WWW '24
      Sponsor:
      WWW '24: The ACM Web Conference 2024
      May 13 - 17, 2024
      Singapore, Singapore

      Acceptance Rates

      Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 253
        Total Downloads
      • Downloads (Last 12 months)253
      • Downloads (Last 6 weeks)16
      Reflects downloads up to 14 Feb 2025

      Other Metrics

      Citations

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media