skip to main content
10.1145/3492321.3519557acmconferencesArticle/Chapter ViewAbstractPublication PageseurosysConference Proceedingsconference-collections
research-article

GNNLab: a factored system for sample-based GNN training over GPUs

Published: 28 March 2022 Publication History

Abstract

We propose GNNLab, a sample-based GNN training system in a single machine multi-GPU setup. GNNLab adopts a factored design for multiple GPUs, where each GPU is dedicated to the task of graph sampling or model training. It accelerates both tasks by eliminating GPU memory contention. To balance GPU workloads, GNNLab applies a global queue to bridge GPUs asynchronously and adopts a simple yet effective method to adaptively allocate GPUs for different tasks. GNNLab further leverages temporarily switching to avoid idle waiting on GPUs. Furthermore, GNNLab proposes a new pre-sampling based caching policy that takes both sampling algorithms and GNN datasets into account, and shows an efficient and robust caching performance. Evaluations on three representative GNN models and four real-life graphs show that GNNLab outperforms the state-of-the-art GNN systems DGL and PyG by up to 9.1× (from 2.4×) and 74.3× (from 10.2×), respectively. In addition, our pre-sampling based caching policy achieves 90% -- 99% of the optimal cache hit rate in all experiments.

References

[1]
2020. DGL: Deep Graph Library. https://www.dgl.ai/.
[2]
2020. Euler 2.0: A Distributed Graph Deep Learning Framework. https://github.com/alibaba/euler.
[3]
2021. Open Graph Benchmark: The MAG240M dataset. https://ogb.stanford.edu/docs/lsc/mag240m/.
[4]
2021. Open Graph Benchmark: The ogbn-papers100M dataset. https://ogb.stanford.edu/docs/nodeprop/#ogbn-papers100M.
[5]
2021. Open Graph Benchmark: The ogbn-products dataset. https://ogb.stanford.edu/docs/nodeprop/#ogbn-products.
[6]
2021. Using GPU for Neighborhood Sampling in DGL Data Loaders. https://docs.dgl.ai/guide/minibatch-gpu-sampling.html.
[7]
Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, et al. 2016. Tensorflow: A system for large-scale machine learning. In 12th USENIX symposium on operating systems design and implementation (OSDI'16). 265--283.
[8]
Lada A Adamic and Bernardo A Huberman. 2000. Power-law distribution of the world wide web. science 287, 5461 (2000), 2115--2115.
[9]
Paolo Boldi and Sebastiano Vigna. 2004. The WebGraph Framework I: Compression Techniques. In Proceedings of the 13th International Conference on World Wide Web (WWW'04). 595--601.
[10]
Zhenkun Cai, Xiao Yan, Yidi Wu, Kaihao Ma, James Cheng, and Fan Yu. 2021. DGCL: An Efficient Communication Library for Distributed GNN Training. In Proceedings of the 16th European Conference on Computer Systems (EuroSys'21). 130--144.
[11]
Jie Chen, Tengfei Ma, and Cao Xiao. 2018. FastGCN: Fast Learning with Graph Convolutional Networks via Importance Sampling. In Proceedings of the 6th International Conference on Learning Representations (ICLR'18).
[12]
Jianfei Chen, Jun Zhu, and Le Song. 2018. Stochastic Training of Graph Convolutional Networks with Variance Reduction. In Proceedings of the 35th International Conference on Machine Learning (ICML'18). 941--949.
[13]
Rong Chen, Jiaxin Shi, Yanzhe Chen, and Haibo Chen. 2015. Power-Lyra: differentiated graph computation and partitioning on skewed graphs. In Proceedings of the Tenth European Conference on Computer Systems. 1--15.
[14]
Tianqi Chen, Mu Li, Yutian Li, Min Lin, Naiyan Wang, Minjie Wang, Tianjun Xiao, Bing Xu, Chiyuan Zhang, and Zheng Zhang. 2015. Mxnet: A flexible and efficient machine learning library for heterogeneous distributed systems. In Neural Information Processing Systems, Workshop on Machine Learning Systems.
[15]
Wei-Lin Chiang, Xuanqing Liu, Si Si, Yang Li, Samy Bengio, and Cho-Jui Hsieh. 2019. Cluster-gcn: An efficient algorithm for training deep and large graph convolutional networks. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 257--266.
[16]
Wenfei Fan, Tao He, Longbin Lai, Xue Li, Yong Li, Zhao Li, Zhengping Qian, Chao Tian, Lei Wang, Jingbo Xu, et al. 2021. GraphScope: a unified engine for big graph processing. Proceedings of the VLDB Endowment 14, 12 (2021), 2879--2892.
[17]
Matthias Fey and Jan E. Lenssen. 2019. Fast graph representation learning with PyTorch Geometric. (2019).
[18]
Ronald Aylmer Fisher, Frank Yates, et al. 1963. Statistical tables for biological, agricultural and medical research, edited by ra fisher and f. yates. Edinburgh: Oliver and Boyd.
[19]
Alex Fout, Jonathon Byrd, Basir Shariat, and Asa Ben-Hur. 2017. Protein interface prediction using graph convolutional networks. In Advances in neural information processing systems. 6530--6539.
[20]
Swapnil Gandhi and Anand Padmanabha Iyer. 2021. P3: Distributed Deep Graph Learning at Scale. In Proceedings of the 15th USENIX Conference on Operating Systems Design and Implementation (OSDI'21).
[21]
Justin Gilmer, Samuel S Schoenholz, Patrick F Riley, Oriol Vinyals, and George E Dahl. 2017. Neural message passing for quantum chemistry. In International conference on machine learning. PMLR, 1263--1272.
[22]
Joseph E Gonzalez, Yucheng Low, Haijie Gu, Danny Bickson, and Carlos Guestrin. 2012. Powergraph: Distributed graph-parallel computation on natural graphs. In 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI'12). 17--30.
[23]
Aditya Grover and Jure Leskovec. 2016. node2vec: Scalable Feature Learning for Networks. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'16). 855--864.
[24]
Arpan Gujarati, Reza Karimi, Safya Alzayat, Wei Hao, Antoine Kaufmann, Ymir Vigfusson, and Jonathan Mace. 2020. Serving DNNs like clockwork: Performance predictability from the bottom up. In Proceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI '20). 443--462.
[25]
William L. Hamilton, Rex Ying, and Jure Leskovec. 2017. Inductive Representation Learning on Large Graphs. In Proceedings of the 31st International Conference on Neural Information Processing Systems (NeurIPS'17). 1025--1035.
[26]
Weihua Hu, Matthias Fey, Marinka Zitnik, Yuxiao Dong, Hongyu Ren, Bowen Liu, Michele Catasta, and Jure Leskovec. 2020. Open Graph Benchmark: Datasets for Machine Learning on Graphs. In Proceedings of the 34th International Conference on Neural Information Processing Systems (NeurIPS'20).
[27]
Kezhao Huang, Jidong Zhai, Zhen Zheng, Youngmin Yi, and Xipeng Shen. 2021. Understanding and bridging the gaps in current GNN performance optimizations. In Proceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming.
[28]
Wenbing Huang, Tong Zhang, Yu Rong, and Junzhou Huang. 2018. Adaptive sampling towards fast graph representation learning. Advances in neural information processing systems 31 (2018).
[29]
Ankit Jain, Isaac Liu, Ankur Sarda, and Piero Molino. 2019. Food Discovery with Uber Eats: Using Graph Learning to Power Recommendations. https://eng.uber.com/uber-eats-graph-learning/.
[30]
Abhinav Jangda, Sandeep Polisetty, Arjun Guha, and Marco Serafini. 2021. Accelerating Graph Sampling for Graph Machine Learning using GPUs. In Proceedings of the 16th European Conference on Computer Systems (EuroSys'21). 311--326.
[31]
Zhihao Jia, Sina Lin, Mingyu Gao, Matei Zaharia, and Alex Aiken. 2020. Improving the Accuracy, Scalability, and Performance of Graph Neural Networks with ROC. In Proceedings of the 3rd Machine Learning and Systems (MLSys'20). 187--198.
[32]
Taehyun Kim, KyoungSoo Park, Changho Hwang, Peng Cheng, Youshan Miao, Lingxiao Ma, Zhiqi Lin, and Yongqiang Xiong. 2021. Accelerating GNN Training with Locality-Aware Partial Execution. In Proceedings of the 12th ACM SIGOPS Asia-Pacific Workshop on Systems (APSys'21).
[33]
Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In Proceedings of the 5th International Conference on Learning Representations (ICLR'17).
[34]
Haewoon Kwak, Changhyun Lee, Hosung Park, and Sue Moon. 2010. What is Twitter, a social network or a news media?. In Proceedings of the 19th International Conference on World Wide Web (WWW'10). 591--600.
[35]
Zhiqi Lin, Cheng Li, Youshan Miao, Yunxin Liu, and Yinlong Xu. 2020. Pagraph: Scaling GNN Training on Large Graphs via Computation-aware Caching. In Proceedings of the 11th ACM Symposium on Cloud Computing (SoCC'20). 401--415.
[36]
Xin Liu, Mingyu Yan, Lei Deng, Guoqi Li, Xiaochun Ye, and Dongrui Fan. 2021. Sampling methods for efficient training of graph convolutional networks: A survey. arXiv preprint arXiv:2103.05872 (2021).
[37]
Lingxiao Ma, Zhi Yang, Youshan Miao, Jilong Xue, Ming Wu, Lidong Zhou, and Yafei Dai. 2019. NeuGraph: Parallel Deep Neural Network Computation on Large Graphs. In Proceedings of 2019 USENIX Annual Technical Conference (ATC'19). 443--458.
[38]
Jason Mohoney, Roger Waleffe, Henry Xu, Theodoros Rekatsinas, and Shivaram Venkataraman. 2021. Marius: Learning Massive Graph Embeddings on a Single Machine. In Proceedings of the 15th USENIX Symposium on Operating Systems Design and Implementation (OSDI'21).
[39]
Deepak Narayanan, Aaron Harlap, Amar Phanishayee, Vivek Seshadri, Nikhil R Devanur, Gregory R Ganger, Phillip B Gibbons, and Matei Zaharia. 2019. PipeDream: generalized pipeline parallelism for DNN training. In Proceedings of the 27th ACM Symposium on Operating Systems Principles (SOSP'19). 1--15.
[40]
Santosh Pandey, Lingda Li, Adolfy Hoisie, Xiaoye S Li, and Hang Liu. 2020. C-SAW: A framework for graph sampling and random walk on GPUs. In SC20: International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, 1--15.
[41]
Jay H Park, Gyeongchan Yun, M Yi Chang, Nguyen T Nguyen, Seungmin Lee, Jaesik Choi, Sam H Noh, and Young-ri Choi. 2020. Hetpipe: Enabling large DNN training on (whimpy) heterogeneous GPU clusters through integration of pipelined model parallelism and data parallelism. In Proceedings of 2020 USENIX Annual Technical Conference (Usenix ATC'20). 307--321.
[42]
Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. 2019. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019).
[43]
Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. Deepwalk: Online Learning of Social Representations. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'14). 701--710.
[44]
Aurick Qiao, Sang Keun Choe, Suhas Jayaram Subramanya, Willie Neiswanger, Qirong Ho, Hao Zhang, Gregory R Ganger, and Eric P Xing. 2021. Pollux: Co-adaptive cluster scheduling for goodput-optimized deep learning. In 15th USENIX Symposium on Operating Systems Design and Implementation (OSDI'21).
[45]
Colby Ranger, Ramanan Raghuraman, Arun Penmetsa, Gary Bradski, and Christos Kozyrakis. 2007. Evaluating MapReduce for Multicore and Multiprocessor Systems. In Proceedings of the 13th IEEE International Symposium on High Performance Computer Architecture (ISCA'07). 13--24.
[46]
Victor Garcia Satorras and Joan Bruna Estrach. 2018. Few-Shot Learning with Graph Neural Networks. In Proceedings of the 6th International Conference on Learning Representations (ICLR'18).
[47]
Marco Serafini and Hui Guan. 2021. Scalable Graph Neural Network Training: The Case for Sampling. ACM SIGOPS Operating Systems Review 55, 1 (2021), 68--76.
[48]
John Thorpe, Yifan Qiao, Jonathan Eyolfson, Shen Teng, Guanzhou Hu, Zhihao Jia, Jinliang Wei, Keval Vora, Ravi Netravali, Miryung Kim, et al. 2021. Dorylus: affordable, scalable, and accurate GNN training with distributed CPU servers and serverless threads. In 15th USENIX Symposium on Operating Systems Design and Implementation (OSDI'21). 495--514.
[49]
Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. 2018. Graph Attention Networks. In Proceedings of the 6th International Conference on Learning Representations (ICLR'18).
[50]
Jeffrey S Vitter. 1985. Random sampling with a reservoir. ACM Transactions on Mathematical Software (TOMS) 11, 1 (1985), 37--57.
[51]
Lei Wang, Qiang Yin, Chao Tian, Jianbang Yang, Rong Chen, Wenyuan Yu, Zihang Yao, and Jingren Zhou. 2021. FlexGraph: A Flexible and Efficient Distributed Framework for GNN Training. In Proceedings of the 16th European Conference on Computer Systems (EuroSys'21). 67--82.
[52]
Minjie Wang, Da Zheng, Zihao Ye, Quan Gan, Mufei Li, Xiang Song, Jinjing Zhou, Chao Ma, Lingfan Yu, Yu Gai, Tianjun Xiao, Tong He, George Karypis, Jinyang Li, and Zheng Zhang. 2019. Deep Graph Library: A Graph-Centric, Highly-Performant Package for Graph Neural Networks. arXiv preprint arXiv:1909.01315 (2019).
[53]
Yuke Wang, Boyuan Feng, Gushu Li, Shuangchen Li, Lei Deng, Yuan Xie, and Yufei Ding. 2021. GNNAdvisor: An Adaptive and Efficient Runtime System for GNN Acceleration on GPUs. In Proceedings of the 15th USENIX Conference on Operating Systems Design and Implementation (OSDI'21).
[54]
David Wentzlaff and Anant Agarwal. 2009. Factored Operating Systems (fos): The Case for a Scalable Operating System for Multicores. ACM SIGOPS Operating Systems Review 43, 2 (2009), 76--85.
[55]
Wei Wu, Bin Li, Chuan Luo, and Wolfgang Nejdl. 2021. Hashing-Accelerated Graph Neural Networks for Link Prediction. In Proceedings of the Web Conference 2021. 2910--2920.
[56]
Yidi Wu, Kaihao Ma, Zhenkun Cai, Tatiana Jin, Boyang Li, Chenguang Zheng, James Cheng, and Fan Yu. 2021. Seastar: Vertex-centric Programming for Graph Neural Networks. In Proceedings of the 16th European Conference on Computer Systems (EuroSys'21). 359--375.
[57]
Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong Long, Chengqi Zhang, and S Yu Philip. 2020. A comprehensive survey on graph neural networks. IEEE Transactions on Neural Networks and Learning Systems (2020).
[58]
Rex Ying, Ruining He, Kaifeng Chen, Pong Eksombatchai, William L. Hamilton, and Jure Leskovec. 2018. Graph Convolutional Neural Networks for Web-Scale Recommender Systems. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'18). 974--983.
[59]
Zhitao Ying, Jiaxuan You, Christopher Morris, Xiang Ren, Will Hamilton, and Jure Leskovec. 2018. Hierarchical graph representation learning with differentiable pooling. Advances in neural information processing systems 31 (2018).
[60]
Hanqing Zeng, Muhan Zhang, Yinglong Xia, Ajitesh Srivastava, Andrey Malevich, Rajgopal Kannan, Viktor Prasanna, Long Jin, and Ren Chen. 2020. Deep graph neural networks with shallow subgraph samplers. arXiv preprint arXiv:2012.01380 (2020).
[61]
Hanqing Zeng, Hongkuan Zhou, Ajitesh Srivastava, Rajgopal Kannan, and Viktor Prasanna. 2020. GraphSAINT: Graph Sampling Based Inductive Learning Method. In Proceedings of the 8th International Conference on Learning Representations (ICLR'20).
[62]
Dalong Zhang, Xin Huang, Ziqi Liu, Jun Zhou, Zhiyang Hu, Xianzheng Song, Zhibang Ge, Lin Wang, Zhiqiang Zhang, and Yuan Qi. 2020. AGL: A Scalable System for Industrial-Purpose Graph Machine Learning. Proc. VLDB Endow. 13, 12 (2020), 3125--3137.
[63]
Muhan Zhang and Yixin Chen. 2018. Link prediction based on graph neural networks. Advances in Neural Information Processing Systems 31 (2018), 5165--5175.
[64]
Qingru Zhang, David Wipf, Quan Gan, and Le Song. 2021. A biased graph neural network sampler with near-optimal regret. Advances in Neural Information Processing Systems 34 (2021).
[65]
Ziwei Zhang, Peng Cui, and Wenwu Zhu. 2020. Deep learning on graphs: A survey. IEEE Transactions on Knowledge and Data Engineering (2020).
[66]
Da Zheng, Chao Ma, Minjie Wang, Jinjing Zhou, Qidong Su, Xiang Song, Quan Gan, Zheng Zhang, and George Karypis. 2020. DistDGL: Distributed Graph Neural Network Training for Billion-Scale Graphs. In Proceedings of the 10th IEEE/ACM Workshop on Irregular Applications: Architectures and Algorithms (IA3'20). 36--44.
[67]
Rong Zhu, Kun Zhao, Hongxia Yang, Wei Lin, Chang Zhou, Baole Ai, Yong Li, and Jingren Zhou. 2019. AliGraph: A Comprehensive Graph Neural Network Platform. In Proceedings of the VLDB Endowment. 2094--2105.

Cited By

View all
  • (2025)DiskGNN: Bridging I/O Efficiency and Model Accuracy for Out-of-Core GNN TrainingProceedings of the ACM on Management of Data10.1145/37097383:1(1-27)Online publication date: 11-Feb-2025
  • (2025)Frugal: Efficient and Economic Embedding Model Training with Commodity GPUsProceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 110.1145/3669940.3707245(509-523)Online publication date: 3-Feb-2025
  • (2025)Survey on Characterizing and Understanding GNNs From a Computer Architecture PerspectiveIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2025.353208936:3(537-552)Online publication date: Mar-2025
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
EuroSys '22: Proceedings of the Seventeenth European Conference on Computer Systems
March 2022
783 pages
ISBN:9781450391627
DOI:10.1145/3492321
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 March 2022

Permissions

Request permissions for this article.

Check for updates

Badges

Author Tags

  1. caching policy
  2. graph neural networks
  3. sample-based GNN training

Qualifiers

  • Research-article

Funding Sources

Conference

EuroSys '22
Sponsor:

Acceptance Rates

Overall Acceptance Rate 241 of 1,308 submissions, 18%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)390
  • Downloads (Last 6 weeks)33
Reflects downloads up to 13 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)DiskGNN: Bridging I/O Efficiency and Model Accuracy for Out-of-Core GNN TrainingProceedings of the ACM on Management of Data10.1145/37097383:1(1-27)Online publication date: 11-Feb-2025
  • (2025)Frugal: Efficient and Economic Embedding Model Training with Commodity GPUsProceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 110.1145/3669940.3707245(509-523)Online publication date: 3-Feb-2025
  • (2025)Survey on Characterizing and Understanding GNNs From a Computer Architecture PerspectiveIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2025.353208936:3(537-552)Online publication date: Mar-2025
  • (2024)Efficient Training of Graph Neural Networks on Large GraphsProceedings of the VLDB Endowment10.14778/3685800.368584417:12(4237-4240)Online publication date: 8-Nov-2024
  • (2024)Eliminating Data Processing Bottlenecks in GNN Training over Large Graphs via Two-level Feature CompressionProceedings of the VLDB Endowment10.14778/3681954.368196817:11(2854-2866)Online publication date: 1-Jul-2024
  • (2024)NeutronOrch: Rethinking Sample-Based GNN Training under CPU-GPU Heterogeneous EnvironmentsProceedings of the VLDB Endowment10.14778/3659437.365945317:8(1995-2008)Online publication date: 31-May-2024
  • (2024)FreshGNN: Reducing Memory Access via Stable Historical Embeddings for Graph Neural Network TrainingProceedings of the VLDB Endowment10.14778/3648160.364818417:6(1473-1486)Online publication date: 3-May-2024
  • (2024)DAHA: Accelerating GNN Training with Data and Hardware Aware Execution PlanningProceedings of the VLDB Endowment10.14778/3648160.364817617:6(1364-1376)Online publication date: 3-May-2024
  • (2024)Comprehensive Evaluation of GNN Training Systems: A Data Management PerspectiveProceedings of the VLDB Endowment10.14778/3648160.364816717:6(1241-1254)Online publication date: 3-May-2024
  • (2024)XGNN: Boosting Multi-GPU GNN Training via Global GNN Memory StoreProceedings of the VLDB Endowment10.14778/3641204.364121917:5(1105-1118)Online publication date: 2-May-2024
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media