Abstract
Sampling is a critical operation in Graph Neural Network (GNN) training that helps reduce the cost. Previous literature has explored improving sampling algorithms via mathematical and statistical methods. However, there is a gap between sampling algorithms and hardware. Without consideration of hardware, algorithm designers merely optimize sampling at the algorithm level, missing the great potential of promoting the efficiency of existing sampling algorithms by leveraging hardware features. In this paper, we pioneer to propose a unified programming model for mainstream sampling algorithms, termed GNNSampler, covering the critical processes of sampling algorithms in various categories. Second, to leverage the hardware feature, we choose the data locality as a case study, and explore the data locality among nodes and their neighbors in a graph to alleviate irregular memory access in sampling. Third, we implement locality-aware optimizations in GNNSampler for various sampling algorithms to optimize the general sampling process. Finally, we emphatically conduct experiments on large graph datasets to analyze the relevance among training time, accuracy, and hardware-level metrics. Extensive experiments show that our method is universal to mainstream sampling algorithms and helps significantly reduce the training time, especially in large-scale graphs.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Scarselli, F., Gori, M., Tsoi, A.C., Hagenbuchner, M., Monfardini, G.: The graph neural network model. IEEE Trans. Neural Netw. 20(1), 61–80 (2008)
Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. In: ICLR (2017)
Schlichtkrull, M., Kipf, T.N., Bloem, P., van den Berg, R., Titov, I., Welling, M.: Modeling relational data with graph convolutional networks. In: Gangemi, A., et al. (eds.) ESWC 2018. LNCS, vol. 10843, pp. 593–607. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93417-4_38
Hamilton, W., Ying Z., Leskovec, J.: Inductive representation learning on large graphs. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Liu, X., Yan, M., Deng, L., Li, G., Ye, X., et al.: Survey on graph neural network acceleration: an algorithmic perspective. arXiv preprint arXiv:2202.04822 (2022)
Yan, M., Hu, X., Li, S., et al.: Alleviating irregularity in graph analytics acceleration: a hardware/software co-design approach. In: MICRO (2019)
Chen, J., Zhu, J., Song, L.: Stochastic training of graph convolutional networks with variance reduction. In: ICML (2018)
Chen, J., Ma, T., Xiao, C.: FastGCN: fast learning with graph convolutional networks via importance sampling. In: ICLR (2018)
Huang, W., Zhang, T., Rong, Y., Huang, J.: Adaptive sampling towards fast graph representation learning. In: Advances in Neural Information Processing System, vol. 31, pp. 4563–4572 (2018)
Chiang, W.L., Liu, X., Si, S., et al.: Cluster-GCN: an efficient algorithm for training deep and large graph convolutional networks. In: SIGKDD (2019)
Zeng, H., Zhou, H., Srivastava, A., Kannan, R., Prasanna, V.: GraphSAINT: graph sampling based inductive learning method. In: ICLR (2020)
Zeng, H., Zhou, H., Srivastava, A., Kannan, R., Prasanna, V.: Accurate, efficient and scalable graph embedding. In: IPDPS (2019)
Zeng, H., Zhang, M., et al.: Decoupling the depth and scope of graph neural networks. In: Advances in Neural Information Processing Systems, vol. 34, pp. 19665–19679 (2021)
Liu, X., Yan, M., Deng, L., Li, G., Ye, X., Fan, D.: Sampling methods for efficient training of graph convolutional networks: a survey. IEEE/CAA J. Automatica Sin. 9(2), 205–234 (2021)
Sen, P., Namata, G., Bilgic, M., Getoor, L., Galligher, B., Eliassi-Rad, T.: Collective classification in network data. AI Mag. 29(3), 93–93 (2008)
Ma, L., Yang, Z., Miao, Y., et al.: NeuGraph: parallel deep neural network computation on large graphs. In: USENIX ATC (2019)
Yan, M., et al.: HyGCN: a GCN accelerator with hybrid architecture. In: HPCA (2020)
Yan, M., Chen, Z., Deng, L., et al.: Characterizing and understanding GCNs on GPU. IEEE Comput. Archit. Lett. 19(1), 22–25 (2020)
Denning, P.J.: The locality principle. In: Communication Networks And Computer Systems: A Tribute to Professor Erol Gelenbe (2006)
Mukkara, A., Beckmann, N., Abeydeera, M., et al.: Exploiting locality in graph analytics through hardware-accelerated traversal scheduling. In: MICRO (2018)
Zitnik, M., Leskovec, J.: Predicting multicellular function through multi-layer tissue networks. Bioinformatics 33(14), i190–i198 (2017)
Thomas, W., Roman, D.: Intel performance counter monitor - a better way to measure CPU utilization (2018). https://github.com/opcm/pcm
Leskovec, J., Sosič, R.: SNAP: a general-purpose network analysis and graph-mining library. ACM Trans. Intell. Syst. Technol. (TIST) 8(1), 1–20 (2016)
Watts, D.J., Strogatz, S.H.: Collective dynamics of ‘small-world’ networks. Nature 393(6684), 440–442 (1998)
Easley, D., Kleinberg, J.: Networks, Crowds, and Markets: Reasoning about a Highly Connected World. Cambridge University Press, Cambridge (2010)
Acknowledgment
This work was partly supported by the Strategic Priority Research Program of Chinese Academy of Sciences (Grant No. XDA18000000), National Natural Science Foundation of China (Grant No.61732018 and 61872335), Austrian-Chinese Cooperative R &D Project (FFG and CAS) (Grant No. 171111KYSB20200002), CAS Project for Young Scientists in Basic Research (Grant No. YSBR-029), and CAS Project for Youth Innovation Promotion Association.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Liu, X. et al. (2023). GNNSampler: Bridging the Gap Between Sampling Algorithms of GNN and Hardware. In: Amini, MR., Canu, S., Fischer, A., Guns, T., Kralj Novak, P., Tsoumakas, G. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2022. Lecture Notes in Computer Science(), vol 13717. Springer, Cham. https://doi.org/10.1007/978-3-031-26419-1_30
Download citation
DOI: https://doi.org/10.1007/978-3-031-26419-1_30
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-26418-4
Online ISBN: 978-3-031-26419-1
eBook Packages: Computer ScienceComputer Science (R0)