Abstract
Deep learning has gained superior accuracy on Euclidean structure data in neural networks. As a result, non-Euclidean structure data, such as graph data, has more sophisticated structural information, which can be applied in neural networks as well to address more complex and practical problems. However, actual graph data obeys a power-law distribution, so the adjacent matrix of a graph is random and sparse. Graph processing accelerator (GPA) is designed to handle the problems above. However, graph computing only processes 1-dimensional data. In graph neural networks (GNNs), graph data is multi-dimensional. Consequently, GNNs include the execution processes of both traditional graph processing and neural network, which have irregular memory access and regular computation, respectively. To obtain more information in graph data and require better model generalization ability, the layers of GNN are deeper, so the overhead of memory access and computation is considerable. At present, GNN accelerators are designed to deal with this issue. In this paper, we conduct a systematic survey regarding the design and implementation of GNN accelerators. Specifically, we review the challenges faced by GNN accelerators, and existing related works in detail to process them. Finally, we evaluate previous works and propose future directions in this booming field.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Cao S, Lu W, Xu Q. Deep neural networks for learning graph representations. In: Proceedings of the 30th AAAI Conference on Artificial Intelligence. 2016, 1145–1152
Velickovic P, Cucurull G, Casanova A, Romero A, Liò P, Bengio Y. Graph attention networks. In: Proceedings of the 6th International Conference on Learning Representations. 2018
You J, Ying R, Ren X, Hamilton W L, Leskovec J. GraphRNN: a deep generative model for graphs. 2018, arXiv preprint arXiv: 1802.08773
Xu K, Hu W, Leskovec J, Jegelka S. How powerful are graph neural networks? In: Proceedings of the 7th International Conference on Learning Representations. 2019
Wu Z, Pan S, Chen F, Long G, Zhang C, Yu P S. A comprehensive survey on graph neural networks. IEEE Transactions on Neural Networks and Learning Systems, 2021, 32(1): 4–24
Hamilton W L, Ying Z, Leskovec J. Inductive representation learning on large graphs. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, 1025–1035
Ying R, He R, Chen K, Eksombatchai P, Hamilton W L, Leskovec J. Graph convolutional neural networks for web-scale recommender systems. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2018, 974–983
Kipf T N, Welling M. Semi-supervised classification with graph convolutional networks. In: Proceedings of the 5th International Conference on Learning Representations. 2017
Gao H, Wang Z, Ji S. Large-scale learnable graph convolutional networks. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2018, 1416–1424
Li R, Wang S, Zhu F, Huang J. Adaptive graph convolutional neural networks. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence, (AAAI 18), the 30th Innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI 18). 2018, 434
Xu K, Hu W, Leskovec J, Jegelka S. How powerful are graph neural networks? 2018, arXiv preprint arXiv: 1810.00826
Zhang M, Cui Z, Neumann M, Chen Y. An end-to-end deep learning architecture for graph classification. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence, (AAAI-18), the 30th Innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI-18). 2018, 4438–4445
Yin L, Wang J, Zheng H. Exploring architecture, dataflow, and sparsity for gcn accelerators: a holistic framework. In: Proceedings of the Great Lakes Symposium on VLSI 2023. 2023, 489–495
Garg R, Qin E, Munoz-Matrinez F, Guirado R, Jain A, Abadal S, Abellan J L, Acacio M E, Alarcon E, Rajamanickam S, Krishna T. Understanding the design space of sparse/dense multiphase gnn dataflows on spatial accelerators. In: Proceedings of IEEE International Parallel and Distributed Processing Symposium. 2022, 571–582
Hamaguchi T, Oiwa H, Shimbo M, Matsumoto Y. Knowledge transfer for out-of-knowledge-base entities: a graph neural network approach. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence. 2017, 1802–1808
Schlichtkrull M, Kipf T N, Bloem P, van den Berg R, Titov I, Welling M. Modeling relational data with graph convolutional networks. In: Proceedings of the 15th International Conference on the Semantic Web. 2018, 593–607
Zhou J, Cui G, Hu S, Zhang Z, Yang C, Liu Z, Wang L, Li C, Sun M. Graph neural networks: a review of methods and applications. AI Open, 2020, 1: 57–81
Ma L, Yang Z, Miao Y, Xue J, Wu M, Zhou L, Dai Y. NeuGraph: parallel deep neural network computation on large graphs. In: Proceedings of 2019 USENIX Annual Technical Conference. 2019, 443–458
Yan M, Chen Z, Deng L, Ye X, Zhang Z, Fan D, Xie Y. Characterizing and understanding GCNs on GPU. IEEE Computer Architecture Letters, 2020, 19(1): 22–25
Yang J, Tang D, Song X, Wang L, Yin Q, Chen R, Yu W, Zhou J. GNNLab: a factored system for sample based GNN training over GPUs. In: Proceedings of the 17th European Conference on Computer Systems. 2022, 417–434
Wang L, Yin Q, Tian C, Yang J, Chen R, Yu W, Yao Z, Zhou J. FlexGraph: a flexible and efficient distributed framework for GNN training. In: Proceedings of the 16th European Conference on Computer Systems. 2021, 67–82
Tailor S A, Fernández-Marqués J, Lane N D. Degree-quant: quantization-aware training for graph neural networks. In: Proceedings of the 9th International Conference on Learning Representations. 2021
Feng B, Wang Y, Li X, Yang S, Peng X, Ding Y. SGQuant: squeezing the last bit on graph neural networks with specialized quantization. In: Proceedings of the 32nd IEEE International Conference on Tools with Artificial Intelligence. 2020, 1044–1052
Wang Y, Feng B, Ding Y. QGTC: accelerating quantized graph neural networks via GPU tensor core. In: Proceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. 2022, 107–119
Wang Y, Feng B, Li G, Li S, Deng L, Xie Y, Ding Y. GNNAdvisor: an adaptive and efficient runtime system for GNN acceleration on GPUs. In: Proceedings of the 15th USENIX Symposium on Operating Systems Design and Implementation. 2021, 515–531
Geng T, Li A, Shi R, Wu C, Wang T, Li Y, Haghi P, Tumeo A, Che S, Reinhardt S, Herbordt M C. AWB-GCN: a graph convolutional network accelerator with runtime workload rebalancing. In: Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture. 2020, 922–936
Huang Y, Zheng L, Yao P, Wang Q, Liao X, Jin H, Xue J. Accelerating graph convolutional networks using crossbar-based processing-in-memory architectures. In: Proceedings of IEEE International Symposium on High-Performance Computer Architecture. 2022, 1029–1042
Li J, Louri A, Karanth A, Bunescu R C. GCNAX: a flexible and energy-efficient accelerator for graph convolutional neural networks. In: Proceedings of IEEE International Symposium on High-Performance Computer Architecture. 2021, 775–788
You H, Geng T, Zhang Y, Li A, Lin Y. GCoD: graph convolutional network acceleration via dedicated algorithm and accelerator Co-design. In: Proceedings of IEEE International Symposium on High-Performance Computer Architecture. 2022, 460–474
Geng T, Wu C, Zhang Y, Tan C, Xie C, You H, Herbordt M, Lin Y, Li A. I-GCN: a graph convolutional network accelerator with runtime locality enhancement through islandization. In: Proceedings of the 54th Annual IEEE/ACM International Symposium on Microarchitecture. 2021, 1051–1063
Gong Z, Ji H, Yao Y, Fletcher C W, Hughes C J, Torrellas J. Graphite: optimizing graph neural networks on CPUs through cooperative software-hardware techniques. In: Proceedings of the 49th Annual International Symposium on Computer Architecture. 2022, 916–931
Hwang R, Kang M, Lee J, Kam D, Lee Y, Rhu M. GROW: a row-stationary sparse-dense GEMM accelerator for memory-efficient graph convolutional neural networks. In: Proceedings of IEEE International Symposium on High-Performance Computer Architecture. 2023, 42–55
Yoo M, Song J, Lee J, Kim N, Kim Y, Lee J. SGCN: exploiting compressed-sparse features in deep graph convolutional network accelerators. In: Proceedings of IEEE International Symposium on High-Performance Computer Architecture. 2023, 1–14
Yang T, Li D, Ma F, Song Z, Zhao Y, Zhang J, Liu F, Jiang L. PASGCN: an ReRAM-based PIM design for GCN with adaptively sparsified graphs. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2023, 42(1): 150–163
Xu S, Shao Z, Yang C, Liao X, Jin H. Accelerating backward aggregation in GCN training with execution path preparing on GPUs. IEEE Transactions on Parallel and Distributed Systems, 2022, 33(12): 4891–4902
Gui C, Zheng L, He B, Liu C, Chen X, Liao X, Jin H. A survey on graph processing accelerators: challenges and opportunities. Journal of Computer Science and Technology, 2019, 34(2): 339–371
Roy A, Mihailovic I, Zwaenepoel W. X-stream: edge-centric graph processing using streaming partitions. In: Proceedings of the 24th Symposium on Operating Systems Principles. 2013, 472–488
Perozzi B, Al-Rfou R, Skiena S. DeepWalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2014, 701–710
Ozdal M M, Yesil S, Kim T, Ayupov A, Greth J, Burns S, Ozturk O. Energy efficient architecture for graph analytics accelerators. In: Proceedings of the 43rd ACM/IEEE Annual International Symposium on Computer Architecture. 2016, 166–177
Zhang M, Zhuo Y, Wang C, Gao M, Wu Y, Chen K, Kozyrakis C, Qian X. GraphP: reducing communication for PIM-based graph processing with efficient data partition. In: Proceedings of IEEE International Symposium on High Performance Computer Architecture. 2018, 544–557
Song L, Zhuo Y, Qian X, Li H, Chen Y. GraphR: accelerating graph processing using ReRAM. In: Proceedings of IEEE International Symposium on High Performance Computer Architecture. 2018, 531–543
Xie C, Yan L, Li W J, Zhang Z. Distributed power-law graph computing: theoretical and empirical analysis. In: Proceedings of the 27th International Conference on Neural Information Processing Systems. 2014, 1673–1681
Gonzalez J E, Low Y, Gu H, Bickson D, Guestrin C. PowerGraph: distributed graph-parallel computation on natural graphs. In: Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation. 2012, 17–30
Heidari S, Simmhan Y, Calheiros R N, Buyya R. Scalable graph processing frameworks: a taxonomy and open challenges. ACM Computing Surveys, 2019, 51(3): 60
Ham T J, Wu L, Sundaram N, Satish N, Martonosi M. Graphicionado: a high-performance and energy-efficient accelerator for graph analytics. In: Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture. 2016, 1–13
Yan M, Hu X, Li S, Basak A, Li H, Ma X, Akgun I, Feng Y, Gu P, Deng L, Ye X, Zhang Z, Fan D, Xie Y. Alleviating irregularity in graph analytics acceleration: a hardware/software Co-design approach. In: Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture. 2019, 615–628
Zhang S, Qin Z, Yang Y, Shen L, Wang Z. Transparent partial page migration between CPU and GPU. Frontiers of Computer Science, 2020, 14(3): 143101
Fan S, Fei J, Shen L. Accelerating deep learning with a parallel mechanism using CPU + MIC. International Journal of Parallel Programming, 2018, 46(4): 660–673
Yan M, Deng L, Hu X, Liang L, Feng Y, Ye X, Zhang Z, Fan D, Xie Y. HyGCN: a GCN accelerator with hybrid architecture. In: Proceedings of IEEE International Symposium on High Performance Computer Architecture. 2020, 15–29
Yang T, Li D, Han Y, Zhao Y, Liu F, Liang X, He Z, Jiang L. PIMGCN: a ReRAM-based PIM design for graph convolutional network acceleration. In: Proceedings of the 58th ACM/IEEE Design Automation Conference. 2021, 583–588
Yang H. AliGraph: a comprehensive graph neural network platform. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2019, 3165–3166
Zeng H, Prasanna V K. GraphACT: accelerating GCN training on CPUFPGA heterogeneous platforms. In: Proceedings of 2020 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. 2020, 255–265
Kung H T. Why systolic architectures? Computer, 1982, 15(1): 37–46
Liang S, Wang Y, Liu C, He L, Li H, Xu D, Li X. EnGN: a high-throughput and energy-efficient accelerator for large graph neural networks. IEEE Transactions on Computers, 2021, 70(9): 1511–1525
Auten A, Tomei M, Kumar R. Hardware acceleration of graph neural networks. In: Proceedings of the 57th ACM/IEEE Design Automation Conference. 2020, 1–6
Liang S, Liu C, Wang Y, Li H, Li X. DeepBurning-GL: an automated framework for generating graph neural network accelerators. In: Proceedings of IEEE/ACM International Conference on Computer Aided Design. 2020, 72
Song X, Zhi T, Fan Z, Zhang Z, Zeng X, Li W, Hu X, Du Z, Guo Q, Chen Y. Cambricon-G: a polyvalent energy-efficient accelerator for dynamic graph neural networks. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2022, 41(1): 116–128
Zhou Z, Shi B, Zhang Z, Guan Y, Sun G, Luo G. BlockGNN: towards efficient GNN acceleration using block-circulant weight matrices. In: Proceedings of the 58th ACM/IEEE Design Automation Conference. 2021, 1009–1014
Arka A I, Doppa J R, Pande P P, Joardar B K, Chakrabarty K. RegraphX: NoC-enabled 3D heterogeneous ReRAM architecture for training graph neural networks. In: Proceedings of Design, Automation & Test in Europe Conference & Exhibition. 2021, 1667–1672
Kiningham K, Levis P, Ré C. GRIP: a graph neural network accelerator architecture. IEEE Transactions on Computers, 2023, 72(4): 914–925
Chen C, Li K, Li Y, Zou X. ReGNN: a redundancy-eliminated graph neural networks accelerator. In: Proceedings of IEEE International Symposium on High-Performance Computer Architecture. 2022, 429–443
Liu C, Liu H, Jin H, Liao X, Zhang Y, Duan Z, Xu J, Li H. ReGNN: a ReRAM-based heterogeneous architecture for general graph neural networks. In: Proceedings of the 59th ACM/IEEE Design Automation Conference. 2022, 469–474
Lee Y, Chung J, Rhu M. SmartsAGE: training large-scale graph neural networks using in-storage processing architectures. In: Proceedings of the 49th Annual International Symposium on Computer Architecture. 2022, 932–945
Sun Q, Liu Y, Yang H, Zhang R, Dun M, Li M, Liu X, Xiao W, Li Y, Luan Z, Qian D. CoGNN: efficient scheduling for concurrent GNN training on GPUs. In: Proceedings of International Conference on High Performance Computing, Networking, Storage and Analysis. 2022, 39
Sarkar R, Abi-Karam S, He Y, Sathidevi L, Hao C. FlowGNN: a dataflow architecture for real-time workload-agnostic graph neural network inference. In: Proceedings of the 29th IEEE International Symposium on High-Performance Computer Architecture. 2023, 1099–1112
Yun S, Kim B, Park J, Nam H, Ahn J H, Lee E. GraNDe: near-data processing architecture with adaptive matrix mapping for graph convolutional networks. IEEE Computer Architecture Letters, 2022, 21(2): 45–48
Gustavson F G. Two fast algorithms for sparse matrices: multiplication and permuted transposition. ACM Transactions on Mathematical Software, 1978, 4(3): 250–269
Acknowledgements
This work was supported by the National Natural Science Foundation of China (Grant Nos. 62032001 and 61972407).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests The authors declare that they have no competing interests or financial conflicts to disclose.
Additional information
Jingyu Liu received the master degree in integrated circuit engineering from National University of Defense Technology, China in 2021. He is now working towards the PhD degree with the School of Computer, National University of Defense Technology, China. His research interests include computer architecture and graph-based hardware accelerator.
Shi Chen received the bachelor degree in Computer Science & Technology from National University of Defense Technology, China in 2021. He is now working towards the PhD degree with the School of Computer, National University of Defense Technology, China. His research interests include computer architecture and graph-based hardware accelerator.
Li Shen received the BS and PhD degrees in Computer Science & Technology from National University of Defense Technology, China. Currently he is a professor at School of Computer, National University of Defense Technology, China. His research interests include high performance processor architecture, parallel programming, and performance optimization techniques.
Rights and permissions
About this article
Cite this article
Liu, J., Chen, S. & Shen, L. A comprehensive survey on graph neural network accelerators. Front. Comput. Sci. 19, 192104 (2025). https://doi.org/10.1007/s11704-023-3307-2
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11704-023-3307-2