research-article

GNNIE: GNN inference engine with load-balancing and graph-specific caching

Authors:

Sudipta Mondal,

Susmita Dey Manasi,

Sachin S. SapatnekarAuthors Info & Claims

DAC '22: Proceedings of the 59th ACM/IEEE Design Automation Conference

Pages 565 - 570

https://doi.org/10.1145/3489517.3530503

Published: 23 August 2022 Publication History

Abstract

Graph neural networks (GNN) inferencing involves weighting vertex feature vectors, followed by aggregating weighted vectors over a vertex neighborhood. High and variable sparsity in the input vertex feature vectors, and high sparsity and power-law degree distributions in the adjacency matrix, can lead to (a) unbalanced loads and (b) inefficient random memory accesses. GNNIE ensures load-balancing by splitting features into blocks, proposing a flexible MAC architecture, and employing load (re)distribution. GNNIE's novel caching scheme bypasses the high costs of random DRAM accesses. GNNIE shows high speedups over CPUs/GPUs; it is faster and runs a broader range of GNNs than existing accelerators.

References

[1]

S. Han et al. EIE: Efficient Inference Engine on Compressed Deep Neural Network. In ISCA, June 2016.

Digital Library

[2]

T. Kipf et al. Semi-Supervised Classification with Graph Convolutional Networks. In ICLR, 2017.

[3]

W. Hamilton et al. Inductive Representation Learning on Large Graphs. In NeurIPS, 2017.

[4]

P. Veličković et al. Graph Attention Networks. In ICLR, 2018.

[5]

K. Xu et al. How Powerful are Graph Neural Networks? In ICLR, 2019.

[6]

H. Sharma et al. Bit Fusion: Bit-Level Dynamically Composable Architecture for Accelerating Deep Neural Network. In ISCA, 2018.

[7]

H. Genc et al. Gemmini: Enabling Systematic Deep-Learning Architecture Evaluation via Full-stack Integration. In DAC, 2021.

Digital Library

[8]

Y. Chen et al. Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks. JSSC, 52(1), 2017.

[9]

N. P. Jouppi et al. In-datacenter Performance Analysis of a Tensor Processing Unit. In ISCA, June 2017.

[10]

A. Parashar et al. SCNN: An Accelerator for Compressed-sparse Convolutional Neural Networks. In Proc. ISCA, pages 27--40, 2017.

Digital Library

[11]

Tae Jun Ham et al. Graphicionado: A High-Performance and Energy-Efficient Accelerator for Graph Analytics. In MICRO, 2016.

[12]

G. Dai et al. FPGP: Graph Processing Framework on FPGA A Case Study of Breadth-First Search. In FPGA, 2016.

Digital Library

[13]

Sang-Woo Jun et al. GraFBoost: Using accelerated flash storage for external graph analytics. In ISCA, 2018.

[14]

Z. Zhou et al. BlockGNN: Towards Efficient GNN Acceleration Using Block-Circulant Weight Matrices. In DAC, 2021.

Digital Library

[15]

M. Yan et al. HyGCN: A GCN Accelerator with Hybrid Architecture. In HPCA, 2020.

[16]

J.R. Stevens et al. GNNerator: A Hardware/Software Framework for Accelerating Graph Neural Networks. In DAC, 2021.

Digital Library

[17]

C. Chen et al. DyGNN: Algorithm and Architecture Support of Dynamic Pruning for Graph Neural Networks. In DAC, 2021.

Digital Library

[18]

T. Geng et al. AWB-GCN: A Graph Convolutional Network Accelerator with Runtime Workload Rebalancing. In MICRO, 2020.

[19]

S. Liang et al. EnGN: A High-Throughput and Energy-Efficient Accelerator for Large Graph Neural Networks. IEEE Transactions on Computers, 2020.

[20]

R. Ying et al. Hierarchical Graph Representation Learning with Differentiable Pooling. In NeurIPS, 2018.

Digital Library

[21]

Y. Zhang et al. Making Caches Work for Graph Analytics. In IEEE BigData, 2017.

[22]

P. Faldu et al. Domain-Specialized Cache Management for Graph Analytics. In HPCA, 2020.

[23]

Y. Kim et al. Ramulator: A Fast and Extensible DRAM Simulator. IEEE Comp. Arch. Letters, 15(1), 2015.

[24]

M. O'Connor et al. Fine-Grained DRAM: Energy-Efficient DRAM for Extreme Bandwidth Systems. In ISCA, 2017.

[25]

M. Fey and J. E. Lenssen. Fast graph representation learning with PyTorch Geometric. In ICLR Workshop on Representation Learning on Graphs and Manifolds, 2019. https://github.com/pyg-team/pytorch_geometric.

[26]

V. P. Dwivedi et al. Benchmarking Graph Neural Networks. arXiv preprint arXiv:2003.00982, 2020.

Cited By

Song YLi XLi FYu G(2024)Learning from Feature and Global Topologies: Adaptive Multi-View Parallel Graph Contrastive LearningMathematics10.3390/math1214227712:14(2277)Online publication date: 21-Jul-2024
https://doi.org/10.3390/math12142277
Chen SLiu JShen L(2024)A Survey on Graph Neural Network Acceleration: A Hardware PerspectiveChinese Journal of Electronics10.23919/cje.2023.00.13533:3(601-622)Online publication date: May-2024
https://doi.org/10.23919/cje.2023.00.135
Sheng ZZhang WTao YCui B(2024)OUTRE: An OUT-of-Core De-REdundancy GNN Training Framework for Massive Graphs within A Single MachineProceedings of the VLDB Endowment10.14778/3681954.368197617:11(2960-2973)Online publication date: 30-Aug-2024
https://doi.org/10.14778/3681954.3681976
Show More Cited By

Recommendations

Hardware Acceleration of Inference on Dynamic GNNs
ISLPED '24: Proceedings of the 29th ACM/IEEE International Symposium on Low Power Electronics and Design

Dynamic graph neural networks (DGNNs) play a crucial role in applications that require inferencing on graph-structured data, where the connectivity and features of the graph evolve over time. The proposed platform integrates graph neural network (GNN) ...
Optimizing massively parallel sparse matrix computing on ARM many-core processor
Abstract
Sparse matrix multiplication is ubiquitous in many applications such as graph processing and numerical simulation. In recent years, numerous efficient sparse matrix multiplication algorithms and computational libraries have been proposed. However,...
A Unified CPU-GPU Protocol for GNN Training
CF '24: Proceedings of the 21st ACM International Conference on Computing Frontiers

Training a Graph Neural Network (GNN) model on large-scale graphs involves a high volume of data communication and computations. While state-of-the-art CPUs and GPUs feature high computing power, the Standard GNN training protocol adopted in existing GNN ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

DAC '22: Proceedings of the 59th ACM/IEEE Design Automation Conference

July 2022

1462 pages

ISBN:9781450391429

DOI:10.1145/3489517

General Chair:
Rob Oshana
NXP

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGDA: ACM Special Interest Group on Design Automation
IEEE CEDA

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 August 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

SRC

Conference

DAC '22

Sponsor:

SIGDA

DAC '22: 59th ACM/IEEE Design Automation Conference

July 10 - 14, 2022

California, San Francisco

Acceptance Rates

Overall Acceptance Rate 1,770 of 5,499 submissions, 32%

Upcoming Conference

DAC '25

Sponsor:
sigda

62nd ACM/IEEE Design Automation Conference

June 22 - 26, 2025

San Francisco , CA , USA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

11
Total Citations
View Citations
515
Total Downloads

Downloads (Last 12 months)210
Downloads (Last 6 weeks)12

Reflects downloads up to 20 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Song YLi XLi FYu G(2024)Learning from Feature and Global Topologies: Adaptive Multi-View Parallel Graph Contrastive LearningMathematics10.3390/math1214227712:14(2277)Online publication date: 21-Jul-2024
https://doi.org/10.3390/math12142277
Chen SLiu JShen L(2024)A Survey on Graph Neural Network Acceleration: A Hardware PerspectiveChinese Journal of Electronics10.23919/cje.2023.00.13533:3(601-622)Online publication date: May-2024
https://doi.org/10.23919/cje.2023.00.135
Sheng ZZhang WTao YCui B(2024)OUTRE: An OUT-of-Core De-REdundancy GNN Training Framework for Massive Graphs within A Single MachineProceedings of the VLDB Endowment10.14778/3681954.368197617:11(2960-2973)Online publication date: 30-Aug-2024
https://doi.org/10.14778/3681954.3681976
Besta MHoefler T(2024)Parallel and Distributed Graph Neural Networks: An In-Depth Concurrency AnalysisIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2023.330343146:5(2584-2606)Online publication date: May-2024
https://doi.org/10.1109/TPAMI.2023.3303431
Yin CJiang JWang QMao ZJing N(2024)DeltaGNN: Accelerating Graph Neural Networks on Dynamic Graphs With Delta UpdatingIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2023.333515343:4(1163-1176)Online publication date: Apr-2024
https://doi.org/10.1109/TCAD.2023.3335153
Wang YPan XAn YZhang JReinman G(2024)BeaconGNN: Large-Scale GNN Acceleration with Out-of-Order Streaming In-Storage Computing2024 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA57654.2024.00033(330-344)Online publication date: 2-Mar-2024
https://doi.org/10.1109/HPCA57654.2024.00033
Zhou YLeng JSong YLu SWang MLi CGuo MShen WLi YLin WLiu XWu HAamodt TJerger NSwift M(2023)uGrapher: High-Performance Graph Operator Computation via Unified Abstraction for Graph Neural NetworksProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 210.1145/3575693.3575723(878-891)Online publication date: 27-Jan-2023
https://dl.acm.org/doi/10.1145/3575693.3575723
Unnikrishnan NGould JParhi K(2023)SCV-GNN: Sparse Compressed Vector-Based Graph Neural Network AggregationIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2023.329167242:12(4803-4816)Online publication date: 3-Jul-2023
https://dl.acm.org/doi/10.1109/TCAD.2023.3291672
Mondal SS. RZeng ZKunal KSapatnekar S(2023)A Multicore GNN Training Accelerator2023 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED)10.1109/ISLPED58423.2023.10244283(1-6)Online publication date: 7-Aug-2023
https://doi.org/10.1109/ISLPED58423.2023.10244283
Zhang DSong XHu ZLi YTao MHu BWang LZhang ZZhou J(2023)InferTurbo: A Scalable System for Boosting Full-graph Inference of Graph Neural Network over Huge Graphs2023 IEEE 39th International Conference on Data Engineering (ICDE)10.1109/ICDE55515.2023.00248(3235-3247)Online publication date: Apr-2023
https://doi.org/10.1109/ICDE55515.2023.00248
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents