research-article

Workload-Balanced Graph Attention Network Accelerator with Top-K Aggregation Candidates

Authors:

Jae-Joon KimAuthors Info & Claims

ICCAD '22: Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design

Article No.: 39, Pages 1 - 9

https://doi.org/10.1145/3508352.3549343

Published: 22 December 2022 Publication History

Abstract

Graph attention networks (GATs) are gaining attention for various transductive and inductive graph processing tasks due to their higher accuracy than conventional graph convolutional networks (GCNs). The power-law distribution of real-world graph-structured data, on the other hand, causes a severe workload imbalance problem for GAT accelerators. To reduce the degradation of PE utilization due to the workload imbalance, we present algorithm/hardware co-design results for a GAT accelerator that balances workload assigned to processing elements by allowing only K neighbor nodes to participate in aggregation phase. The proposed model selects the K neighbor nodes with high attention scores, which represent relevance between two nodes, to minimize accuracy drop. Experimental results show that our algorithm/hardware co-design of the GAT accelerator achieves higher processing speed and energy efficiency than the GAT accelerators using conventional workload balancing techniques. Furthermore, we demonstrate that the proposed GAT accelerators can be made faster than the GCN accelerators that typically process smaller number of computations.

References

[1]

Kathi Canese and Sarah Weis. 2013.: the bibliographic database. The NCBI handbook 2, 1 (2013).

[2]

Alex Fout et al. 2017. Protein interface prediction using graph convolutional networks. In Proceedings of the 31st International Conference on Neural Information Processing Systems. 6533--6542.

[3]

Mingyu Gao et al. 2017. Tetris: Scalable and efficient neural network acceleration with 3d memory. In Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems. 751--764.

[4]

Tong Geng et al. 2020. AWB-GCN: A graph convolutional network accelerator with runtime workload rebalancing. In 2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). IEEE, 922--936.

[5]

Tong Geng, Chunshu Wu, Yongan Zhang, Cheng Tan, Chenhao Xie, Haoran You, Martin Herbordt, Yingyan Lin, and Ang Li. 2021. I-GCN: A graph convolutional network accelerator with runtime locality enhancement through islandization. In MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture. 1051--1063.

Digital Library

[6]

C Lee Giles, Kurt D Bollacker, and Steve Lawrence. 1998. CiteSeer: An automatic citation indexing system. In Proceedings of the third ACM conference on Digital libraries. 89--98.

Digital Library

[7]

Chuang-Yi Gui et al. 2019. A survey on graph processing accelerators: Challenges and opportunities. Journal of Computer Science and Technology 34, 2 (2019), 339--371.

[8]

Tae Jun Ham et al. 2020. A ^ 3: Accelerating Attention Mechanisms in Neural Networks with Approximation. In 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 328--341.

[9]

Song Han, Junlong Kang, Huizi Mao, Yiming Hu, Xin Li, Yubin Li, Dongliang Xie, Hong Luo, Song Yao, Yu Wang, et al. 2017. Ese: Efficient speech recognition engine with sparse lstm on fpga. In Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. 75--84.

Digital Library

[10]

Hanhwi Jang et al. 2019. Mnnfast: A fast and scalable system architecture for memory-augmented neural networks. In Proceedings of the 46th International Symposium on Computer Architecture. 250--263.

[11]

Norman P Jouppi et al. 2017. In-datacenter performance analysis of a tensor processing unit. In Proceedings of the 44th annual international symposium on computer architecture. 1--12.

[12]

Thomas N Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016).

[13]

Jiajun Li et al. 2021. Gcnax: A flexible and energy-efficient accelerator for graph convolutional neural networks. In 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA). IEEE, 775--788.

[14]

Sheng Li, Ke Chen, Jung Ho Ahn, Jay B Brockman, and Norman P Jouppi. 2011. CACTI-P: Architecture-level modeling for SRAM-based structures with advanced leakage reduction techniques. In 2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD). IEEE, 694--701.

Digital Library

[15]

Andrew Kachites McCallum, Kamal Nigam, Jason Rennie, and Kristie Seymore. 2000. Automating the construction of internet portals with machine learning. Information Retrieval 3, 2 (2000), 127--163.

Digital Library

[16]

Chris Stark, Bobby-Joe Breitkreutz, Teresa Reguly, Lorrie Boucher, Ashton Breitkreutz, and Mike Tyers. 2006. BioGRID: a general repository for interaction datasets. Nucleic acids research 34, suppl_1 (2006), D535--D539.

[17]

Synopsys. 2018. Design Compiler. https://www.synopsys.com/implementation-and-signoff/rtl-synthesis-test/dc-ultra.html

[18]

Synopsys. 2020. PrimeTime Datasheet. https://www.synopsys.com/content/dam/synopsys/implementation&signoff/datasheets/primetime-ds.pdf

[19]

Petar Veličković et al. 2017. Graph attention networks. arXiv preprint arXiv:1710.10903 (2017).

[20]

Hanrui Wang et al. 2021. SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning. In 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA). IEEE, 97--110.

[21]

Mingyu Yan et al. 2020. Hygcn: A gcn accelerator with hybrid architecture. In 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 15--29.

[22]

Weian Yan et al. 2020. FPGAN: an FPGA accelerator for graph attention networks with software and hardware co-optimization. IEEE Access 8 (2020), 171608--171620.

[23]

Rex Ying et al. 2018. Hierarchical graph representation learning with differentiable pooling. In Proceedings of the 32nd International Conference on Neural Information Processing Systems. 4805--4815.

[24]

Shijie Zhou, Charalampos Chelmis, and Viktor K Prasanna. 2016. High-throughput and energy-efficient graph processing on FPGA. In 2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM). IEEE, 103--110.

Cited By

Han DWu MXue RYan MYe XFan D(2024)ADE-HGNN: Accelerating HGNNs Through Attention Disparity ExploitationEuro-Par 2024: Parallel Processing10.1007/978-3-031-69766-1_7(91-106)Online publication date: 26-Aug-2024
https://dl.acm.org/doi/10.1007/978-3-031-69766-1_7

Index Terms

Workload-Balanced Graph Attention Network Accelerator with Top-K Aggregation Candidates
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches
      1. Neural networks
2. Hardware
  1. Integrated circuits
    1. Reconfigurable logic and FPGAs
      1. Hardware accelerators

Recommendations

An FPGA-based accelerator platform implements for convolutional neural network
HP3C '19: Proceedings of the 3rd International Conference on High Performance Compilation, Computing and Communications

In recent years, convolutional neural network (CNN) has become widely universal in large number of applications including computer vision, natural language processing and automatic driving. However, the CNN-based methods are computational-intensive and ...
Workload balancing and adaptive resource management for the swift storage system on cloud

The demand for big data storage and processing has become a challenge in today's industry. To meet the challenge, there is an increasing number of enterprises adopting distributed storage systems. Frequently, in these systems, storage nodes intensively ...
Configurable Network Protocol Accelerator (COPA)
Today's field-programmable gate arrays (FPGAs) offer not only programmable acceleration capabilities but also include advanced features that are on-par with a mainstream processor platform. However, rather than being deployed as autonomous acceleration ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ICCAD '22: Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design

October 2022

1467 pages

ISBN:9781450392174

DOI:10.1145/3508352

Conference Chair:
Tulika Mitra
National University of Singapore
,
Program Chairs:
Evangeline Young
The Chinese University of Hong Kong
,
Jinjun Xiong
University at Buffalo (UB)

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGDA: ACM Special Interest Group on Design Automation

In-Cooperation

IEEE-EDS: Electronic Devices Society
IEEE CAS
IEEE CEDA

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 December 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ICCAD '22

Sponsor:

SIGDA

ICCAD '22: IEEE/ACM International Conference on Computer-Aided Design

October 30 - November 3, 2022

California, San Diego

Acceptance Rates

Overall Acceptance Rate 457 of 1,762 submissions, 26%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
271
Total Downloads

Downloads (Last 12 months)72
Downloads (Last 6 weeks)6

Reflects downloads up to 28 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Han DWu MXue RYan MYe XFan D(2024)ADE-HGNN: Accelerating HGNNs Through Attention Disparity ExploitationEuro-Par 2024: Parallel Processing10.1007/978-3-031-69766-1_7(91-106)Online publication date: 26-Aug-2024
https://dl.acm.org/doi/10.1007/978-3-031-69766-1_7

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten