skip to main content
10.1145/3508352.3549343acmconferencesArticle/Chapter ViewAbstractPublication PagesiccadConference Proceedingsconference-collections
research-article

Workload-Balanced Graph Attention Network Accelerator with Top-K Aggregation Candidates

Published: 22 December 2022 Publication History

Abstract

Graph attention networks (GATs) are gaining attention for various transductive and inductive graph processing tasks due to their higher accuracy than conventional graph convolutional networks (GCNs). The power-law distribution of real-world graph-structured data, on the other hand, causes a severe workload imbalance problem for GAT accelerators. To reduce the degradation of PE utilization due to the workload imbalance, we present algorithm/hardware co-design results for a GAT accelerator that balances workload assigned to processing elements by allowing only K neighbor nodes to participate in aggregation phase. The proposed model selects the K neighbor nodes with high attention scores, which represent relevance between two nodes, to minimize accuracy drop. Experimental results show that our algorithm/hardware co-design of the GAT accelerator achieves higher processing speed and energy efficiency than the GAT accelerators using conventional workload balancing techniques. Furthermore, we demonstrate that the proposed GAT accelerators can be made faster than the GCN accelerators that typically process smaller number of computations.

References

[1]
Kathi Canese and Sarah Weis. 2013.: the bibliographic database. The NCBI handbook 2, 1 (2013).
[2]
Alex Fout et al. 2017. Protein interface prediction using graph convolutional networks. In Proceedings of the 31st International Conference on Neural Information Processing Systems. 6533--6542.
[3]
Mingyu Gao et al. 2017. Tetris: Scalable and efficient neural network acceleration with 3d memory. In Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems. 751--764.
[4]
Tong Geng et al. 2020. AWB-GCN: A graph convolutional network accelerator with runtime workload rebalancing. In 2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). IEEE, 922--936.
[5]
Tong Geng, Chunshu Wu, Yongan Zhang, Cheng Tan, Chenhao Xie, Haoran You, Martin Herbordt, Yingyan Lin, and Ang Li. 2021. I-GCN: A graph convolutional network accelerator with runtime locality enhancement through islandization. In MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture. 1051--1063.
[6]
C Lee Giles, Kurt D Bollacker, and Steve Lawrence. 1998. CiteSeer: An automatic citation indexing system. In Proceedings of the third ACM conference on Digital libraries. 89--98.
[7]
Chuang-Yi Gui et al. 2019. A survey on graph processing accelerators: Challenges and opportunities. Journal of Computer Science and Technology 34, 2 (2019), 339--371.
[8]
Tae Jun Ham et al. 2020. A ^ 3: Accelerating Attention Mechanisms in Neural Networks with Approximation. In 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 328--341.
[9]
Song Han, Junlong Kang, Huizi Mao, Yiming Hu, Xin Li, Yubin Li, Dongliang Xie, Hong Luo, Song Yao, Yu Wang, et al. 2017. Ese: Efficient speech recognition engine with sparse lstm on fpga. In Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. 75--84.
[10]
Hanhwi Jang et al. 2019. Mnnfast: A fast and scalable system architecture for memory-augmented neural networks. In Proceedings of the 46th International Symposium on Computer Architecture. 250--263.
[11]
Norman P Jouppi et al. 2017. In-datacenter performance analysis of a tensor processing unit. In Proceedings of the 44th annual international symposium on computer architecture. 1--12.
[12]
Thomas N Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016).
[13]
Jiajun Li et al. 2021. Gcnax: A flexible and energy-efficient accelerator for graph convolutional neural networks. In 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA). IEEE, 775--788.
[14]
Sheng Li, Ke Chen, Jung Ho Ahn, Jay B Brockman, and Norman P Jouppi. 2011. CACTI-P: Architecture-level modeling for SRAM-based structures with advanced leakage reduction techniques. In 2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD). IEEE, 694--701.
[15]
Andrew Kachites McCallum, Kamal Nigam, Jason Rennie, and Kristie Seymore. 2000. Automating the construction of internet portals with machine learning. Information Retrieval 3, 2 (2000), 127--163.
[16]
Chris Stark, Bobby-Joe Breitkreutz, Teresa Reguly, Lorrie Boucher, Ashton Breitkreutz, and Mike Tyers. 2006. BioGRID: a general repository for interaction datasets. Nucleic acids research 34, suppl_1 (2006), D535--D539.
[17]
Synopsys. 2018. Design Compiler. https://www.synopsys.com/implementation-and-signoff/rtl-synthesis-test/dc-ultra.html
[18]
Synopsys. 2020. PrimeTime Datasheet. https://www.synopsys.com/content/dam/synopsys/implementation&signoff/datasheets/primetime-ds.pdf
[19]
Petar Veličković et al. 2017. Graph attention networks. arXiv preprint arXiv:1710.10903 (2017).
[20]
Hanrui Wang et al. 2021. SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning. In 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA). IEEE, 97--110.
[21]
Mingyu Yan et al. 2020. Hygcn: A gcn accelerator with hybrid architecture. In 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 15--29.
[22]
Weian Yan et al. 2020. FPGAN: an FPGA accelerator for graph attention networks with software and hardware co-optimization. IEEE Access 8 (2020), 171608--171620.
[23]
Rex Ying et al. 2018. Hierarchical graph representation learning with differentiable pooling. In Proceedings of the 32nd International Conference on Neural Information Processing Systems. 4805--4815.
[24]
Shijie Zhou, Charalampos Chelmis, and Viktor K Prasanna. 2016. High-throughput and energy-efficient graph processing on FPGA. In 2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM). IEEE, 103--110.

Cited By

View all
  • (2024)ADE-HGNN: Accelerating HGNNs Through Attention Disparity ExploitationEuro-Par 2024: Parallel Processing10.1007/978-3-031-69766-1_7(91-106)Online publication date: 26-Aug-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICCAD '22: Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design
October 2022
1467 pages
ISBN:9781450392174
DOI:10.1145/3508352
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

  • IEEE-EDS: Electronic Devices Society
  • IEEE CAS
  • IEEE CEDA

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 December 2022

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. graph attention network
  2. hardware accelerator design
  3. workload balancing

Qualifiers

  • Research-article

Conference

ICCAD '22
Sponsor:
ICCAD '22: IEEE/ACM International Conference on Computer-Aided Design
October 30 - November 3, 2022
California, San Diego

Acceptance Rates

Overall Acceptance Rate 457 of 1,762 submissions, 26%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)72
  • Downloads (Last 6 weeks)6
Reflects downloads up to 28 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)ADE-HGNN: Accelerating HGNNs Through Attention Disparity ExploitationEuro-Par 2024: Parallel Processing10.1007/978-3-031-69766-1_7(91-106)Online publication date: 26-Aug-2024

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media