research-article

CU.POKer: placing DNNs on wafer-scale AI accelerator with optimal kernel sizing

Authors:

Evangeline F. Y. YoungAuthors Info & Claims

ICCAD '20: Proceedings of the 39th International Conference on Computer-Aided Design

Article No.: 142, Pages 1 - 9

https://doi.org/10.1145/3400302.3415688

Published: 17 December 2020 Publication History

Get Access

Abstract

The tremendous growth in deep learning (DL) applications has created an exponential demand for computing power, which leads to the rise of AI-specific hardware. Targeted towards accelerating computation-intensive deep learning applications, AI hardware, including but not limited to GPGPU, TPU, ASICs, etc., have been adopted ubiquitously. As a result, domain-specific CAD tools play more and more important roles and have been deeply involved in both the design and compilation stages of modern AI hardware. Recently, ISPD 2020 contest introduced a special challenge targeting at the physical mapping of neural network workloads onto the largest commercial deep learning accelerator, CS-1 Wafer-Scale Engine (WSE). In this paper, we proposed CU.POKer, a high-performance engine fully-customized for WSE's DNN workload placement challenge. A provably optimal placeable kernel candidate searching scheme and a data-flow-aware placement tool are developed accordingly to ensure the state-of-the-art quality on the real industrial benchmarks. Experimental results on ISPD 2020 contest evaluation suites [1] demonstrated the superiority of our proposed framework over other contestants.

References

[1]

"ISPD 2020 Contest: Wafer-Scale Deep Learning Accelerator Placement," https://www.cerebras.net/ispd-2020-contest/.

Google Scholar

[2]

D. Silver, T. Hubert, J. Schrittwieser, I. Antonoglou, M. Lai, A. Guez, M. Lanctot, L. Sifre, D. Kumaran, T. Graepel, T. Lillicrap, K. Simonyan, and D. Hassabis, "Mastering chess and shogi by self-play with a general reinforcement learning algorithm," 2017.

Google Scholar

[3]

T. B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell et al., "Language models are few-shot learners" arXiv preprint arXiv:2005.14165, 2020.

Google Scholar

[4]

O. Ronneberger, P. Fischer, and T. Brox, "U-net: Convolutional networks for biomedical image segmentation" in Proc. MICCAI, 2015, pp. 234--241.

Crossref

Google Scholar

[5]

K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770--778.

Google Scholar

[6]

D. Amodei and D. Hernandez, "AI and Compute," https://openai.com/blog/ai-and-compute/, May 16, 2018, accessed on 20-May-2020.

Google Scholar

[7]

M. James, M. Tom, P. Groeneveld, and V. Kibardin, "Ispd 2020 physical mapping of neural networks on a wafer-scale deep learning accelerator" in Proceedings of the 2020 International Symposium on Physical Design, ser. ISPD '20. New York, NY, USA: Association for Computing Machinery, 2020, p. 145--149.

Digital Library

Google Scholar

[8]

Y. Lin, S. Dhar, W. Li, H. Ren, B. Khailany, and D. Z. Pan, "Dreamplace: Deep learning toolkit-enabled gpu acceleration for modern vlsi placement," in Proceedings of the 56th Annual Design Automation Conference 2019, 2019, pp. 1--6.

Google Scholar

[9]

A. Mirhoseini, A. Goldie, M. Yazgan, J. Jiang, E. Songhori, S. Wang, Y.-J. Lee, E. Johnson, O. Pathak, S. Bae et al., "Chip placement with deep reinforcement learning" arXiv preprint arXiv:2004.10746, 2020.

Google Scholar

[10]

H.-M. Chen, M. D. Wong, H. Zhou, F.-Y. Young, H. H. Yang, and N. Sherwani, "Integrated floorplanning and interconnect planning" in Layout optimization in VLSI design. Springer, 2001, pp. 1--18.

Google Scholar

[11]

E. F. Young, C. C. Chu, and Z. C. Shen, "Twin binary sequences: a nonredundant representation for general nonslicing floorplan," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 22, no. 4, pp. 457--469, 2003.

Digital Library

Google Scholar

Cited By

View all

Hu YLin XWang HHe ZYu XZhang JYang QXu ZGuan SFang JShang HTang XDai XWei SYin S(2024)Wafer-Scale Computing: Advancements, Challenges, and Future Perspectives [Feature]IEEE Circuits and Systems Magazine10.1109/MCAS.2024.334966924:1(52-81)Online publication date: Sep-2025
https://doi.org/10.1109/MCAS.2024.3349669
Özdemir SKhasawneh MRao SMadden PBehjat LYang S(2022)Kernel Mapping Techniques for Deep Learning Neural Network AcceleratorsProceedings of the 2022 International Symposium on Physical Design10.1145/3505170.3506730(21-28)Online publication date: 13-Apr-2022
https://dl.acm.org/doi/10.1145/3505170.3506730
Jiang BChen JLiu JLiu LWang FZhang XYoung E(2022)CU.POKer: Placing DNNs on WSE With Optimal Kernel Sizing and Efficient Protocol OptimizationIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2021.309645841:6(1888-1901)Online publication date: Jun-2022
https://doi.org/10.1109/TCAD.2021.3096458
Show More Cited By

Recommendations

MP-Trees: A Packing-Based Macro Placement Algorithm for Modern Mixed-Size Designs

In this paper, we present a new multipacking-tree (MP-tree) representation for macro placements to handle modern mixed-size designs with large macros and high chip utilization rates. Based on binary trees, the MP-tree is very efficient, effective, and ...
MP-trees: a packing-based macro placement algorithm for mixed-size designs
DAC '07: Proceedings of the 44th annual Design Automation Conference

In this paper, we present a new multi-packing tree (MP-tree) representation for macro placement to handle mixed-size designs. Based on binary trees, the MP-tree is very efficient, effective, and flexible for handling macro placement with various ...
A robust detailed placement for mixed-size IC designs
ASP-DAC '06: Proceedings of the 2006 Asia and South Pacific Design Automation Conference

The rapid increase in IC design complexity and wide-spread use of intellectual-property (IP) blocks have made the so-called mixed-size placement a very important topic in recent years. Although several algorithms have been proposed for mixed-sized ...

Comments

Information & Contributors

Information

Published In

ICCAD '20: Proceedings of the 39th International Conference on Computer-Aided Design

November 2020

1396 pages

ISBN:9781450380263

DOI:10.1145/3400302

General Chair:
Yuan Xie
Univ. of California, Santa Barbara, CA

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

IEEE CAS
IEEE CEDA
IEEE CS

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 December 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article

Conference

ICCAD '20

Sponsor:

SIGDA

ICCAD '20: IEEE/ACM International Conference on Computer-Aided Design

November 2 - 5, 2020

Virtual Event, USA

Acceptance Rates

Overall Acceptance Rate 457 of 1,762 submissions, 26%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

5
Total Citations
View Citations
282
Total Downloads

Downloads (Last 12 months)37
Downloads (Last 6 weeks)7

Reflects downloads up to 16 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

Hu YLin XWang HHe ZYu XZhang JYang QXu ZGuan SFang JShang HTang XDai XWei SYin S(2024)Wafer-Scale Computing: Advancements, Challenges, and Future Perspectives [Feature]IEEE Circuits and Systems Magazine10.1109/MCAS.2024.334966924:1(52-81)Online publication date: Sep-2025
https://doi.org/10.1109/MCAS.2024.3349669
Özdemir SKhasawneh MRao SMadden PBehjat LYang S(2022)Kernel Mapping Techniques for Deep Learning Neural Network AcceleratorsProceedings of the 2022 International Symposium on Physical Design10.1145/3505170.3506730(21-28)Online publication date: 13-Apr-2022
https://dl.acm.org/doi/10.1145/3505170.3506730
Jiang BChen JLiu JLiu LWang FZhang XYoung E(2022)CU.POKer: Placing DNNs on WSE With Optimal Kernel Sizing and Efficient Protocol OptimizationIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2021.309645841:6(1888-1901)Online publication date: Jun-2022
https://doi.org/10.1109/TCAD.2021.3096458
He ZLiao PLiu SMa YLin YYu B(2021)Physical Synthesis for Advanced Neural Network ProcessorsProceedings of the 26th Asia and South Pacific Design Automation Conference10.1145/3394885.3431625(833-840)Online publication date: 18-Jan-2021
https://dl.acm.org/doi/10.1145/3394885.3431625
Chou YHsu JChang YChen T(2021)VLSI Structure-aware Placement for Convolutional Neural Network Accelerator Units2021 58th ACM/IEEE Design Automation Conference (DAC)10.1109/DAC18074.2021.9586294(1117-1122)Online publication date: 5-Dec-2021
https://doi.org/10.1109/DAC18074.2021.9586294

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Abstract

References

Cited By

Recommendations

MP-Trees: A Packing-Based Macro Placement Algorithm for Modern Mixed-Size Designs

MP-trees: a packing-based macro placement algorithm for mixed-size designs

A robust detailed placement for mixed-size IC designs

Comments

Information

Published In

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations