skip to main content
10.1145/3508352.3549434acmconferencesArticle/Chapter ViewAbstractPublication PagesiccadConference Proceedingsconference-collections
research-article

GPU-Accelerated Rectilinear Steiner Tree Generation

Published: 22 December 2022 Publication History

Abstract

Rectilinear Steiner minimum tree (RSMT) generation is a fundamental component in the VLSI design automation flow. Due to its extensive usage in circuit design iterations at early design stages like synthesis, placement, and routing, the performance of RSMT generation is critical for a reasonable design turnaround time. State-of-the-art RSMT generation algorithms, like fast look-up table estimation (FLUTE), are constrained by CPU-based parallelism with limited runtime improvements. The acceleration of RSMT on GPUs is an important yet difficult task, due to the complex and non-trivial divide-and-conquer computation patterns with recursions. In this paper, we present the first GPU-accelerated RSMT generation algorithm based on FLUTE. By designing GPU-efficient data structures and levelized decomposition, table look-up, and merging operations, we incorporate large-scale data parallelism into the generation of Steiner trees. An up to 10.47× runtime speed-up has been achieved compared with FLUTE running on 40 CPU cores, filling in a critical missing component in today's GPU-accelerated design automation framework.

References

[1]
M. R. Garey and D. S. Johnson, "The rectilinear steiner tree problem is np-complete," SIAM Journal on Applied Mathematics, vol. 32, no. 4, pp. 826--834, 1977.
[2]
D. M. Warme, P. Winter, and M. Zachariasen, "Exact algorithms for plane steiner tree problems: A computational study," in Advances in Steiner trees. Springer, 2000, pp. 81--116.
[3]
GeoSteiner, Inc., "Geosteiner: Software for computing steiner trees," http://www.geosteiner.com/, 2017.
[4]
F. K. Hwang, "An o (n log n) algorithm for rectilinear minimal spanning trees," Journal of the ACM (JACM), vol. 26, no. 2, pp. 177--182, 1979.
[5]
F. K. Hwang, "On steiner minimal trees with rectilinear distance," SIAM journal on Applied Mathematics, vol. 30, no. 1, pp. 104--114, 1976.
[6]
J. Griffith, G. Robins, J. S. Salowe, and T. Zhang, "Closing the gap: Near-optimal steiner trees in polynomial time," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 13, no. 11, pp. 1351--1365, 1994.
[7]
I. I. Mandoiu, V. V. Vazirani, and J. L. Ganley, "A new heuristic for rectilinear steiner trees," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 19 no. 10 pp. 1129--1139, 2000.
[8]
A. B. Kahng, I. I. Măndoiu, and A. Z. Zelikovsky, "Highly scalable algorithms for rectilinear and octilinear steiner trees," in Proceedings of the 2003 Asia and South Pacific Design Automation Conference, 2003, pp. 827--833.
[9]
C. Chu and Y.-C. Wong, "FLUTE: Fast lookup table based rectilinear steiner minimal tree algorithm for VLSI design," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), vol. 27, no. 1, pp. 70--83, 2008.
[10]
J. Liu, G. Chen, and E. F. Young, "Rest: Constructing rectilinear steiner minimum tree via reinforcement learning," in 2021 58th ACM/IEEE Design Automation Conference (DAC). IEEE, 2021, pp. 1135--1140.
[11]
W. Shi and C. Su, "The rectilinear steiner arborescence problem is np-complete," SIAM Journal on Computing, vol. 35, no. 3, pp. 729--740, 2005.
[12]
A. B. Kahng and G. Robins, "A new class of iterative steiner tree heuristics with good performance," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), vol. 11, no. 7, pp. 893--902, 1992.
[13]
J. Córdova and Y.-H. Lee, "A heuristic algorithm for the rectilinear steiner arborescence problem," in Engineering Optimization. Citeseer, 1994.
[14]
J. Cong, K.-S. Leung, and D. Zhou, "Performance-driven interconnect design based on distributed rc delay model," in 30th ACM/IEEE Design Automation Conference. IEEE, 1993, pp. 606--611.
[15]
M. Pan, C. Chu, and P. Patra, "A novel performance-driven topology design algorithm," in 2007 Asia and South Pacific Design Automation Conference. IEEE, 2007, pp. 244--249.
[16]
G. Chen and E. F. Young, "Salt: provably good routing topology by a novel steiner shallow-light tree algorithm," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 39, no. 6, pp. 1217--1230, 2019.
[17]
B. Awerbuch, A. Baratz, and D. Peleg, Efficient broadcast and light-weight spanners. Technical Report, 1992.
[18]
J. Cong, A. B. Kahng, G. Robins, M. Sarrafzadeh, and C.-K. Wong, "Provably good performance-driven global routing," IEEE transactions on computer-aided design of integrated circuits and systems, vol. 11, no. 6, pp. 739--752, 1992.
[19]
S. Khuller, B. Raghavachari, and N. Young, "Balancing minimum spanning trees and shortest-path trees," Algorithmica, vol. 14, no. 4, pp. 305--321, 1995.
[20]
C. J. Alpert, T. C. Hu, J.-H. Huang, A. B. Kahng, and D. Karger, "Prim-dijkstra tradeoffs for improved performance-driven routing tree design," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 14, no. 7, pp. 890--896, 1995.
[21]
C. J. Alpert, W.-K. Chow, K. Han, A. B. Kahng, Z. Li, D. Liu, and S. Venkatesh, "Prim-dijkstra revisited: Achieving superior timing-driven routing trees," in Proceedings of the 2018 International Symposium on Physical Design, 2018, pp. 10--17.
[22]
Y. Zhang, H. Ren, and B. Khailany, "Opportunities for rtl and gate level simulation using gpus (invited talk)," in 2020 IEEE/ACM International Conference On Computer Aided Design (ICCAD), 2020, pp. 1--5.
[23]
Y. Lin, Z. Jiang, J. Gu, W. Li, S. Dhar, H. Ren, B. Khailany, and D. Z. Pan, "DREAMPlace: Deep learning toolkit-enabled gpu acceleration for modern vlsi placement," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), 2020.
[24]
Z. Guo, T.-W. Huang, and Y. Lin, "Gpu-accelerated static timing analysis," in IEEE/ACM International Conference on Computer-Aided Design (ICCAD). ACM, 2020.
[25]
S. Lin, J. Liu, and M. D. Wong, "Gamer: Gpu accelerated maze routing," in 2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD), 2021, pp. 1--8.
[26]
S. Liu, P. Liao, Z. Chen, W. Lv, Y. Lin, and B. Yu, "Fastgr: Global routing on cpu-gpu with heterogeneous task graph scheduler," in IEEE/ACM Proceedings Design, Automation and Test in Eurpoe (DATE), Antwerp, Belgium, March 2022.
[27]
M. Kim, J. Hu, J. Li, and N. Viswanathan, "ICCAD-2015 CAD contest in incremental timing-driven placement and benchmark suite," in IEEE/ACM International Conference on Computer-Aided Design (ICCAD), 2015, pp. 921--926.
[28]
M. Hanan, "On steiner's problem with rectilinear distance," SIAM Journal on Applied Mathematics (SIAP), vol. 14, no. 2, pp. 255--265, 1966.
[29]
N. Bell and J. Hoberock, "Thrust: A productivity-oriented library for cuda," in GPU computing gems Jade edition. Elsevier, 2012, pp. 359--371.
[30]
Z. Guo and Y. Lin, "Differentiable-timing-driven global placement," in ACM/IEEE Design Automation Conference (DAC), 2022, pp. 1--6.
[31]
W.-H. Liu, W.-C. Kao, Y.-L. Li, and K.-Y. Chao, "NCTU-GR 2.0: multithreaded collision-aware global routing with bounded-length maze routing," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), vol. 32, no. 5, pp. 709--722, 2013.
[32]
T. Huang, G. Guo, C. Lin, and M. D. F. Wong, "OpenTimer v2: A New Parallel Incremental Timing Analysis Engine," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), pp. 1--1, 2021.
[33]
Z. Guo, T.-W. Huang, and Y. Lin, "A provably good and practically efficient algorithm for common path pessimism removal in large designs," in ACM/IEEE Design Automation Conference (DAC). ACM, 2021.
[34]
H. Yang, K. Fung, Y. Zhao, Y. Lin, and B. Yu, "Mixed-cell-height legalization on cpu-gpu heterogeneous systems," in IEEE/ACM Proceedings Design, Automation and Test in Eurpoe (DATE), 2022.

Cited By

View all
  • (2025)A Robust FPGA Router With Optimization of High-Fanout Nets and Intra-CLB ConnectionsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2024.344721844:3(1003-1016)Online publication date: Mar-2025
  • (2023)Towards Timing-Driven Routing: An Efficient Learning Based Geometric Approach2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD)10.1109/ICCAD57390.2023.10323981(1-9)Online publication date: 28-Oct-2023

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICCAD '22: Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design
October 2022
1467 pages
ISBN:9781450392174
DOI:10.1145/3508352
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

  • IEEE-EDS: Electronic Devices Society
  • IEEE CAS
  • IEEE CEDA

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 December 2022

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Conference

ICCAD '22
Sponsor:
ICCAD '22: IEEE/ACM International Conference on Computer-Aided Design
October 30 - November 3, 2022
California, San Diego

Acceptance Rates

Overall Acceptance Rate 457 of 1,762 submissions, 26%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)87
  • Downloads (Last 6 weeks)3
Reflects downloads up to 28 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)A Robust FPGA Router With Optimization of High-Fanout Nets and Intra-CLB ConnectionsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2024.344721844:3(1003-1016)Online publication date: Mar-2025
  • (2023)Towards Timing-Driven Routing: An Efficient Learning Based Geometric Approach2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD)10.1109/ICCAD57390.2023.10323981(1-9)Online publication date: 28-Oct-2023

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media