ABSTRACT
The VLSI chip design process consists of a sequence of distinct steps like floor planning, placement, clock tree synthesis and routing. Each of these steps requires solving optimization problems that are often NP-hard, and the state-of-the art algorithms are not guaranteed to the optimal. Due to the compartmentalization of the design flow into distinct steps, these optimization problems are solved sequentially, with the output of first feeding into the next. This results in an inherent inefficiency, where the optimization goal of an early step problem is estimated using a fast and approximate surrogate model for the following steps. Consequently, any improvement in the step-specific optimization algorithm, while obvious at that step, is much smaller when measured at the end of the full design flow. For example, the placement step minimizes wire length. In the absence of routed nets, this wire length might be estimated by using a simple wire length model like the Steiner tree. Thus, any improvement in the placement algorithm is limited by the accuracy of the wire length estimate.
Recently, Reinforcement Learning (RL) has emerged as a promising alternative to the state-of-the-art algorithms used to solve optimization problems in placement and routing of a VLSI design [1, 2, 3]. The RL problem setup involves an agent exploring an unknown environment to achieve a goal. RL is based on the hypothesis that all goals can be described by the maximization of expected cumulative reward. The agent must learn to sense and perturb the state of the environment using its actions to derive maximal reward. Many problems in VLSI chip design can be represented as Markov Decision Problems (MDPs), where design optimization objectives are converted into rewards given by the environment and design variables are converted into actions provided to the environment. Recent advances in applying RL to VLSI implementation problems such as floor planning, standard cell layout, synthesis and placement have demonstrated improvements over the state-of-the-art algorithms. However, these improvements continue to be limited by the inaccuracies in the estimate of the optimization goal as described previously.
With DSO.ai, we have built a distributed system for the optimization of physical design flow, where multiple iterations of parallel runs are used to optimize enormous design parameter search spaces. In addition to a multiplicative improvement in human productivity, it has unlocked significant performance gains across a wide range of technology nodes. At the heart of DSO.ai's decision engine is an implementation of RL that solves the sequential decision making and optimization problem spanning the entire design flow. Unlike prior works where RL is used for step-specific optimization within the chip design flow, DSO.ai's RL algorithm wraps around the optimization steps to guide them via parameter choices that depend upon the optimization goal for the full flow. Thus, the quality of the final design generated by DSO.ai is no longer subject to the limitations of the compartmentalized design flow. DSO.ai's RL algorithm views the full chip design flow as one optimization problem, where the design quality at the end of the flow is the only one that matters. To propagate the design through the design flow, DSO.ai makes parameter choices for the underlying optimization steps, which constitute its action space. By tracking the effect of these actions as a function of the design state, DSO.ai can find the optimal sequence of actions to meet the optimization goal at the end of the full flow.
In this presentation, we demonstrate how DSO.ai provides a flexible framework to integrate with existing design flows and serve the design quality needs throughout the design evolution cycle. We will also highlight how DSO.ai is allowing expert designers at Synopsys to package their knowledge into fully featured toolboxes ready to be deployed by novice designers. We will provide a summary of QoR gains that DSO.ai has delivered on advanced process nodes. Finally, we will show how DSO.ai's decision engine is paving the way to automating the parameter choices in the chip design flow.
- Mirhoseini, A., Goldie, A., Yazgan, M., Jiang, J., Songhori, E., Wang, S., ... & Dean, J. (2020). Chip placement with deep reinforcement learning. arXiv preprint arXiv:2004.10746.Google Scholar
- Agnesina, A., Chang, K., & Lim, S. K. (2020, November). VLSI placement parameter optimization using deep reinforcement learning. In Proceedings of the 39th International Conference on Computer-Aided Design (pp. 1--9).Google ScholarDigital Library
- Lu, Y. C., Nath, S., Khandelwal, V., & Lim, S. K. (2021, December). RL-Sizer: VLSI gate sizing for timing optimization using deep reinforcement learning. In 2021 58th ACM/IEEE Design Automation Conference (DAC) (pp. 733--738). IEEE.Google ScholarDigital Library
Index Terms
- DSO.ai - A Distributed System to Optimize Physical Design Flows
Recommendations
Seeing the forest and the trees: Steiner wirelength optimization in placemen
ISPD '06: Proceedings of the 2006 international symposium on Physical designWe show how to optimize Steiner-tree Wirelength (StWL) in global and detail placement without a significant runtime penalty, making the use of Half-Perimeter Wirelength unnecessary. Given that StWL correlates with Routed Wirelength (rWL) much better ...
Physical hierarchy generation with routing congestion control
ISPD '02: Proceedings of the 2002 international symposium on Physical designIn this paper, we develop a multi-level physical hierarchy generation (mPG) algorithm integrated with fast incremental global routing for directly updating and optimizing congestion cost during placement. The fast global routing is achieved by using a ...
IPR: an integrated placement and routing algorithm
DAC '07: Proceedings of the 44th annual Design Automation ConferenceIn nanometer-scale VLSI technologies, several interconnect issues like routing congestion and interconnect delay have become the main concerns in placement. However, all previous placement approaches optimize some very primitive interconnect models ...
Comments