Elsevier

Integration

Volume 47, Issue 2, March 2014, Pages 175-183
Integration

Fast and scalable parallel layout decomposition in double patterning lithography

https://doi.org/10.1016/j.vlsi.2013.09.002Get rights and content

Highlights

  • A window-based parallel layout decomposition framework is presented for improving both runtime and memory consumption.

  • Two parallel layout decomposition approaches are presented and compared, including maximum independent set-based (MISP) and stochastic optimization-based (SOP) methods.

  • MISP obtains notable runtime and memory improvements. SOP further improves the solution quality using the Cross Entropy method.

  • An overlapping window-based scheme is presented, which avoids large memory consumption for constructing the whole graph of large layouts.

  • The presented parallel methods do not make any assumptions about the machine architecture and hence are very generic.

Abstract

For 32/22 nm technology nodes and below, double patterning (DP) lithography has become the most promising interim solutions due to the delay in the deployment of next generation lithography (e.g., EUV). DP requires the partitioning of the layout patterns into two different masks, a procedure called layout decomposition. Layout decomposition is a key computational step that is necessary for double patterning technology. Existing works on layout decomposition are all single-threaded, which is not scalable in runtime and/or memory for large industrial layouts. This paper presents the first window-based parallel layout decomposition methods for improving both runtime and memory consumption. Experimental results are promising and show the presented parallel layout decomposition methods obtain upto 21× speedup in runtime and upto 7.5×reduction in peak memory consumption with acceptable solution quality.

Introduction

As VLSI technology nodes proceed beyond 32 nm/22 nm and below, the next generation lithography (NGL), such as Extreme Ultraviolet (EUV), is still facing both technology and cost challenges for mass production. As a result, double patterning (DP) lithography technology has become the most feasible interim solution [1]. DP requires that the dense circuit patterns be partitioned into two separate exposures, which decreases the pattern densities in each exposure and thus improves resolution and depth of focus (DOF). DP layout decomposition must satisfy the following requirement: two patterns must be assigned opposite colors (corresponding to different mask exposures) if their spacing is less than the minimum coloring spacing [2].

There has been a large body of work on single-threaded DP layout decomposition. Kahng et al. proposed different DP layout decomposition and coloring approaches based on integer linear programming (ILP) and graph bipartization algorithms [2], [3], [4]. Yuan et al. developed an ILP algorithm to minimize the number of conflicts and stitches simultaneously based on a grid layout model [5]. Xu and Chu proposed a graph-based method for reducing the problem size and thus speeding up the ILP solution [6]. A simultaneous layout decomposition and migration method for standard cells based on ILP formulation [7] was proposed by Hsu et al. Yang et al. presented a heuristic minimum-cut-based graph bipartization algorithm for DP layout decomposition considering both balanced density and timing optimization [8]. A graph-matching based method on a constructed face graph was proposed by Xu and Chu [9]. Luk and Huang used an SPQR-Tree data structure to reduce the layout decomposition problem size for runtime speedup without degrading the solution quality [10]. Ban et al. presented a layout decomposition framework for spacer-type double patterning [11]. There were also works considering the DP requirements during both the routing stage [12], [13], [14] and post-routing optimization stage (e.g., wire spreading, layer assignment, etc.) [15], [16], [17].

All the above works present single-threaded algorithms for optimizing the DP objectives, such as the number of coloring conflicts, stitches, etc. With the rapid deployment of multi-core CPUs and commodity computational clusters, parallel computation is becoming necessary to achieve the great performance and scalability [18], [19], [20]. Although the graph partitioning methods in [6], [10] can be easily extended for parallel processing, they require that the whole layout be loaded in and the whole corresponding graph be constructed, which consume huge memory for large layouts. This paper presents window-based scalable parallel DP layout decomposition methods, which avoid the construction of the whole graph to save memory consumption. The presented methods are generic in the sense that the graph partitioning methods in [6], [10] can still be used in each window for further speedup. Major contributions of the paper are as follows.

  • We present and compare two parallel layout decomposition approaches, including a maximum independent set-based parallel (MISP) method and a stochastic optimization-based parallel (SOP) method, which make the existing single-threaded layout decomposition algorithms more practical and applicable for large industrial layouts.

  • The MISP method partitions the original layout decomposition problem into sets of sub-problems, where the sets are processed in sequential manner and sub-problems in each set are processed in parallel. The MISP method obtains notable improvements in both runtime and memory, with a little degraded solution quality. The SOP method can further improve the solution quality using the Cross Entropy method [21] with increased runtime. The two parallel methods can be selected in real applications depending on the available servers in the computational cluster.

  • The two parallel methods use an overlapping window-based scheme, which avoids large memory consumption for constructing the whole graph of large layouts. The window-based scheme can be used in conjunction with any specific single-threaded layout decomposition algorithm. Besides, the presented parallel methods do not make any assumptions about the machine architecture and hence are very generic.

The rest of the paper is organized as follows. An overview of the double patterning lithography and the associated layout decomposition problem is given in Section 2. Section 3 presents the flows of the parallel methods for layout decomposition. An overview of the window construction process is given in Section 4. 5 Parallel DP coloring flow in MISP, 6 Parallel DP coloring flow in SOP present the details of the MISP and SOP parallel flows, respectively. Experimental results are presented in Section 7. Section 8 concludes the paper with future research directions.

Section snippets

Double patterning technology

There are basically two types of multiple exposure techniques with different lithography patterning steps, i.e., pitch splitting (PS) and spacer patterning (SP) [1]. PS includes double patterning (DP) and double exposure (DE). For patterning a single device layer, the DP process consists of two separate and sequential lithography/etch steps (i.e., litho etch litho etch, LELE). By contrast, the DE process includes two lithographic exposures followed by one etch step. Though one etch step is

Maximum independent set-based parallel flow

Fig. 3 shows the overall parallel layout decomposition flow based on the MISP approach.

Window construction: In large GDS layouts, the large number of polygons makes it hard to load the entire layout in the memory of a single machine. This is addressed by processing the layout in window-based scheme. Besides, windows can be processed in parallel to save the runtime and memory consumptions. Our parallel methods in this paper are based on the rectangular windows. During window construction, the

Window construction

In the OPC process, the layout is divided into overlapping windows, where the size of each window is between 500nm×500nm and 500μm×500μm. These windows can be processed in parallel on a computational grid of a cluster of CPUs. Using the same design partitioning metrics, we can seamlessly integrate the DP coloring process and OPC together. Different window sizes affect the solution quality and runtime, which is analyzed in Section 7. The best window size may be obtained for the given layout with

Parallel DP coloring flow in MISP

In the MISP parallel flow, we compute the maximum number of non-adjacent o-windows, whose corresponding e-windows can be colored in parallel. Two o-windows are regarded as adjacent if their corresponding e-windows overlap with each other. Thus, two o-windows can be adjacent in either horizontal, vertical or diagonal directions. Given a layout, a window graph is constructed as follows: each o-window is represented as a node and an edge is drawn between two nodes if their corresponding o-windows

SOP DP coloring flow

In the SOP parallel DP coloring flow, all the e-windows are colored in parallel. To avoid the coloring inconsistency on the polygons in the overlapping area, the colors of these polygons are fixed before hand as pre-coloring constraints. With these pre-coloring constraints, each e-window can be colored independently. A primitive way to obtain the optimal solution is to enumerate all the possible colorings for the polygons in the overlapping area and compute the corresponding color assignment

Experimental results

We have implemented the presented parallel DP methods in C++. Due to the lack of open-source academic DP coloring tools, we implemented an ILP-based DP coloring algorithm. We adopt the connected component based speedup without the graph partitioning methods as in [2], because the graph partitioning heuristic may introduce sub-optimal solutions. Note that any specific DP coloring algorithms can be used in our window-based parallel approaches. We use lp_solve 5.5 [30] for solving the ILP

Conclusion

This paper presents new parallel DP layout decomposition methods as a complement for existing single-threaded DP layout decomposition and coloring algorithms. The proposed parallel flows use a window-based approach, which avoids large memory consumption and allows for seamless integration of DP and window-based OPC solutions. Experimental results show improved runtime and memory scalability, thereby demonstrating the effectiveness and the practicality of the proposed parallel methods. Future

Acknowledgments

The authors would like to thank Synopsys Inc. for the partial funding support for the research collaborations and the authors would also like to thank Professor Andrew Kahng from UC San Diego for providing the poly-layer layout testcases for the experiments.

Charles Chiang joined Synopsys, Inc. in 2001 after working at IBM and EDA companies for 10 years. His research interests include routing, placement, floorplanning, design for manufacturability (DFM) and 3D IC integration. His main focus now is on mask synthesis area.

Dr. Chiang received his Ph.D. degree from the Department of Electrical Engineering and Computer Science, Northwestern University, Illinois, USA, in 1991 and Bachelor degree from Tunghai University, Taichung, Taiwan, in 1980. He is

References (31)

  • M. Laguna et al.

    Hybridizing the cross-entropy methodan application to the max-cut problem

    Computers and Operations Research

    (2009)
  • J. Guo et al.

    Compression-based fixed-parameter algorithms for feedback vertex set and edge bipartization

    Journal of Computer and System Sciences

    (2006)
  • International Technology Roadmap for Semiconductors, Lithography Chapter, 2009 [Online]. Available from:...
  • A.B. Kahng et al.

    Layout decomposition approaches for double patterning lithography

    IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

    (2010)
  • A.B. Kahng, C.-H. Park, X. Xu, H. Yao, Layout decomposition for double patterning lithography, in: Proceedings of the...
  • A.B. Kahng, C.-H. Park, X. Xu, H. Yao, Revisiting the layout decomposition problem for double patterning lithography,...
  • K. Yuan, J.-S. Yang, D.Z. Pan, Double patterning layout decomposition for simultaneous conflict and stitch...
  • Y. Xu, C. Chu, GREMA: Graph reduction based efficient mask assignment for double patterning technology, in: Proceedings...
  • C.-H. Hsu, Y.-W. Chang, S.R. Nassif, Simultaneous layout migration and decomposition for double patterning technology,...
  • J.-S. Yang, K. Lu, M.S. Cho, K. Yuan, D.Z. Pan, A new graph-theoretic, multi-objective layout decomposition framework...
  • Y. Xu, C. Chu, A matching based decomposer for double patterning lithography, in: Proceedings of the International...
  • W.-S. Luk, H. Huang, Fast and lossless graph division method for layout decomposition using SPQR-Tree, in: Proceedings...
  • Y. Ban, K. Lucas, D. Pan, Flexible 2D layout decomposition framework for spacer-type double pattering lithography, in:...
  • M. Cho, Y. Ban, D.Z. Pan, Double patterning technology friendly detailed routing, in: Proceedings of the IEEE/ACM...
  • K. Yuan, K. Lu, D.Z. Pan, Double patterning lithography friendly detailed routing with redundant via consideration, in:...
  • Cited by (5)

    Charles Chiang joined Synopsys, Inc. in 2001 after working at IBM and EDA companies for 10 years. His research interests include routing, placement, floorplanning, design for manufacturability (DFM) and 3D IC integration. His main focus now is on mask synthesis area.

    Dr. Chiang received his Ph.D. degree from the Department of Electrical Engineering and Computer Science, Northwestern University, Illinois, USA, in 1991 and Bachelor degree from Tunghai University, Taichung, Taiwan, in 1980. He is the recipient of 2007 Synopsys Distinguished Inventor Award and co-authored a DFM book "Design for Manufacturability and Yield for Nano Scale CMOS" in June 2007. He has published more than 60 technical papers and granted 26 patents. He has served on several technical program committee of conferences including ICCAD and ASP-DAC.

    Dr. Haiong Yao received B.S. degree in computer science and technology from Tianjin University, P.R. China, in 2002. Then he received the M.S. and Ph.D. degrees in computer science and technology from Tsinghua University, P.R. China, in 2007. From 2007 to 2009, he was a Post-Doctoral Research Scholar with the Department of Computer Science and Engineering, University of California at San Diego. Since 2009, he has been an Assistant Professor with the Department of Computer Science and Technology, Tsinghua University. His research interests include very large scale integration physical design, design–manufacturing interface, analog routing, etc. Dr. Yao has published over 30 journal and conference papers. He received two Best Paper Nominations at ICCAD in 2006 and 2008. He also received one ISQED Best Paper Nomination in 2011.

    Subarna Sinha received B.S. degree from the Department of Electronics and Electrical Communication Engineering, Indian Institute of Technology, Kharagpur, in 1996 and the M.S. and Ph.D. degrees from the Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, in 1998, and 2002, respectively.

    She is a research associate in the Department of Computer Science at Stanford University. Before that, she worked at Intel Strategic CAD Labs and Synopsys Advanced Technology Group. She has made numerous contributions in the area of logic synthesis, physical synthesis and design-for-manufacturability. Her current research interests include developing computational methods for mining large genomic datasets.

    Dr. Sinha is the recipient of numerous awards including the Donald O Pederson for the best paper and the Synopsys Inventor Award in 2009.

    Wei Zhao received the B.S. and M.S. degrees in computer science from Tsinghua University, Beijing, China, in 2010 and 2013, respectively.

    His main research interests include power grid analysis and numerical algorithms.

    Yici Cai received B.S degree in Electronic Engineering from Tsinghua University, Beijing, China in 1983, received M.S. degree in Computer Science and Technology from Tsinghua University, Beijing, China, in 1986, and received Ph.D. in Computer Science, University of Science & Technology of China, Hefei, China, in 2007. She has been a professor with the Department of Computer Science & Technology, Tsinghua University. Her research interests include design automation for VLSI integrated circuits algorithms and theory, power/ground distribution network analysis and optimization, high performance clock synthesis, and low power physical design.

    This work is supported in part by the National Natural Science Foundation of China (61274031), Doctoral Fund of Ministry of Education of China (20111011328), and the National Natural Science Foundation of China (61106104).

    View full text