# Statistical Timing Analysis of Flip-flops Considering Codependent Setup and Hold Times

Safar Hatami, Hamed Abrishami, Massoud Pedram
Department of Electrical Engineering
University of Southern California
Los Angeles, CA
{shatami, habrisha, pedram}@usc.edu

#### **ABSTRACT**

Statistical static timing analysis (SSTA) plays a key role in determining performance of the VLSI circuits implemented in state-of-the-art CMOS technology. A pre-requisite for employing SSTA is the characterization of the setup and hold times of the latches and flip-flops in the cell library. This paper presents a methodology to exploit the statistical codependence of the setup and hold times. The approach comprises of three steps. In the first step, probability mass function (pmf) of codependent setup and hold time (CSHT) contours are approximated with piecewise linear curves by considering the probability density functions of sources of variability. In the second step, pmf of the required setup and hold times for each flip-flop in the design are computed. Finally, these pmf values are used to compute the probability of individual flip-flops in the design passing the timing constraints and to report the overall pass probability of the flip-flops in the design as a histogram. We applied the proposed method to true single phase clocking flip-flops to generate the piecewise linear curves for CSHT. The characterized flip-flops were instantiated in an example design, on which timing verification was successfully performed.

## **Categories and Subject Descriptors**

B.8.2 [Performance and Reliability]: Performance Analysis and Design Aids

## **General Terms**

Algorithms, Performance, Design, Reliability

## **Keywords**

Probability, process variations, statistical static timing analysis (SSTA), setup time, hold time, codependency, piecewise linear.

## 1. INTRODUCTION

As we move towards the 45nm and lower minimum feature sizes for the devices, process variations are becoming an ever increasing concern for the design of high performance integrated circuits [1].

This work was sponsored in part by the Semiconductor Research Corporation (research ID #1423).

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.

GLSVLSI'08, May 4–6, 2008, Orlando, Florida, USA. Copyright 2008 ACM 978-1-59593-999-9/08/05...\$5.00. The process variations can cause excessive uncertainty in timing calculation, which in turn calls for sophisticated analysis techniques to reduce the uncertainty. As the number of sources of variations increases, corner-based static timing analysis (STA) techniques computationally become very expensive. Moreover, with decreasing size of transistors and interconnect width, the variation of electrical characteristics is getting proportionally higher.

The process corner approach, which used to work well, may thus result in inaccurate estimates and over-constrained designs. Statistical static timing analysis (SSTA) has been developed to address the above-mentioned shortcomings of the STA [2] [3].

Operating frequencies of up to 1 GHz are common in modern integrated circuits. As the clock period decreases, inaccuracy in setup/hold times caused by corner-based STA tools becomes less acceptable. Optimism in setup/hold time calculation can result in circuit failure, while pessimism leads to inferior performance [4]. Therefore, accurate characterization of the setup and hold times of latches and registers is critically important for timing analysis of digital circuits [5].

Typically in today's circuit design, setup and hold times are characterized independently since these quantities are assumed independent. However, setup and hold times are not independent [4]. In the other words, there are multiple pairs of setup and hold times that result in the same clock-to-q delay. Salman et al. in [4] presented a methodology to co-dependently characterize the setup and hold times of sequential circuit elements and use the resulting multiple pairs in STA. An Euler-Newton curve tracing procedure was used in [5] to efficiently characterize the setup and hold times co-dependently. The set of all codependent setup/hold time pairs which yield the same clock-to-q delay define a contour of the clock-to-q surface. The setup/hold time contours are utilized to evaluate the setup and hold slack<sup>2</sup>. In a conventional static timing analysis, the STA tool reports the percentage of flip-flops which fail the timing constraints in a circuit based on the number of flipflops which have negative slack. This information is then used by the circuit designer to determine the clock frequency of circuit.

With statistical parameter variations becoming more visible in VLSI circuits, delay of every combinational path in the circuit as well as the setup and hold times of flip-flops (which serve as the start and end points of the combinational paths) become non-deterministic parameters, therefore, values of setup and hold slacks themselves become random variables. The existing statistical STA (SSTA) algorithms consider only the impact of process variation only on the delay of combinational paths in the circuit to estimate

<sup>&</sup>lt;sup>1</sup> See subsection 2.1 for an explanation of setup and hold times, clock-to-q.

<sup>&</sup>lt;sup>2</sup> See subsection 2.3 for an explanation of setup and hold slacks.

the slack of circuit [6]-[7]. However, to perform accurate timing verification and to precisely determine timing violations, the impact of process variations on the clock-to-output delay and codependent setup/hold times must be considered.

This paper presents a statistical CSHT characterization approach by taking into account the impact of process variations. It proposes to efficiently approximate a setup/hold time contour by using a three-point piecewise linear curve. Moreover, this paper proposes a backward Euler based search (BEBS) method to obtain setup/hold times contours. A probability mass function (pmf) is derived for positions of the contours in the setup/hold time plane. Another probability mass function (pmf) is obtained for the random variable defined on the required setup time (RST) and the required hold time (RHT) of the flip-flop in the circuit. These two pmf's are utilized to obtain the probability that the slack of a flipflop is negative and hence violates the timing constraints. The proposed algorithm enables SSTA to report a set of probability values which accurately represent the percentage of time that the flip-flop fails. In contrast, a STA tool reports a deterministic percentage of flip-flops which fail the setup and hold time constraints, and this value may be optimistic or pessimistic for a circuit whose process and circuit parameters are subject to random change.

The remainder of the paper is organized as follows. Section 2 provides needed terminology and present BEBS algorithm for CSHT characterization. The statistical CSHT characterization methodology and algorithm is described in Section 3. Section 4 explains how to use results of the CSHT characterization in a SSTA tool. Simulation results are presented in Section 5. The paper is concluded in Section 6.

## 2. STATIC CSHT

This section provide some terminology, propose a backward Euler based search for characterizing codependent setup-hold time contour for a given clock-to-q delay, and explains how to utilize this contour in a STA tool for timing verification.

### 2.1 Terminology

Latches and flip-flops are the sequential circuit elements used in synchronous designs. The **setup time** is the *minimum time before* the active edge of the clock that the input data line must be valid for reliable latching. Similarly, the **hold time** represents the *minimum time* that the data input must be held stable *after* the active clock edge. The active clock edge is the transition edge (either low-to-high or high-to-low) at which data transfer/latching occurs. The **clock-to-q** delay is the delay from the 50% transition of the active clock edge to the 50% transition of the output, q, of the latch/register. The **setup skew** refers to the delay from the latest 50% transition edge of the data signal to the 50% active clock transition edge; similarly, the **hold skew** denotes the delay from the 50% active clock transition edge to the earliest 50% transition edge of the data signal. Figure 1 illustrates the setup and hold skews, which are denoted by  $\tau_{sw}$  and  $\tau_{hw}$ , respectively.

A common technique for setup/hold time characterization is to plot the clock-to-q delay,  $t_{c2q}$ , for various setup and hold skews via a series of transient simulations. This process in turn produces a clock-to-q *delay surface*. The setup (hold) time is then taken as a particular setup (hold) skew point on the plot, for which the *characteristic clock-to-q delay*<sup>1</sup>,  $t_{cc2q}$ , increases by say 10%.

As already mentioned, the setup and hold times are not independent quantities, but depend strongly on one another. Typically, the hold time reduces as the setup skew moves up. Similarly, the setup time decreases as the hold skew increases. The tradeoff between setup and hold skews and the hold and setup times is a strong function of the circuit design of the flip-flop [5].



Figure 1. An illustration of the setup and hold skews.

### 2.2 CSHT Characterization

A general method to extract codependent pairs of setup/hold times is to first obtain the clock-to-q delay,  $t_{c2q}$ , as a function of the setup/hold skews. This is followed by extraction of a contour of the setup/hold times corresponding to all points on the  $t_{c2q}$  surface that result in a given increase (*e.g.*, 10%) in the characteristic clock-to-q delay,  $t_{cc2q}$  [5]. Figure 2 depicts an example setup/hold time contour.



Figure 2. A codependent setup and hold time contour,  $\tau_{\rm h} = Z(\tau_{\rm s}) \ \ {\rm for \ given \ clock\text{-}to\text{-}q \ delay.}$ 

Definition 1: Let's denote the set of all setup/hold time points which are located on the contour associated with 10% increase in  $t_{cc2q}$  as  $\Gamma = \{\vec{\tau}_c = (\tau_s, \tau_h)\}$  for c = 1,...,n where n denotes the number of the data points on the contour. Alternatively, we write this contour as  $\tau_h = Z(\tau_s)$ .

Definition 2: The slope,  $\alpha = \frac{d\tau_h}{d\tau_s} = \frac{dZ(\tau_s)}{d\tau_s}$ , of contour Γ at point  $A = (\tau_s^A, \tau_h^A)$  is approximated as:  $\alpha = \frac{\tau_h^B - \tau_h^A}{\Delta \tau_s}$  where point B is a

previously calculated point on  $\Gamma$  such that  $\Delta \tau_s = \tau_s^B - \tau_s^A$ .

In *Definition* 2, we may want to use a point B as the reference point for slope calculation where  $\tau_s^A - \tau_s^B = k \Delta \tau_s$ , k > 1. In our experience k = 3 is a good value. In the proposed algorithm (see below), we search through the setup/hold pairs starting from the largest setup time and ending with the smallest one. Furthermore,

constant clock-to-q delay which corresponds to large setup and hold skews is called the "characteristic clock-to-output delay" of the flip-flop.

<sup>&</sup>lt;sup>1</sup> In a flip-flop, if the setup and hold skews are larger than certain values then the clock-to-q delay will become independent of these skew; this

we assume that the slope of  $\Gamma$  changes smoothly. This is true since the setup time step is chosen to be small enough so that there are no singular values on  $\Gamma$ . Consequently, we can use the slope at point A to guess the next point  $G = (\tau_s^G, \tau_h^G)$  on  $\Gamma$  as follows:  $\tau_s^G = \tau_s^A - \Delta \tau_s$ ,  $\tau_h^G = \tau_h^A - \alpha \Delta \tau_s$ . This may be compared with the approach in [5] where the authors use a nonlinear circuit model to construct the contour. Our approach is clearly simpler while producing excellent results.

We next describe a backward Euler based search (BEBS) algorithm to efficiently construct the setup/hold time contour. Let  $\Delta \tau_s$  denote the setup time step resolution that the user intends to have for the CSHT characterization. The BEBS algorithm is as follows:

- 1) Find  $t_{cc2q}$  for the flip-flop by doing a transient simulation with large setup and hold skews. Initialize i = 1 and  $\tau_s^i$  to the largest setup time for which we want to calculate the corresponding hold time. A good guess for the largest value of setup time is half of the clock period. Next sweep the hold skew values and determine the hold time,  $\tau_b^i$ .
- 2) Calculate slope  $\alpha^i$  at  $(\tau_s^i, \tau_h^i)$  from *Definition* 2. Notice  $\alpha^1 = 0$  because  $\Gamma$  is asymptotic to a constant hold time value when  $\tau_s \to \infty$ .
- 3) Set  $\tau_s^{i+1} = \tau_s^i \Delta \tau_s$  and calculate the first guess for the hold time by using backward Euler (BE) method as follows (see Figure 2):

$$\tau_{h,init}^{i+1} = \tau_h^i - \alpha^i \Delta \tau_s \tag{1}$$

Sweep the hold skew values in the range of  $\tau_{h,init}^{i+1} \pm \alpha^i \Delta \tau_s$  with time step  $\Delta \tau_h$  (hold time step resolution) and find the hold time  $\tau_h^{i+1}$  i.e., the value of hold skew which results in a clock-to-q delay equal to  $1.1 \times t_{cc2a}$ .

4) Repeat steps 2-3 for i=2 to n to compute the desired n setup/hold pairs on the contour.

### 2.3 Application of CSHT in STA

In general, a STA tool reads in a circuit netlist, a cell library, and a clock period T [4]. The tool reports whether the circuit performs as intended. This analysis is accomplished by computing the worst setup slack ( $S_S$ ) and worst hold slack ( $H_S$ ) for each flip-flop. Referring to Figure 3, these slacks are computed as follows:

$$S_S = min (\tau_{sw}) - \tau_s = T + min(D_{p2}) - max(D_{p1} + D_c + t_{c2q}) - \tau_s$$
 (2)  $H_S = min (\tau_{hw}) - \tau_h = min(D_{p1} + D_c + t_{c2q}) - max(D_{p2}) - \tau_h$  (3) where  $D_{p1}$ ,  $D_{p2}$ , and  $D_c$  stand for the delays of local clock signals compared to the global clock, and delay of the combinational logic encased between the input and output flip-flops, respectively, as illustrated in Figure 3.



Figure 3. Definition of  $S_S$  and  $H_S$  in a synchronous data path.

Definition 3: The required setup time (RST) for a given flip-flop is defined as the minimum value of  $\tau_{sw}$  for that flip-flop which results in a non-negative setup slack (i.e., the minimum setup skew

needed to eliminate setup time violations for the flip-flop). The required hold time (RHT) is defined similarly.

If a slack is negative, it is said to be "violated". If a setup slack  $S_S$  is violated, then the circuit can operate correctly only by increasing T. If a hold time  $H_S$  is negative, then the circuit will not function correctly unless delay elements are inserted on the short paths in the combinational logic.

## 3. STATISTICAL CSHT

As mentioned before, process variations greatly affect the timing characteristics of the flip-flop. In SSTA, a key objective is to calculate the probability of satisfying the setup and hold times by each flip-flop in the circuit subject to process variations. To do this, we must derive the probability distribution of the CSHT contour for a flip-flip (given its clock-to-q delay) as well as the probability distribution of the required setup and hold times in the setup-hold time plane. This section first shows how variations in the process parameters are translated into pdf's that describe variations in the flip-flop parameters. Next it describes an approach to quantify the impact of process variations on the CSHT contour of the flip-flops as well as the required setup and hold times for each individual flip-flop in the design. Finally by using these probability distributions, the probability of timing violations in any flip-flop in the design is calculated.

## 3.1 Problem Formulation

Let  $\bar{Q}$  denote the set of circuit random independent variables, where  $q_i \in \bar{Q}$  (i=1,2,...,M) refer to a random variable with a normal distribution given by  $N(\mu_{q_i},\sigma_{q_i})$ . If these circuit variables are not independent, by applying the Principle Component Analysis, an independent variable set may be generated [2]. In this work, two process parameters are considered [8]-[9] as random variables: transistor length  $L_{eff}$  and transistor threshold voltage  $V_{th}$ . In order to have simple equations to work with, a new set of random variables,  $\bar{P}$ , are created from  $\bar{Q}$  where  $p_i = q_i - \mu_{q_i}$ . In this way, the distribution of random variable  $p_i \in \bar{P}$  becomes  $N(0,\sigma_{p_i})$ .

The goal is to estimate the distribution of pairs  $\vec{\tau}$  on the setup/hold plane corresponding to a particular clock-to-q delay. In order to carry out this, the contour of setup/hold times is modeled with a three-point piecewise linear curve. To find these points, we consider three different slopes  $\alpha_l$ ,  $\alpha_2$  and  $\alpha_3$  on the contours. Intuitively  $\alpha_l$ ,  $\alpha_2$  and  $\alpha_3$  corresponds to the points with lower dependency of the contour to the setup time, equal dependency of the contour on both setup time and hold time, and lower dependency of the contour to the hold time, respectively. Practically, typical values of  $\alpha_l$ ,  $\alpha_2$  and  $\alpha_3$  are around -8, -1 and 0. The loci of these three points (critical points) when the flip-flop parameters change randomly are approximated by three lines,  $\vec{d}_1$ ,  $\vec{d}_2$ , and  $\vec{d}_3$  as shown in Figure 4.

The problem of finding the pdf of position of a setup/hold time contour is simplified to that of finding the pdf of  $\vec{\tau}$  's in the directions of these three lines. When the flip-flop parameters change, the perturbation of  $\vec{\tau}$  is approximated as a linear function of deviations of all parameters. To drive the linear function, the

of deviations of all parameters. To drive the linear function, the sensitivities of  $\bar{\tau}$  with respect to any flip-flop parameter are assessed. The sensitivity list is exploited to compute the variance

of  $\vec{\tau}$  in the direction of the three aforementioned lines. To find the variance, the sensitivity of  $\vec{\tau}$  respect to each  $p_i$  is needed.



Figure 4. Partitioning setup/hold time plane into 7 parts, each with a fixed total probability (red lines denote  $\vec{d}_1$ ,  $\vec{d}_2$ ,  $\vec{d}_3$ ).

Definition 4: We define sensitivity of  $\vec{\tau}$  respect to  $p_i$  in the direction  $\vec{d}_j$  is  $s_{ji} = \frac{\partial \vec{\tau}}{\partial p_i} \bigg|_{\vec{J}}$ .

Definition 5: We define three new random variables  $r_{\bar{d}}$  ,  $r_{\bar{d}}$  and  $r_{\vec{d}_2}$  associated with the variation of  $\vec{\tau}$  in directions  $\vec{d}_1$ ,  $\vec{d}_2$  and  $\vec{d}_3$ , respectively. We call them critical line random variables (CLRVs). A first order Taylor expansion is used to approximate  $r_{\bar{d}}$  ,  $r_{\bar{d}}$ , and  $r_{\vec{d}_3}$  in terms of process random variables  $p_i \in \vec{P}$  as follows:

$$r_{\bar{d}j} = \vec{\tau}_{\bar{d}_j} + \sum_{i=1}^{M} s_{ji} p_i \tag{4}$$

where  $\vec{\tau}_{d_j}$  denotes the expected value of  $r_{\vec{d}}$ 

Since we have assumed that random variables in  $\vec{P}$  have normal distribution and are independent, random variables  $r_{ar{d}_1}$  ,  $r_{ar{d}_2}$  and  $r_{ar{d}_3}$ are also normal. From (4), variances of  $r_{\bar{d_1}}$  ,  $r_{\bar{d_2}}$  and  $r_{\bar{d_3}}$  are computed as  $\sigma_{\bar{d}_j}^2 = \sum_{j=1}^{M} s_{ji}^2 \sigma_{ji}^2$ . For j=1, 2 and 3. Since the mean of each  $p_i$  is zero, the expected value of  $r_{ar{d}_1}$  ,  $r_{ar{d}_2}$  and  $r_{ar{d}_3}$  are  $\vec{ au}_{d_1}$  ,  $\vec{ au}_{d_2}$ and  $\vec{\tau}_{d_3}$ .

## 3.2 Defining a pmf for the Setup/Hold Plane

To efficiently calculate the probability value of negative or positive slacks, we use a probability mass function (pmf) to specify the probability of setup/hold time CLRV's as explained below.

Recall that a probability mass function (pmf) is a function that gives the probability that a discrete random variable is exactly equal to some value. A probability mass function differs from a pdf in that the values of a pdf are defined only for continuous random variables. Note that a probability distribution is called discrete if it is characterized by a probability mass function. Thus, the distribution of a random variable X is discrete, and X is then called a discrete random variable, if  $\sum Pr(x=u) = 1$  as u runs through the set of all possible values of X.

To explain the approach for partitioning the setup/hold plane into regions using a pmf, consider Figure 4. We would like to partition the plane into  $N_{sh}$  contiguous regions with equal total probability  $P_{sh}=1/N_{sh}$  (that is, for a 1-D pdf, the area under the pdf in each of these  $N_{sh}$  parts is equal to  $1/N_{sh}$ ). By using the standard deviations of  $r_{\vec{d}_1}$  ,  $r_{\vec{d}_2}$  and  $r_{\vec{d}_3}$  , the critical lines  $\vec{d}_1$  ,  $\vec{d}_2$  and  $\vec{d}_3$  are divided into  $N_{sh}$  segments, each with a discrete probability  $P_{sh}$ .

Definition 5: Statistical required setup time (SRST) is a random variable defined on the RST of some specific flip-flop in the design when the preceding combinational logic gates and flip-flops are subjected to random process variations. For the right flip-flop shown in Figure 3, the SRST is computed as follows:

$$SRST = T + min(D_{p2}) - max(D_{p1} + D_c + t_{c2q})$$
 (5)

Definition 6: Statistical required setup time (SRST) is a random variable defined on the RHT of some specific flip-flop in the design when preceding combinational logic gates and flip-flops are exposed to random process variations. For the right flip-flop shown in Figure 3, the SRHT is computed as follows:

$$SRHT = min(D_{p1} + D_c + t_{c2q}) - max(D_{p2})$$
 (6)

 $SRHT = min(D_{p1} + D_c + t_{c2q}) - max(D_{p2})$  (6) Consider random variables  $D_{p1}$ ,  $D_{p2}$  and  $D_c$  with normal distribution as follows:

 $D_{pl} \sim N(\mu_{pl}, \ \sigma_{pl}), \ D_{p2} \sim N(\mu_{p2}, \ \sigma_{p2}), \ D_c \sim N(\mu_c, \ \sigma_c)$ Assume  $D_{pl}$ ,  $D_{p2}$  and  $D_c$  are independent variables, then by using min-max operation, SRST and SRHT are approximated by two normal variables whose variance and mean are computed from  $\mu_{nl}$ ,  $\mu_{p2}$ ,  $\mu_c$ ,  $\sigma_{pl}$ ,  $\sigma_{p2}$ , and  $\sigma_c$ , [2]. Equations (8) show an archetypal distribution for SRST and SRHT:

$$SRST \sim N(\mu_s, \sigma_s), SRHT \sim N(\mu_h, \sigma_h)$$
 (8)

Based on (8), the setup/hold time plane is partitioned into  $N_{rsh}$ regions, each with a probability  $P_{rsh}$ . The partitioning procedure is similar to the one described in subsection 3.2. Figure 5 shows a typical partitioning of the setup/hold time plane based on the required setup and hold times.



Figure 5. Partitioning the setup/hold time plane into five parts, with a fixed total probability based on SRST and SRHT.

Definition 7: Statistical pass value (SPV) is defined as the probability that some specific flip-flop in the design satisfies the required setup and hold time constraint.

SPV is computed from the joint probability distribution of the parts obtained from statistical CSHT partitioning (SCP) and those obtained from statistical required setup/hold time partitioning (SRP). Details are provided next.

Let the number of the borderlines between adjacent parts in SRP and SCP solutions range from 1 to  $N_{rsh}$  and  $N_{sh}$ , respectively (the last borderline for each partitioning solution is at infinity.) The numerations are increasing from origin of the setup-hold plane to infinity. Let's denote each border line of SCP by  $Bc_u$  and each border line of SRP by  $Br_v$ . We make an array  $A_{cr}$  of  $N_{sh}$  by  $N_{rsh}$ . An element  $x_{u,v}$  in the  $u^{th}$  row and the  $v^{th}$  column of this array is one if  $Bc_u$  and  $Br_v$  have an intersection; otherwise it is zero. Obviously, the last column of this 0-1 matrix is all 1's while the last row is all 0's (except for its very last column entry which is 1). Since random variables in both SCP and SRP are independent, the value of SPV may be computed as follows:

$$SPV = \text{Prob}(\text{Pass Timing Check}) = P_{sh} P_{rsh} \sum_{u=1}^{N_{sh}} \sum_{v=1}^{N_{rsh}} x_{u,v}$$
 (9)

As an example consider the setup-hold time plane displayed in Figure 6. In this example  $N_{sh}$  and  $N_{rsh}$  are 7 and 5, respectively. The parts in SCP are separated by black lines while the parts in SRP are separated by red lines. We have:

$$A_{cr} = \begin{bmatrix} 0 & 1 & 1 & 1 & 1 \\ 0 & 1 & 1 & 1 & 1 \\ 0 & 0 & 1 & 1 & 1 \\ 0 & 0 & 1 & 1 & 1 \\ 0 & 0 & 1 & 1 & 1 \\ 0 & 0 & 0 & 1 & 1 \\ 0 & 0 & 0 & 0 & 1 \end{bmatrix}$$

The blue rectangles in Figure 6 correspond to the 'one' values of the array. As an example, the second column of the array states that the second borderline in the SRP solution has intersections with the first and second borderlines of the SCP solution, and thus, the SPV value is increased by  $2P_{sh}P_{rsh}$ .



Figure 6. Example to illustrate *SPV* calculation (borderlines at infinity are not shown).

In this way for each flip-flop in the design, there is an *SPV* which is computed in SSTA. More precisely, SSTA gives distribution of meeting the setup-hold time constraints for all flip-flops in the design as a "Pass probability" histogram instead of a percentage of the number of flip-flops that meet the timing constraints. The designer uses this histogram to analyze the circuit and decide to lower the clock frequency, resize any flip-flop, or resize any combinational logic gate if the histogram is heavier on the left side (the expected Pass probability per flip-flop is lower than ½). In contrast, frequency may be increased if this histogram is heavier on the right side (the expected Pass probability is higher than ½).

## 4. SIMULATION RESULTS

All experiments were done on a Linux server with a 1.5-GHz CPU and a 14-GB memory. Process and electrical parameters of a typical 130 nm CMOS technology were used. In this work, for transistors, the following process parameters are considered [8]-[9] as random variables: transistor length  $L_{\rm eff}$ , and transistor threshold voltage  $V_{th}$ . The  $3\sigma$  variation for  $L_{\rm eff}$  and  $V_{th}$  are 45 nm and 50 mv, respectively. We first show that the proposed BEBS algorithm works correctly and efficiently to extract CSHT characteristics. Next the SPV is calculated and compared for TSPC flip-flop by using proposed method and Monte-Carlo.

#### 4.1 BEBS validation

At first we characterized CSHT for true single phase clocking (TSPC) flip-flop by producing the clock-to-q delay surface. Figure 7 depicts the surface and constant clock-to-q plane used for characterization. Next the BEBS algorithm was applied to the TSPC flip-flop and the CSHT was characterized. Figure 8 compares the resulting setup/hold time contour obtained by BEBS vs. that produced by the conventional method. As seen they match each other very closely. The speedup of BEBS over the conventional method is between a factor of 10x to 20x.

#### 4.2 Statistical CSHT Characterization

In order to calculate the SPV value for a flip-flop in a circuit, critical points and sensitivity values are calculated. Figure 9 and Figure 10 show the critical points for normally distributed random variables  $L_{\it eff}$  and  $V_{\it th}$  for a TSPC flip-flop. Table 1 reports the sensitivity values of the TSPC flip-flop. The standard deviation of  $r_{\it d_i}$ ,  $r_{\it d_j}$  and  $r_{\it d_i}$  are calculated and reported in

Table **2**. The values of  $N_{rsh}$  and  $N_{sh}$  are set to 3. For *SRST* and *SRHT* with distribution N(55ps, 27.5ps) and N(60ps, 25ps), SPV is 2/3 by using the proposed technique. Indeed the Monte-Carlo simulation estimated SPV to be very close to 2/3.



Figure 7. CSHT characterization done by generating the clockto-q delay surface.

#### 5. CONCLUSION

This paper proposes a methodology to exploit the statistical codependence of the setup and hold times. The approach comprises of two phases, pmf of codependent setup and hold time (CSHT) contours are determined by considering the probability density functions (pdf) of sources of variability in the first phase. A numerical backwards Euler based search is proposed to characterize CSHT efficiently and accurately. Validity of these numerical algorithm for extracting the contours, critical points and sensitivity values are verified by applying them to the TSPC flip-

flop to generate the piecewise linear contours for CSHT. In the second phase the piecewise linear curves are used to estimate the timing pass rates in terms of probability values. The characterized flip-flops are instantiated in an example design, on which timing verification is performed. The accuracy of algorithm is compared with Monte-Carlo simulation.



Figure 8. Comparison of setup/hold time contour characterization for BEBS (solid line) and the conventional (dashed line) methods.



Figure 9. Impact of  $L_{eff}$  variation on CSHT,  $\alpha_l = -7$ ,  $\alpha_2 = -1$  and  $\alpha_3 = 0$ .



Figure 10. Impact of  $V_{th}$  variation on CSHT,  $\alpha_1 = -10$ ,  $\alpha_2 = -1$  and  $\alpha_3 = 0$ .

Table 1. Sensitivity values for a TSPC flip-flop (index j in  $S_{j,-}$  refers to direction  $\vec{d}_{i,-}$ )

| Parameter   | Sensitivity (ps/mv) | Parameter    | Sensitivity (ps/nm) |
|-------------|---------------------|--------------|---------------------|
| $S_{I,Vth}$ | -0.1122             | $S_{I,Leff}$ | 0.1770              |
| $S_{2,Vth}$ | -0.1171             | $S_{2,Leff}$ | 0.2227              |
| $S_{3,Vth}$ | -0.0495             | $S_{3,Leff}$ | 0.1493              |

Table 2. Standard deviation CLRVs

| Parameter                              | standard deviation (ps) |  |
|----------------------------------------|-------------------------|--|
| $\sigma_{l,Leff}$                      | 3.3                     |  |
| $\sigma_{\!\scriptscriptstyle 2,Leff}$ | 3.9                     |  |
| $\sigma_{\!\scriptscriptstyle 3,Leff}$ | 2.4                     |  |

#### REFERENCES

- [1] V. Vishvanathan, C.P. Ravikumar, and Vinod, Menezes, "Design Technology Challenges in the Sub-100 Nanometer Era," *in the periodical of the VLSI society of India VLSI Vision* vol. 1, no. 1, 2005.
- [2] J. Singh, S. Sapatnekar, "Statistical timing analysis with correlated non-Gaussian parameters using independent component analysis," *Proc. DAC*, 2006.
- [3] H. Chang and S. Sapatnekar, "Statistical timing analysis under spatial correlations," *IEEE Transaction on Computer-Aided Design of Integrated Circuits and Systems*, vol. 24, no. 9, September, 2005.
- [4] E E. Salman, A. Dasdan, F. Taraporevala, K. Kucukcakar, and E.G. Friedman, "Exploiting setup-hold-time interdependence in static timing analysis," *IEEE Transaction on Computer-Aided Design of Integrated Circuits and Systems*, vol. 26, no. 6, June, 2007.
- [5] S. Srivastava and J. Roychowdhury, "Interdependent latch setup/hold time characterization via Euler-Newton curve tracing on state-transition equations," Proc. DAC, 2007.
- [6] A. Nardi, E. Tuncer, S. Naidu, A. Antonau, S. Gradinaru, T. Lin, J. Song, "Use of statistical timing analysis on real designs," *Proc. DATE*, 2007.
- [7] C. S. Amin, N. Menezes, K. Killpacks, F. Dartus, U. Choudhug, N. Hakims, and Y. I. Ismail, "Statistical static timing analysis: how simple can we get?," proc. of DAC, 2005.
- [8] International technology roadmap for semiconductors. Semiconductor Industry Association, 2005, http://public.itrs/net/.
- [9] S. Nassif, "Design for variability in DSM technologies," Proc. ISQED, 2000.