# On the Efficacy of Input Vector Control to Mitigate NBTI Effects and Leakage Power

Yu Wang<sup>1</sup>, Xiaoming Chen<sup>1</sup>, Wenping Wang<sup>2</sup>, Varsha Balakrishnan<sup>2</sup>, Yu Cao<sup>2</sup>,

Yuan Xie<sup>3</sup>, Huazhong Yang<sup>1</sup>

<sup>1</sup>Dept. of E.E., TNList, Tsinghua Univ., Beijing, China

<sup>2</sup>Dept. of E.E., Arizona State Univ., USA, <sup>3</sup>Dept. of CSE, Pennsylvania State Univ., USA

<sup>1</sup> Email: yu-wang@mail.tsinghua.edu.cn

Abstract<sup>1</sup>As technology scales, the aging effect caused by Negative Bias Temperature Instability (NBTI) has become a major reliability concerns for circuit designers. Consequently, we have seen a lot of research efforts on NBTI analysis and mitigation techniques. On the other hand, reducing leakage power remains to be one of the major design goals. Both NBTI-induced circuit degradation and standby leakage power have a strong dependency on the input patterns of circuits. In this paper, we propose a co-simulation flow to study NBTI-induced circuit degradation and leakage power, taking into account the different behaviors between circuit active and standby time. Based on this flow, we evaluate the efficacy of Input Vector Control (IVC) technique on mitigating circuit aging and reducing standby leakage power with experiments on benchmark circuits that are implemented in 90nm, 65nm, and 45nm technology nodes. The IVC technique is proved to be effective to mitigate NBTI-induced circuit degradation, saving up to 56% circuit performance degradation at 65nm technology node, and on average 30% circuit performance degradation across different technology nodes. Meanwhile, IVC technique can save up to 18% of the worst case leakage power. Since leakage power and NBTI-induced circuit degradation have different dependencies on the input patterns, we propose to derive Pareto sets for designers to explore trade-offs between the life-time reliability and leakage power.

**Keywords**— Negative Bias Temperature Instability (NBTI), input vector control (IVC), leakage power reduction

## I. INTRODUCTION

As technology scales, Negative Bias Temperature Instability (NBTI) is emerging as one of the major reliability degradation mechanisms [1]. NBTI occurs when PMOS transistors are negatively biased (i.e.,  $V_{gs} = -V_{dd}$ ) at elevated temperature, causing a shift in threshold voltages. Over a long period of time, such  $V_{th}$  shifts can potentially cause a significant increase in the delay of PMOS devices [2], and result in about 10-20% degradation in circuit speed, potentially causing a functional failure [3]. The impact of NBTI on circuit performance has become a key issue with technology scaling [4]. Consequently, it is important to model, analyze, and mitigate the impact of the NBTI effect on the circuit performance.

Early research on NBTI mainly focused on the analysis of the threshold voltage degradation and the impact on the drive current of semiconductor devices [5]. Recently, many researchers have studied the NBTI modeling and mitigation techniques on various design abstraction levels. Analytical compact models [6-8] that evaluate NBTI effects using power-law timing degradation were proposed to help designers estimate the performance degradation. Based on these transistor compact models, circuit level NBTI degradation analysis models were proposed [9,10]. Static timing analysis (STA) techniques considering NBTI degradation were proposed [11, 12]. Based on these NBTI-aware circuit performance degradation models and STA techniques, researchers have investigated design techniques that can mitigate NBTI effects, such as transistor sizing [13], adjustment of dynamic operation conditions (supply voltage  $(V_{dd})$ , temperature (T), and signal probability (SP)) [11], bit-flipping technique [14]. Higher level technique, such as NBTI-aware synthesis [15] was also studied.

The majority of prior works estimated the NBTI-induced life-time degradation with the assumption that the circuits operate all the time with the worst case on-chip temperature. However, in practical not every application requires the underlying hardware to operate at the highest performance level all the time. Modules in which the computation is burst are often idle. There are periods during which the PMOS transistors are under static stress condition. Many PMOS transistors affected by NBTI can be found in both combinational and storage blocks when the gate inputs are set to be "0" during the standby time, thus leads to a larger degradation. Consequently, it is important to accurately estimate the NBTI-induced degradation at the standby time in order to safely guard-band the circuit performance, and to find design techniques to mitigate such degradation.

Input Vector Control (IVC) is a well-studied technique for leakage power reduction [16, 17]. Since NBTI also depends on the input patterns of PMOS devices, IVC can be used to mitigate the NBTI effect during the standby mode. Wang et al. [12] proposed a method to select the best input vectors from the minimum leakage vector set.

978-1-4244-2953-0/09/\$25.00 ©2009 IEEE

10th Int'l Symposium on Quality Electronic Design

<sup>&</sup>lt;sup>1</sup>This work was supported by National Natural Science Foundation of China (No. 60870001, No.90207002) and TNList Cross-discipline Foundation. Yu Cao's work was partially supported by GSRA/SRC. Yuan Xie's work was supported in part by grants from NSF 0643902, 0702617, and a grant from SRC.

<sup>19</sup> 

However, due to the different dependencies of leakage and NBTI on the input patterns, the best input vectors for minimum leakage power may not be the best input vectors to minimize NBTI-induced circuit degradation. Since they didn't consider the difference of NBTI effects during active and standby time, the results claimed only 3% circuit degradation saving at the 90nm technology node. Jaume et al. [18] used different input vectors to change the zero-probability of internal PMOS transistors, so that the PMOS transistors' degradation is evenly distributed. The effect of this technique on an adder is evaluated, however, detailed research for random logic is needed.

In this paper, we first propose a co-simulation flow to estimate NBTI-induced aging effect and leakage power, and then evaluate the potential and efficacy of the IVC techniques. The contributions of this paper can be summarized in the following aspects:

- We propose a co-simulation flow for NBTI-induced aging effect and leakage power. The co-simulation flow integrates transistor level NBTI modeling and leakage modeling, path-based NBTI-aware timing analysis, and gate-level leakage analysis. The simulation flow includes two important factors that affect the accurate estimation of performance degradation: the Ratio of Active time to Standby time (RAS) and the standby time temperature.
- We evaluate IVC techniques for NBTI mitigation. From the experimental results, IVC technique, which leads to around 30% circuit performance saving from 90nm to 45nm technology nodes, is proved to be effective during the standby time for mitigating NBTI-induced circuit degradation. For some circuit, IVC technique will mitigate up to 56% circuit performance degradation at 65nm node. In addition, we argue that it is possible to perform NBTI optimization and the leakage power optimization simultaneously. Since the leakage power and NBTI-induced circuit degradation have different dependencies on the input patterns, Pareto sets [19] are derived for designers to explore trade-offs between the circuit life-time and leakage power consumption.
- IVC techniques are compared with other design techniques including tuning  $V_{dd}$ , tuning  $V_{th}$ , and power gating. The results show that IVC technique has comparable potential on NBTI mitigation, while the design overhead of IVC technique is small.

The rest of the paper is organized as follows. Section II describes our NBTI-induced circuit degradation and leakage power co-simulation flow. Section III presents NBTI and leakage models, and analyzes the different dependency of NBTI and leakage on input patterns. Section IV proposes the IVC technique for NBTI and leakage mitigation. Section V evaluates the experimental results with ISCAS85 and ALU circuits from 90nm to 45nm node; comparison with other techniques is also discussed in Section V. Section VI concludes the paper.



Fig. 1. The proposed NBTI and leakage co-simulation flow.

# II. OVERVIEW OF THE PROPOSED NBTI/LEAKAGE CO-SIMULATION FLOW

Fig.1 shows the proposed NBTI/leakage co-simulation flow based on the STA framework in [20]. For a given circuit, commercial static timing analysis tool is firstly used to generate the Potential Critical Paths (PCPs) using standard timing libraries. When circuit is in active mode, statistical information for input Signal Probability (SP) is used to generate the internal node SP. When circuit is in standby mode, logic simulator is used to generate the voltage level of each internal node. The active time internal node SP and the standby time internal node states are used to estimate the NBTI-induced  $V_{th}$  degradation through transistor level NBTI modeling. The leakage power is estimated based on the input vector aware leakage lookup tables. The detailed NBTI model and leakage model will be described in the following section. Based on the  $V_{th}$ degradation estimation and the original timing libraries, a fast path-based NBTI-aware timing analysis is performed to evaluate the PCPs and to report the paths which might have timing violations due to NBTI (e.g. the degradation is less than 10% of the maximal delay during the circuit's lifetime). Our flow helps evaluate the NBTI and leakage mitigation techniques, such as input vector control, tuning  $V_{dd}$ , tuning  $V_{th}$ , and power gating.

#### III. NBTI AND LEAKAGE MODELING

## A. Standby time aware NBTI modeling

Depending on the bias condition of PMOS transistor, NBTI has two phases: stress phase and recovery phase. In the stress phase ( $V_g = 0$ ), the holes in the channel weaken the Si-H bonds, which results in the generation of the positive interface charges and hydrogen species, correspondingly, threshold voltage  $(V_{th})$  of PMOS increases. During the recovery phase  $(V_g = V_{DD})$ , the interface traps can be annealed by the hydrogen species and thus,  $V_{th}$  degradation  $(\Delta V_{th})$  is partially recovered. If a PMOS device is always under stress condition, it is referred as *static* NBTI. Otherwise, both stress and recovery exist during active circuit operation, it is described as *dynamic* NBTI.

Based on the reaction-diffusion mechanism, real time NBTI model is developed in [21, 22] shown in Table I.

TABLE I Summary of the predictive model

| Static  | $A\Big((1+\delta)t_{ox} + \sqrt{C(t-t_0)}\Big)^{2n}$                                                                           |
|---------|--------------------------------------------------------------------------------------------------------------------------------|
| Dynamic | Stress $\left(K_v(t-t_0)^{0.5} + \sqrt[2n]{\Delta V_{th}(t_0)}\right)^{2n}$                                                    |
|         | $Recovery \qquad \Delta V_{th}(t_1) \bigg( 1 - \frac{2\xi_1 t_e + \sqrt{\xi_2 C(t-t_1)}}{(1+\delta)t_{ox} + \sqrt{Ct}} \bigg)$ |
|         |                                                                                                                                |



Fig. 2. Threshold voltage degradation model verification for both static and dynamic NBTI.

The proposed model is verified by 65nm and 90nm silicon data, as shown in Figure 2. From the right figure, we know that for dynamic NBTI there is a sudden change at the beginning of the recovery phase, which has a significant impact on the estimation of NBTI degradation. This sudden drop can be explained by the fast diffusion in the gate dielectric or trapping/detrapping. Using static NBTI model, which ignores recovery phase, to predict  $V_{th}$ degradation for a gate operating under dynamic condition will lead to a dramatic overestimation in  $V_{th}$  degradation. Therefore, the exact amount of degradation relies on the period of time in which the circuit stays in stress or recovery.

Fig. 3 shows  $\Delta V_{th}$  prediction by using the proposed model. The big difference between the static and dynamic NBTI, has also been observed in silicon data [23,24]. Therefore, the simple static analysis may cause an extremely pessimistic estimation of NBTI-induced degradation and consequently, results in over-margining in design stage. On the contrary, only dynamic NBTI model for the total lifetime without considering the static NBTI effect during the standby time may lead to an underestimation of NBTI-induced performance degradation. In this paper,



Fig. 3. Static and dynamic NBTI degradation for different input signal probabilities.

we use dynamic NBTI model in the active time and static NBTI model in the standby time.

The delay difference due to  $\Delta V_{th}$  is given by [12, 13]:

$$\Delta d(v) = \alpha \Delta V_{th} / (V_g - V_{th}) \times d(v) \tag{1}$$

where d(v) is the original delay of gate v which can be extracted from the commercial STA tools. There could be several  $\Delta V_{th}$  of different PMOS's in one gate. In such cases, we just select the largest one to calculate the gate delay degradation, which is the worst case delay degradation.

## B. Leakage power modeling

A leakage lookup table is created by simulating all the gates in the standard cell library under all possible input patterns. Thus the leakage power  $P_{leak}$  can be expressed as:

$$P_{leak}(v) = V_{dd} \times \sum_{input} I_l(v, input) \times Prob(v, input)$$
(2)

where  $I_l(v, input)$  and Prob(v, input) are the leakage current (including subthreshold and gate leakage current) and the probability of gate v under input pattern *input*. Along the circuit life time, the circuit leakage power will be smaller due to the NBTI-induced  $V_{th}$  shifts. We take the leakage power at the starting time of the circuit, which is of the maximum value, as the design objective to be optimized.

# C. Different dependency of NBTI and leakage on input vectors

Both NBTI and leakage mechanism have dependencies on technology and design parameters related to gate drive. In this subsection, we mainly focus on the the internal node dependency analysis.

The NBTI effect on a PMOS transistor depends on  $V_{gs}$ and the stress time (duty cycle) which are both related to the input state of a gate. Consequently, inputs with all 1 will be the best input pattern with the smallest NBTIinduced degradation for all the gate types. On the other hand, both subthreshold leakage and gate leakage depend on the input state of a gate due to the stacking effect [25]. Table II lists the overall leakage power in NOR2, NAND3, and INV gates under different input combinations at 65nm technology node, the temperature is 378K. We can see that leakage power varies between different input vectors. The best input vector for leakage power for NOR2, NAND3, and INV are "11", "000", and "0" respectively. We also simulate all the cells (NAND/AND, NOR/OR, INV, BUF) in the library, and find out that the best case input patterns to mitigate the leakage for NAND/AND/INV gates are all 0's at the inputs, while for NOR/OR/BUF gates are all 1's at the inputs.

#### TABLE II

Leakage power comparison under different input vectors(65nm): A) NOR2 b) NAND3 c) INV. The temperature of leakage power estimation is set to 378K,  $V_{dd} = 1V$ .

| L     | A) NOR2      |   | В     | ) NAND3     |
|-------|--------------|---|-------|-------------|
| Input | Leakage(pW)  | - | Input | Leakage(pW) |
| 00    | 617.0        | - | 000   | 30.1        |
| 01    | 283.2        |   | 001   | 54.9        |
| 10    | 230.1        |   | 010   | 54.7        |
| 11    | 45.8         |   | 011   | 249.1       |
|       |              |   | 100   | 55.1        |
|       | C) INV       |   | 101   | 259.2       |
| Input | Leakage (pW) |   | 110   | 309.8       |
| 0     | 633.2        |   | 111   | 703.3       |
| 1     | 791.3        | - |       |             |

We can see the discrepancy: for NAND/AND/INV gates, the input pattern for least leakage will lead to worst NBTI-induced delay degradation; for NOR/OR gates, the input pattern for least leakage will also lead to best case NBTI-induced delay degradation. Consequently, when the IVC technique, which takes effect by controlling the internal node voltages, is performed to optimize the leakage, the NBTI effect may be worsen; or if IVC is performed to optimize the NBTI effect, the leakage power may be worsen. The input vector for standby time should be carefully chose to meet both leakage power and lifetime requirements.

# IV. IVC TECHNIQUES FOR NBTI AND LEAKAGE MITIGA-TION

In this section, we first define the theoretic bounds of IVC techniques for NBTI mitigation and then describe the IVC techniques we evaluated for mitigating NBTI-induced circuit degradation and leakage power.

## A. Theoretic bounds for NBTI-induced circuit degradation

We define the *theoretic upper bound* for NBTI-induced circuit degradation  $D_{UB}$  is the maximum circuit degradation when all the internal nodes are "0" during the standby time; while the *theoretic lower bound*  $D_{LB}$  for NBTI-induced circuit degradation is the minimum circuit degradation when all the internal nodes are "1" during the standby time.

Of course, in a realistic design, there exists no such input vector that makes the internal nodes all 1's or all 0's, so  $D_{UB}$  and  $D_{LB}$  only define the upper and lower bounds of NBTI-induced degradation. We will compare the maximum and minimum circuit degradation induced by different input vectors with the theoretic bounds for NBTIinduced circuit degradation.

## B. IVC technique for NBTI and leakage mitigation

Different input vectors result in different internal node voltages, hence different NBTI-induced circuit degradation. Similar to the definition of Minimum Leakage Vector (MLV), we define input vectors with smallest NBTIinduced circuit degradation as Minimum Degradation Vectors (MDVs). Finding MDV is as hard as finding MLV, which is an NP-complete problem [17].

We use two input vector selection methods: 1) exhausted search 2) probability-based algorithm [12]. Fig. 4 shows the input vector selection flow used in our research.



Fig. 4. The input vector selection flow. The input vector generator can generate new input vectors based on the previous results.

#### B.1 Exhausted search

For exhausted search, we run the input vector selection flow for only once. After parsing the benchmark circuits, we can generate random input vectors and use the NBTI/Leakage co-simulation flow proposed in Section II to get the results. We select the input vector with the least NBTI-induced circuit degradation as MDV; and select the input vector with the least leakage power as MLV. We can also get a Pareto set for the least degradation and leakage power.

## B.2 Probability based algorithm

For probability based algorithm, we first generate random input vectors to get the best input vector set for design objectives. According to the 0/1 probability of each input node gaining from the previous best input vector set, the input vector generator generates new input vector set. The iteration continues until the results for the design objective converge. The design objectives include: 1) only NBTI-induced circuit degradation 2) only leakage power 3) NBTI and leakage co-optimization.

## V. Implementation and Simulation Results

## A. Implementation and Experiment Setup

We implement the proposed NBTI/Leakage co-simulation flow and the input vector selection flow in C++ and perl. We use a commercial static timing analysis tool Prime-Time from Synopsys to perform the timing analysis and generate the timing report, as well as the internal node signal probabilities. Benchmark circuits are synthesized using two libraries from industry (90nm and 65nm) and an open cell library (45nm) [26] that are based on the PTM 45nm transistor model [27]. Table IV shows these benchmark circuits, which include ISCAS85 benchmark and some arithmetic components circuits. The circuits "array4x4" and "array8x8" are 4x4 and 8x8 array multipliers; "bkung16" and "bkung32" are 16-bit and 32-bit Brent Kung adders; "booth9x9" is 9x9 booth multiplier; "kogge16" and "kogge32" are 16-bit and 32-bit Kogge Stone adders; "log32" and "log64" are 32-bit and 64bit log shifter; "Pmult16" and "Pmult32" are 16x16 and 32x32 parallel multipliers, respectively.

#### TABLE III

DIFFERENT DESIGN PARAMETERS FOR DIFFERENT TECHNOLOGY NODES

|               | 90nm | 65nm | 45nm |
|---------------|------|------|------|
| $V_{dd}$ (V)  | 1.08 | 1.00 | 0.80 |
| $V_{th}$ (V)  | 0.22 | 0.20 | 0.18 |
| $T_{ox}$ (nm) | 1.4  | 1.2  | 1.1  |

Table III shows the design parameters for each technology node. The active time temperature  $T_{active}$  and standby time temperature  $T_{standby}$  are both set to be 378K corresponding to the worst-case NBTI-induced circuit degradation and leakage power. Ratio of active and standby time (RAS) is set to be 1:9. We can use the statistical information from the real applications for the input probabilities during the active time. We set input probabilities of all the input nodes to 0.5 for simplicity. The circuit lifetime is set to be 10 years.

#### B. Theoretic bound analysis

Table IV shows the theoretic bounds  $(D_{UB}, D_{LB})$  across different technology nodes. From Table IV,  $D_{UB}$  increases from 23.74% at 90nm to 49.63% at 45nm node.  $D_{LB}$  increases from 9.19% to 13.97%. The large derivation between  $D_{UB}$  is due to the large difference of static NBTI models between two technology nodes; while the small difference between  $D_{LB}$  is because the dynamic NBTI model used in the active time does not vary too much between two technology nodes. However, the difference between  $D_{UB}$ and  $D_{LB}$ , which is the potential of standby time NBTI mitigation technique, remains nearly the same: around 70% ( $(D_{UB}-D_{LB})/D_{UB}$ ) from 90nm to 45nm node.

Using benchmark circuit "C432" as an example, we further analyze the theoretic bounds under different RAS and standby time temperatures in Table V. According to the analysis of how often the combinational blocks are idle in Intel's paper [18], we vary the ratio of the active time to

#### TABLE IV

Theoretic bounds for NBTI-induced circuit degradation of ISCAS85 benchmark circuits and some ALU circuits at 90nm, 65nm, and 45nm technology nodes(%). RAS = 1:9,

 $T_{active} = T_{standby} = 378K$ 

| Benchmark | 451      | nm       | 651      | nm       | 90       | nm       |
|-----------|----------|----------|----------|----------|----------|----------|
| Circuits  | $D_{UB}$ | $D_{LB}$ | $D_{UB}$ | $D_{LB}$ | $D_{UB}$ | $D_{LB}$ |
| c432      | 52.44    | 14.92    | 32.93    | 7.91     | 18.71    | 4.17     |
| c499      | 50.85    | 13.51    | 35.18    | 10.32    | 24.51    | 9.25     |
| c880      | 52.58    | 15.63    | 33.13    | 8.77     | 23.10    | 8.16     |
| c1355     | 47.57    | 11.34    | 32.00    | 7.04     | 19.58    | 4.88     |
| c1908     | 50.33    | 13.21    | 34.54    | 9.47     | 24.94    | 9.71     |
| c2670     | 52.97    | 15.30    | 34.31    | 9.74     | 23.63    | 8.55     |
| c3540     | 51.88    | 15.16    | 33.50    | 9.49     | 22.88    | 8.28     |
| c5315     | 52.56    | 15.27    | 35.04    | 10.33    | 27.06    | 11.77    |
| c6288     | 45.27    | 11.57    | 32.49    | 7.86     | 21.45    | 6.48     |
| c7552     | 52.95    | 15.62    | 28.01    | 7.82     | 22.99    | 9.69     |
| array4x4  | 52.07    | 14.63    | 35.05    | 10.63    | 26.71    | 11.31    |
| array8x8  | 50.71    | 13.44    | 36.51    | 11.05    | 26.80    | 11.30    |
| bkung16   | 52.41    | 15.47    | 35.96    | 10.84    | 26.37    | 11.11    |
| bkung32   | 52.51    | 15.47    | 36.08    | 10.84    | 27.52    | 12.06    |
| booth9x9  | 52.91    | 15.64    | 31.59    | 7.03     | 25.76    | 10.54    |
| kogge16   | 52.34    | 15.47    | 36.24    | 11.13    | 25.03    | 9.98     |
| kogge32   | 52.57    | 15.46    | 37.19    | 11.81    | 27.24    | 11.92    |
| log32     | 39.76    | 11.56    | 23.15    | 6.74     | 18.95    | 7.56     |
| log64     | 33.64    | 9.60     | 21.04    | 6.24     | 16.70    | 6.66     |
| Pmult16   | 49.88    | 13.41    | 36.40    | 11.16    | 24.58    | 9.79     |
| Pmult32   | 49.52    | 13.10    | 37.55    | 12.10    | 24.31    | 9.42     |
| Average   | 49.63    | 13.97    | 33.22    | 9.52     | 23.74    | 9.19     |
| Potential |          | 71.88    |          | 71.43    |          | 61.68    |

the total time from 10% to 30%.

It is observed that lowering standby time temperature will decrease the  $D_{UB}$ , since the  $V_{th}$  degradation in the standby time will be mitigated to some extent with a decreased standby temperature. If we lower the standby temperature from 378K to 318K, the  $D_{UB}$  decreases about 10% of the original circuit maximum delay, which is about 30% of the maximum  $D_{UB}$  when  $T_{standby} = 378K$ . Increasing the RAS will cause the increase of both  $D_{UB}$  and  $D_{LB}$ ; however, the difference between  $D_{UB}$  and  $D_{LB}$  is decreasing with an increasing RAS, as shown in Table V.

TABLE V  $D_{UB}$  and  $D_{LB}$  analysis (C432 in ISCAS85) under different RAS and standby temperature  $T_{standby}$  at 65nm node(%).

|               | RAS=1:9  |           | RA       | AS=2:8    | RAS=3:7  |           |  |
|---------------|----------|-----------|----------|-----------|----------|-----------|--|
| $T_{active}$  |          | $D_{LB}$  |          | $D_{LB}$  |          | $D_{LB}$  |  |
| 378K          |          | 7.91      |          | 9.01      |          | 9.71      |  |
| $T_{standby}$ | $D_{UB}$ | Potential | $D_{UB}$ | Potential | $D_{UB}$ | Potential |  |
| 378K          | 32.93    | 75.98     | 33.64    | 73.22     | 33.84    | 71.31     |  |
| 368K          | 31.20    | 74.65     | 31.94    | 71.80     | 32.18    | 69.84     |  |
| 358K          | 29.52    | 73.20     | 30.28    | 70.25     | 30.56    | 68.23     |  |
| 348K          | 27.87    | 71.62     | 28.66    | 68.57     | 28.97    | 66.49     |  |
| 338K          | 26.27    | 69.89     | 27.08    | 66.74     | 27.43    | 64.61     |  |
| 328K          | 24.71    | 67.99     | 25.55    | 64.75     | 25.93    | 62.56     |  |
| 318K          | 23.21    | 65.91     | 24.07    | 62.58     | 24.48    | 60.34     |  |

From Table IV and Table V, we can conclude that NBTI effect during the circuit standby time has significant impact on circuit lifetime. Consequently, standby time NBTI mitigation is attractive to improve the circuit lifetime reliability.

## C. The Effectiveness of the IVC techniques

We first use exhausted search method to search input vectors. By varying the random input vector number from 2008 to 20000, the improvements of the optimization are within 2% of the objective value. Thus we use 2008 random input vectors for all the circuits. Probability based algorithm is used for NBTI and leakage co-optimization. The probability based algorithm has a faster runtime to find input vectors that can achieve comparable optimization (within 1%) as the results from the exhausted search with 20000 random input vectors. Since the focus of the discussion is on the efficacy of IVC, the comparison of these two methods will not be illustrated in detail.

Table VI shows the IVC results at the 65nm technology node for ISCAS85 circuits and ALU circuits. Through input vector selection for NBTI mitigation, the worst-case NBTI-induced degradation  $(D_{worst})$  is on average 29.63% of the original circuit delay, while the best NBTI-induced degradation  $(D_{best})$  is on average 19.11%. The capability of IVC is on average 33.89%  $((D_{worst} - D_{best})/D_{worst})$ . Through input vector selection for leakage power reduction, the capability of IVC is on average 9.23%.



Fig. 5. The IV selection results for NBTI at 90nm and 45nm nodes. RAS = 1:9,  $T_{active} = T_{standby} = 378K$ 

Figure 5 shows the best and worst case degradation using IVC technique at 90nm and 45nm technology nodes. The average capabilities of IVC techniques for NBTI mitigation are 33.27% and 29.84% of the worst case value at 45nm and 90nm node, respectively. The average capabilities of IVC technique for leakage power mitigation are 9.75% and 4.71% of the worst case value at 45nm and 90nm node (the detailed results are not shown). For the three technology nodes, comparing with the theoretic bounds shown in Table IV, the difference between  $D_{worst}$  and  $D_{UB}$  is on average 4% of the original circuit delay, while the difference between  $D_{best}$  and  $D_{LB}$  is about 10%. So there still exists potentials to further integrate the internal node control technique for standby time NBTI mitigation.

In Table VI, we show the corresponding leakage power values of the input vectors with best and worst NBTIinduced circuit degradation. These values are not the same or even close to the best or worst case leakage power. It is also true for the corresponding degradation values of the input vectors with best and worst leakage power. Last two columns in the table show the results of co-optimization. We can see for most of the circuits, the near optimal results can be traced. The differences compared with the best leakage power and NBTI-induced degradation are both within 3.5% of the best values on average.

As we mentioned in Section 3.C, leakage power and NBTI-induced degradation have different dependencies on the input patterns at the gate level: for NAND/AND/INV gates, the input patterns for the least leakage result in the worst NBTI-induced delay degradation; on the other hand, for NOR/OR gates, the input patterns for the least leakage result in the best case NBTI-induced delay degradation. Consequently, during the co-optimization of NBTI and leakage through IVC, we can get the Pareto sets of input vectors with different NBTI-induced circuit degradation and leakage power.



Fig. 6. The Pareto set for c1908 and Array8x8 at 65nm node.  $RAS=1:9,\,T_{active}=T_{standby}=378K$ 

Figure 6 shows two examples of Pareto sets of different benchmark circuits. Each point in the figures is an input vector with corresponding NBTI-induced circuit degradation and leakage power. In the Pareto set of c1908 benchmark circuit, we can find near optimal input vector whose NBTI-induced circuit degradation and leakage power are within 1% of the optimal results (shown in Table VI). However, in the Pareto set of array8x8 benchmark circuit, we can not find an input vector whose degradation and leakage power values are close to the optimal ones. We should choose the proper input vector depending on the design goals: longer circuit lifetime or smaller leakage power. Consequently, we suggest to use Pareto sets as a reference to explore the trade-offs between circuit life time reliability and leakage power reduction.

#### D. Comparison against other techniques

In this section, we compare the IVC techniques against other NBTI/power mitigation techniques. We change the

#### TABLE VI

IVC results for NBTI-induced circuit degradation  $\Delta D(\%)$  and leakage power(W) at 65nm node. RAS = 1:9,

 $T_{active} = T_{standby} = 378K$ 

| Benchmark  | Wors       | st NBTI    | Bes        | t NBTI     | Worst Le   | akage      | Best Lea   | akage      | IVC C | apability | Co-opt     | imization  |
|------------|------------|------------|------------|------------|------------|------------|------------|------------|-------|-----------|------------|------------|
| circuits   | $\Delta D$ | $P_{leak}$ | $\Delta D$ | $P_{leak}$ | $P_{leak}$ | $\Delta D$ | $P_{leak}$ | $\Delta D$ | NBTI  | Leakage   | $\Delta D$ | $P_{leak}$ |
| c432       | 29.90      | 2.37E-07   | 12.99      | 2.44E-07   | 2.50E-07   | 21.12      | 2.29E-07   | 27.75      | 56.55 | 8.45      | 13.01      | 2.41E-07   |
| c499       | 31.41      | 2.22E-07   | 19.18      | 2.29E-07   | 2.39E-07   | 25.34      | 2.16E-07   | 28.15      | 38.95 | 9.62      | 19.71      | 2.26E-07   |
| c880       | 25.23      | 3.89E-07   | 17.96      | 4.00E-07   | 4.13E-07   | 22.36      | 3.79E-07   | 24.05      | 28.84 | 8.24      | 18.03      | 3.98E-07   |
| c1355      | 21.71      | 6.53E-07   | 17.61      | 6.48E-07   | 6.60E-07   | 20.18      | 6.36E-07   | 20.84      | 18.88 | 3.77      | 17.61      | 6.48E-07   |
| c1908      | 26.33      | 6.73E-07   | 20.22      | 6.74E-07   | 6.83E-07   | 21.72      | 6.70E-07   | 24.23      | 23.20 | 1.98      | 20.22      | 6.74E-07   |
| c2670      | 34.31      | 8.66E-07   | 23.62      | 8.70E-07   | 8.84E-07   | 34.31      | 8.52E-07   | 34.31      | 31.15 | 3.59      | 23.62      | 8.70E-07   |
| c3540      | 25.61      | 1.22E-06   | 19.79      | 1.24E-06   | 1.27E-06   | 22.10      | 1.21E-06   | 25.16      | 22.70 | 5.42      | 20.16      | 1.23E-06   |
| c5315      | 34.07      | 1.77E-06   | 24.38      | 1.79E-06   | 1.81E-06   | 28.17      | 1.73E-06   | 34.07      | 28.45 | 4.41      | 25.15      | 1.78E-06   |
| c6288      | 22.24      | 4.67E-06   | 19.42      | 4.67E-06   | 4.69E-06   | 20.40      | 4.63E-06   | 22.24      | 12.65 | 1.28      | 19.43      | 4.67E-06   |
| c7552      | 23.25      | 2.82E-06   | 15.36      | 2.85E-06   | 2.90E-06   | 18.07      | 2.81E-06   | 20.90      | 33.95 | 3.34      | 15.57      | 2.84E-06   |
| array4x4   | 35.05      | 9.09E-08   | 19.21      | 9.15E-08   | 9.27E-08   | 25.88      | 7.54E-08   | 35.05      | 45.19 | 18.65     | 20.11      | 8.53E-08   |
| array8x8   | 36.51      | 3.74E-07   | 16.75      | 4.17E-07   | 4.18E-07   | 32.22      | 3.49E-07   | 36.51      | 54.12 | 16.45     | 19.41      | 4.12E-07   |
| bkung16    | 35.96      | 1.42E-07   | 15.68      | 1.40E-07   | 1.50E-07   | 35.96      | 1.28E-07   | 31.22      | 56.38 | 14.66     | 16.42      | 1.38E-07   |
| bkung32    | 36.08      | 2.87E-07   | 19.80      | 2.84E-07   | 3.01E-07   | 32.51      | 2.68E-07   | 26.96      | 45.12 | 10.81     | 20.67      | 2.77E-07   |
| booth9x9   | 23.21      | 1.11E-06   | 17.23      | 1.13E-06   | 1.14E-06   | 20.96      | 1.09E-06   | 18.96      | 25.77 | 4.47      | 17.23      | 1.10E-06   |
| kogge16    | 36.24      | 1.95E-07   | 22.09      | 1.89E-07   | 2.15E-07   | 33.42      | 1.82E-07   | 33.73      | 39.07 | 15.32     | 22.09      | 1.89E-07   |
| kogge32    | 37.19      | 4.75E-07   | 25.91      | 4.58E-07   | 4.96E-07   | 35.47      | 4.38E-07   | 33.16      | 30.32 | 11.85     | 27.12      | 4.50E-07   |
| log32      | 23.15      | 5.48E-07   | 18.96      | 5.55E-07   | 5.68E-07   | 18.96      | 5.19E-07   | 22.50      | 18.10 | 8.61      | 18.96      | 5.36E-07   |
| log64      | 21.04      | 1.31E-06   | 15.55      | 1.35E-06   | 1.36E-06   | 15.55      | 1.25E-06   | 20.41      | 26.08 | 8.23      | 15.55      | 1.29E-06   |
| Pmult16    | 31.08      | 1.83E-06   | 19.33      | 1.86E-06   | 1.95E-06   | 25.55      | 1.71E-06   | 23.42      | 37.80 | 12.42     | 20.54      | 1.80E-06   |
| Pmult32    | 28.67      | 7.55E-06   | 20.17      | 7.34E-06   | 7.66E-06   | 21.59      | 6.94E-06   | 25.94      | 29.65 | 10.44     | 21.12      | 7.32E-06   |
| Average    | 29.63      | 1.17E-06   | 19.11      | 1.18E-06   | 1.21E-06   | 24.90      | 1.13E-06   | 27.38      | 33.89 | 9.23      | 19.68      | 1.16E-06   |
| Difference |            |            |            |            |            |            |            |            |       |           | 3.01%      | 3.35%      |

corresponding parameters, and achieve the theoretic upper bounds for the following techniques which are adopted during the circuit standby time.

## D.1 Tuning $V_{dd}$

Lowering  $V_{dd}$  helps mitigate the NBTI effect. Table VII shows the change of theoretic upper bound  $D_{UB}$  by varying  $V_{dd}$  during the standby time.  $D_{UB}$  is approaching  $D_{LB}$  when  $V_{dd}$  is lower than 0.5V.

TABLE VII  $D_{UB}$  of C432 for different  $V_{dd}$  in standby time (65nm). RAS=1:9  $T_{active} = T_{standby} = 378K$ . The theoretic lower bound  $D_{LB} = 7.91\%$ . The best input vector result:  $D_{best} = 12.99\%$ 

| $V_{dd}(V)$  | 1.0   | 0.9   | 0.8   | 0.7   | 0.6  | 0.5  | 0.4  |
|--------------|-------|-------|-------|-------|------|------|------|
| $D_{UB}$ (%) | 32.93 | 19.31 | 13.31 | 10.44 | 9.10 | 8.51 | 8.20 |

# D.2 Tuning $V_{th}$

Increasing  $V_{th}$  helps mitigate the NBTI effect. Table VIII shows the change of theoretic upper bound  $D_{UB}$  by varying  $V_{th}$  during the standby time. The smallest  $D_{UB}$  when  $V_{th} = 0.3V$  is larger than our best input vector result.

#### TABLE VIII

 $D_{UB}$  of C432 for different  $V_{th}$  in standby time (65nm). RAS=1:9  $T_{active} = T_{standby} = 378K$ . The theoretic lower bound  $D_{LB} = 7.91\%$ . The best input vector result:  $D_{best} = 12.99\%$ 

| ſ | $V_{th}(V)$  | 0.20  | 0.22  | 0.24  | 0.26  | 0.28  | 0.30  |
|---|--------------|-------|-------|-------|-------|-------|-------|
| [ | $D_{UB}$ (%) | 32.93 | 29.66 | 26.43 | 23.24 | 21.19 | 19.31 |

D.3 Power gating

Power gating will bring all the PMOS's in the circuits into relaxation phase. Thus we can ignore the NBTI-induced degradation during the standby time. The total lifetime degradation equals to the theoretic lower bound  $D_{LB}$ .

## D.4 Design overhead

From the above analysis, we find out that all these techniques have similar capability to that of IVC techniques. However, all the techniques incur extra design overheads. The timing and area overhead of the IVC technique, which is caused by the flip-flop (or extra memory) to store the optimal input vectors at the primary inputs of the circuits, can be neglected for a large digital circuit design. For tuning  $V_{dd}$ , extra power rails should be added and level converters have to be introduced. Lowering  $V_{th}$ needs the body biasing technique which requires triple-well technique. Meanwhile, the body biasing technique is not applicable for scaled CMOS technology due to the rapid increase of the junction leakage. Power gating technique, which is the most effective technique for NBTI and leakage mitigation, needs extra sleep transistors. The sleep transistors will lead to a slower circuit speed during the active time. Extra area, power, and design effort for the sleep signal control logic are other critical issues for the power gating technique. Consequently, IVC is an economical method to be adopted, and can achieve considerable NBTI and leakage mitigation.

As a summary, the comparison is shown in Table IX. Note that IVC can be combined with  $V_{dd}$  and  $V_{th}$  tuning to achieve even better mitigation for NBTI.

#### TABLE IX

Comparison of different techniques for standby time NBTI mitigation.

|          | IVC         | Tuning $V_{dd}$ | Tuning $V_{th}$ | Power gating | ][1 |
|----------|-------------|-----------------|-----------------|--------------|-----|
| Efficacy | Medium high | High            | Medium          | Highest      | 1   |
| Overhead | Low         | Medium          | Medium          | High         | 1   |

### VI. CONCLUSION AND FUTURE WORKS

In this paper, we evaluated the efficacy of the IVC technique on circuit lifetime mitigation and leakage reduction, based on a proposed co-simulation flow that estimates NBTI-induced circuit degradation and leakage power for various input patterns. From the experimental results, IVC technique, saving around 30% circuit performance degradation from 90nm to 45nm technology node, is proved to be effective during the standby time for mitigating NBTIinduced circuit degradation. Furthermore, we can optimize the NBTI and the leakage power simultaneously. Since the leakage power and NBTI-induced circuit degradation have different dependencies on the input patterns, Pareto sets were derived for the designer to explore trade-offs between the life-time reliability and leakage power reduction.

For future work, we plan to 1) further investigate the internal node control technique to get better control of the internal nodes; 2) combine IVC with other standby time techniques, such as tuning  $V_{dd}$  and  $V_{th}$  to further mitigate NBTI and leakage; and 3) adopt a probabilistic framework considering the process variation to investigate the impact of the NBTI-induced  $V_{th}$  degradation in the circuit optimization.

#### References

- V. Huard, M. Denais, and C. Parthasarathy, "NBTI degradation: From physical mechanisms to modelling," *Microelectron. Reliab.*, vol. 46, no. 1, pp. 1–23, 2006.
- [2] D. K. Schroder and J. A. Babcock, "Negative bias temperature instability: Road to cross in deep submicron silicon semiconductor manufacturing," *Journal of Applied Physics*, vol. 94, no. 1, pp. 1–18, 2003.
- [3] S. Borkar, "Electronics beyond nano-scale cmos," in *Proc. DAC*, 2006, pp. 807 – 808.
- [4] M. Agarwal, B. C. Paul, Z. Ming, and S. Mitra, "Circuit failure prediction and its application to transistor aging," in VLSI Test Symposium, 2007. 25th IEEE, 2007, pp. 277–286.
- [5] S. Mahapatra and M. Alam, "A predictive reliability model for PMOS bias temperature degradation," in *IEDM Tech. Dig.*, 2002, pp. 505–508.
- [6] M. Alam, "A critical examination of the mechanics of dynamic NBTI for PMOSFETs," in *IEDM Tech. Dig.*, 2003, pp. 14.4.1– 14.4.4.
- [7] R. Vattikonda, W. Wang, and Y. Cao, "Modeling and minimization of pmos nbti effect for robust nanometer design," *DAC*, pp. 1047–1052, Jul. 2006.
- [8] W. Wang, V. Reddy, A. T. Krishnan, R. Vattikonda, S. Krishnan, and Y. Cao, "An integrated modeling paradigm of circuit reliability for 65nm cmos technology," in *CICC* '07, 2007, pp. 511–514.
- [9] B. Paul, K. Kang, H. Kufluoglu, M. Alam, and K. Roy, "Impact of NBTI on the temporal performance degradation of digital circuits," *IEEE Electron Device Lett.*, vol. 26, no. 8, pp. 560– 562, 2005.
- [10] S. V. Kumar, C. H. Kim, and S. S. Sapatnekar, "An Analytical Model for Negative Bias Temperature Instability," in *Proc. IEEE/ACM ICCAD*, 2006.

- [11] W. Wang, S. Yang, S. Bhardwaj, R. Vattikonda, S. Vrudhula, F. Liu, and Y. Cao, "The impact of nbti on the performance of combinational and sequential circuits," in *Proc. DAC*, 2007, pp. 364–369.
- [12] Y. Wang, H. Luo, K. He, R. Luo, H. Yang, and Y. Xie, "Temperature-aware nbti modeling and the impact of input vector control on performance degradation," in *Proc. DATE*, 2007, pp. 546–551.
- [13] B. Paul, K. Kang, H. Kufluoglu, M. Alam, and K. Roy, "Temporal Performance Degradation under NBTI: Estimation and Design for Improved Reliability of Nanoscale Circuits," in *Proc. DATE*, vol. 1, 2006, pp. 1–6.
- [14] S. Kumar, C. Kim, and S. Sapatnekar, "Impact of NBTI on SRAM Read Stability and Design for Reliability," in *Proc. ISQED*, 2006, pp. 210–218.
- [15] S. V. Kumar, C. H. Kim, and S. S. Sapatnekar, "Nbti-aware synthesis of digital circuits," in *Proc. DAC*, 2007, pp. 370–375.
- [16] A. Abdollahi, F. Fallah, and M. Pedram, "Leakage current reduction in CMOS VLSI circuits by input vector control," *IEEE Trans. on Very Large Scale Integration (VLSI) Systems*, vol. 12, no. 2, pp. 140–154, 2004.
- [17] L. Yuan and G. Qu, "A combined gate replacement and input vector control approach for leakage current reduction," *IEEE Trans. on Very Large Scale Integration (VLSI) Systems*, vol. 14, no. 2, pp. 173–182, 2006.
- [18] J. Abella, X. Vera, and A. Gonzalez, "Penelope: The nbti-aware processor," in *MICRO 2007*, 2007, pp. 85–96.
- [19] J. G. Lin, "Three methods for determining pareto-optimal solutions of multiple-objective problems," *Directions in Large-Scale Systems*, pp. 117–138, 1975.
- [20] W. Wang, Z. Wei, S. Yang, and Y. Cao, "An efficient method to identify critical gates under circuit aging," in *ICCAD 2007*, 2007, pp. 735–740.
- [21] W. Wang, V. Reddy, A. Krishnan, R. Vattikonda, S. Krishnan, and Y. Cao, "Compact modeling and simulation of circuit reliability for 65nm cmos technology," *IEEE Transactions on Device* and Materials Reliability, vol. 7, no. 4, pp. 509–517, 2007.
- [22] S. Bhardwaj, W. Wang, R. Vattikonda, and Y. Cao, "Predictive modeling of the nbti effect for reliable design," in *Conference* 2006, IEEE Custom Integrated Circuits, 2006, pp. 189–192.
- [23] V. Huard, C. Parthasarathy, N. Rallet, C. Guerin, M. Mammase, D. Barge, and C. Ouvrard, "New characterization and modeling approach for nbti degradation from transistor to product level," *IEDM 2007.*, pp. 797–800, 10-12 Dec. 2007.
- [24] T. Grasser, B. Kaczer, P. Hehenberger, W. Gos, R. O'Connor, H. Reisinger, W. Gustin, and C. Schlunder, "Simultaneous extraction of recoverable and permanent components contributing to bias-temperature instability," *IEDM 2007.*, pp. 801–804, 10-12 Dec. 2007.
- [25] D. Lee, W. Kwong, D. Blaauw, and D. Sylvester, "Analysis and minimization techniques for total leakage considering gate oxide leakage," in *Proc. of Design Automation Conference*, 2003, pp. 175–180.
- [26] Nangate open cell library. [Online]. Available: http://www.nangate.com/
- [27] Nanoscale Integration and Modeling (NIMO) Group, ASU. Predictive Technology Model (PTM). [Online]. Available: http://www.eas.asu.edu/~ptm/