# Enhancing the Tolerance to Power-Supply Instability in Digital Circuits

J. Semião<sup>1</sup>, J. Freijedo<sup>2,3</sup>, J.J. Rodríguez Andina<sup>3</sup>, F. Vargas<sup>4</sup>, M. B. Santos<sup>2</sup>, I. C. Teixeira<sup>2</sup>, J. P. Teixeira<sup>2</sup>

<sup>1</sup> Univ. of Algarve – UAlg / EST, Campus da Penha, 8005-139 Faro, Portugal
<sup>2</sup> IST (TUL) / INESC-ID Lisboa, R. Alves Redol, 9, 3°, 1000-029 Lisboa, Portugal
<sup>3</sup> Univ. of Vigo, Dept. de Tecnología Electrónica, Campus Universitario, 36310 Vigo, Spain
<sup>4</sup> PUCRS, Electrical Engineering Dept., Av. Ipiranga, 6681, 90619-900 Porto Alegre, Brazil paulo.teixeira@ist.utl.pt, jsemiao@ualg.pt, fvargas@computer.org, juanjo@uvivo.es

Abstract - As IC technology scales down, power supply instability may dramatically contribute to signal integrity loss. In this paper, we propose a new methodology to enhance circuit tolerance to powersupply voltage (V<sub>DD</sub>) local variations, without degrading its performance. The underlying idea is to add additional tolerance to the edge trigger of the clock signal driving specific memory cells. The clock duty-cycle (CDC) is thus dynamically modulated according to V<sub>DD</sub>. Two architectures are presented, and one of them is shown to be effective. The key module is a Clock Stretching Logic (CSL) block, used to increase CDC according to V<sub>DD</sub>-V<sub>SS</sub> variations. Moreover, when clock frequency reduction is inevitable, circuit tolerance when disturbances start to occur is enhanced, allowing the clock generator to react and reduce its frequency. Experimental results SPICE simulations on for combinational, pipeline and finite-state machine (FSM) circuits are used to demonstrate the usefulness of the proposed methodology.

*Index Terms*: signal integrity; tolerance to power-supply variations; clock skew; noise-adaptive clock duty-cycle.

### 1. Introduction

signal integrity is becoming a significant problem for high-performance IC products, namely high-speed gigahertz nanometer System-on-Chip (SoC) [1]. Problems, such as crosstalk, overshoot / undershoot (momentarily signal rising / decreasing above/below the power supply voltage (V<sub>DD</sub>) and ground (V<sub>SS</sub>) lines) [2][3], reflection, electromagnetic interference – EMI, signal skew (delay in arrival time to different receivers) [4] [5] and power-supply noise [3] can lead to functional errors, which may cause yield loss (at the production stage) and/or dependability loss (during product lifetime). Both the power grid and the clock distribution network need to be carefully designed and tested [6].

In this work, we focus power supply instability, regardless of its origin, and its impact on digital circuit performance. Both  $V_{DD}$  depletion and ground elevation cause an increase on signal paths propagation delay. The shrink of nominal  $(V_{DD}$  -  $V_{SS})$  due to technology scaling

down and the use of advanced power management techniques (to reduce power consumption) enhance the problem severity. A common technique is known as DVS (Dynamic Voltage Scaling) [7]. We refer the minimum supply voltage that results in correct operation as the *critical* supply voltage. This voltage must be sufficient to ensure correct operation in the presence of a number of environmental and process related variations (local or global) that can impact circuit performance [8]. In order to guarantee correct operation under all assumed variations, safety margins are added to the critical voltage to account for circuit models inaccuracy and worst-case combination of parametric variations.

It is assumed that well-known techniques [9] are used to limit the impact of power supply disturbances on clock generation and distribution performance. However, if a signal propagation delay becomes large enough during circuit operation, it will induce a *de-synchronization effect* due to the increased difference between the critical path propagation delay and the clock distribution network delay. Hence, e.g., 10% power supply voltage fluctuations may translate in more than 10% timing inaccuracy, causing a functional error [5]. Interestingly, research results on delay fault detection and diagnosis can be reused to enhance delay fault *tolerance* (and thus signal integrity) to power supply voltage transients [10].

Many design solutions have been proposed to enhance signal integrity in the presence of power grid activity. However, these are static solutions which may be insufficient when the time budget is too limited. Some of the proposed solutions include: 3-D layout modeling and parasitic extraction [11], accurate RLC simulation of onchip power grid [4] using decoupling capacitors [12] to limit resistive voltage drop (IR-drop) [4][13], insertion of buffers on the grid [11], wire shielding [14], and buffer insertion plus transistor resizing methods to achieve better power-delay and area-delay trade-offs [15][16]. Additionally, self-test methodologies and on-chip probes to monitor intra-packaging EM-emission activities [17] have been developed to test signal integrity in high-speed SoC. Finally, other techniques are used in industry, namely, the reduction of the maximum distance between supply pins and circuit supply connections [18]. With a different objective, a technique addressing the problem of de-synchronization of memory elements (for pipeline based circuits) has been proposed [7]. However, its purpose is to correct (not to prevent) errors caused by aggressive DVS techniques and to reduce power consumption.

In order to guarantee correct timing performance in the presence of undesired  $V_{\text{DD}}$  transients, the ultimate solution is to reduce the clock frequency. However, for some applications, such performance degradation cannot be tolerated. The goal of this paper is to present and discuss a new, patented [19] solution, which does not require clock frequency reduction. The underlying principle is to local and dynamically adapt the clock duty-cycle (CDC) of the clock signal driving a limited subset of memory elements, according to the signal propagation delay through the logic whose power supply voltage is being disturbed. The target memory cells (referred as critical memory cells) are the ones connected to the output of critical paths, for which the smallest time slack occurs. In this work, the time slack, ts, is defined as the time interval difference between the clock period and the time interval associated with the time response of the critical path in the slowest combinational module between registers.

This paper is organized as follows: Section 2 shows how stretching CDC can enhance circuit tolerance to V<sub>DD</sub> disturbances. Section 3 describes the proposed methodology and possible implementations. In Section 4, experimental results based on SPICE simulations of two circuit examples are presented. Finally, Section 5 outlines the main conclusions and directions for future work.

### 2. CLOCK DUTY CYCLE STRETCHING

As referred, lowering  $V_{DD}$  enhances the propagation delay of signal paths. If the observation pace is kept invariant (at-speed circuit operation, with nominal clock period,  $t_{CLK}$ ), this reduces circuit time margins, which may induce system functional errors. Power voltage instability may also result from *ground bounce*. This phenomenon occurs when internal nodes of a logic device change state. When this happens, the charge remaining in the internal nodes ( $C_L$ ) is drained through the ground grid inducing a local  $V_{SS}$  variation. Worst case conditions exist when a large number of nodes simultaneously switch, which is operation-dependent.

For simplicity, the *size of the disturbance* on  $V_{\rm DD}/V_{\rm SS}$  interconnects voltage is normalized using a *gamma parameter*:

$$y(vdd) = \frac{\Delta V_{DD}}{V_{DDnom}} \qquad \text{or} \qquad \gamma(gnd) = \frac{\Delta GND}{V_{DDnom}}$$

where  $\Delta V_{DD}$  is the difference between the nominal  $V_{DD}$  ( $V_{DDnom}$ ) and the depleted  $V_{DD}$ , and  $\Delta GND = \Delta V_{SS}$  is the difference between the elevated ground and  $V_{SS} = 0$  Volt.

In order to show the *de-synchronization effect*, SPICE simulation was run for a 77-inverter chain designed with the 130nm UMC CMOS technology ( $V_{DDnom}=1.2V$ ). The inverter chain mimics a long switching signal path. The clock period is  $t_{CLK}=1$  GHz. The inverter chain is terminated by a level-sensitive D-type flip-flop (D-FF). In this case, the circuit collapses functionally at 83.33% of  $V_{DDnom}$  ( $\gamma(V_{DD})=16.67\%$ ), with CDC = 50%. When the clock duty cycle is increased to CDC = 80%, the circuit functional collapse occurs only at 73.33% of

 $V_{DDnom} (\gamma(V_{DD}) = 26.67\%).$ 

Thus, by increasing CDC, the circuit is rendered more robust to power interconnect noise.

The dependence of propagation delay on  $V_{DD}$  reduction,  $T_{pd}$  ( $V_{DD}$ ), can be described using a simple analytical model [5][10]. This model can also be used to compute the amount of CDC stretching needed for a given power-supply variation.

#### 3. PROPOSED METHODOLOGY

Consider a synchronous IP core, with different modules, each fed by a subsection of the power grid infrastructure (Fig. 2). If circuit operation induces  $V_{\rm DD}$  variations in module i, its timing performance is distorted, typically delayed. In order to allow the combinational blocks to finish their job, more time has to be given to the signals switching in critical paths. Hence, the underlying idea is to dynamically delay their capture, by the *critical memory cells*, in the presence of  $V_{\rm DD}$  variation. Therefore, for a limited subset of the module's registers (m < < k), a CDC modulation block must be added to accommodate such delay.



Fig. 2: Synchronous IP core under local  $V_{\text{DD}}$  disturbance

In a synchronous circuit, CDC is generally set at 50% to minimize *jitter* and *process variations*, and to allow a *weighted-time distribution* for circuits designed with both rise- and fall-edge triggered flip-flops (FF). In order to prevent logic errors, we assume that CDC may be stretched up to 80%.

We refer as CDCM (or CDC Modulation) module the one that implements this added functionality. Locally, the CDCM module monitors  $V_{\rm DD}$  variations and triggers CDC variations accordingly. For each IP module, static timing analysis is used to identify the critical paths, allowing us to determine how many CDC modulators should be inserted (and where). Here, we illustrate the methodology using *one* power grid partition, *one* functional module and *one* CDCM system.

The proposed CDCM architecture contains a <u>c</u>lock <u>s</u>tretching <u>l</u>ogic (CSL) block, to enhance the CDC delivered by the phase-locked loop (PLL) block (the clock generator) to the controlled logic. Eventually, it may contain a <u>b</u>uilt-<u>i</u>n <u>p</u>ower supply voltage transient <u>s</u>ensor (BIPS) block to monitor power grid activity and control the use of the modulated clock signal provided by the CSL block.

Fig. 3 depicts two possible architectures of the CDCM module. In architecture (a) (the *core architecture*), the CSL block performs disturbance monitoring and CDC stretching, using a simple circuit and introducing a limited clock delay,  $\tau_o$ , (referred as the *intrinsic CDCM* 

delay). The modulated clock signal adds a minimum stretch at nominal  $V_{DD}$  (CDC slightly greater than 50%). When  $V_{DD}$  decreases, the CSL block stretches CDC according to  $V_{DD}$  reduction. In this case, the time slack is increased by the amount  $\tau_o$ , i.e.,  $t_s' = t_s + \tau_o$  ( $V_{DD}$ ).



Fig. 3: Basic architectures of the CDCM system: (a) core architecture; (b) enhanced architecture

Architecture (b) (the enhanced architecture) is more complex, and has been envisaged to add flexibility to the solution, by separating V<sub>DD</sub> monitoring and CDC modulation functions in two blocks, BIPS and CSL respectively. Assume that the designer only wants to modify the clock signal when V<sub>DD</sub><V<sub>DDth</sub> (a user's defined V<sub>DD</sub> threshold). Under normal operation, MUX 2X1 is set to the PLL output (CLK\_CDCM=CLK). When the BIPS block detects  $V_{DD}$  depletion below  $V_{DDth}$ , the CSL block input of the MUX 2X1 is selected. A D FF is used to guarantee that the switching from the PLL clock to the CSL clock (and vice-versa) is performed without glitches at the MUX output. The D FF is chosen with the opposite trigger edge that is controlled by the CSL block. When the V<sub>DD</sub> voltage transient fades away, the CSL block starts gradually reestablishing the original clock duty-cycle (50%), and the BIPS switches the MUX 2X1 selection signal.

The enhanced architecture requires a more complex circuit and adds a considerably larger delay to the modulated clock when  $V_{DD} < V_{DDth}$ . In fact, the intrinsic delay in the clock path is now the MUX delay,  $\tau_M$ , (typically,  $\tau_M < \tau_o$ ). When the selected signal is the CSL output signal, the delayed clock signal (CLK\_CDCM, at the MUX output) will be delayed by an amount  $\tau_o + \tau_M$ . Hence, as it will be shown in section 4, the enhanced architecture can only be used when this delay is negligible when compared to the clock period.

Several architectures for the CSL and BIPS blocks can be implemented. One possible implementation in static CMOS technology is shown in Fig. 4. The CSL block (Fig. 4(a)) modulates CDC, by delaying *one* of the switching transitions (in this case, the High-to-Low transition). The M1-M2 inverter implements the CDC modulator core. Its pull-up PMOS transistor (M2) has a slow driving capability, due to M3. The result is a modulated clock signal, with identical clock period, but

with enhanced CDC in the presence of a reduction of  $V_{\rm DD}$ . The output signal is buffered to restore the fast switching capability. For circuits with both rising and falling-edge trigger clocks, two types of CSL circuits must be implemented.





Fig. 4: Typical architectures for (a) CSL and (b) BIPS blocks.

In Fig. 4(b) a possible implementation of the BIPS block is shown. The M1-M3 voltage divider biases the M5 NMOS transistor near cut-off. When  $V_{\rm DD}\text{-}V_{\rm SS}$  decreases bellow a pre-defined threshold value  $(V_{\rm DDth}),$  M5 is driven into the cut-off region, and the output of the pseudo-NMOS M4-M5 inverter is driven to  $V_{\rm DD}.$  The two additional inverters act as a buffer. The output signal  $(V_{\rm DD}$  sensor) is used to generate the MUX2X1 Selection signal (through DFF).



Fig. 5: CSL robustness to power supply voltage variations 200 MHz,  $V_{DD} \in (3.3; 1.0) \text{ V}$ 

In Fig. 5, simulation results (AMS 350 nm CMOS technology) show the CSL block capability to stretch CDC proportionally to the depletion of  $V_{DD}$  from nominal  $V_{DD}$  (3.3 Volt) to  $V_{DD}$  = 1.0 Volt. Note that the stretched CDC is always larger than 50% and lower than 100%.

When  $V_{DD}$  is depleted to the lowest possible value, 1.1V, CDC reaches 79%. Bellow 1.1 V, the correct functionality collapses. Simulation results for the BIPS block, using the same technology, show that for  $V_{DD}$  < 2.7V the BIPS detects an abnormal  $V_{DD}$  reduction and signalizes its output with a High value.

#### **APPLICATIONS**

Let us analyze some applications where the proposed methodology is useful. One useful application is when we need to synchronize data between different logic blocks. The use of the proposed methodology in the receiver can improve signal integrity and maintain the blocks synchronized.

High-speed applications usually make use of *pipeline* solutions. Also, for this type of applications, the methodology will likely reduce errors occurrence, by relaxing local timing constraints, in the presence of  $V_{\text{DD}}$  variations.

The proposed methodology may also be applied to  $sequential\ circuits$ , namely finite-state-machines (FSM). The combinational critical paths limit the maximum clock frequency. Thus, adding circuit tolerance to  $V_{DD}$  variations will improve signal integrity in the circuit.

### LIMITATIONS

However, the methodology has some limitations. First, limiting CDC variation (from 50 to 80%) also limits the range of  $\Delta V_{DD}$  or  $\Delta V_{SS}$  values for which signal integrity holds. Second, in pipeline circuits, methodology application depends on circuit architecture and on the relative location of the critical paths in the pipeline. For example, if a critical path is followed by another critical path in the pipeline, the methodology still applies, but with a smaller margin for  $\Delta V_{DD}$  and  $\Delta V_{SS}$ .

Third, in FSM, for some topologies the proposed methodology cannot be applied inside the FSM. This is the case when combinational critical paths occur between the output and the input of feedback memory cells. Such loops make the circuit insensitive to CDC variations.

In general, we have to be cautious when two or more combinational critical paths are followed (separated by memory cells). In some cases, like pipeline, the methodology can be applied using different CSL blocks for different memory cells. However, in such cases, the margin of improvement of the methodology is limited. In other cases like FSM, the amount of hardware necessary to make the methodology work may become unpractical.

There is also another limitation associated with the presence of *short paths*. The use of a delayed clock edge trigger in a memory element raises the possibility that a short path in its input combinational logic cone will corrupt the data in the memory element. To prevent this, a minimum-path length constraint should be added at the input of each CDC controlled flip-flop in the design. As done in other design solutions [6][8], these minimum-path constraints result in the addition of buffers during logic synthesis to slow down fast paths. The minimum-path constraint is equal to the clock edge trigger delay of the new CSL clock plus the propagation delay of the CSL block, i.e.,

$$\min t_{pd} = |CDC_{clock} - CDC_{CSL}| \times T + t_{pdCSL}$$

#### 4. EXPERIMENTAL RESULTS

Extensive SPICE simulations (AMS 350nm CMOS technology) with the Cadence<sup>TM</sup> framework have been performed, in order to demonstrate the effectiveness of the proposed methodology.



Fig. 6: Pipeline circuit example: (a) basic circuit; (b) basic circuit with CDCM applied to Q2.

First, results are shown for a pipeline example circuit (Fig. 6(a)). The pipeline circuit has two memory cells, 4 inputs and 1 output, with 7 logic gates. As we can see, the critical path candidates all end up in the second memory cell, Q2. Hence, to implement the proposed methodology we add a CDCM module (enhanced architecture) to control the clock at Q2 (Fig. 6(b)).

Figure 7 shows SPICE results for the pipeline circuit of Fig. 6. This circuit was stimulated with pseudo-random vectors. The flip-flops have fall-edge trigger activation, except the D flip-flop used to synchronize the BIPS signal. As the critical combinational paths of the circuit end at Q2, with a lower V<sub>DD</sub> this cell will capture the wrong signal value. In the first graph of Fig. 7(a) we see the correct operation of Q2. The second graph depicts the assumed  $V_{DD}$  variation (voltage ramp down and up).  $V_{DD}$ varies between 3.3 V and 2.7 V. In the third graph, the erroneous Q2 signal, caused by V<sub>DD</sub>, is visible. Applying our methodology to control the clock driving O2, we introduce the BIPS, DFF, MUX21 and CSL blocks (Fig. 3). In the first graph of fig. 7(b) the BIPS block identifies the need for correction. With the correction circuitry, the MUX21 will switch Q2 clock for a corrected clock (generated by CSL) in the next ascending edge of the clock, after the BIPS detection. The second graph shows the clock applied to Q2: during BIPS detection, the CDC is stretched. This allows Q2 to capture de correct value during V<sub>DD</sub> depletion (third graph).



Fig. 7: SPICE results for the Pipeline circuit example: (a) Without CDCM; (b) With CDCM.



Fig. 8: SPICE results for the Pipeline circuit example with CDCM (detail).

However, the detail depicted in Fig. 8 shows that this enhanced architecture may exhibit limited application, due to the extra timing incurred in it. In fact, as it can be seen, when  $V_{DD} < V_{DDth}$ , BIPS output goes to '1' (in the next rising edge of the clock), and now a  $(\tau_o + \tau_M)$  delay is introduced. When  $V_{DD}$  rises again, and BIPS output goes to '0', the switching of the MUX2x1 lead to a thin pulse. In this case, it was still sufficient to capture the data correctly. However, this requires a very delicate design, to avoid problems. Hence, we concentrate on the alternative architecture – the core architecture, which is more robust. This is shown n the next example.

Consider a FSM benchmark example - the B02 ITC'99 benchmark [20]. B02 is a FSM that recognizes BCD numbers. B02 is implemented with 28 gates, 4 flip-flops (3 of them are feedback memory elements), 1 input and 1 output. In this case, the combinational critical path ends on flip-flop Q2, which has also a short path in its input combinational cone, coming from the unique primary input. Therefore, to implement the proposed methodology we added one CDCM module (core architecture) to control the clock driving Q2 and added one buffer to a short path. In order to use the same CSL block from the previous example, we opted to implement the flip-flops with fall-edge trigger activation.

Figure 9 shows SPICE results for the B02 circuit benchmark, stimulated with random vectors. As shown in Fig. 9(a), an incorrect value is captured at Q2 when  $V_{DD}$  is decreased. When we introduce the CDCM module, the clock duty-cycle applied to Q2 is changed to 57% (due to  $\tau_{o}$ , CDC is different from 50%). When  $V_{DD}$  is depleted (down to  $\gamma(V_{DD})$ =21%), the dynamic CDC change allows Q2 to capture the correct signal in the presence  $V_{DD}$  variation, as shown in Fig. 9(b).



Fig. 9: SPICE results for B02 ITC'99 benchmark: (a) Without CDCM; (b) With CDCM.

## 5. CONCLUSIONS

A new methodology has been proposed in this work to enhance circuit tolerance to power-supply instability in a synchronous part of an IP core or SoC, without degrading its performance. The goal is to keep circuit synchronism and performance (i.e., to keep at-speed operation) even in the presence of power line voltage disturbances that, due

to increased signal path delays, may cause a desynchronization effect and disrupt signal integrity.

The methodology is based on CDC modulation (or CDCM) module, which performs V<sub>DD</sub> sensing and CDC modulation. Two architectures have been proposed, the core and enhanced architecture. The core architecture implements a <u>clock stretching logic</u> (CSL) block, which performs both functions with low area overhead and low intrinsic delay ( $\tau_0$ ). As  $V_{DD}$  fluctuates, the clock duty cycle is on-line adapting, according to supply voltage variation. The enhanced architecture induces CDC variations only beyond a user's defined power supply voltage threshold, V<sub>DDth</sub>, separating the two functions is two blocks – the CSL and a  $\underline{\boldsymbol{b}}$  uilt- $\underline{\boldsymbol{i}}$ n  $\underline{\boldsymbol{p}}$  ower supply transient sensor (BIPS) block. The amount of CDC variation, as a function of V<sub>DD</sub> variation, can be computed from a simple semi-empirical delay model described in [5][10]. This computation drives the design of the CDCM module. Basically, for the worst-case situation, for which the stretched CDC keeps the user's defined time slack, CDC variation is a replica of the propagation delay variation along the combinational paths. The domain of application and the limitations of the proposed methodology have also been identified.

Experimental simulations results demonstrate the *effectiveness* of the proposed methodology to sense abnormal power supply voltage activity and to react to this disturbance by stretching the clock signal driving the critical memory cells. Simulation results have also shown the *intrinsic robustness* of the BIPS and CSL blocks to the considered V<sub>DD</sub> variations. As shown in Section 4, by carefully adjusting internal BIPS/CSL blocks parameters it is possible to reach self-tolerance to power-supply variation up to 50%.

Simulation results also put in evidence the fact that the *core architecture* is clearly more fitted than the enhanced architecture. This is especially true for high-performance products, for which it is expected that high clock frequency, low time slack and limited difference in the propagation delay times of the critical paths and of the clock signal paths occur. Hence, the core architecture will be selected for future applications.

Future work includes the procedures for the introduction of the proposed methodology in a VLSI design flow. Here, two directions of work emerge. First, the automation of critical path identification, selection of critical memory cells, CDCM cell library definition, and insertion. Second, the availability of simulation tools, at logic level, which may simulate, at this level of abstraction, the impact of  $V_{DD}$  fluctuations on the circuit timing behavior, and the effectiveness of CDCM insertion to enhance circuit tolerance. Both directions are under scrutiny. The work will be reported in the future.

# REFERENCES

- [1] M. Nourani, A. Attarha. "Signal Integrity: Fault Modeling and Testing in High-Speed SoCs". J. of Electronic Testing: Theory and Applications (JETTA), pp. 539-554, 2002.
- [2] S. Bobba, T. Thorp, K. Aingaran, D. Liu, "IC Power Distribution Challenges", Proc. IEEE VLSI Test Symposium (VTS), pp. 643-650, 2001.
- [3] A. Krstic, Y-M. Jiang, K.-T. Cheng, "Pattern Generation for Delay Testing and Dynamic Timing Analysis

- Considering Power-Supply Noise Effects" IEEE Transactions on CAD, vol. 20, no. 3, pp. 416-425, 2001.
- [4] H. Chen, L. Wang. "Design for Signal Integrity: The New Paradigm for Deep-Submicron VLSI Design". Proc. Int. Symposium on VLSI Technology, pp. 329-333, 1997.
- [5] M. Rodriguez-Irago et al., "Dynamic Fault Test and Diagnosis in Digital Systems Using Multiple Clock Schemes and Multi-VDD Test". Proc. IEEE Int. On-Line Testing Symposium (IOLTS), pp. 281-286, 2005.
- [6] C. Metra, S. Di Francescantonio, T.M. Mak, "Implications of Clock Distribution Faults and Issues with Screening Them during Manufacturing Testing", IEEE Transactions on Computers, vol. 53, no. 5, pp. 531-546, May 2004.
- [7] D. Ernst, N. Sung Kim, S. Das, S. Pant, T. Pham, R. Rao, C. Ziesler, D. Blaauw, T. A. Trevor Mudge, "Razor: A Low-Power Pipeline Based on Circuit-Level Timing Speculation", Micro Conference, December, 2003.
- [8] D. Ernst, S. Das, S. Lee, D. Blaauw, T. Austin, T. Mudge, N. Sung Kim, K. Flautner, "Razor: Circuit-Level Correction of Timing Errors for Low-Power Operation", IEEE Micro, 24(6):10-20, November 2004.
- [9] K. Berstein, K.M. Carrig, Ch.M. Durham, P.R. Hansen, D. Hogenmiller, E.J. Nowak, N.J. Rohrer, "High Speed CMOS Design Styles", Kluwer 1999.
- [10] D. Barros Júnior et al., "Fault Modeling and Simulation of Power Supply Voltage Transients in Digital Systems on a Chip", JETTA, vol.21, pp. 349-363, Kluwer, August 2005.
- [11] L. Green. "Simulation, Modeling and Understanding the Importance of Signal Integrity". IEEE Circuit and Devices Magazine, pp. 7-10, Nov. 1999.
- [12] R. Downing, P. Gebler, G. Katopis. "Decoupling Capacitor Effects on Switching Noise". IEEE Trans. on Components, Hybrids and Manufacturing Technology, vol.16, no.5, pp. 484-489, Aug. 1993.
- [13] R. Saleh, D. Overhauser, S. Taylor. "Full-Chip Verification of UDSM Designs" Proc. Int. Conf. on Computer Aided Design (ICCAD), pp. 453-460, 1998.
- [14] A. Kahng, S. Muddu, E. Sarto. "Interconnect Optimization Strategies for High-Performance VLSI Designs". Proc. Int. Conf. on VLSI Design, pp. 464-469, Aug. 1999.
- [15] G. Tellez, M. Sarrafzadeh. "Minimal Buffer Insertion in Clock Trees with Skew and Slew Rate Constraints." IEEE Trans. on CAD, vol. 16, no. 4, pp. 333-342, April 1997.
- [16] Y. Jiang, S. Sapatnekar, C. Bamji and J. Kim. "Interleaving Buffer Insertion and Transistor Resizing into a Single Optimization". IEEE Trans. on VLSI Systems, vol. 6, no. 4, pp. 625-633, Dec. 1998.
- [17] T. Steinecke, W. John Koehne, M. Schmidt. "EMC Modeling and Simulation on Chiplevel". IEEE Int. Symp. on Electromagnetic Compatibility (EMC), 14-16 Aug. 2001.
- [18] S. Ben Dia, M. Ramdani, E. Sicard, editors, "Electromagnetic Compatibility of Integrated Circuits – Techniques for Low Emission and Susceptibility", Springer, 2006.
- [19] INESC-ID, U. Vigo, PUCRS, Portuguese National Patent Nº. 103436, "Método para Aumentar a Tolerância Dinâmica da Parte Digital de um Sistema Electrónico Integrado, a Variações de Tensão Eléctrica de Alimentação e de Temperatura", Boletim da Propriedade Industrial (BPI) 12/2006, October 2006 (in portuguese).
- [20] S. Davidson, "ITC'99 Benchmark Circuits Preliminary Results", Proc. IEEE International Test Conf (ITC), pp. 1125, 1999; available at <a href="http://www.cerc.utexas.edu/itc99-benchmarks/bench.html">http://www.cerc.utexas.edu/itc99-benchmarks/bench.html</a>.