# Soft Error Rate Determination for Nanoscale Sequential Logic Fan Wang Juniper Networks, Inc. Sunnyvale, CA 94086, USA fanw@juniper.net Vishwani D. Agrawal Auburn University, Dept. of ECE Auburn, AL 36849, USA vagrawal@eng.auburn.edu #### Abstract We analyze the neutron induced soft error rate (SER) by modeling induced error pulse using two parameters, occurrence frequency and probability density function for the pulse width. We extend the analysis to sequential logic and latches and calculate the failures in time (FIT) rate. The analysis is developed for the available background neutron flux data, which is experimentally determined. This, along with the device characteristics, gives the induced pulse parameters. A gate-level algorithm propagates the pulse parameters through logic gates. This algorithm correctly models the logic masking of error pulses. We introduce the concept of latching window that accurately models the temporal masking by sequential elements and present an algorithm for SER analysis of sequential logic. #### **Keywords** FIT rate, SEU, SER, Soft error, sequential circuits #### 1. Introduction Soft errors are radiation induced errors in microelectronic circuits, They occur when charged particles strike a sensitive regions in the silicon of a circuit. Soft errors used to be a major concern for avionics and space mission applications. They are considered a bottleneck nowadays in the evolution of next generation electronics because nano-devices tend to be more sensitive. Meanwhile, contemporary VLSI industry is seeking continuous down scaling of transistor size, threshold voltage, and oxide thickness to meet the growing demands for higher levels of integration and performance. However, issues like leakage power dissipation, large variability in device parameters and soft error reliability arise with such scaling [5, 14]. Studies show that environmental failure mechanisms exhibit product failure rates on the order of $1\sim100$ FIT (failures in time, 1 FIT = 1 failure in $10^9$ hours). On the other hand, the soft error rate of a low-voltage embedded SRAM can easily be 1000 FIT/Mbit [2]. The soft error related reliability issue becomes the Achilles' heel for large modern computing systems. Errors caused by cosmic rays and alpha particles will remain the prevalent failing causes in electronic systems because the physical defect caused errors are significantly reduced with advanced design and manufacturing techniques. Neutron is proved to be the principal cause of error transients over all cosmic particles for ground-level electronics [24] and hence we will only consider the neutron induced soft error in this work. Getting sufficient microchip reliability information, especially soft errors rate (SER), before the chip is manufactured is critical for chip engineers. Certain level of error protection may be added to the design if the reliability criteria does not meet the requirements from the customer. An accurate prediction of SER needs SER simulation using actual circuit models, which include device, process, technology and environmental parameters. Most integrated circuits are tested at particle accelerators using accelerated testing methods. The purpose of accelerated life tests is to identify and quantify the failures and failure mechanisms, which cause products to wear out before the end of their working life. Unfortunately, accelerated life testing is always very expensive because multiple runs are normally needed to get sufficient number of samples under test to fail and data to be statistically meaningful. The test time may typically vary from few weeks to few months [4]. Analytical approaches to calculate the circuit soft error rate are good alternatives. Recent work includes Asadi et al. [1], Rao et al. [13], Zhang and Shanbhag [22], Miskov-Zivanov and Marculescu [8] and Wang and Agrawal [18]. There are also recently published books on soft error phenomenon as it applies to memories [23] and processor systems [10]. A comprehensive text of general nature on soft errors is expected soon [12]. Logic circuits have specific masking effects on SET (soft error transients). The masking factors are modeled as electrical masking, logic masking and temporal masking [11]. In addition, environmental factors like location, altitude, longitude also play important roles in determining logic circuit SER. Accurate estimation of logic circuit SER requires a comprehensive model that considers both circuit characteristics and environmental factors and it continues to be a major challenge. Section 2 reviews an environment dependent soft Figure 1. Problem statement. error model, which is based on both error occurrence rate represented as a probability and the SET pulse density represented as a probability density function [18, 19, 20]. In Section 3, we apply the soft error rate analysis method to latches. In Section 4, we estimate the failure in time (FIT) rate for sequential logic circuits. The proposed method consists of two steps, (1) categorize soft errors and (2) propagate error probability though logic depending on the error category either by using combinational SER analysis method or by a time frame expansion method. In Section 5, we conclude. ### 2. Review of Soft Error Models # 2.1 An Environment-Based Model [17] Single event transient pulse is induced when a neutron strikes a sensitive region of the circuit with sufficient energy. A voltage pulse propagates through an activated path in the logic circuit. When the pulse is captured by a clock edge, a soft error occurs. Otherwise, the error pulse is just a transient. Neutron-induced SET has unique characteristics such as polarity, waveform shape, amplitude and duration. These characteristics depend on particle impact location, particle energy, device technology, device supply voltage and output load [9]. We model the neutron induced SET pulse by two factors based on the natural characteristics of environmental neutrons flux. The transient current pulse created by a particle strike for each given LET linear energy transfer can be modeled as a double exponential equation for any given neutron energy [7]. LET is a measure of the energy transferred to the device per unit length as a particle travels through the material. Through charging and discharging of the circuit node capacitance, the transient current pulse is converted to a transient voltage pulse. We consider all neutron flux energy components in our soft error model and average the error occurrence probability per particle hit for each circuit node. The flux energy (LET) density is converted to SET width density. Figure 2 illustrates a neutron-induced soft error model for logic circuits. This probabilistic soft error model is based on two factors: (1) the SEU occurrence rate expressed as probability and (2) once Figure 2. A probabilistic soft error model based on background neutron flux. an SEU occurs, it affects various nodes of the logic circuit as SET pulses of widths represented by probability density functions. Both factors are strongly related to the environmental parameters and circuit material and can be obtained from the experimentally measured data available in the literature [7]. The error occurrence frequency p for each circuit node is given by Equation 1 where flux is background neutron flux expressed in $m^{-2} \cdot s^{-1}$ , A is the sensitive area of the node in $m^2$ , and $p_{perhit}$ is the probability of error induced by each particle. From [24], the total neutron flux at sea level is $56.5m^{-2}s^{-1}$ . $$p = p_{perhit} \times flux \times A \tag{1}$$ ### 2.2 Gate Level Propagation Algorithm [20] A soft error model is derived for the error probability and SET width density as the induced pulse propagates through a logic gate. First, the problem for a given $f_X(x)$ and function g: Y = g(X), where g is differentiable and an increasing function so that g' and $g^{-1}$ exist, is to find $f_Y(y)$ where X and Y are random variables. - X: input pulse width - Y: output pulse width, - $f_X(x)$ : probability density function of X - $f_Y(y)$ : probability density function of Y More specifically, function g is expressed as $Y = g\{X, pmos W/L, nmos W/L, C_{load}, technology\}$ . From the theory of random functions, we have: $$f_Y(y) = f_X(x)/g'(x). \tag{2}$$ Load capacitances are generally determined from the layout. For estimating the SER before the circuit is physically laid out, we used a wire-load capacitance model [16]. Wire-load models estimate capacitance of a net by its pin-count and the technology data. The load capacitance of a gate can be simply estimated as the technology-dependent nominal gate delay multiplied by $(1+number\ of\ fanouts)$ . For a generic logic gate, when an input pulse passes through, it follows the following rule: - 1. Propagation with no attenuation, if $D_{in} \geq 2\tau_p$ . - 2. Propagation with attenuation, if $\tau_p < D_{in} < 2\tau_p$ - 3. Non-propagation, if $D_{in} \leq \tau_p$ . #### Where - D<sub>in</sub>: input pulse width, also represented by random variable X. - $D_{out}$ : output pulse width (to be determined), also represented by random variable Y. - $\tau_p$ : gate input to output delay. The pulse width propagation function g for each individual gate is a non-linear function but can be approximated as a three interval piecewise-linear function of Equation 3, where $\tau_p$ is the average of the rise delay and fall delay for the gate output. Comparing with actual HSPICE simulation [20] this model has been validated as a reasonable approximation. $$D_{out} = \begin{array}{c} 0 & \text{if } D_{in} \leq \tau_p \\ 2(D_{in} - \tau_p) & \text{if } \tau_p < D_{in} < 2\tau_p \\ D_{in} & \text{if } D_{in} \geq 2\tau_p \end{array}$$ (3) In summary, given an error occurrence probability p and a SET pulse width density function $f_X(x)$ , after propagation through a logic gate with transfer function g, the error probability at the gate output $p_{out}$ is given in Equation 4. Here, $p_{non-controlling}(i)$ is the probability that $i_{th}$ input has a non-controlling value to pass the error pulse through. $$p_{out} = p \cdot \underbrace{\int\limits_{x>0}^{\int} f_X(x)dx}_{\substack{x>0}} \cdot \underbrace{\prod\limits_{x>0}^{i} [p_{(non-controlling)}(i)]}_{\substack{Logic\ Masking}}$$ $$(4)$$ ### 3. Latching Window Masking Model In this section, we introduce a masking factor to account for temporal masking, which was not considered in the combinational logic SER analysis, to extend our analysis to sequential logic. Borrowed from the published literature on temporal masking analysis, a simple latching window masking model [15] is used in our work. When SET pulses survive through combinational logic (logic masking and electrical masking) and arrive at latches, only the pulses of enough amplitude and width positioned around the latch closing edge Figure 3. Latching window masking. will be captured. A latching window $(t_{lw})$ is a duration bounded by the setup time $(t_{su})$ and hold time $(t_h)$ around the active clock edge of a flip-flop. For simplicity, we assume that SET pulses have very small rise and fall times so that issues like SET hold time and SET setup time defined in [6] are not considered in our framework. Also, in our probabilistic analysis, we exclude the possibility that a SET error pulse can delay the correct input signal from arriving at the latch input to cause an error. SPICE simulation can be used to determine if the SET has sufficient amplitude and duration to be captured by the latch. The simulation is performed by keeping the rise and fall times constant, but varying the SET pulse width, to determine the minimum duration for a pulse that can be latched [15]. If the SET pulse width exceeds this minimum duration, the soft error has certain probability to occur. As defined in [15], a SET pulse that is present at the latch input throughout the entire latching window will be latched and cause a soft error. Suppose the latching-window starts after time t and ends before time t+d. We randomly place an SET pulse of width w to overlap the interval d. The probability of the pulse being latched to cause an error is given by Equation 5 and is shown in Figure 3. The probability of the pulse causing a soft error is computed as the probability that a randomly placed interval of length d overlaps a fixed interval of length w within an overall interval of length c. This probability is given by Equation 5 and Figure 3 [15]. $$p_{lacthing} = \begin{cases} 0, & \text{if } d < w \\ \frac{d-w}{c}, & \text{if } w \le d \le c + w \\ 1, & \text{if } d > c + w \end{cases}$$ (5) where - d is the duration of the SET pulse width on arrival, - w is the size of the latching window, and - c is the clock cycle time. Figure 4. Soft error categories depend on the topological effects on circuit outputs. Consider the previous SER analysis of combinational logic. For using the proposed model, suppose we know the SET pulse width probability density function f(x) and the error occurrence frequency $p_{comb}$ upon arrival at the latch. From the latching window model we then have: $$p_{error} = p_{comb} \times p_{latch}$$ $$= \left[ \frac{d - w}{c} \int_{w}^{c+w} f(x)x + \int_{c+w}^{\infty} f(x)x \right] \times p_{comb}$$ (6) Electrical masking is ignored in Equation 6 because the latching window masking is the dominant factor for latches. # 4. Proposed Methodology and Results Different from combinational circuit, sequential circuits can have feedback among flip-flops. Depending on the circuit topology and the particle hit position, some soft errors only affect the primary outputs (PO) when other errors will go through the feedback path from pseudo primary outputs (PPO) to pseudo primary inputs (PPI) and propagate through the combinational logic again. Therefore, we classify soft errors into three categories. In Figure 4, we categorize the soft errors based on how they affect POs and PPOs. Suppose $F^i$ is the set of unprocessed soft errors, $A^i$ , $B^i$ and $C^i$ are subsets of $F^i$ . These subsets are mutually exclusive and we have $F^i = A^i +$ $B^i + C^i + \delta$ , where $\delta$ is the subset of circuit positions where the soft error has 0 probability to propagate to either PO or PPO if a particle strikes on these positions, and the index i denotes the $i_{th}$ stage in time frame expansion method presented in next paragraph. Thus. - $F^i$ is set of unprocessed soft errors, - $A^i$ is subset of soft errors only affecting PO, Figure 5. Time frame expansion method for sequential logic SER analysis. - B<sup>i</sup> is subset of soft errors affecting both PO and PPO, and - $C^i$ is subset of soft errors only affecting PPO. The initial set $A^1$ , $B^1$ and $C^1$ can be easily found by circuit simulation or path analysis. For soft error (e), if $e \in A^i$ , we simply use SER analysis method for combinational logic presented in [21] to get the circuit SER. If soft error $e \in B \bigcup C$ , we introduce a time-frame expansion method to analyze the cyclic behavior of errors in the sequential logic. The concept is borrowed from time-frame expansion test generation for sequential logic [3]. In Figure 5, the whole sequential circuit, both combinational block and flipflops are duplicated n times as stages 1 through n. The soft error is only introduced in stage 1 (initial stage) and the time-frame expansion analyzes the cumulatively contribution of the induced error to circuit SER. We use the gate level propagation algorithm and error model (p, f) introduced in Section 2.1, and the latching window model of Section 3. The overall error rate of a sequential circuit is given by Equation 7. Figure 5 shows how soft error set $A^i$ , $B^i$ and $C^i$ mutations with the stage changes. The size of $F^i$ keeps shrinking and $SER_i$ keeps approaching final SER, as follows: $$SER = \sum_{i=0}^{\infty} \left( \sum_{\substack{e \in A^i \\ e \in B^i \\ e \in C^i}} SER(e) \right)$$ (7) After the error has propagated through stage 1, for stages 2 through n soft errors only occur on pseudo primary inputs. So, for stages 2 through n, the subsets $A^i$ , $B^1$ and $C^i$ can be obtained by assuming that errors occur only on PPI. Also, assuming that after $n_{th}$ stage the error probability is small on PPI so the contribution to circuit can be neglected in the analy- Figure 6. Soft error set transformation under different stages. Figure 7. SER(i) and F(i) on $i_{th}$ stage trend for s27. sis. Thus, Equation 7 can simplified: $$SER \approx \widetilde{SER} = \sum_{\substack{e \in A^1 \\ e \in B^1 \\ e \in C^1}} SER(e) + \sum_{i=2}^{n} (\sum_{\substack{e \in \overline{A^i} \\ e \in \overline{B^i} \\ e \in C^i}} SER(e))$$ $$Stage 1 \qquad \underbrace{Stage \ 2 \ to \ n}$$ (8) where $\overline{A^i} \cap \overline{B^i} \cap \overline{C^i} = \{PPI_1, PPI_2, \ldots\}.$ We analyzed ISCAS89 benchmark circuits using a simulator developed in C programming language. For simplicity, we assume that all circuits are working at the ground level and the probability of SEU per particle hit is $10^{-4}$ . We assume that the SET width density per circuit node follows a normal distribution with mean $\mu=150$ and standard deviation $\sigma=50$ . From [24], the total neutron flux at sea level is $56.5m^{-2}s^{-1}$ . For a CMOS circuit in TSMC035 technology, we assume the sensitive region to be $10\mu m^2$ for each circuit node and set the clock frequency to 1GHz. Using the proposed algorithm for sequential logic (Equation 8) the SER obtained for ISCAS'89 circuits are shown in Table 2. Figure 7 shows that the SER for s27 circuit is 380FIT. **Discussion** We examine some recent SER analysis methods for applicability to sequential circuits and Figure 8. ISCAS89' benchmark Circuit s27. Table 1. An example of soft error category at stage 1 for S27. | ſ | $A^i$ | $B^i$ | $C^i$ | | | |---|-------|------------------------------------|-----------------------|--|--| | | G8 | I1, I2, G1<br>G2, G4, G5<br>G7, G8 | I3, G6<br>I4, G3, G10 | | | list open problems for future research. - (1) A comprehensive analysis of soft error sample size and clock cycles in Monte Carlo (MC) Simulation is needed. It will be pointless to compare the simulation time with Monte Carlo simulation without considering the MC simulation accuracy for which the sample size should be statistically meaningful to achieve any level of confidence. We find that in most published literature part seems to be missing. - (2) For Monte Carlo simulation methods and probabilistic estimation approaches [1], requiring SPICE simulation completed in several days, the error probabilities for circuit nodes are found to be much higher, maybe by several orders of magnitude, than real world measurement values. The proposed work may prove to be more relevant when compared with real measured SER data. For other methods as discussed in [21] the estimated SER can be 10<sup>9</sup> times that of measured data. ## 5. Conclusion We have extended an environment-based soft error model to sequential logic circuit SER analysis. The soft error model is characterized by error occurrence rate, the SET pulse width density, and a new latching window model. The temporal masking factor is taken into account for sequential logic. To estimate sequential SER, we have developed a repetitive two-phase method: (1) categorize the soft errors by its impact on primary outputs and pseudo-primary outputs. (2) depending on the categorization, either use combinational logic analysis methods or time-frame expansion Table 2. SER estimation for ISCAS'89 benchmark sequential circuits. | Circuit | # | # | # | # | CPU | SER | |---------|----|----|------|-----|-------|---------------| | | PΙ | PO | Gate | FF | (s) | (FIT) | | | | | | | | $\times 10^3$ | | s27 | 4 | 1 | 10 | 3 | 0.01 | 0.38 | | s298 | 3 | 6 | 119 | 14 | 0.19 | 1.05 | | s386 | 7 | 7 | 159 | 6 | 0.22 | 2.86 | | s444 | 3 | 6 | 181 | 21 | 0.51 | 4.93 | | s526 | 3 | 6 | 193 | 21 | 0.82 | 4.51 | | s832 | 18 | 19 | 287 | 5 | 2.07 | 12.79 | | s1196 | 14 | 14 | 529 | 18 | 3.78 | 34.62 | | s1494 | 8 | 19 | 647 | 6 | 8.96 | 46.20 | | s5378 | 35 | 49 | 2779 | 179 | 12.19 | 102.36 | method to analyze the error effects in a cyclic structure. After first time-frame of combinational logic, the analysis is simplified by considering soft errors only on pseudo-primary inputs. Results show that with increasing number of stage, the SER converges to realistic values of SER as the unprocessed soft error set $F^i$ keeps shrinking. # References - G. Asadi and M. B. Tahoori, "An Accurate SER Estimation Method Based on Propagation Probability," Proc. Design Automation and Test in Europe Conf, pp. 306–307, 2005. - [2] R. Baumann, "Technology Scaling Trends and Accelerated Testing for Soft Errors in Commercial Silicon Devices," in *Proc. 9th On-Line Testing Symposium*, 2003, p. 4. - [3] M. L. Bushnell and V. D. Agrawal, Essnetials of Electronic Testing for Digital, Memory and Mixed-Signal VLSI Circuits. Springer, 2000. - [4] C. Croarkin, P. Tobias, and C. Zey, Engineering Statistics Handbook. NIST and SEMATECH, USA, 2001. - [5] C. Hawkins, K. Baker, K. M. Butler, J. Fiquera, M. Nicolaidis, V. B. Rao, R. Roy, and T. Welsher, "IC Reliability and Test: What Will Deep Submicron Bring?," *IEEE Design & Test of Computers*, vol. 16, no. 2, pp. 84–91, 1999. - [6] M. Hosseinabady, P. Lotfi-Kamran, G. Di Natale, S. Di Carlo, A. Benso, and P. Prinetto, "Single-Event Upset Analysis and Protection in High Speed Circuits," in *Proc. Eleventh IEEE European Test Sym*posium (ETS'06), 2006, pp. 29–34. - [7] G. C. Messenger and M. Ash, Single Event Phenomena. Chapman & Hall, 1997. - [8] N. Miskov-Zivanov and D. Marculescu, "Soft Error Rate Analysis for Sequential Circuits," in *Proceedings* of the conference on Design, automation and test in Europe, EDA Consortium San Jose, CA, USA, 2007, pp. 1436–1441. - [9] S. Mitra, N. Kee, and S. Kim, "Robust System Design with Built-In Soft-Error Resilience," *IEEE Design & Test Computers*, vol. 38, no. 2, pp. 43–52, 2005. - [10] S. Mukherjee, Architecture Design for Soft Errors. Morgan-Kaufmann, 2008. - [11] H. T. Nguyen and Y. Yagil, "A Systematic Approach to SER Estimation and Solutions," in *Proc. 41st An*nual IEEE International Reliability Physics Symposium, 2003, pp. 60–70. - [12] M. Nicolaidis, editor, Soft Errors in Modern Electronic Systems. Springer, 2010. To be published. - [13] R. R. Rao, K. Chopra, D. Blaauw, and D. Sylvester, "An Efficient Static Algorithm for Computing the Soft Error Rates of Combinational Circuits," in Proc. Conference on Design, Automation and Test in Europe, 2006, pp. 164–169. - [14] K. Roy, T. M. Mak, and K. T. Cheng, "Test Consideration for Nanometer-Scale CMOS Circuits," *IEEE Design & Test of Computers*, vol. 23, no. 2, pp. 128–136. - [15] P. Shivakumar, M. Kistler, S. W. Keckler, D. Burger, and L. Alvisi, "Modeling the Effect of Technology Trends on the Soft Error Rate of Combinational Logic," in *Proc. International Conference on De*pendable Systems and Networks, 2002, pp. 389–398. - [16] M. J. S. Smith, Application-Specific Integrated Circuits. Reading, Massachusetts: Addison-Wesley, 1997. - [17] F. Wang, Soft Error Rate Determination for Nanometer CMOS VLSI Circuits. Master's thesis, Auburn University, Dept. of ECE, May 2008. - [18] F. Wang and V. D. Agrawal, "Probabilistic Soft Error Rate Determination from Statistical SEU Parameters," in *Proc. 17th IEEE North Atlantic Test Workshop*, May 2008. - [19] F. Wang and V. D. Agrawal, "Single Event Upset: An Embedded Tutorial," in Proc. 21th International Conference on VLSI Design, 2008, pp. 429–434. - [20] F. Wang and V. D. Agrawal, "Soft Error Rate Determination for Nanometer CMOS VLSI Circuits," in Proc. 40th Southeastern Symposium on System Theory, Mar. 2008, pp. 324–328. - [21] F. Wang and V. D. Agrawal, "Soft Error Rates with Inertial and Logical Masking," in *Proc. 22th Inter*national Conference on VLSI Design, 2009, pp. 459– 464. - [22] M. Zhang and N. R. Shanbhag, "A Soft Error Rate Analysis (SERA) Methodology," in Proc. IEEE/ACM International Conference on Computer-Aided Design, 2004, pp. 111–118. - [23] J. Ziegler and H. Puchner, SER History, Trends, and Challenges: A Guide for Designing Memory ICs. Cypress Online Store, www.cypress.com/support, 2004. - [24] J. F. Ziegler, "Terrestrial Cosmic Rays," IBM Journal of Research and Development, vol. 40, no. 1, pp. 19–39, 1996.