Characterizing, modeling, and analyzing soft error propagation in asynchronous and synchronous digital circuits

https://doi.org/10.1016/j.microrel.2014.09.025Get rights and content

Highlights

  • Analysis of soft error propagation in asynchronous and synchronous digital designs.

  • Modeling electrical and logical masking at gate level by utilizing MDG and GP sets.

  • Identification of soft error propagation critical inputs sequences.

  • Faster, efficient, and more accurate estimation of the soft error rate (SER).

Abstract

Soft errors, due to cosmic radiations, are one of the major challenges for reliable VLSI designs. In this paper, we present a symbolic framework to model soft errors in both synchronous and asynchronous designs. The proposed methodology utilizes Multiway Decision Graphs (MDGs) and glitch-propagation sets (GP sets) to obtain soft error rate (SER) estimation at gate level. This work helps mitigate design for testability (DFT) issues in relation to identifying the controllable and the observable circuit nodes, when the circuit is subject to soft errors. Also, this methodology allows designers to apply radiation tolerance techniques on reduced sets of internal nodes. To demonstrate the effectiveness of our technique, several ISCAS89 sequential and combinational benchmark circuits, and multiple asynchronous handshake circuits have been analyzed. Results indicate that the proposed technique is on average 4.29 times faster than the best contemporary state-of-the-art techniques. The proposed technique is capable to exhaustively identify soft error glitch propagation paths, which are then used to estimate the SER. To the best of our knowledge, this is the first time that a decision diagram based soft error identification approach is proposed for asynchronous circuits.

Introduction

With the scaling of process technologies, the ratio between the charge required to introduce a current pulse that changes a logic state, i.e. the critical charge (QCRIT), and the charge collection efficiency (QS) has been reduced significantly [1], [2]. Therefore, the possibility of transient pulse generation, when an energetic particle hits one of the sensitive sites of a combinational logic block or of an asynchronous circuit, has considerably increased in modern deep sub-micron (DSM) technologies. This phenomenon is known as a single event transient (SET). These transients may cause reversible (soft) and irreversible (hard) faults in digital designs, which are often called Single Event Effects (SEEs). Soft faults are more difficult to trace because it is hard to reproduce. If a soft fault results in changing the state of a system, then it leads to soft errors. Hence, with technology evolution, the soft error rate (SER) has become an important reliability metric and there is a growing need for fast, accurate, and efficient estimation of that metric.

Soft faults and the resulting soft errors are traditionally studied at the physical design level. On the other hand, it is desirable to develop a technique for SER estimation at a higher abstraction level. Recently, some groups have developed methodologies to perform SER estimation at the gate level or higher. One of those techniques is to do simulations with fault injection based on random vector generation [3], [4], [5]. However, the simulation based approach has serious shortcomings as fault simulations can be very time consuming for large designs with many primary inputs and sequential states. When complexity forces using sampling techniques, accuracy of fault simulations decreases with the ratio of the simulated sample size over the total vector space size.

Another technique used for soft error modeling and estimation is based on binary decision diagrams (BDDs) [6], [7], [8]. BDDs notoriously suffer from a state space explosion problem. In [9], the authors proposed the combinations of the reduced-order binary decision diagrams (ROBDDs) and the algebraic decision diagrams (ADDs), to simultaneously model and analyze the effects of logical, electrical, and time masking. However, the use of two decision diagrams makes this technique more complex and it consumes a considerable amount of memory. Recently, a new technique was proposed in [10], which leverages the concepts of Boolean Satisfiability and uses the so-called SAT (satisfiability) solvers. In spite of the use of very efficient SAT solvers, this method is time consuming and resource hungry, partly because of the requirement of unrolling copies of the circuit when modeling sequential designs.

Notably, none of the above mentioned approaches deal with asynchronous handshaking circuits, which have recently become a popular solution for clock domain crossing (CDC) interfaces. The utilization of asynchronous handshakes will increase once emerging arbitration schemes are fully evolved [1]. Moreover, 25% of the global signals in integrated circuits will use asynchronous handshakes [28]. Since asynchronous interfaces use combinational logic and feedback, they are more vulnerable than combinational logic to soft error glitches that may result from poor signal integrity. In certain cases, errors occurring in asynchronous circuits can have catastrophic effects due to the event ordering constraints that may cause circuit failures or deadlocks [11]. Therefore, analyzing the sensitivity to soft errors in asynchronous circuits early in the design cycle is a growing concern. As a step in that direction, in [30], a new simulation based methodology to analyze the Quasi-Delay Insensitive (QDI) asynchronous circuits is proposed. In that method, probability distributions and confidence intervals are used to decrease the total number of simulations.

In order to overcome these shortcomings, a new methodology to characterize, model, and analyze soft error propagation at gate level is proposed. This work is distinct in the following ways:

  • (1)

    A new technique is proposed to model soft error glitch propagation in digital designs using Multiway Decision Graphs (MDGs) [12]. MDGs are chosen over other types of decision graphs because they allow defining several data types, which are used to capture different characteristics related to glitches. The enumerated data type in MDG facilitates modeling both the notion of soft error glitch width variation e.g., electrical masking and the sensitization path e.g., logical masking in a single decision diagram.

  • (2)

    It is demonstrated that the proposed technique models the expected circuit behavior in case of soft error glitches utilizing only gate level information. This methodical approach is fully automated in new tool called SEGP-Finder. This technique can be used by designers to facilitate insertion of error-mitigation mechanisms on selected control and data paths. Moreover, this work identifies the set of conditions that may lead to soft error propagation; hence it reduces the requirement of having controllable or observable nodes, which may be necessary from a design for testability (DFT) perspective. Consequently, considering these conditions may reduce the need for complete and more expensive redundancy techniques, such as systematic triple module redundancy (TMR).

  • (3)

    In this methodology, we combine the adopted SER estimation algorithm (similar to [10]) and our proposed soft error glitch propagation analysis. This combination leads to a more accurate analysis, where less memory is required than contemporary techniques (such as [10]). This improvement is mainly due to the use of the MDG model checker as a verification engine in our algorithm. The MDG tool is applicable to different kinds of systems, some of which as large as 11400 gates, as shown by different case studies [14], [15], [16].

The rest of this paper is organized as follows. Section 2 reviews the literature on the MDG decision graph, the MDG model checker, and asynchronous interfaces. Section 3 explains in details how the principle of GP set and MDG are extended to model both electrical and logical masking effects of soft errors. Section 4 explains the proposed methodology. In Section 5, applications of the proposed methodology on combinational logic, sequential logic, and asynchronous circuits are explained. Section 6 describes the proposed automated tool supporting this methodology. Section 7 provides results and a discussion. Finally, Section 8 concludes this paper.

Section snippets

Preliminaries

We choose MDG to model soft error glitch propagation paths in digital designs utilizing glitch propagation (GP) sets [13]. In this section, a background is provided about terminologies frequently used in this paper:

The proposed concurrent modeling of electrical and logical masking

Before discussing our methodology, in the following, we provide the procedure for modeling soft error glitches under the joint effect of logical and electrical masking.

The proposed methodology to quantify the effect of soft error glitch propagation

In this section, we elaborate on our proposed methodology to analyze and estimate the soft error glitch propagation. The flowchart of our proposed methodology is shown in Fig. 2a. This methodology requires structural specifications of the considered logical circuit as an input, which is in our case its MDG-HDL description. Based on the above discussion, we must either assume a value of N or determine this value through circuit simulations. The main steps of the proposed methodology in Fig. 2

Implementation of the proposed methodology

This section discusses the applications of the proposed methodology on combinational, sequential, and asynchronous circuits.

Automating soft error glitch propagation analysis with SEGP-Finder

In order to introduce glitches in complex circuits with several thousand nodes, the proposed methodology needs to be automated. An in-house tool is developed for this purpose. It is called the Soft Error Glitch Propagation (SEGP)-Finder. Its main components are illustrated in Fig. 7. This tool serves two purposes: (1) If a gate level MDG-model (equivalent to a netlist) is available for a design, this tool automatically modifies the netlist to produce equivalent descriptions in MDG-HDL to

Results and discussion

We performed our verification on a workstation with an Intel Core i7 running at 3 GHz and 24 GB RAM. To characterize the efficiency of the SEGP-Finder tool, different combinational designs (ISCAS 85 benchmarks) and sequential designs (ISCAS 89) have been analyzed.

Conclusion and future work

We proposed a novel method to identify paths that can propagate soft faults causing soft errors in digital designs. The proposed technique reduces error injection requirements. It can also help dealing with asynchronous designs for testability issues by providing a set of controllable and observable points. It allows designers to apply fault mitigation techniques for soft error on limited parts of data paths.

A new tool for automating the analysis of the sensitivity to soft error is presented.

References (35)

  • ITRS 2011,...
  • Shivakumar P, Kistler M, Keckler SW, Burger D, Alivis L. Modeling the effect of technology trends on the soft error...
  • J.Arlat

    Error injection for dependability validation: a methodology and some applications

    IEEE Trans Software Eng

    (1990)
  • Dhillon YS, Diril AU, AChatterjee A. Soft error tolerance analysis and optimization of nanometer circuits. In:...
  • Holcomb D, Li W, Seshia SA. Design as you see fit: systemlevel soft error analysis of sequential circuits. In:...
  • N. Miskov-Zivanov et al.

    Circuit reliability analysis using symbolic techniques

    IEEE Trans Comput-Aided Des Integrated Circuit Syst

    (2006)
  • Zhang B, Orshansky M. Symbolic simulation of the propagation and filtering of transient faulty pulses. In: Proceedings...
  • B. Zhang et al.

    Faser: fast analysis of soft error susceptibility for cell-based designs

    ISQED

    (2006)
  • Miskov-Zivanov N, Marculescu D. MARS-C: modeling and reduction of soft errors in combinational circuits. In: Design...
  • Shazli Syed Z, Tahoori Mehdi. A framework based on boolean satisfiability for soft error rate computation in early...
  • Almukhaizim, Sobeeh, et al., Seamless integration of SER in rewiring-based design space exploration. Test Conference,...
  • F. Corella et al.

    Multiway decision graphs for automated hardware verification

    Formal Method Syst Des, Kluwer

    (1997)
  • S.R. Hasan et al.

    Crosstalk glitch propagation modeling for asynchronous interfaces in globally asynchronous locally synchronous systems

    IEEE Trans Circuits Syst

    (2010)
  • Balakrishnan S. A Hierarchical Approach to the Formal Verification of Embedded Systems Using MDGs. Master’s Thesis,...
  • S. Tahar et al.

    Modeling and verification of the fairisle ATM switch fabric using MDGs

    IEEE Trans CAD Integrated Circuit Syst

    (1999)
  • Zobair MH. Modeling and Formal Verification of a Telecom System Block using MDGs. M.A.Sc. Thesis, Concordia University,...
  • Y. Xu et al.

    Model checking for first-order temporal logic using multiway decision graphs

  • Cited by (16)

    View all citing articles on Scopus
    View full text