Skip to main content
Log in

A Hybrid Fault-Tolerant Architecture for Highly Reliable Processing Cores

Journal of Electronic Testing Aims and scope Submit manuscript

Abstract

Increasing vulnerability of transistors and interconnects due to scaling is continuously challenging the reliability of future microprocessors. Lifetime reliability is gaining attention over performance as a design factor even for lower-end commodity applications. In this work we present a low-power hybrid fault tolerant architecture for reliability improvement of pipelined microprocessors by protecting their combinational logic parts. The architecture can handle a broad spectrum of faults with little impact on performance by combining different types of redundancies. Moreover, it addresses the problem of error propagation in nonlinear pipelines and error detection in pipeline stages with memory interfaces. Our case-study implementation of a fault tolerant MIPS microprocessor highlights four main advantages of the proposed solution. It offers (i) 11.6 % power saving, (ii) improved transient error detection capability, (iii) lifetime reliability improvement, and (iv) more effective fault accumulation effect handling, in comparison with TMR architectures. We also present a gate-level fault-injection framework that offers high fidelity to model physical defects and transient faults.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

References

  1. Avirneni NDP, Somani AK (2012) Low overhead soft error mitigation techniques for high-performance and aggressive designs. IEEE Trans Comput 61(4):488–501

    Article  MathSciNet  Google Scholar 

  2. E. Balaji and P. Krishnamurthy (1996) Modeling ASIC memories in VHDL. In: Proc. EURO-VHDL Design Automation Conference, pp. 502–508

  3. J. A. Blome, S. Feng, S. Gupta, S. Mahlke (2006) Online timing analysis for wearout detection. In: Proc. of the 2nd Workshop on Architectural Reliability

  4. Bubrova E (2013) “Hardware redundancy,” in Fault-Tolerant Design. Springer, New York

    Book  Google Scholar 

  5. D. Ernst, Nam Sung Kim, S. Das, S. Pant, R. Rao, T. Pham, C. Ziesler, D. Blaauw, T. Austin, K. Flautner, T. Mudge, (2003) Razor: a low-power pipeline based on circuit-level timing speculation. In: Proc. of the 36th Annual IEEE/ACM International Symposium on Microarchitecture, pp. 7–18

  6. Introduction to Single-Event Upsets, white paper, Altera Corp (2013) http://www.altera.com/literature/lit-index.html.

  7. K. John, H.K. Chris (2011) Transistor Aging, IEEE Spectrum, http://spectrum.ieee.org.

  8. Johnson BW (1989) Design techniques to achieve fault tolerance. In: Design and analysis of Fault-Tolerant Digital Systems. Addison-Wesley Pub Comp. Inc, USA, pp. 67–68

    Google Scholar 

  9. Li M-L., P. Ramachandran, U.R. Karpuzcu, S.K.S. Hari, S.V. Adve (2009) Accurate microarchitecture-level fault modeling for studying hardware faults. In: Proc. of the 15th IEEE International Symposium on High Performance Computer Architecture, pp. 105–116

  10. P. Liden et al. (1994) On latching probability of particle induced transients in combinational networks. In: Proc. of the Symp on Fault-Tolerant Computing, pp. 340–349

  11. M. Mehrara, M. Attariyan, S. Shyam, K. Constantinides, V. Bertacco and T. Austin(2007) Low-Cost Protection for SER Upsets and Silicon Defects. In: Proc. of the Design, Automation & Test in Europe Conference, pp. 1–6

  12. S. Mitra, E.J. McCluskey (2000) Word-voter: a new voter design for triple modular redundant systems. In: Proc. of the 18th IEEE VLSI Test Symposium, pp. 465–470

  13. M. Prvulovic et al. (2002) ReVive: cost-effective architectural support for rollback recovery in shared-memory multiprocessors. In: Proc. of the Int Symp on Computer Architecture, pp. 111–122

  14. Semiconductor Industry Association (2010) International Technology Roadmap for Semiconductors (ITRS)

  15. P. Shivakumar, M. Kistler, S.W. Keckler, D. Burger and L. Alvisi (2002) Modeling the effect of technology trends on the soft error rate of combinational logic. In: Proc. of the Int Conf on Dependable Systems and Networks, pp. 389–398, .

  16. V. Subramanian, A.K. Somani (2008) Conjoined Pipeline: Enhancing Hardware Reliability and Performance through Organized Pipeline Redundancy. In: Proc. 14th IEEE Pacific Rim International Symposium on Dependable Computing, pp. 9–16

  17. D. A. Tran, A. Virazel, A. Bosio, L. Dilillo, P. Girard, S. Pravossoudovitch and H.-J. Wunderlich (2011) A hybrid fault tolerant architecture for robustness improvement of digital circuits. In: Proc. of the Asian Test Symposium, pp. 136–141

  18. D. A. Tran, A. Virazel, A. Bosio, L. Dilillo, P. Girard, A. Todri, M.E. Imhof and H.-J. Wunderlich (2012) A pseudo-dynamic comparator for error detection in fault tolerant architectures. In: Proc. of the VLSI Test Symposium, pp. 50–55

  19. J. Velamala, R. LiVolsi, M. Torres and Yu Cao (2011) Design sensitivity of Single Event Transients in scaled logic circuits. In: Proc. of the Design Automation Conference, pp. 694–699

  20. I. Wali, A. Virazel, A. Bosio, L. Dilillo, P. Girard, A. Todri (2014) Protecting combinational logic in pipelined microprocessor cores against transient and permanent faults,. In: Proc. of the Int. Symp. on Design and Diagnostics of Electronic Circuits & Systems, pp. 223, 225

  21. Wirth G, Kastensmidt L, Fernanda IR (2008) Single event transients in logic circuits—load and propagation induced pulse broadening. IEEE Trans Nucl Sci 55(6):2928–2935

    Article  Google Scholar 

  22. Yao J, Shimada H, Kobayashi K (2010) A stage-level recovery scheme in scalable pipeline modules for high dependability. In: Proc. of the Int Workshop on Innovative Architecture for Future Generation High Performance, pp.21–29

  23. Yao J et al. (2012) DARA: A low-cost reliable architecture based on unhardened devices and its case study of radiation stress test. IEEE Trans Nucl Sci 59(6):2852–2858

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Arnaud Virazel.

Additional information

Responsible Editor: M. Abadir

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wali, I., Virazel, A., Bosio, A. et al. A Hybrid Fault-Tolerant Architecture for Highly Reliable Processing Cores. J Electron Test 32, 147–161 (2016). https://doi.org/10.1007/s10836-016-5578-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10836-016-5578-0

Keywords

Navigation