Skip to main content
Log in

System on chip failure rate assessment using the executable model of a system

  • Published:
Computing Aims and scope Submit manuscript

Abstract

Statistical data from many application fields confirm that System on Chips (SoCs) products implemented in modern deep submicron technologies are getting more and more sensitive to transient errors such as soft-errors. Although the thorough and comprehensive understanding of services that an SoCs provides is an important step for meeting stringent system requirements, designers no longer can ignore emerging safety and reliability issues in nanoscale devices. In fact, proper actions should be taken at various stages of system design to mitigate the effect of such errors and enhance safety of SoC in fault prone environments. Therefore, SoC designs can benefit from knowing the soft-error rate (SER) of different cores as well as the whole system failure rate at a very early stage of SoC development. Such data enables companies and designers to make the right decision at the right time concerning the intensity of error protection mechanisms across different modules. This paper proposes a new quantitative method to estimate the SER of different modules inside an SoC by means of an executable model. The executable model of a system is based on the Unified Modeling Language Real-Time standard and is exercised by the actual workload. Experimental results show that the proposed quantitative method is 17 % more accurate than the previous error estimations techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  1. Mitra S, Karnik T, Seifert N, Zhang M (2005) Logic soft errors in sub-65nm technologies design and CAD challenges. In: Proc. design automation conference (DAC), pp 2–4

  2. Savino A, Carlo SD, Benso A, Bosio A, Di Natale G (2012) Statistical reliability estimation of microprocessor-based systems. IEEE Trans Comput 61:1521–1534

    Article  MathSciNet  Google Scholar 

  3. Li X, Adve SV, Bose P, Rivers JA (2008), Online estimation of architectural vulnerability factor for soft errors. In: Proceedings of the 35th international symposium on computer, architecture, pp 341–352

  4. Sridharan V, Kaeli DR (2010) Using pvf traces to accelerate avf modeling. In: Proceedings of the IEEE workshop on silicon errors in logic—system effects, Stanford, California

  5. Zhe ZMa, Catthoor F, Vermunt F, Hendriks T (2010) System-level analysis of soft error rates and mitigation trade-off explorations. In: Proc. reliability physics symposium conference (IRPS), pp 1014–1018

  6. Heidergott W (2005) SEU tolerant device, circuit and process design. In: Proc. 42nd design automation conference (DAC), pp 5–10

  7. Calin T, Nicolaidis M, Velazco R (1996) Upset hardened memory design for submicron CMOS technology. IEEE Trans Nucl Sci 43(6)

  8. Mukherjee SS, Weaver C, Emer J, Reinhardt SK, Austin T (2003) A systematic methodology to compute architectural vulnerability factors for a high-performance microprocessor. In: Proc. international symposium on microarchitecture (MICRO), pp 29–40

  9. Nguyen HT, Yagil Y (2003) A systematic approach SER estimation and solutions. In: Proc. IEEE international reliability physics symposium (IRPS), pp 60–70

  10. Seifert N, Tam N (2004) Timing vulnerability factors of sequentials. IEEE Trans Device Mater Reliab 4(3):516–522

    Article  Google Scholar 

  11. Wasserman GS (2002) Reliability verification, testing, and analysis in engineering design. Marcel Dekker Incorporated, New York

    Book  Google Scholar 

  12. Hosseinabady M, Neishaburi MH, Navabi Z, Benso Alfredo, Di Carlo S, Prinetto P, Di Natale G (2007) Analysis of system-failure rate caused by soft-errors using a UML-based systematic methodology in an SoC. In: Proc. IEEE intl. on-line testing symposium (IOLTS), pp 205–206

  13. Papadopoulos Y, McDermid J, Sasse R, Heiner G (2001) Analysis and synthesis of the behavior of complex programmable electronic systems in conditions of failure. Reliab Eng Syst Saf 71:229–247

    Article  Google Scholar 

  14. Pinello C, Carloni LP, Sangiovanni-Vincentelli AL (2004) Fault-tolerant deployment of embedded software for cost-sensitive real-time feedback-control application. In: Proc. design automation and test in Europe conference (DATE’04), pp 1164–1169

  15. McKelvin ML, Sprinkle J, Pinello C, Sangiovanni-Vincentelli A (2005), Fault tolerant data flow modeling using the generic modeling environment. In: Proc. IEEE international conference and workshop engineering computer-based, system (ECBS’05), pp 229–235

  16. Sangiovanni-Vincentelli A, Carloni L, De Bernardinists F, Sgroi M (2004) Benefits and challenges for platform-based design. In Proc. design automation conference (DAC), pp 409–414

  17. Neishaburi MH, Kakoee MR, Daneshtalab M, Safari S (2007) HW/SW architecture for soft-error cancellation in real-time operating system. IEICE Electron Express 4(23):755–761

    Article  Google Scholar 

  18. Ferreira P, Sampaio A, Mota A (2008) Viewing CSP specifications with UML-RT diagrams. Electron Notes Theoret Comput Sci 195:57–74

    Article  Google Scholar 

  19. Neishaburi MH, Zilic Z (2011) On failure rate assessment using an executable model of the system. In: Proc. digital system design (DSD), pp 29–36

  20. Selic B (2000) A generic framework for modeling resources with UML. IEEE Comput, pp 64–69

  21. OMG (2003) UML profile for schedulability, performance, and time specification

  22. Lyons A (1998) UML for real-time overview. Technical Report, Prentice-Hall International, Object Time Limited

  23. Rational Rose RealTime, “Modeling Language Guide”, Version 2003.06.00, http://www.rational.com

  24. Neishaburi MH, Zilic Z (2009) Reliability aware NoC router architecture using input channel buffer sharing. In: Proceedings of great lake symposium on VLSI (GLSVLSI), pp 511–516

  25. Neishaburi MH, Zilic Z (2011) ERAVC: Enhanced reliability aware NoC router. In: Proceedings of international symposium on quality electronic design (ISQED), pp 591–596

  26. Neishaburi MH, Zilic Z (2013) NISHA: A Fault-tolerant NoC router enabling deadlock-free interconnection of subsets in hierarchical architecture. J Syst Archit (JSA)

  27. Neishaburi MH, Zilic Z (2011) Hierarchical embedded logic analyzer for accurate root-cause analysis. In: Proceedings of international symposium on defect and fault tolerance in VLSI and nanotechnology systems (DFT), pp 120–128

  28. Neishaburi MH, Zilic Z (2011) A fault tolerant hierarchical network on chip router architecture. In: Proceedings of international symposium on defect and fault tolerance in VLSI and nanotechnology systems (DFT), pp 445–453

  29. Neishaburi MH, Zilic Z (2012) An infrastructure for debug using clusters of assertion checkers. Microelectronics Reliability 52(11):2781–2798

    Google Scholar 

  30. http://www.bolidesoft.com/

  31. Hosseinabady M, Neishaburi MH, Lotfi-Kamran P, Navabi Z (2007) A UML based system level failure rate assessment technique for SoC designs. In: Proc. VLSI test symposium (VTS), pp 6–10

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to M. H. Neishaburi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Neishaburi, M.H., Zilic, Z. System on chip failure rate assessment using the executable model of a system. Computing 97, 611–629 (2015). https://doi.org/10.1007/s00607-013-0372-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00607-013-0372-7

Keywords

Mathematics Subject Classification

Navigation