Skip to main content
Log in

Developing Inherently Resilient Software Against Soft-Errors Based on Algorithm Level Inherent Features

  • Published:
Journal of Electronic Testing Aims and scope Submit manuscript

Abstract

A potential peculiarity of software systems is that a large number of soft-errors are inherently derated (masked) at the software level. The rate of error-deration may depend on the type of algorithms and data structures used in the software. This paper investigates the effects of the underlying algorithms of programs on the rate of error-deration. Eight different benchmark programs were used in the study; each of them was implemented by four different algorithms, i.e. divide-and-conquer, dynamic, backtracking and branch-and-bound. About 10,000 errors were injected into each program in order to quantify and analyze the error-derating capabilities of different algorithm-designing-techniques. The results reveal that about 40.0 % of errors in the dynamic algorithm are derated; this figure for backtracking, branch-and-bound and divide-and-conquer algorithms are 39.5 %, 38.1 % and 28.8 %, respectively. These results can enable software designers and programmers to select the most efficient algorithms for developing inherently resilient programs. Furthermore, an analytical examination of the results using one-way ANOVA acknowledged the statistical significance of difference between the algorithm-designing-techniques in terms of resiliency at 95 % level of confidence.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22

Similar content being viewed by others

Notes

  1. The terms “software” and “program” have been used interchangeably with the same meaning in this paper.

References

  1. Ammann P, Mason G (2008) Introduction to software testing. Cambridge University Press, New York

    Book  MATH  Google Scholar 

  2. Austin T, Larson E, Ernst D (2002) SimpleScalar: an infrastructure for computer system modeling. IEEE Comput 35(2):59–67

    Article  Google Scholar 

  3. Benso A, Chiusano S, Prinetto P. Tagliaferri L (2000) C/C++ source-to-source compiler for dependable applications. In: IEEE International Conference on Dependable systems and Networks (DSN), June 2000

  4. Benso A, Di Carlo S, Di Natale G, Prinetto P, Tagliaferri L (2003) Data criticality estimation in software application. In: International test conference, pp. 802–810, October 2003

  5. Borodin D, Juurlink BHH (2010) Protective redundancy overhead reduction using instruction vulnerability factor. In: ACM international conference on computing frontiers, Italy, pp. 319–326, May 2010

  6. Cook JJ, Zilles C (2008) A characterization of instruction-level error derating and its implications for error detection. In: IEEE international conference on dependable systems and networks (DSN), June 2008

  7. Dixit A, Wood A (2011) The impact of new technology on soft error rates. In: Proceedings of the IEEE workshop on silicon errors in logic—system effects, Illinois University, March 2011

  8. Engel H (1996) Data flow transformations to detect results which are corrupted by hardware faults. In: IEEE high-assurance system engineering workshop, pp. 279–285, October 1996

  9. Fazeli M, Farivar R, Miremadi SG (2005) A software-based concurrent error detection technique for PowerPC processor-based embedded systems. In: 20th IEEE international symposium on defect and fault tolerance in VLSI Systems, pp. 266–274, October 2005

  10. Hari SKS (2012) Low-cost program level detectors for reducing silent data corruptions. In IEEE international conference on Dependable Systems and Networks (DSN), June 2012

  11. Henning JL (2006) SPEC CPU2006 benchmark descriptions. SIGARCH Comput Archit News 34(4):1–17

    Article  MathSciNet  Google Scholar 

  12. Hiller M, Jhumka A, Suri N (2001) An approach for analyzing the propagation of data errors in software. In: IEEE international conference on dependable systems and networks (DSN), July 2001

  13. Horowitz E, Sahni S, Rajasekaran S (2008) Algorithms: design and analysis. Computer Science Press, ISBN: 0-929-30641-4

  14. Karlsson J (1990) Reliability evaluation of a fault-tolerant computer for a multi-phased mission and a use of heavy-ion radiation for fault injection experiments, PhD Thesis, School of Electrical and Computer Engineering, Chalmers University of Technology

  15. Karnik T, Hazucha P, Patel J (2004) Characterization of soft errors caused by single event upsets in CMOS process. IEEE Trans Dependable Secure Comput 1(2):128–143

    Article  Google Scholar 

  16. Kleinberg J, Tardos E (2004) Algorithm design. Addison-Wesley, ISBN: 0-321-29535-8

  17. Li X (2009) Exploiting inherent program redundancy for fault tolerance, PHD Thesis in University of Maryland

  18. Lu JS, Li F, Degalahal V, Kandemir M, Vijaykrishnan N, Irwin MJ (2005) Compiler-directed instruction duplication for soft error detection. In: Design, automation and test in Europe conference, pp. 1056–1057, March 2005

  19. Messer A (2004) Susceptibility of commodity systems and software to memory soft errors. IEEE Trans Comput 53(12):1557–1568

    Article  Google Scholar 

  20. Miremadi G, Karlsson J, Gunneflo U, Torin J (1992) Two software techniques for online error detection. In: 22nd International symposium on fault-tolerant computing, pp. 328–335, July 1992

  21. Mukherjee SS, Weaver C, Emer J, Reinhardt SK, Austin T (2003) A systematic methodology to compute the architectural vulnerability factors for a high-performance microprocessor. In: 36th annual IEEE/ACM International symposium on micro architecture, pp. 29–40, December 2003

  22. Neapolitan R, Naimipour K (2004) Foundations of algorithms using C++ pseudo code. Jones and Bartlett Publishers, ISBN: 0-763-72387-8

  23. Norusis M (2008) SPSS 16.0 guide to data analysis. Prentice Hall, ISBN: 0-136-06136-2

  24. Oh N, Mccluskey EJ (2002) Error detection by selective procedure call duplication for low energy consumption. IEEE Trans Reliab 51(4):392–402

    Article  Google Scholar 

  25. Oh N, Shirvani PP, McCluskey EJ (2002) Error detection by duplicated instructions in super-scalar processors. IEEE Trans Reliab 51(1):63–75

    Article  Google Scholar 

  26. Oh N, Shirvani PP, McCluskey EJ (2002) Control-flow checking by software signatures. IEEE Trans Reliab 51(2):111–122

    Article  Google Scholar 

  27. Pradhan DK (1996) Fault-tolerant computer system design. Prentice-Hall, ISBN:0-13-057887-8

  28. Rebaudengo M, Sonza Reorda M, Torchiano M, Iolante M (2001) A source-to-source compiler for generating dependable software. In: IEEE international workshop on source code analysis and manipulation, pp. 33–42, November 2001

  29. Rebaudengo M, Sonza Reorda M, Torchiano M, Violante M (1999) Soft-error Detection through Software Fault-Tolerance Techniques. In: IEEE international symposium on defect and fault tolerance in VLSI systems, pp. 210–218

  30. Reinhardt KS, Mukherjee S (2000) Transient fault detection via simultaneous multithreading. In: 27th annual international symposium on computer architecture, pp. 25–36, June 2000

  31. Roberts MJ, Russo R (1999) A student’s guide to analysis of variance. Routledge Publication, ISBN:0-415-16564-2

  32. Rotenberg E (1999) Exploiting large ineffectual instruction sequences, Technical report, North Carolina State University, November 1999

  33. Saggese GP, Wang NJ, Kalbarczyk ZT, Patel SJ, Iyer RK (2005) An experimental study of soft errors in microprocessors. IEEE Micro 25(6):30–39

    Article  Google Scholar 

  34. Sahoo SK (2008) Using likely program invariants to detect hardware errors. In: IEEE International Conference on dependable systems and networks (DSN), June 2008

  35. Savino A, Carlo SD, Politano G, Benso A, Dnatale G (2012) Statistical reliability estimation of microprocessor-based systems. IEEE Trans Comput 61(11):1521–1534

    Article  MathSciNet  Google Scholar 

  36. Sedgewick R (1998) Algorithms in C, Third edn. Addison-Wesley, ISBN 0-201-31452-5

  37. Shivakumar P, Kistler M, Keckler S, Burger D, Alvisi L (2002) Modeling the effect of technology trends on soft error rate of combinational logic. In: International conference on Dependable Systems and Networks (DSN), June 2002

  38. Shuguang F, Shantanu G, Ansari A, Mahlke S (2010) Shoestring: probabilistic soft-error resilience on the cheap. In: 15th international conference on architectural support for programming languages and operating systems, March 2010.

  39. Slegel TJ, Averill RM, Check MA, Giamei BC, Krumm BW, Krygowski CA, Li WH, Liptay JS, MacDougall JD, McPherson TJ, Navarro JA, Schwarz EM, Shum K, Webb CF (1999) IBM’s S/390 G5 microprocessor design. IEEE Micro 19(2):12–23

    Article  Google Scholar 

  40. Sosnowski J (1994) Transient fault tolerance in digital systems. IEEE Micro 14(1):24–35

    Article  Google Scholar 

  41. Sridharan V, Kaeli DR (2010) Using PVF traces to accelerate AVF modeling. In: Proceedings of the IEEE workshop on silicon errors in logic - system effects, Stanford, California, March 2010

  42. Steininger A, Scherrer C (1997) On finding an optimal combination of error detection mechanisms based on results of fault injection experiments. In: 27th international symposium on fault-tolerant computing, USA, pp. 238–247, June 1997

  43. Stephens C, Cogswell B, Gregory JH (1991) Instruction level profiling and evaluation of the IBM RS/6000. In: 18th international symposium on computer architecture, May 1991

  44. Thaker D, Franklin D, Oliver J, Biswas S, Lockhart D, Metodi T, Chong FT (2006) Characterization of error-tolerant applications when protecting control data. In: IEEE international symposium on workload characterization, October 2006

  45. Thomas H (2001) Introduction to algorithms. the MIT Press, ISBN: 0-262-03293-7

  46. Wang F, Agrawal VD (2009) Soft error rates with inertial and logical masking. In 22nd international conference on VLSI design, January 2009

  47. Wang N, Fertig M, Patel S (2003) Y-branches: when you come to a fork in the road, take it. In: International conference on parallel architectures and compilation techniques

  48. Xiong L, Tan Q, Xu J (2011) Soft error mask analysis on program level. In: 10th international conference on network

  49. Xu X, Li M (2012) Understanding soft error propagation using efficient vulnerability-driven fault injection. In IEEE international conference on dependable systems and networks (DSN), June

  50. Yeh Y (1998) Design considerations in Boeing 777 fly-by-wire computers. In: 3rd IEEE International high-assurance systems engineering symposium, pp. 64–72, November 1998.

  51. Zhang M, Shanbhag N (2004) A soft error rate analysis methodology. In IEEE/ACM International Conference on Computer-aided design, November 2004

  52. Zhang B, Wang WS, Orshansky M (2006) FASER: fast analysis of soft error susceptibility for cell-based designs. In: 7th international symposium on quality electronic design, March 2006

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bahman Arasteh.

Additional information

Responsible Editor: M. Sonza Reorda

Rights and permissions

Reprints and permissions

About this article

Cite this article

Arasteh, B., Miremadi, S.G. & Rahmani, A.M. Developing Inherently Resilient Software Against Soft-Errors Based on Algorithm Level Inherent Features. J Electron Test 30, 193–212 (2014). https://doi.org/10.1007/s10836-014-5438-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10836-014-5438-8

Keywords

Navigation