Skip to main content
Log in

A dynamic approach to tolerate soft errors

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

Dynamic implementation for software-based soft error tolerance method which can protect more types of codes can cover more soft errors. This paper explores soft error tolerance with dynamic software-based method. We propose a new dynamic software-based approach to tolerate soft errors. In our approach, the objective which is protected is dynamic program. For those protected dynamic binary codes, we make sure right control flow and right data flow to significant extent in our approach. Our approach copies every data and operates every operation twice to ensure those data stored into memory are right. Additionally, we ensure every branch instruction can jump to the right address by checking condition and destination address. Our approach is implemented by the technique dynamic binary instrumentation. Specifically, our tool is implemented on the basis of valgrind framework which is a heavyweight dynamic binary instrumentation tool. Our experimental results demonstrate that our approach can get higher reliability of dynamic software than those approaches which is implemented with static program protection method. However, our approach is only suitable for the system which has a strict requirement of reliability because our approach also sacrifices more performance of software than those static program protection methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Mahmood, A., McCluskey, E.J.: Concurrent error detection using watchdog processors a survey. IEEE Trans. Comput. 37(2), 160–174 (1988)

    Article  Google Scholar 

  2. Austin, T.M.: DIVA: a reliable substrate for deep submicron microarchitecture design. In: 32nd Annual International Symposium on Microarchitecture (MICRO), pp. 196–207 (1999)

    Google Scholar 

  3. Reinhardt, S.K., et al.: Transient fault detection via simultaneous multithreading. In: Proceedings of the 27th Annual International Symposium on Computer Architecture, pp. 25–36 (2000)

    Google Scholar 

  4. Reis, G.A., Chang, J., Vachharajani, N., et al.: SWIFT: software-implemented fault tolerance. In: Proceedings of the 3rd International Symposium on Code Generation and Optimization, pp. 243–254, March 2005

    Chapter  Google Scholar 

  5. Oh, N., Shirvani, P.P., McCluskey, E.J.: Error detection by duplicated instructions in super-scalar processors. IEEE Trans. Reliab. 51(1), 63–75 (2002)

    Article  Google Scholar 

  6. Oh, N., Shirvani, P.P., McCluskey, E.J.: ED4I: error detection by diverse data and duplicated instructions. In: IEEE Transactions on Computers, pp. 180–199 (2002)

    Google Scholar 

  7. Reis, G.A., Chang, J., Vachharajani, N., et al.: Software-controlled fault tolerance. ACM Trans. Archit. Code Optim. V(N), 1–28 (2005)

    Google Scholar 

  8. Reis, G.A.: Software modulated fault tolerance. A dissertation presented to the faculty of Princeton University (2008)

  9. Borin, E., Wang, C., Wu, Y., Araujo, G.: Software-based transparent and comprehensive control-flow error detection. In: International Symposium on Code Generation and Optimization, pp. 333–345 (2006)

    Chapter  Google Scholar 

  10. Reis, G.A., Chang, J., August, D.I.: Configurable transient fault detection via dynamic binary translation. In: Proceedings of the 2nd Workshop on Architectural Reliability, December 2006

    Google Scholar 

  11. Luk, C.-K., Cohn, R., Muth, R., Patil, H., et al.: Pin: building customized program analysis tools with dynamic instrumentation. In: Proceedings of PLDI 2005, pp. 191–200, June 2005

    Google Scholar 

  12. Rebaudengo, M., Reorda, M.S., Violante, M., Torchiano, M.: A source-to-source compiler for generating dependable software. In: IEEE International Workshop on Source Code Analysis and Manipulation, pp. 33–42 (2001)

    Google Scholar 

  13. Nethercote, N.: Dynamic Binary Analysis and Instrumentation. University of Cambridge, Cambridge (2004)

    Google Scholar 

  14. Seward, J., Nethercote, N.: Using Valgrind to detect undefined value errors with bit-precision. In: Proceedings of the USENIX 05 Annual Technical Conference, April 2005

    Google Scholar 

  15. www.valgrind.org (2012)

  16. Nethercote, N., Seward, J.: Valgrind: a framework for heavyweight dynamic binary instrumentation. In: Proceedings of PLDI 2007, pp. 191–200, June 2007

    Google Scholar 

  17. Li, X.: Soft error modeling and analysis for microprocessors. A dissertation presented to computer science in the graduate college of the University of Illinois (2008)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lei Xiong.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xiong, L., Tan, Q. A dynamic approach to tolerate soft errors. Cluster Comput 16, 359–366 (2013). https://doi.org/10.1007/s10586-011-0196-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-011-0196-1

Keywords

Navigation