Skip to main content

Exploiting Narrow Data-Width to Mask Soft Errors in Register Files

  • Conference paper
Computer Safety, Reliability, and Security (SAFECOMP 2014)

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 8666))

Included in the following conference series:

Abstract

The dependability of computing, caused by soft errors, has become a growing design concern in the safety critical systems. Since Register Files (RFs) are very frequently accessed and errors occurred in them will propagate to other components quickly, RFs are among the major reasons for affecting systemic reliability. Current protecting techniques usually provoke significant power penalty and performance degradation. This paper proposes a lightweight software implemented method for mitigating soft errors in RFs. Based on the observation of many narrow data-width of registers’ value, which indicates a large fraction of unused bits of register data, the masking operations are inserted to clear the possible errors in these bits for reducing the window of vulnerability for RFs. To improve the effectiveness, the effect of each masking range is calculated, and the covered masks analysis can remove the unnecessary masks without scarifying the errors coverage. Under the user-defined overhead constrain, the most cost-effective masking operations can be automatically selected. Experimental results from several benchmarks indicate that the reliability of programs have been averagely improved for 16.8% with only 3.3% performance overhead.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Baumann, R.C.: Radiation-induced soft errors in advanced semiconductor technologies. IEEE Trans. on Device and Materials Reliability 5(3), 305–316 (2005)

    Article  MathSciNet  Google Scholar 

  2. Ziegler, J.F., Puchner, H.: SER - History, Trends, and Challenges: A Guide for Designing with Memory ICs. Cypress Semiconductor Corp. (2004)

    Google Scholar 

  3. Baumann, R.C.: International technology roadmap for semiconductors 2007 executive summary (2007)

    Google Scholar 

  4. Michalak, S.E., Harris, K.W., et al.: Predicting the number of fatal soft errors in los alamos national laboratory’s asc q computer. IEEE Trans. on Device and Materials Reliability 5(3), 329–335 (2005)

    Article  Google Scholar 

  5. Shivakumar, P., Kistler, M., Keckler, S.W., Burger, D., Alvisi, L.: Modeling the effect of technology trends on the soft error rate of combinational logic. In: 32nd Int’l Conf. on Dependable Systems and Networks (DSN), pp. 389–398 (2002)

    Google Scholar 

  6. Blome, J.A., Gupta, S., Feng, S., Mahlke, S.A.: Cost-efficient soft error protection for embedded microprocessors. In: Int’l Conf. on Compilers, Architecture and Synthesis for Embedded Systems (CASES), pp. 421–431 (2006)

    Google Scholar 

  7. Wang, N.J., Quek, J., Rafacz, T.M., Patel, S.J.: Characterizing the effects of transient faults on a modern high-performance processor pipeline. In: 34th Int’l Conf. on Dependable Systems and Networks (DSN), pp. 61–70 (2004)

    Google Scholar 

  8. Schiffel, U., Schmitt, A., Süßkraut, M., Fetzer, C.: ANB- and aNBDmem-encoding: Detecting hardware errors in software. In: Schoitsch, E. (ed.) SAFECOMP 2010. LNCS, vol. 6351, pp. 169–182. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  9. Huang, K.H., Abraham, J.A.: Algorithm-based fault tolerance for matrix operations. IEEE Trans. on Computers 33(6), 518–528 (1984)

    Article  MATH  Google Scholar 

  10. Benso, A., Chiusano, S., Prinetto, P., Tagliaferri, L.: A c/c++ source-to-source compiler for dependable applications. In: 30th Int’l Conf. on Dependable Systems and Networks (DSN), pp. 71–78 (2000)

    Google Scholar 

  11. Oh, N., Shirvani, P.P., et al.: Error detection by duplicated instructions in super-scalar processors. IEEE Trans. on Reliability 51(1), 63–75 (2002)

    Article  Google Scholar 

  12. Reis, G.A., Chang, J., Vachharajani, N., Rangan, R., August, D.I.: Swift: Software implemented fault tolerance. In: Int’l Symp. on Code Generation and Optimization (CGO), pp. 243–254 (2005)

    Google Scholar 

  13. Burger, D., Austin, T., Bennett, S.: Evaluating future microprocessors: the simplescalar tool set. Technical Report 1342, UW Madison CS (1997)

    Google Scholar 

  14. Wu, Y., Larus, J.R.: Static branch frequency and program profile analysis. In: Proc. of the 27th Int’l Symp. on Microarchitecture (MICRO), pp. 1–11 (1994)

    Google Scholar 

  15. Loh, G.H.: Exploiting data-width locality to increase superscalar execution bandwidth. In: 35th Int’l Symp. on Microarchitecture (MICRO), pp. 395–405 (2002)

    Google Scholar 

  16. Lee, J., Shrivastava, A.: Static analysis to mitigate soft errors in register files. In: Design, Automation, and Test in Europe (DATE), pp. 1367–1372 (2009)

    Google Scholar 

  17. Yu, J., Garzarán, M.J., Snir, M.: Esoftcheck: Removal of non-vital checks for fault tolerance. In: Int’l Symp. on Code Generation and Optimization (CGO), pp. 35–46 (2009)

    Google Scholar 

  18. Kildall, G.A.: A unified approach to global program optimization. In: 1st ACM Symp. on Principles of Programming Languages (POPL), pp. 194–206 (1973)

    Google Scholar 

  19. Chang, J., Reis, G.A., et al.: Automatic instruction-level software-only recovery. In: 36th Int’l Conf. on Dependable Systems and Networks (DSN), pp. 83–92 (2006)

    Google Scholar 

  20. Montesinos, P., Liu, W., Torrellas, J.: Using register lifetime predictions to protect register files against soft errors. In: 37th Int’l Conf. on Dependable Systems and Networks (DSN), pp. 286–296 (2007)

    Google Scholar 

  21. Memik, G., Kandemir, M.T., Ozturk, O.: Increasing register file immunity to transient errors. In: Design, Automation and Test in Europe (DATE), pp. 586–591 (2005)

    Google Scholar 

  22. Hu, J.S., Wang, S., Ziavras, S.G.: In-register duplication: Exploiting narrow-width value for improving register file reliability. In: Int’l Conf. on Dependable Systems and Networks (DSN), pp. 281–290 (2006)

    Google Scholar 

  23. Kandala, M., Zhang, W., Yang, L.T.: An area-efficient approach to improving register file reliability against transient errors. In: 21st Int’l Conf. on Advanced Information Networking and Applications Workshops (AINAW), pp. 798–803 (2007)

    Google Scholar 

  24. Amrouch, H., Henkel, J.: Self-immunity technique to improve register file integrity against soft errors. In: 24th Int’l Conf. on VLSI Design (VLSID), pp. 189–194 (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Xu, J., Tan, Q., Shao, Z., Ning, H. (2014). Exploiting Narrow Data-Width to Mask Soft Errors in Register Files. In: Bondavalli, A., Di Giandomenico, F. (eds) Computer Safety, Reliability, and Security. SAFECOMP 2014. Lecture Notes in Computer Science, vol 8666. Springer, Cham. https://doi.org/10.1007/978-3-319-10506-2_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-10506-2_9

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-10505-5

  • Online ISBN: 978-3-319-10506-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics