Abstract
The dependability of computing, caused by soft errors, has become a growing design concern in the safety critical systems. Since Register Files (RFs) are very frequently accessed and errors occurred in them will propagate to other components quickly, RFs are among the major reasons for affecting systemic reliability. Current protecting techniques usually provoke significant power penalty and performance degradation. This paper proposes a lightweight software implemented method for mitigating soft errors in RFs. Based on the observation of many narrow data-width of registers’ value, which indicates a large fraction of unused bits of register data, the masking operations are inserted to clear the possible errors in these bits for reducing the window of vulnerability for RFs. To improve the effectiveness, the effect of each masking range is calculated, and the covered masks analysis can remove the unnecessary masks without scarifying the errors coverage. Under the user-defined overhead constrain, the most cost-effective masking operations can be automatically selected. Experimental results from several benchmarks indicate that the reliability of programs have been averagely improved for 16.8% with only 3.3% performance overhead.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Baumann, R.C.: Radiation-induced soft errors in advanced semiconductor technologies. IEEE Trans. on Device and Materials Reliability 5(3), 305–316 (2005)
Ziegler, J.F., Puchner, H.: SER - History, Trends, and Challenges: A Guide for Designing with Memory ICs. Cypress Semiconductor Corp. (2004)
Baumann, R.C.: International technology roadmap for semiconductors 2007 executive summary (2007)
Michalak, S.E., Harris, K.W., et al.: Predicting the number of fatal soft errors in los alamos national laboratory’s asc q computer. IEEE Trans. on Device and Materials Reliability 5(3), 329–335 (2005)
Shivakumar, P., Kistler, M., Keckler, S.W., Burger, D., Alvisi, L.: Modeling the effect of technology trends on the soft error rate of combinational logic. In: 32nd Int’l Conf. on Dependable Systems and Networks (DSN), pp. 389–398 (2002)
Blome, J.A., Gupta, S., Feng, S., Mahlke, S.A.: Cost-efficient soft error protection for embedded microprocessors. In: Int’l Conf. on Compilers, Architecture and Synthesis for Embedded Systems (CASES), pp. 421–431 (2006)
Wang, N.J., Quek, J., Rafacz, T.M., Patel, S.J.: Characterizing the effects of transient faults on a modern high-performance processor pipeline. In: 34th Int’l Conf. on Dependable Systems and Networks (DSN), pp. 61–70 (2004)
Schiffel, U., Schmitt, A., Süßkraut, M., Fetzer, C.: ANB- and aNBDmem-encoding: Detecting hardware errors in software. In: Schoitsch, E. (ed.) SAFECOMP 2010. LNCS, vol. 6351, pp. 169–182. Springer, Heidelberg (2010)
Huang, K.H., Abraham, J.A.: Algorithm-based fault tolerance for matrix operations. IEEE Trans. on Computers 33(6), 518–528 (1984)
Benso, A., Chiusano, S., Prinetto, P., Tagliaferri, L.: A c/c++ source-to-source compiler for dependable applications. In: 30th Int’l Conf. on Dependable Systems and Networks (DSN), pp. 71–78 (2000)
Oh, N., Shirvani, P.P., et al.: Error detection by duplicated instructions in super-scalar processors. IEEE Trans. on Reliability 51(1), 63–75 (2002)
Reis, G.A., Chang, J., Vachharajani, N., Rangan, R., August, D.I.: Swift: Software implemented fault tolerance. In: Int’l Symp. on Code Generation and Optimization (CGO), pp. 243–254 (2005)
Burger, D., Austin, T., Bennett, S.: Evaluating future microprocessors: the simplescalar tool set. Technical Report 1342, UW Madison CS (1997)
Wu, Y., Larus, J.R.: Static branch frequency and program profile analysis. In: Proc. of the 27th Int’l Symp. on Microarchitecture (MICRO), pp. 1–11 (1994)
Loh, G.H.: Exploiting data-width locality to increase superscalar execution bandwidth. In: 35th Int’l Symp. on Microarchitecture (MICRO), pp. 395–405 (2002)
Lee, J., Shrivastava, A.: Static analysis to mitigate soft errors in register files. In: Design, Automation, and Test in Europe (DATE), pp. 1367–1372 (2009)
Yu, J., Garzarán, M.J., Snir, M.: Esoftcheck: Removal of non-vital checks for fault tolerance. In: Int’l Symp. on Code Generation and Optimization (CGO), pp. 35–46 (2009)
Kildall, G.A.: A unified approach to global program optimization. In: 1st ACM Symp. on Principles of Programming Languages (POPL), pp. 194–206 (1973)
Chang, J., Reis, G.A., et al.: Automatic instruction-level software-only recovery. In: 36th Int’l Conf. on Dependable Systems and Networks (DSN), pp. 83–92 (2006)
Montesinos, P., Liu, W., Torrellas, J.: Using register lifetime predictions to protect register files against soft errors. In: 37th Int’l Conf. on Dependable Systems and Networks (DSN), pp. 286–296 (2007)
Memik, G., Kandemir, M.T., Ozturk, O.: Increasing register file immunity to transient errors. In: Design, Automation and Test in Europe (DATE), pp. 586–591 (2005)
Hu, J.S., Wang, S., Ziavras, S.G.: In-register duplication: Exploiting narrow-width value for improving register file reliability. In: Int’l Conf. on Dependable Systems and Networks (DSN), pp. 281–290 (2006)
Kandala, M., Zhang, W., Yang, L.T.: An area-efficient approach to improving register file reliability against transient errors. In: 21st Int’l Conf. on Advanced Information Networking and Applications Workshops (AINAW), pp. 798–803 (2007)
Amrouch, H., Henkel, J.: Self-immunity technique to improve register file integrity against soft errors. In: 24th Int’l Conf. on VLSI Design (VLSID), pp. 189–194 (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Xu, J., Tan, Q., Shao, Z., Ning, H. (2014). Exploiting Narrow Data-Width to Mask Soft Errors in Register Files. In: Bondavalli, A., Di Giandomenico, F. (eds) Computer Safety, Reliability, and Security. SAFECOMP 2014. Lecture Notes in Computer Science, vol 8666. Springer, Cham. https://doi.org/10.1007/978-3-319-10506-2_9
Download citation
DOI: https://doi.org/10.1007/978-3-319-10506-2_9
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10505-5
Online ISBN: 978-3-319-10506-2
eBook Packages: Computer ScienceComputer Science (R0)