Abstract
Incessant and rapid technology scaling has brought us to a point where today's, and future transistors are susceptible to transient errors induced by energy carrying particles, called soft errors. Within a processor, the sheer size and nature of data in the caches render it most vulnerable to electrical interference on data stored in the cache. Data in the cache is vulnerable to corruption by soft errors, for the time it remains actively unused in the cache. Write-through and early-write-back [Li et al. 2004] cache configurations reduce the time for vulnerable data in the cache, at the cost of increased memory writes and thereby energy. We propose a smart cache cleaning methodology, that enables copying of only specific vulnerable cache blocks into the memory at chosen times, thereby ensuring data cache protection with minimal memory writes. In this work, we first propose a hybrid (software-hardware) methodology. We then propose an improved software solution that utilizes cache write-back functionality available in commodity processors; thereby reducing the hardware overhead required to implement smart cache cleaning for such systems. The parameters involved in the implementation of our Smart Cache Cleaning (SCC) technique enable a means to provide for customizable energy-efficient soft error reduction in the L1 data cache. Given the system requirements of reliability, power-budget and runtime priority of the application, appropriate parameters of the SCC can be customized to trade-off power consumption and L1 data cache reliability. Our experiments over LINPACK and Livermore benchmarks demonstrate 26% reduced energy-vulnerability product (energy-efficient vulnerability reduction) compared to that of hardware based cache reliability techniques. Our software-only solution achieves same levels of reliability with an additional 28% performance improvement.
- AMD Corporation. 2007a. AMD Athlon Processor Product Data Sheet. support.amd.com/us/Processorn TechDocs/43042.pdf.Google Scholar
- Anderson, E. 1999. LAPACK Users' Guide. Vol. 9, Siam, Philadelphia, PA.Google Scholar
- ARM. 2007. ARMv5 Architecture Reference Manual. (2007). infocenter.arm.com.Google Scholar
- Baumann, R., Hossain, T., Murata, S., and Kitagawa., H. 1995. Boron compounds as a dominant source of alpha particles in semiconductor devices. In Proceedings of the 33rd Annual IEEE International Reliability Physics Symposium. IEEE, 297--302. DOI: http://dx.doi.org/10.1109/RELPHY.1995.513695.Google ScholarCross Ref
- Burger, D. and Austin, T. M. 1997. The SimpleScalar tool set, version 2.0. SIGARCH Comput. Archit. News 25, 3, 13--25. DOI: http://dx.doi.org/10.1145/268806.268810. Google ScholarDigital Library
- Cannon, E. H., Reinhardt, D. D., Gordon, M. S., and Makowenskyj, P. S. 2004. SRAM SER in 90, 130 and 180 nm bulk and SOI technologies. In Proceedings of the 42nd Annual Reliability Physics Symposium. IEEE, 300--304. DOI: http://dx.doi.org/10.1109/RELPHY.2004.1315341.Google Scholar
- Chen, G., Kandemir, M., Irwin, M. J., and Memik, G. 2005. Compiler-directed selective data protection against soft errors. In Proceedings of the Conference on Asia South Pacific Design Automation (ASP-DAC'05). ACM Press, New York, 713--716. DOI: http://dx.doi.org/10.1145/1120725.1121000. Google ScholarDigital Library
- Hamming, R. W. 1950. Error detecting and error correcting codes. Bell System Tech. J. 29, 2, 147--160.Google ScholarCross Ref
- Hareland, S., Maiz, J., Alavi, M., Mistry, K., Walsta, S., and Dai, C. 2001. Impact of CMOS process scaling and SOI on the soft error rates of logic processes. In Proceedings of the Symposium on VLSI Technology. IEEE, 73--74. DOI: http://dx.doi.org/10.1109/VLSIT.2001.934953.Google Scholar
- Hung, L. D., Irie, H., Goshima, M., and Sakai, S. 2007. Utilization of SECDED for soft error and variation-induced defect tolerance in caches. In Proceedings of the Design, Automation Test in Europe Conference Exhibition (DATE'07). ACM, 1--6. DOI: http://dx.doi.org/10.1109/DATE.2007.364447. Google ScholarDigital Library
- Ibe, E., Taniguchi, H., Yahagi, Y., Shimbo, K.-I., and Toba, T. 2010. Impact of scaling on neutron-induced soft error in SRAMs from a 250 nm to a 22 nm design rule. IEEE Trans. Electron Devices 57, 7, 1527--1538. DOI: http://dx.doi.org/10.1109/TED.2010.2047907.Google ScholarCross Ref
- Intel Corporation. 2000. Intel XScale technology overview. intel.com/design/intelxscale.Google Scholar
- Intel Corporation. 2007b. Intel IA-32 Developer's Manuals. intel.com/products/processor/manuals/.Google Scholar
- Kayali, S. 2000. Reliability considerations for advanced microelectronics. In Proceedings of the Pacific Rim International Symposium on Dependable Computing (PRDC'00). IEEE, 99--. http://portal.acm.org/citation.cfm?id=826038.826937. Google ScholarDigital Library
- Lee, K., Shrivastava, A., Dutt, N., and Venkatasubramanian, N. 2010. Partitioning techniques for partially protected caches in resource-constrained embedded systems. ACM Trans. Des. Autom. Electron. Syst. 15, 4, Article 30. DOI: http://dx.doi.org/10.1145/1835420.1835423. Google ScholarDigital Library
- Lee, K., Shrivastava, A., Issenin, I., Dutt, N., and Venkatasubramanian, N. 2009. Partially protected caches. IEEE Trans. VLSI Syst. 17, 9, 1343--1347. DOI: http://dx.doi.org/10.1109/TVLSI.2008.2002427. Google ScholarDigital Library
- Li, J.-F., and Huang, Y.-J. 2005. An error detection and correction scheme for RAMs with partial-write function. In Proceedings of the IEEE International Workshop on Memory Technology, Design, and Testing (MTDT'05). IEEE, 115--120. DOI: http://dx.doi.org/10.1109/MTDT.2005.16. Google ScholarDigital Library
- Li, L., Degalahal, V., Vijaykrishnan, N., Kandemir, M., and Irwin, M. J. 2004. Soft error and energy consumption interactions: a data cache perspective. In Proceedings of the International Symposium on Low Power Electronics and Design (ISLPED'04). IEEE, 132--137. DOI: http://dx.doi.org/10.1109/LPE.2004.1349323. Google ScholarDigital Library
- May T. C., and Woods, M. H. 1979. Alpha-particle-induced soft errors in dynamic memories. IEEE Trans. Electron Devices 26, 1, 2--9. DOI:http://dx.doi.org/10.1109/T-ED.1979.19370.Google ScholarCross Ref
- McMahon, F. H. 1993. L. L. N. L. Fortran Kernels Test: MFLOPS. www.netlib.org/benchmark/livermorec.Google Scholar
- Mukherjee, S. S., Weaver, C., Emer, J., Reinhardt, S. K., and Austin, T. 2003. A systematic methodology to compute the architectural vulnerability factors for a high-performance microprocessor. In Proceedings of the 36th Annual IEEE/ACM International Symposium on Microarchitecture. ACM, 29--40. DOI: http://dx.doi.org/10.1109/MICRO.2003.1253181. Google ScholarDigital Library
- Mukherjee, S. S., Emer, J., Fossum, T., and Reinhardt, S. K. 2004. Cache scrubbing in microprocessors: Myth or necessity?. In Proceedings of the 10th IEEE Pacific Rim International Symposium on Dependable Computing (PRDC'04). IEEE, 37--42. Google ScholarDigital Library
- Naseer, R., Boulghassoul, Y., Draper, J., Dasgupta, S., and Witulski, A. 2007. Critical charge characterization for soft error rate modeling in 90nm SRAM. In Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS'07). IEEE, 1879--1882. DOI: http://dx.doi.org/10.1109/ISCAS.2007.378282.Google Scholar
- Rockett Jr, L. R. 1992. Simulated SEU hardened scaled CMOS SRAM cell design using gated resistors. IEEE Trans. Nucl. Sci. 39, 5, 1532--1541. DOI:http://dx.doi.org/10.1109/23.173239.Google ScholarCross Ref
- Shrivastava, A., Issenin, I., and Dutt, N. 2005. Compilation techniques for energy reduction in horizontally partitioned cache architectures. In Proceedings of the International Conference on Compilers, Architectures and Synthesis for Embedded Systems (CASES'05). ACM, New York, 90--96. DOI: http://dx.doi.org/10.1145/1086297.1086310. Google ScholarDigital Library
- Shrivastava, A., Lee, J., and Jeyapaul, R. 2010. Cache vulnerability equations for protecting data in embedded processor caches from soft errors. In Proceedings of the ACM SIGPLAN/SIGBED 2010 Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES'10). ACM, New York, 143--152. DOI: http://dx.doi.org/10.1145/1755888.1755910. Google ScholarDigital Library
- Slayman, C. 2010. Alpha Particle or Neutron SER-What will dominate in future IC technology. ewh.ieee.org/soc/cpmt/presentations/cpmt0910e.pdf.Google Scholar
- Sridharan, V., Asadi, H., Tahoori, M. B., and Kaeli, D. 2006. Reducing data cache susceptibility to soft errors. IEEE Trans. Dependable Secure Comput. 3, 4, 353--364. DOI: http://dx.doi.org/10.1109/TDSC.2006.55. Google ScholarDigital Library
- Zhang, W. 2009. Computing and minimizing cache vulnerability to transient errors. IEEE Des. Test Comput. 26, 2, 44--51. DOI: http://dx.doi.org/10.1109/MDT.2009.29. Google ScholarDigital Library
Index Terms
- Enabling energy efficient reliability in embedded systems through smart cache cleaning
Recommendations
Smart cache cleaning: energy efficient vulnerability reduction in embedded processors
CASES '11: Proceedings of the 14th international conference on Compilers, architectures and synthesis for embedded systemsIncessant and rapid technology scaling has brought us to a point where todays, and future transistors are susceptible to transient errors induced by energy carrying particles, called soft errors. Within a processor, the sheer size and nature of data in ...
Replicating tag entries for reliability enhancement in cache tag arrays
Protecting on-chip cache memories against soft errors has become an increasing challenge in designing new generation reliable microprocessors. Previous efforts have mainly focused on improving the reliability of the cache data arrays. Due to its crucial ...
Modeling soft errors for data caches and alleviating their effects on data reliability
Soft errors caused by strikes arising from energetic particles pose a significant reliability concern for computing systems. In this study, we first introduce a model for soft error occurrence and propagation in cache memories. Based on this model, we ...
Comments