skip to main content
research-article

Mitigating the effects of large multiple cell upsets (MCUs) in memories

Published:27 October 2011Publication History
Skip Abstract Section

Abstract

Reliability is a critical issue for memories. Radiation particles that hit the device can cause errors in some cells, which can lead to data corruption. To avoid this problem, memories are protected with per-word error correction codes (ECCs). Typically, single-error correction and double-error detection (SEC-DED) codes are used. As technology scales, errors caused by radiation particles on memories tend to affect more than one cell—what is known as a multiple cell upset (MCU). To ensure that only a single cell is affected in each word, interleaving is used. With interleaving, cells that belong to the same word are placed at a sufficient distance such that an MCU will only affect a single cell on each word. The use of interleaving significantly increases the cost of the device. Also, determining the interleaving distance (ID) required to avoid MCUs causing double errors is not trivial. Typically, accelerated radiation experiments with a limited number of particle hits are used. They provide a lower bound on the required ID, but larger MCUs may occur with a low probability. But even if the percentage of such large MCUs is very low, the impact on reliability can be significant. This article presents a technique to mitigate the effects of large MCUs that is, those that exceed the ID, on memory reliability. The proposed approach is able to correct most double errors caused by large MCUs by exploiting the locality of the errors within an MCU.

References

  1. Baeg, S., Wen, S., and Wong, R. 2009. Interleaving distance selection with a soft error failure model. IEEE Trans. Nuclear Sci. 56, 4, 2111--2118.Google ScholarGoogle ScholarCross RefCross Ref
  2. Chen, C. L. and Hsiao, M. Y. 1984. Error-correcting codes for semiconductor memory applications: A state-of-the-art review. IBM J. Res. Dev. 28, 2, 124--134. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Dutta, A. and Touba, N. A. 2007. Multiple bit upset tolerant memory using a selective cycle avoidance based SEC-DED-DAEC code. In Proceedings of the IEEE VLSI Test Symposium. IEEE Los Alamitos, CA, 349--354. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Lawrence, R. K. and Kelly, A. T. 2008. Single event effect induced multiple-cell upsets in a commercial 90 nm CMOS digital technology. IEEE Trans. Nuclear Sci. 55, 6, 3367--3374.Google ScholarGoogle ScholarCross RefCross Ref
  5. Maiz, J., Hareland, S., Zhang, K., and Armstrong, P. 2003. Characterization of multi-bit soft error events in advanced SRAMs. In Proceedings of the IEEE International Electron Devices Meeting. IEEE, Los Alamitos, CA, 21.4.1--21.4.4.Google ScholarGoogle Scholar
  6. Radaelli, D., Puchner, H., Wong, S., and Daniel, S. 2005. Investigation of multi-bit upsets in a 150 nm technology SRAM device. IEEE Trans. Nuclear Sci. 52, 6, 2433--2437.Google ScholarGoogle ScholarCross RefCross Ref
  7. Reviriego, P., Maestro, J. A., Baeg, S., Wen, S., and Wong, R. 2010. Protection of memories suffering MCUs through the selection of the optimal interleaving distance. IEEE Trans. Nuclear Sci. 57, 4, 2124--2128.Google ScholarGoogle ScholarCross RefCross Ref
  8. Reviriego, P., Maestro, J. A., and Cervantes, C. 2007. Reliability analysis of memories suffering multiple bit upsets. IEEE Trans. Device Materials Reliability 7, 4, 592--601.Google ScholarGoogle ScholarCross RefCross Ref
  9. Richter, M., Oberlaender, K., and Goessel, M. 2008. New linear SED-DED codes with reduced triple bit error miscorrection probability. In Proceedings of the 14th IEEE International On-Line Testing Symposium (IOLTS). IEEE, Los Alamitos, CA, 37--42. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Satoh, S., Tosaka Y., and Wender, S. A. 2000. Geometric effect of multiple-bit soft errors induced by cosmic ray neutrons on DRAMs. IEEE Electron Device Lett. 21, 6, 310--312.Google ScholarGoogle ScholarCross RefCross Ref
  11. Saleh, A. M., Serrano, J. J., and Patel, J. H. 1990. Reliability of scrubbing recovery-techniques for memory systems. IEEE Trans. Reliability 39, 1, 114--122.Google ScholarGoogle ScholarCross RefCross Ref
  12. Tipton, D., Pellish, J. A., Reed, R. A., Schrimpf, R. D., Weller, R. A., Mendenhall, M. H., Sierawski, B., Sutton, A. K., Diestelhorst, R. M., Espinel, G., Cressler, J. D., Marshall, P. W., and Vizkelethy, G. 2006. Multiple-bit upset in 130 nm CMOS technology. IEEE Trans. Nuclear Sci. 53, 6, 3259--3264.Google ScholarGoogle ScholarCross RefCross Ref
  13. Yang, G. C. 1995. Reliability of semiconductor RAMs with soft-error scrubbing techniques. IEE Proc. Comput. Digital Tech. 142, 5, 337--344.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Mitigating the effects of large multiple cell upsets (MCUs) in memories

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image ACM Transactions on Design Automation of Electronic Systems
            ACM Transactions on Design Automation of Electronic Systems  Volume 16, Issue 4
            October 2011
            326 pages
            ISSN:1084-4309
            EISSN:1557-7309
            DOI:10.1145/2003695
            Issue’s Table of Contents

            Copyright © 2011 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 27 October 2011
            • Accepted: 1 June 2011
            • Revised: 1 March 2011
            • Received: 1 December 2010
            Published in todaes Volume 16, Issue 4

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article
            • Research
            • Refereed

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader