Skip to main content
Log in

Error Correction Schemes with Erasure Information for Fast Memories

  • Published:
Journal of Electronic Testing Aims and scope Submit manuscript

Abstract

Two error correction schemes are proposed for word-oriented binary memories that can be affected by erasures, i.e. errors with known location but unknown value. The erasures considered here are due to the drifting of the electrical parameter used to encode information outside the normal ranges associated to a logic 0 or a logic 1 value. For example, a dielectric breakdown in a magnetic memory cell may reduce its electrical resistance sensibly below the levels which correspond to logic 0 and logic 1 values stored in healthy memory cells. Such deviations can be sensed during memory read operations and the acquired information can be used to boost the correction capability of an error-correcting code (ECC). The proposed schemes enable the correction of double-bit errors based on the combination of erasure information with single-bit error correction and double-bit error detection (SEC-DED) codes or shortened (SEC) codes. The correction of single-bit errors is always guaranteed. Ways to increase the number of double-bit and triple-bit errors that can be detected by shortened SEC and SEC-DED codes are considered in order to augment the error correction capability of the proposed solutions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Chase D (1972) A class of algorithms for decoding block codes with channel measurement information. IEEE Trans Inf Theory IT-18(1):170–182

    Article  MathSciNet  Google Scholar 

  2. Chen CL, Hsiao MY (1984) Error-correcting codes for semiconductor memory applications: a state of the art review. IBM J Res Dev 28(2):124–134

    Article  Google Scholar 

  3. Chishti Z, Alameldeen AR, Wilkerson C, Wu W, Lu S-L (2009) Improving cache lifetime reliability at ultra-low voltages. IEEE/ACM International Symposium on Microarchitecture, MICRO-42, New York City

  4. Dell TJ (1997) A white paper on the benefits of chipkill-correct ECC for PC server main memory. IBM Microelectronics Division

  5. Dimitrov DV, Gao Z, Wang X, Jung W, Lou X, Heinonen OG (2009) Dielectric breakdown of MgO magnetic tunnel junctions. Appl Phys Lett 94:123110

    Article  Google Scholar 

  6. Dong G, Xie N, Zhang T (2011) On the use of soft-decision error-correcting codes in NAND flash memory. IEEE Trans Circ Syst 58(2):429–439

    Article  MathSciNet  Google Scholar 

  7. Forney GD Jr (1965) On decoding BCH codes. IEEE Trans Inf Theory IT–11(4):549–557

    Article  MathSciNet  Google Scholar 

  8. Forney GD Jr (1966) Generalized minimum distance decoding. IEEE Trans Inf Theory IT–12(2):125–131

    Article  MathSciNet  Google Scholar 

  9. Gherman V, Evain S, Seymour N, Bonhomme Y (2011) Generalized parity-check matrices for SEC-DED codes with fixed parity. IEEE International On-Line Testing Symposium (IOLTS) pp 200–203

  10. Gherman V, Evain S, Bonhomme Y (2013) Memory reliability improvement based on maximized error-correcting codes. J Electron Test Theory Appl 29(4):601–608

    Article  Google Scholar 

  11. Godard B, Daga J-M, Torres L, Sassatelli G (2008) Hierarchical code correction and reliability management in embedded nor flash memories. IEEE European Test Symposium (ETS), pp 84–90

  12. Hamming RW (1950) Error correcting and error detecting codes. Bell Syst Tech J 29:147–160

    Article  MathSciNet  Google Scholar 

  13. Hsiao MY (1970) A class of optimal minimum oddweight-column SEC-DED codes. IBM J Res Dev 14:395–401

    Article  Google Scholar 

  14. Kim D, Kim T, Kim S, Kong JH, Yu Y, Char K (2003) Evolution of electrical properties of magnetic tunnel junction through successive dielectric breakdowns. Jpn J Appl Phys 42(3):1242–1245

    Article  Google Scholar 

  15. Lin S, Costello DJ (1983) Error control coding: fundamentals and applications. Prentice-Hall, Inc., Englewood Cliffs

    Google Scholar 

  16. Panagopoulos G, Augustine C, Roy K (2011) Modeling of dielectric breakdown-Induced time-dependent STT-MRAM performance degradation. Device Research Conference, pp 125–126

  17. Richter M, Oberlaender K, Goessel M (2008) New linear SEC-DED codes with reduced triple error miscorrection probability. IEEE International On-Line Testing Symposium (IOLTS), pp 37–42

  18. Savin V (2008) Self-corrected min-sum decoding of LDPC codes. IEEE International Symposium on Information Theory, pp 146–150

  19. Stapper CH, Lee H-S (1992) Synergistic fault-tolerance for memory chips. IEEE Trans Comput 41(9):1078–1087

    Article  Google Scholar 

  20. Tang DD, Lee Y-J (2010) Magnetic memory: fundamentals and technology. Cambridge University Press, pp 96–98, ISBN: 0521449642

  21. Wang J, Courtade T, Shankar H, Wesel RD (2011) Soft information for LDPC decoding in flash: mutual-information optimized quantization. IEEE Global Telecommunications Conference, pp 1–6

  22. Weng M-I, Lee L-N (1983) Weighted erasure codec for the (24,12) extended Golay code. United States Patent, 4397022

Download references

Acknowledgements

We are very grateful to Ken Mackay and Jérémy Alvarez-Hérault from Crocus Technology, and to Nicolas Ravot from CEA for their valuable advices.

We would like to thank Michael Richter for his assistance to derive tighter limits in Appendix 2.

First and last authors acknowledge partial support from the EMYR (Enhancement of MRAM Memory Yield and Reliability) project funded by the French National Research Agency.

Second author acknowledges funding from the European Union’s FP7 Programme, under Grant Agreement number 309129 (i-RISC project).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Valentin Gherman.

Additional information

Responsible Editor: G. Di Natale

This paper is based on a presentation at the 18th IEEE European Test Symposium, Avignon, France, May 27-31, 2013.

Appendixes

Appendixes

1.1 Appendix 1

A linear block SEC code can be defined with the help of a binary matrix, called parity-check matrix or H-matrix, such that a binary column vector V is a code word if and only if it fulfils the relation below [15]:

$$ H\cdot V=0 $$

Each H-matrix column corresponds to a particular bit position in the code words. The number of H-matrix rows is equal to the check-bit number. During the decoding process of a vector V, a syndrome S is calculated with the expression below:

$$ S=H\cdot V $$
(1)

If S is an all-zero syndrome, the binary vector V is assumed to be an error-free code word. A single-bit error generates a syndrome identical to the H-matrix column that corresponds to the corrupted bit position while a double-bit error produces a syndrome equal to the bitwise modulo-2 sum of the H-matrix columns that correspond to the erroneous bit positions. SEC capability can be ensured if the columns of the H-matrix are different from each other and from the all-zero column. The maximum number of bits n in a SEC code word is 2 r -1 if r is the check-bit number.

Usually, the number of data-bits accessed in a memory read or write operation is a power of 2. As a result, shortened SEC codes are used in which the H-matrix columns do not exploit all possible combinations of r-bit vectors. In such a case, r-bit vectors which are different from the H-matrix columns and the all-zero vector can be used as syndromes for double-bit error detection (DED) [17].

In a linear block SEC code, all bit positions corrupted by two different double-bit errors that generate the same syndrome are necessarily different. Consequently, the maximum number of double-bit errors that can generate the same syndrome is given by the floor function ⌊n/2⌋. The product between this number and the number of syndromes available for DED, (2 r -1-n), can be used as an upper limit for the number of double-bit errors that can be detected with a shortened SEC code.

Each syndrome S i available for DED can be used to define a linear function F Si on the syndrome space {0,1} r as below:

$$ \begin{array}{ll}{F}_{si}:{\left\{0,1\right\}}^r\to {\left\{0,1\right\}}^r,\hfill & {F}_{si}(X)={S}_i\oplus X\hfill \end{array} $$

where the symbol ‘⊕’ stands for the bitwise modulo-2 sum.

Maximum DED capability implies that F Si maps a maximum number of syndromes X used for SEC among each other. Since F Si is a bijection, it will also map a maximum number of syndromes available for DED among each other. The latter syndromes can be grouped in zero-sum triples (S i , S j , S l ) that contain the S i syndrome as follows:

$$ {F}_{si}\left({S}_j\right)={S}_l\iff {S}_i\oplus {S}_j\oplus {S}_l=0 $$

A maximum DED capability requires a maximum number of such triplets. Moreover, this property should be imposed to any linear function F Sj defined as above with the help of each syndrome S j available for DED. Consequently, a maximum number of zero-sum triples should be formed by all the syndromes available for DED.

We use a greedy algorithm to find 2 r -1-n r-bit vectors which are different from the all-zero syndrome and define as many zero-sum triplets as possible. The remaining non-zero r-bit vectors are used to fill the H-matrix columns for an n-bit SEC code. Subsequently, the H-matrix density can be reduced by performing linear operations on the H-matrix lines such that the resulting H-matrix defines the same SEC code [10].

1.2 Appendix 2

Consider a shortened linear block SEC-DED code with k data-bits and r check-bits (n = k + r) per code word. Consider also that all code words have the same parity [9, 12, 13]. During the decoding process, r-bit vectors, called syndromes, are calculated as shown in (1). One can identify the following sets of syndromes:

  1. 1.

    a set composed of the all-zero syndrome used to identify error-free code words,

  2. 2.

    a SEC-set which contains all syndromes used for single-bit error correction,

  3. 3.

    a DED-set which contains all syndromes used for double-bit error detection i.e. which is disjoint from the first two sets,

  4. 4.

    a TED-set which contains all syndromes that can be used for triple-bit error detection i.e. which is disjoint from the first two sets.

Due to the fixed parity of the SEC-DED code, the DED and TED sets are also disjoint.

With respect to each syndrome S j in the DED-set, the SEC-set can be partitioned into two sub-sets.

  • a sub-set of syndromes S i that correspond to single-bit errors which affect bit positions that can also be affected by a double-bit error identified by the syndrome S j ,

  • a sub-set of syndromes S l that correspond to single-bit errors which affect bit positions that cannot be affected by any of the double-bit errors identified by the syndrome S j .

By definition, the bitwise modulo-2 sum between the syndromes S l and S j is distinct from any syndrome that corresponds to a single-bit error. It is also different from the all-zero syndrome and any syndrome that corresponds to a double-bit error due to the code word fixed parity. Consequently, this sum corresponds to the syndrome of a detectable triple-bit error. Given x j , the number of double-bit errors that can be detected by the syndrome S j , there are n-2x j single-bit errors which affect bit positions that cannot be affected by a double-bit error identified by the syndrome S j . Hence, there are x j (n-2x j ) triple-bit errors that can be detected by a syndrome equal to a combination of S l and S j syndromes.

The total number of detectable triple-bit errors can be expressed as below:

$$ \# TED={\displaystyle \sum_{DED- set}\frac{x_j\left(n-2x\right)}{3}=\frac{n^2\left(n-1\right)}{6}-\frac{2}{3}{\displaystyle \sum_{DED- set}{x}_j^2}} $$
(2)

where the sum over x j ’s is replaced by n(n-1)/2 and the divisor 3 is introduced to account for the fact that a triple-bit error can be generated by three different combinations of a double-bit error and a single-bit error.

Relation (2) shows that the number of detectable triple-bit errors can be maximized if the square sum is minimized. Since the sum over x j is constant, the square sum can be minimized by (a) increasing the number of syndromes in the DED-set and (b) balancing the x j values.

In the case of linear block SEC-DED codes with fixed parity, the first possibility is not an option as the DED-set cardinality is 2 r-1 -1. Hence, the upper limit of (2) can be calculated by replacing the x j ’s with integer values which are as balanced as possible. The obtained upper limits, which are the complementary of the lower limits illustrated in Table 6, are similar to the limits reported in [17].

Table 6 Limits for the minimal numbers of undetectable (misscorrected) triple-bit errors in shortened SEC-DED codes with fixed parity

Expression (2) can also be applied to linear block SEC-DED codes without fixed parity. In such a case, the DED and TED sets are not necessarily disjoint and the cardinality of the cardinality of the DED-set can be increased in order to improve the upper limit of (2). Explorations of such an enlarged search space have been performed in [17] and are beyond the scope of the present study.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Evain, S., Savin, V. & Gherman, V. Error Correction Schemes with Erasure Information for Fast Memories. J Electron Test 30, 183–192 (2014). https://doi.org/10.1007/s10836-014-5440-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10836-014-5440-1

Keywords

Navigation