Error Correction Schemes with Erasure Information for Fast Memories

Evain, Samuel; Savin, Valentin; Gherman, Valentin

doi:10.1007/s10836-014-5440-1

Error Correction Schemes with Erasure Information for Fast Memories

Published: 21 March 2014

Volume 30, pages 183–192, (2014)
Cite this article

Journal of Electronic Testing Aims and scope Submit manuscript

Samuel Evain¹,
Valentin Savin² &
Valentin Gherman¹

207 Accesses
1 Citation
3 Altmetric
Explore all metrics

Abstract

Two error correction schemes are proposed for word-oriented binary memories that can be affected by erasures, i.e. errors with known location but unknown value. The erasures considered here are due to the drifting of the electrical parameter used to encode information outside the normal ranges associated to a logic 0 or a logic 1 value. For example, a dielectric breakdown in a magnetic memory cell may reduce its electrical resistance sensibly below the levels which correspond to logic 0 and logic 1 values stored in healthy memory cells. Such deviations can be sensed during memory read operations and the acquired information can be used to boost the correction capability of an error-correcting code (ECC). The proposed schemes enable the correction of double-bit errors based on the combination of erasure information with single-bit error correction and double-bit error detection (SEC-DED) codes or shortened (SEC) codes. The correction of single-bit errors is always guaranteed. Ways to increase the number of double-bit and triple-bit errors that can be detected by shortened SEC and SEC-DED codes are considered in order to augment the error correction capability of the proposed solutions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multiple Adjacent Bit Error Detection and Correction Codes for Reliable Memories: A Review

An Improved Single and Double-Adjacent Error Correcting Codec with Lower Decoding Overheads

Article 24 May 2023

Efficiency Estimation of Single Error Correction, Double Error Detection and Double-Adjacent-Error Correction Codes

References

Chase D (1972) A class of algorithms for decoding block codes with channel measurement information. IEEE Trans Inf Theory IT-18(1):170–182
Article MathSciNet Google Scholar
Chen CL, Hsiao MY (1984) Error-correcting codes for semiconductor memory applications: a state of the art review. IBM J Res Dev 28(2):124–134
Article Google Scholar
Chishti Z, Alameldeen AR, Wilkerson C, Wu W, Lu S-L (2009) Improving cache lifetime reliability at ultra-low voltages. IEEE/ACM International Symposium on Microarchitecture, MICRO-42, New York City
Dell TJ (1997) A white paper on the benefits of chipkill-correct ECC for PC server main memory. IBM Microelectronics Division
Dimitrov DV, Gao Z, Wang X, Jung W, Lou X, Heinonen OG (2009) Dielectric breakdown of MgO magnetic tunnel junctions. Appl Phys Lett 94:123110
Article Google Scholar
Dong G, Xie N, Zhang T (2011) On the use of soft-decision error-correcting codes in NAND flash memory. IEEE Trans Circ Syst 58(2):429–439
Article MathSciNet Google Scholar
Forney GD Jr (1965) On decoding BCH codes. IEEE Trans Inf Theory IT–11(4):549–557
Article MathSciNet Google Scholar
Forney GD Jr (1966) Generalized minimum distance decoding. IEEE Trans Inf Theory IT–12(2):125–131
Article MathSciNet Google Scholar
Gherman V, Evain S, Seymour N, Bonhomme Y (2011) Generalized parity-check matrices for SEC-DED codes with fixed parity. IEEE International On-Line Testing Symposium (IOLTS) pp 200–203
Gherman V, Evain S, Bonhomme Y (2013) Memory reliability improvement based on maximized error-correcting codes. J Electron Test Theory Appl 29(4):601–608
Article Google Scholar
Godard B, Daga J-M, Torres L, Sassatelli G (2008) Hierarchical code correction and reliability management in embedded nor flash memories. IEEE European Test Symposium (ETS), pp 84–90
Hamming RW (1950) Error correcting and error detecting codes. Bell Syst Tech J 29:147–160
Article MathSciNet Google Scholar
Hsiao MY (1970) A class of optimal minimum oddweight-column SEC-DED codes. IBM J Res Dev 14:395–401
Article Google Scholar
Kim D, Kim T, Kim S, Kong JH, Yu Y, Char K (2003) Evolution of electrical properties of magnetic tunnel junction through successive dielectric breakdowns. Jpn J Appl Phys 42(3):1242–1245
Article Google Scholar
Lin S, Costello DJ (1983) Error control coding: fundamentals and applications. Prentice-Hall, Inc., Englewood Cliffs
Google Scholar
Panagopoulos G, Augustine C, Roy K (2011) Modeling of dielectric breakdown-Induced time-dependent STT-MRAM performance degradation. Device Research Conference, pp 125–126
Richter M, Oberlaender K, Goessel M (2008) New linear SEC-DED codes with reduced triple error miscorrection probability. IEEE International On-Line Testing Symposium (IOLTS), pp 37–42
Savin V (2008) Self-corrected min-sum decoding of LDPC codes. IEEE International Symposium on Information Theory, pp 146–150
Stapper CH, Lee H-S (1992) Synergistic fault-tolerance for memory chips. IEEE Trans Comput 41(9):1078–1087
Article Google Scholar
Tang DD, Lee Y-J (2010) Magnetic memory: fundamentals and technology. Cambridge University Press, pp 96–98, ISBN: 0521449642
Wang J, Courtade T, Shankar H, Wesel RD (2011) Soft information for LDPC decoding in flash: mutual-information optimized quantization. IEEE Global Telecommunications Conference, pp 1–6
Weng M-I, Lee L-N (1983) Weighted erasure codec for the (24,12) extended Golay code. United States Patent, 4397022

Download references

Acknowledgements

We are very grateful to Ken Mackay and Jérémy Alvarez-Hérault from Crocus Technology, and to Nicolas Ravot from CEA for their valuable advices.

We would like to thank Michael Richter for his assistance to derive tighter limits in Appendix 2.

First and last authors acknowledge partial support from the EMYR (Enhancement of MRAM Memory Yield and Reliability) project funded by the French National Research Agency.

Second author acknowledges funding from the European Union’s FP7 Programme, under Grant Agreement number 309129 (i-RISC project).

Author information

Authors and Affiliations

CEA, LIST, Saclay Nano-INNOV, PC 172, 91191, Gif sur Yvette Cedex, France
Samuel Evain & Valentin Gherman
CEA, LETI, MINATEC Campus, 17 rue des Martyrs, 38054, Grenoble Cedex 9, France
Valentin Savin

Authors

Samuel Evain
View author publications
You can also search for this author in PubMed Google Scholar
Valentin Savin
View author publications
You can also search for this author in PubMed Google Scholar
Valentin Gherman
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Valentin Gherman.

Additional information

Responsible Editor: G. Di Natale

This paper is based on a presentation at the 18th IEEE European Test Symposium, Avignon, France, May 27-31, 2013.

Appendixes

1.1 Appendix 1

A linear block SEC code can be defined with the help of a binary matrix, called parity-check matrix or H-matrix, such that a binary column vector V is a code word if and only if it fulfils the relation below [15]:

$$ H\cdot V=0 $$

Each H-matrix column corresponds to a particular bit position in the code words. The number of H-matrix rows is equal to the check-bit number. During the decoding process of a vector V, a syndrome S is calculated with the expression below:

$$ S=H\cdot V $$

(1)

If S is an all-zero syndrome, the binary vector V is assumed to be an error-free code word. A single-bit error generates a syndrome identical to the H-matrix column that corresponds to the corrupted bit position while a double-bit error produces a syndrome equal to the bitwise modulo-2 sum of the H-matrix columns that correspond to the erroneous bit positions. SEC capability can be ensured if the columns of the H-matrix are different from each other and from the all-zero column. The maximum number of bits n in a SEC code word is 2 ^r -1 if r is the check-bit number.

Usually, the number of data-bits accessed in a memory read or write operation is a power of 2. As a result, shortened SEC codes are used in which the H-matrix columns do not exploit all possible combinations of r-bit vectors. In such a case, r-bit vectors which are different from the H-matrix columns and the all-zero vector can be used as syndromes for double-bit error detection (DED) [17].

In a linear block SEC code, all bit positions corrupted by two different double-bit errors that generate the same syndrome are necessarily different. Consequently, the maximum number of double-bit errors that can generate the same syndrome is given by the floor function ⌊n/2⌋. The product between this number and the number of syndromes available for DED, (2 ^r -1-n), can be used as an upper limit for the number of double-bit errors that can be detected with a shortened SEC code.

Each syndrome S _i available for DED can be used to define a linear function F _Si on the syndrome space {0,1} ^r as below:

$$ \begin{array}{ll}{F}_{si}:{\left\{0,1\right\}}^r\to {\left\{0,1\right\}}^r,\hfill & {F}_{si}(X)={S}_i\oplus X\hfill \end{array} $$

where the symbol ‘⊕’ stands for the bitwise modulo-2 sum.

Maximum DED capability implies that F _Si maps a maximum number of syndromes X used for SEC among each other. Since F _Si is a bijection, it will also map a maximum number of syndromes available for DED among each other. The latter syndromes can be grouped in zero-sum triples (S _i, S _j, S _l) that contain the S _i syndrome as follows:

$$ {F}_{si}\left({S}_j\right)={S}_l\iff {S}_i\oplus {S}_j\oplus {S}_l=0 $$

A maximum DED capability requires a maximum number of such triplets. Moreover, this property should be imposed to any linear function F _Sj defined as above with the help of each syndrome S _j available for DED. Consequently, a maximum number of zero-sum triples should be formed by all the syndromes available for DED.

We use a greedy algorithm to find 2 ^r -1-n r-bit vectors which are different from the all-zero syndrome and define as many zero-sum triplets as possible. The remaining non-zero r-bit vectors are used to fill the H-matrix columns for an n-bit SEC code. Subsequently, the H-matrix density can be reduced by performing linear operations on the H-matrix lines such that the resulting H-matrix defines the same SEC code [10].

1.2 Appendix 2

Consider a shortened linear block SEC-DED code with k data-bits and r check-bits (n = k + r) per code word. Consider also that all code words have the same parity [9, 12, 13]. During the decoding process, r-bit vectors, called syndromes, are calculated as shown in (1). One can identify the following sets of syndromes:

1.
a set composed of the all-zero syndrome used to identify error-free code words,
2.
a SEC-set which contains all syndromes used for single-bit error correction,
3.
a DED-set which contains all syndromes used for double-bit error detection i.e. which is disjoint from the first two sets,
4.
a TED-set which contains all syndromes that can be used for triple-bit error detection i.e. which is disjoint from the first two sets.

Due to the fixed parity of the SEC-DED code, the DED and TED sets are also disjoint.

With respect to each syndrome S _j in the DED-set, the SEC-set can be partitioned into two sub-sets.

a sub-set of syndromes S _i that correspond to single-bit errors which affect bit positions that can also be affected by a double-bit error identified by the syndrome S _j,
a sub-set of syndromes S _l that correspond to single-bit errors which affect bit positions that cannot be affected by any of the double-bit errors identified by the syndrome S _j.

By definition, the bitwise modulo-2 sum between the syndromes S _l and S _j is distinct from any syndrome that corresponds to a single-bit error. It is also different from the all-zero syndrome and any syndrome that corresponds to a double-bit error due to the code word fixed parity. Consequently, this sum corresponds to the syndrome of a detectable triple-bit error. Given x _j, the number of double-bit errors that can be detected by the syndrome S _j, there are n-2x _j single-bit errors which affect bit positions that cannot be affected by a double-bit error identified by the syndrome S _j. Hence, there are x _j(n-2x _j) triple-bit errors that can be detected by a syndrome equal to a combination of S _l and S _j syndromes.

The total number of detectable triple-bit errors can be expressed as below:

$$ \# TED={\displaystyle \sum_{DED- set}\frac{x_j\left(n-2x\right)}{3}=\frac{n^2\left(n-1\right)}{6}-\frac{2}{3}{\displaystyle \sum_{DED- set}{x}_j^2}} $$

(2)

where the sum over x _j’s is replaced by n(n-1)/2 and the divisor 3 is introduced to account for the fact that a triple-bit error can be generated by three different combinations of a double-bit error and a single-bit error.

Relation (2) shows that the number of detectable triple-bit errors can be maximized if the square sum is minimized. Since the sum over x _j is constant, the square sum can be minimized by (a) increasing the number of syndromes in the DED-set and (b) balancing the x _j values.

In the case of linear block SEC-DED codes with fixed parity, the first possibility is not an option as the DED-set cardinality is 2 ^r-1 -1. Hence, the upper limit of (2) can be calculated by replacing the x _j’s with integer values which are as balanced as possible. The obtained upper limits, which are the complementary of the lower limits illustrated in Table 6, are similar to the limits reported in [17].

Table 6 Limits for the minimal numbers of undetectable (misscorrected) triple-bit errors in shortened SEC-DED codes with fixed parity

Full size table

Expression (2) can also be applied to linear block SEC-DED codes without fixed parity. In such a case, the DED and TED sets are not necessarily disjoint and the cardinality of the cardinality of the DED-set can be increased in order to improve the upper limit of (2). Explorations of such an enlarged search space have been performed in [17] and are beyond the scope of the present study.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Evain, S., Savin, V. & Gherman, V. Error Correction Schemes with Erasure Information for Fast Memories. J Electron Test 30, 183–192 (2014). https://doi.org/10.1007/s10836-014-5440-1

Download citation

Received: 25 September 2013
Accepted: 25 February 2014
Published: 21 March 2014
Issue Date: April 2014
DOI: https://doi.org/10.1007/s10836-014-5440-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Error Correction Schemes with Erasure Information for Fast Memories

Abstract

Access this article

Similar content being viewed by others

Multiple Adjacent Bit Error Detection and Correction Codes for Reliable Memories: A Review

An Improved Single and Double-Adjacent Error Correcting Codec with Lower Decoding Overheads

Efficiency Estimation of Single Error Correction, Double Error Detection and Double-Adjacent-Error Correction Codes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendixes

1.1 Appendix 1

1.2 Appendix 2

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Error Correction Schemes with Erasure Information for Fast Memories

Abstract

Access this article

Similar content being viewed by others

Multiple Adjacent Bit Error Detection and Correction Codes for Reliable Memories: A Review

An Improved Single and Double-Adjacent Error Correcting Codec with Lower Decoding Overheads

Efficiency Estimation of Single Error Correction, Double Error Detection and Double-Adjacent-Error Correction Codes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendixes

Appendixes

1.1 Appendix 1

1.2 Appendix 2

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation