Synonyms
Definition of Entry
Conventional or well-established redundancy methods for preventing data loss, unavailability, or corruption can be used to protect big data, but they need to be updated in order to make them efficiently applicable to large data sets.
Overview
Data stored in memory devices, in storage networks, on the Web, or in the Cloud must be protected against loss, accidental contamination, or deliberate adulteration. Data are valuable assets that can be lost to negligence or theft (for illicit use or to exchange for ransom). Over the years, many methods of data protection have been devised by researchers in the field of dependable and fault-tolerant computing (Jalote 1994; Parhami 2018), all of which entail introducing redundancy to make data robust and recoverable in the event of loss or corruption. As data assumes ever-more important roles in the proper functioning of systems that affect our daily lives, greater...
References
Arazi B (1987) A commonsense approach to the theory of error-correcting codes. MIT Press, Cambridge, MA
Armburst M et al (2010) A view of cloud computing. Commun ACM 53(4):50–58
Avizienis A (1971) Arithmetic error codes: cost and effectiveness studies for application in digital system design. IEEE Trans Comput 20(11):1322–1331
Avizienis A (1973) Arithmetic algorithms for error-coded operands. IEEE Trans Comput 22(6):567–572
Beimel A (2011) Secret-sharing schemes: a survey. In: Proceedings of international conference coding and cryptology, Springer LNCS no. 6639, Berlin, pp 11–46
Benedetto S, Montorsi G (1996) Unveiling turbo codes: some results on parallel concatenated coding schemes. IEEE Trans Inf Theory 42(2):409–428
Berger JM (1961) A note on error detection codes for asymmetric channels. Inf Control 4:68–73
Budhiraja N, Marzullo K, Schneider FB, Toueg S (1993) The primary-backup approach. Distrib Syst 2:199–216
Caulfield AM, et al (2016) A cloud-scale acceleration architecture. In: Proceedings of 49th IEEE/ACM international symposium on microarchitecture, Taipei, Taiwan, pp 1–13
Chen CLP, Zhang C-Y (2014) Data-intensive applications, challenges, techniques and technologies: a survey on big data. Inf Sci 275:314–347
Chen PM, Lee EK, Gibson GA, Katz RH, Patterson DA (1994) RAID: high-performance reliable secondary storage. ACM Comput Surv 26(2):145–185
Dimakis AG, Ramachandran K, Wu Y, Suh C (2011) A survey on network codes for distributed storage. Proc IEEE 99(3):476–489
Dullmann D et al (2001) Models for replica synchronization and consistency in a data grid. In: Proceedings of 10th IEEE international symposium on high performance distributed computing, San Francisco, CA, pp 67–75
Feng G-L, Deng RH, Bao F, Shen J-C (2005) New efficient MDS array codes for RAID – part I: Reed-Solomon-like codes for tolerating three disk failures. IEEE Trans Comput 54(9):1071–1080; Part II: Rabin-like codes for tolerating multiple (≥ 4) disk failures. IEEE Trans Comput 54(12):1473–1483
Gallager R (1962) Low-density parity-check codes. IRE Trans Inf Theory 8(1):21–28
Garner HL (1966) Error codes for arithmetic operations. IEEE Trans Electron Comput 5:763–770
Garrett P (2004) The mathematics of coding theory. Prentice Hall, Upper Saddle River
Guerraoui R, Schiper A (1997) Software-based replication for fault tolerance. IEEE Comput 30(4):68–74
Guruswami V, Rudra A (2009) Error correction up to the information-theoretic limit. Commun ACM 52(3):87–95
Hamming RW (1950) Error detecting and error correcting codes. Bell Labs Tech J 29(2):147–160
Hankerson R et al (2000) Coding theory and cryptography: the essentials. Marcel Dekker, New York
Hilbert M, Gomez P (2011) The World’s technological capacity to store, communicate, and compute information. Science 332:60–65
Hu H, Wen Y, Chua T-S, Li X (2014) Toward scalable systems for big data analytics; a technology tutorial. IEEE Access 2:652–687
Iyengar A, Cahn R, Garay JA, Jutla C (1998) Design and implementation of a secure distributed data repository. IBM Thomas J. Watson Research Division, Yorktown Heights
Jalote P (1994) Fault tolerance in distributed systems. Prentice Hall, Englewood Cliffs
Knuth DE (1986) Efficient balanced codes. IEEE Trans Inf Theory 32(1):51–53
Lin S, Costello DJ (2004) Error control coding, vol 2. Prentice Hall, Upper Saddle River
McAuley AJ (1994) Weighted sum codes for error detection and their comparison with existing codes. IEEE/ACM Trans Networking 2(1):16–22
Parhami B (2018) Dependable computing: a multi-level approach. Draft of book manuscript, available on-line at: http://www.ece.ucsb.edu/~parhami/text_dep_comp.htm
Parhami B, Avizienis A (1973) Detection of storage errors in mass memories using arithmetic error codes. IEEE Trans Comput 27(4):302–308
Petascale Data Storage Institute (2012) Analyzing failure data. Project Web site: http://www.pdl.cmu.edu/PDSI/FailureData/index.html
Peterson WW, Brown DT (1961) Cyclic codes for error detection. Proc IRE 49(1):228–235
Peterson WW, Weldon EJ Jr (1972) Error-correcting codes, 2nd edn. MIT Press, Cambridge, MA
Pless V (1998) Bose-Chaudhuri-Hocquenghem (BCH) codes. In: Introduction to the theory of error-correcting codes, 3rd edn. Wiley, New York, pp 109–222
Rabin M (1989) Efficient dispersal of information for security, load balancing, and fault tolerance. J ACM 36(2):335–348
Rao TRN, Fujiwara E (1989) Error-control coding for computer systems. Prentice Hall, Upper Saddle River, NJ
Reed I, Solomon G (1960) Polynomial codes over certain finite fields. SIAM J Appl Math 8:300–304
Schroeder B, Gibson GA (2007) Understanding disk failure rates: what does an MTTF of 1,000,000 hours mean to you? ACM Trans Storage 3(3):Article 8, 31 pp
Sklar B, Harris FJ (2004) The ABCs of linear block codes. IEEE Signal Process 21(4):14–35
Stanford University (2012) 21st century computer architecture: a community white paper. On-line: http://csl.stanford.edu/~christos/publications/2012.21stcenturyarchitecture.whitepaper.pdf
Wakerly JF (1978) Error detecting codes, self-checking circuits and applications. North Holland, New York
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Section Editor information
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this entry
Cite this entry
Parhami, B. (2018). Data Replication and Encoding. In: Sakr, S., Zomaya, A. (eds) Encyclopedia of Big Data Technologies. Springer, Cham. https://doi.org/10.1007/978-3-319-63962-8_174-1
Download citation
DOI: https://doi.org/10.1007/978-3-319-63962-8_174-1
Received:
Accepted:
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-63962-8
Online ISBN: 978-3-319-63962-8
eBook Packages: Springer Reference MathematicsReference Module Computer Science and Engineering