Abstract
Taking advantage of consuming low storage resource, erasure codes have been widely adopted by various storage systems, e.g., HDFS, Ceph, and Microsoft Azure, to provide reliability. With the rapid expansion of data storage, it becomes increasingly significant to protect against co-occurring failure events. RAID-6 codes are capable of addressing the issue of multi-disk faults to some extent, while they are prone to catastrophic data loss if two disks are held on the same shelf and the shelf encounters some unrecoverable failures. Therefore, triple disk failure tolerant (3DFT) coding, which increases tolerance for concurrent faults, is increasingly popular. Existing 3DFT codes either have low computational efficiency, e.g., Reed–Solomon (RS) codes and minimum storage regeneration (MSR) codes, or high write overhead, e.g., Cauchy RS (CRS) codes, STAR codes, and RTP codes. In this paper, we propose a kind of lowest-density XOR-based 3DFT codes, named Thou codes, which have lowest storage overhead, high encoding and decoding performance, and optimal write overhead. Thou codes are constructed by an efficient searching approach over a cluster of matrices derived by Liberation codes, a sort of lowest-density RAID-6 codes. Experimental results demonstrate that Thou codes have high encoding and decoding efficiency and outperform three state-of-the-art 3DFT codes in write performance. Compared to STAR codes and RTP codes, Thou codes can reduce write overhead averagely by 21.9% and 26.8%, respectively, under real I/O workloads experiments.
Similar content being viewed by others
Notes
Write performance means the average number of parity elements to be modified when a data element or multiple consecutive data elements are updated, which is inversely proportional to write overhead or write complexity.
Disk and node refer the same in this paper.
A permutation matrix refers to a matrix with exactly one one in each row and column.
References
Ghemawat S, Gobioff H, Leung ST (2003) The Google file system. In: Proceedings SOSP, Bolton Landing (Lake George), NY, USA, pp 29–43
Weil SA, Brandt SA, Miller EL, Long DD, Maltzahn C (2006) Ceph: a scalable, high performance distributed file system. In: Proceedings OSDI, Seattle, WA, pp 307–320
Arnold BJ (2014) OpenStack swift using, administering, and developing for swift object storage. O’Reilly Media Press, Sebastopol, pp 125–150
Atul G, Peter C (2012) RAID triple parity. ACM SIGOPS Oper Syst Rev 46(3):41–49
Mitra S, Panta R, Ra M-R, Bagchi S (2016) Partial-parallel-repair (PPR): a distributed technique for repairing erasure coded storage. In: Proceedings EuroSys, Imperial College London, London
Reed I, Solomon G (1960) Polynomial codes over certain finite fields. J Soc Ind Appl Math 8(2):300–304
Chamazcoti SA, Miremadi SG (2014) EA-EO: endurance aware erasure code for SSD-based storage systems. In: Proceedings PRDC, Singapore, Singapore, pp 76–85
Chamazcoti SA, Delavari Z, Miremadi SG, Asadi H (2015) On endurance and performance of erasure codes in SSD-based storage systems. Microelectron Reliab 55(11):2453–2467
Chamazcoti SA, Miremadi SG (2016) On designing endurance aware erasure code for SSD-based storage systems. Microprocess Microsyst 45:283–296
Bloemer J, Kalfane M, Karp R (1995) An XOR-based erasure-resilient coding scheme. Technical Report TR-95-048, ICSI, Berkeley, CA
Huang C, Xu L (2008) STAR: an efficient coding scheme for correcting triple storage node failures. IEEE Trans Comput 57(7):889–901
Zhang Y, Wu C, Li J, Guo M (2015) TIP-Code: a three independent parity code to tolerate triple disk failures with optimal update complextiy. In: Proceedings DSN, Univ Estadual Campinas, Rio de Janeiro, Brazil, pp 136–147
Blaum M, Brady J, Bruck J, Menon J (1995) EVENODD: an efficient scheme for tolerating double disk failures in RAID architectures. IEEE Trans Comput 44(2):192–202
Corbett P, English B, Goel A, Grcanac T, Kleiman S, Leong J, Sankar S (2004) Row-diagonal parity for double disk failure correction. In: Proceedings FAST, San Francisco, CA, pp 1–14
Blaum M, Roth R (1999) On lowest density MDS codes. IEEE Trans Inf Theory 45(1):46–59
Plank JS (2008) The RAID-6 liberation codes. In: Proceedings FAST, San Jose, CA, pp 97–110
Plank JS (2009) The Raid-6 Liber8Tion code. Int J High Perform Comput Appl 23(3):242–251
Xu L, Bruck J (1999) X-code: MDS array codes with optimal encoding. IEEE Trans Inf Theory 45(1):272–276
Xu L, Bohossian V, Bruck J, Wagner D (1999) Low-density MDS codes and factors of complete graphs. IEEE Trans Inf Theory 45(6):1817–1826
Wu C, Wan S, He X, Xu B, Cao Q, Xie C (2011) H-Code: A hybrid MDS array code to optimize partial stripe writes in RAID-6. In: Proceedings IPDPS, Anchorage, AK, USA, pp 782–793
Jin C, Jiang H, Feng D, Tian L (2009) P-code: A new RAID-6 code with optimal properties. In: Proceedings ICS, Yorktown Heights, NY, pp 360–369
Shen Z, Shu J (2014) HV code: an all-around MDS code to improve efficiency and reliability of RAID-6 systems. In: Proceedings DSN, Atlanta, GA, pp 550–561
Ford D, Labelle F, Popovici FI, Murray S, Sean Q (2010) Availability in globally distributed file systems. In: Proceedings OSDI, Vancouver, BC, Canada, pp 61–74
Hady FT, Foong A, Veal B, Williams D (2017) Platform storage performance with 3D XPoint technology. Proc IEEE 105(9):1822–1833
Jeremy C, Edmund BN, Christopher F, Engin I, Benjamin L, Doug B, Derrick C (2009) Better I/O through byte-addressable, persistent memory. In: Proceedings SOSP, Big Sky, MT, pp 133–146
Dong M, Chen H (2017) Soft updates made simple and fast on non-volatile memory. In: Proceedings ATC, Santa Clara, CA, pp 719–731
Yang J, Joseph I, Steven S (2019) Orion: a distributed file system for non-volatile main memory and RDMA-capable networks. In: Proceedings FAST, Boston, MA, pp 221–234
Xu J, Steven S (2016) NOVA: a log-structured file system for hybrid volatile/non-volatile main memories. In: Proceedings FAST, Santa Clara, CA, pp 323–338
Xu J, Zhang L, Amirsaman M, Akshatha G, Amit B, Tamires BDS, Steven S, Andy R (2017) NOVA-fortis: a fault-tolerant non-volatile main memory file system. In: Proceedings SOSP, Shanghai, People’s Republic of China, pp 478–496
Goparaju S, Fazeli A, Vardy A (2017) Minimum storage regenerating codes for all parameters. IEEE Trans Inf Theory 63(10):6318–6328
Plank JS, Buchsbaum AL (2007) Some classes of invertible matrices in GF(2). Technical Report CS-07-599, University of Tennessee
Plank JS, Jerasure G (2014) A library in C facilitating erasure coding for storage applications–version 2.0. Technical Report CS-14-721, University of Tennessee
Narayanan D, Donnelly A, Donnelly A (2008) Write off-loading: practical power management for enterprise storage. In: Proceedings FAST, San Jose, CA, pp 253–267
N-29–17: NAND flash design and use considerations introduction Micron. http://download.micron.com/pdf/technotes/nand/tn2917.pdf
Blaum M, Hafner JL, Hetzler S (2013) Partial-MDS codes and their application to RAID type of architectures. IEEE Trans Inf Theory 59(7):4510–4519
Plank JS, Blaum M, Hafner JL, Codes SD (2013) Erasure codes designed for how storage systems really fail. In: Proceedings of the 11th USENIX Conference on File and Storage Technologies (FAST ’13), pp 95–104, San Jose, CA
Li Mingqiang, Lee Patrick PC (2014) STAIR codes: a general family of erasure codes for tolerating device and sector failures. ACM Trans Storage (TOS) 10(4):1–30
Kishani Mostafa, Ahmadian Saba, Asadi Hossein (2019) A modeling framework for reliability of erasure codes in SSD arrays. IEEE Trans Comput 69(5):649–665
Acknowledgements
This research has been supported by the China National Key R&D Program during the 13th Five-year Plan Period (Grant No. 2018YFB1700405).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix A
Rights and permissions
About this article
Cite this article
Liang, N., Zhang, X., Chen, H. et al. Thou code: a triple-erasure-correcting horizontal code with optimal update complexity. J Supercomput 78, 10088–10117 (2022). https://doi.org/10.1007/s11227-021-04271-9
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-021-04271-9