skip to main content
research-article

On Fault Tolerance, Locality, and Optimality in Locally Repairable Codes

Published:22 May 2020Publication History
Skip Abstract Section

Abstract

Erasure codes in large-scale storage systems allow recovery of data from a failed node. A recently developed class of codes, locally repairable codes (LRCs), offers tradeoffs between storage overhead and repair cost. LRCs facilitate efficient recovery scenarios by adding parity blocks to the system. However, these additional blocks may eventually increase the number of blocks that must be reconstructed. Existing LRCs differ in their use of the parity blocks, in their locality semantics, and in their parameter space. Thus, existing theoretical models cannot directly compare different LRCs to determine which code offers the best recovery performance, and at what cost.

We perform the first systematic comparison of existing LRC approaches. We analyze Xorbas, Azure’s LRCs, and Optimal-LRCs in light of two new metrics: average degraded read cost and normalized repair cost. We show the tradeoff between these costs and the code’s fault tolerance, and that different approaches offer different choices in this tradeoff. Our experimental evaluation on a Ceph cluster further demonstrates the different effects of realistic system bottlenecks on the benefit from each LRC approach. Despite these differences, the normalized repair cost metric can reliably identify the LRC approach that would achieve the lowest repair cost in each setup.

References

  1. Amazon. 2017. Amazon EBS Volumes. Retrieved September 24, 2017 from http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSVolumes.html.Google ScholarGoogle Scholar
  2. Amazon. 2017. Amazon EC2 Instance Types. Retrieved September 22, 2017 from https://aws.amazon.com/ec2/instance-types.Google ScholarGoogle Scholar
  3. Amazon. 2017. Amazon EC2 Regions and Availability Zones. Retrieved September 22, 2017 from http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html.Google ScholarGoogle Scholar
  4. Apache Hadoop. 2017. HDFS Erasure Coding. Retrieved November 16, 2019 from https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-hdfs/HDFSErasureCoding.html.Google ScholarGoogle Scholar
  5. Ceph. 2017. Jerasure Erasure Code Plugin. Retrieved September 24, 2017 from http://docs.ceph.com/docs/hammer/rados/ operations/erasure-code-jerasure/.Google ScholarGoogle Scholar
  6. Ceph. 2017. Locally Repairable Erasure Code Plugin. Retrieved September 24, 2017 from http://docs.ceph.com/docs/hammer/rados/operations/erasure-code-lrc/.Google ScholarGoogle Scholar
  7. GitHub. 2018. Optimal-LRC Matlab Source Code. Retrieved August 12, 2019 from https://github.com/olekol33/optlrc2018/tree/master/src/erasure-code/optlrc/matlab.Google ScholarGoogle Scholar
  8. M. Blaum, J. Brady, J. Bruck, and J. Menon. 1994. EVENODD: An optimal scheme for tolerating double disk failures in RAID architectures. In Proceedings of the 21st Annual International Symposium on Computer Architecture (ISCA’94).Google ScholarGoogle Scholar
  9. Yu Lin Chen, Shuai Mu, Jinyang Li, Cheng Huang, Jin Li, Aaron Ogus, and Douglas Phillips. 2017. Giza: Erasure coding objects across global data centers. In Proceedings of the 2017 USENIX Annual Technical Conference (ATC’17). 539--551.Google ScholarGoogle Scholar
  10. Alexandros G. Dimakis, P. Brighten Godfrey, Yunnan Wu, Martin J. Wainwright, and Kannan Ramchandran. 2010. Network coding for distributed storage systems. IEEE Transactions on Information Theory 56, 9 (2010), 4539--4551.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Vero Estrada-Galinanes, Ethan Miller, Pascal Felber, and Jehan-Francois Paris. 2018. Alpha entanglement codes: Practical erasure codes to archive data in unreliable environments. In Proceedings of the 2018 48th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’18). IEEE, Los Alamitos, CA, 183--194.Google ScholarGoogle ScholarCross RefCross Ref
  12. Eyal En Gad, Robert Mateescu, Filip Blagojevic, Cyril Guyot, and Zvonimir Bandic. 2013. Repair-optimal MDS array codes over GF(2). In Proceedings of the 2013 IEEE International Symposium on Information Theory. IEEE, Los Alamitos, CA, 887--891.Google ScholarGoogle ScholarCross RefCross Ref
  13. Parikshit Gopalan, Cheng Huang, Huseyin Simitci, and Sergey Yekhanin. 2012. On the locality of codeword symbols. IEEE Transactions on Information Theory 58, 11 (Nov. 2012), 6925--6934.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Venkatesan Guruswami and Mary Wootters. 2016. Repairing Reed-Solomon codes. In Proceedings of the 48th Annual ACM SIGACT Symposium on Theory of Computing (STOC’16).Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Cheng Huang, Minghua Chen, and Jin Li. 2013. Pyramid codes: Flexible schemes to trade space for access efficiency in reliable data storage systems. ACM Transactions on Storage 9, 1 (March 2013), Article 3, 28 pages.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Cheng Huang, Huseyin Simitci, Yikang Xu, Aaron Ogus, Brad Calder, Parikshit Gopalan, Jin Li, and Sergey Yekhanin. 2012. Erasure coding in Windows Azure storage. In Proceedings of the USENIX Annual Technical Conference (ATC’12). 15--26.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Saurabh Kadekodi, K. V. Rashmi, and Gregory R. Ganger. 2019. Cluster storage systems gotta have HeART: Improving storage efficiency by exploiting disk-reliability heterogeneity. In Proceedings of the 17th USENIX Conference on File and Storage Technologies (FAST’19). 345--358.Google ScholarGoogle Scholar
  18. Osama Khan, Randal C Burns, James S. Plank, William Pierce, and Cheng Huang. 2012. Rethinking erasure codes for cloud file systems: Minimizing I/O for recovery and degraded reads. In Proceedings of the 10th USENIX Conference on File and Storage Technologies (FAST’12). 20.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Oleg Kolosov. 2018. On Fault Tolerance, Locality, and Optimality in Locally Repairable Codes. Master’s Thesis. School of Electrical Engineering, Tel Aviv University. http://primage.tau.ac.il/libraries/theses/exeng/free/9932978264204146.pdf.Google ScholarGoogle Scholar
  20. Oleg Kolosov, Alexander Barg, Itzhak Tamo, and Gala Yadgar. 2018. Optimal LRC codes for all lengths n <= q. arXiv:1802.00157.Google ScholarGoogle Scholar
  21. Oleg Kolosov, Gala Yadgar, Matan Liram, Itzhak Tamo, and Alexander Barg. 2018. On fault tolerance, locality, and optimality in locally repairable codes. In Proceedings of the 2018 USENIX Annual Technical Conference (USENIX ATC’18). 865--877.Google ScholarGoogle Scholar
  22. John Kubiatowicz, David Bindel, Yan Chen, Steven Czerwinski, Patrick Eaton, Dennis Geels, Ramakrishna Gummadi, et al. 2000. OceanStore: An architecture for global-scale persistent storage. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’00).Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Jie Li and Xiaohu Tang. 2016. Optimal exact repair strategy for the parity nodes of the (k+2,k) zigzag code. IEEE Transactions on Information Theory 62, 9 (Sept. 2016), 4848--4856.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Mingqiang Li and Patrick P. C. Lee. 2014. STAIR codes: A general family of erasure codes for tolerating device and sector failures. ACM Transactions on Storage 10, 4 (Oct. 2014), Article 14, 30 pages.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Runhui Li, Xiaolu Li, Patrick P. C. Lee, and Qun Huang. 2017. Repair pipelining for erasure-coded storage. In Proceedings of the 2017 USENIX Annual Technical Conference (ATC’17). 567--579.Google ScholarGoogle Scholar
  26. Xiaolu Li, Runhui Li, Patrick P. C. Lee, and Yuchong Hu. 2019. OpenEC: Toward unified and configurable erasure coding management in distributed storage systems. In Proceedings of the 17th USENIX Conference on File and Storage Technologies (FAST’19). 331--344.Google ScholarGoogle Scholar
  27. Jian Liu, Sihem Mesnager, and Lusheng Chen. 2018. New constructions of optimal locally recoverable codes via good polynomials. IEEE Transactions on Information Theory 64, 2 (2018), 889--899.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Subrata Mitra, Rajesh Panta, Moo-Ryong Ra, and Saurabh Bagchi. 2016. Partial-parallel-repair (PPR): A distributed technique for repairing erasure coded storage. In Proceedings of the 11th European Conference on Computer Systems. ACM, New York, NY, 30.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Edmund B. Nightingale, Jeremy Elson, Jinliang Fan, Owen Hofmann, Jon Howell, and Yutaka Suzue. 2012. Flat datacenter storage. In Proceedings of the 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI’12). 1--15.Google ScholarGoogle Scholar
  30. Frederique Oggier and Anwitaman Datta. 2011. Self-repairing homomorphic codes for distributed storage systems. In Proceedings of the 2011 IEEE INFOCOM Conference. 1215--1223.Google ScholarGoogle ScholarCross RefCross Ref
  31. Lluis Pamies-Juarez, Filip Blagojevic, Robert Mateescu, Cyril Guyot, Eyal En-Gad, and Zvonimir Bandic. 2016. Opening the chrysalis: On the real repair performance of MSR codes. In Proceedings of the 14th USENIX Conference on File and Storage Technologies (FAST’16). 81--94.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. James S. Plank and Mario Blaum. 2014. Sector-disk (SD) erasure codes for mixed failure modes in RAID systems. ACM Transactions on Storage 10, 1 (2014), 4.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. James S. Plank, Kevin M. Greenan, and Ethan L. Miller. 2013. Screaming Fast Galois field arithmetic using Intel SIMD instructions. In Proceedings of the 11th USENIX Conference on File and Storage Technologies (FAST’13). 299--306.Google ScholarGoogle Scholar
  34. James S. Plank, Jianqiang Luo, Catherine D. Schuman, Lihao Xu, and Zooko Wilcox-O’Hearn. 2009. A performance evaluation and examination of open-source erasure coding libraries for storage. In Proceedings of the 7th USENIX Conference on File and Storage Technologies (FAST’09), Vol. 9. 253--265.Google ScholarGoogle Scholar
  35. K. V. Rashmi, Preetum Nakkiran, Jingyan Wang, Nihar B. Shah, and Kannan Ramchandran. 2015. Having your cake and eating it too: Jointly optimal erasure codes for I/O, storage, and network-bandwidth. In Proceedings of the 13th USENIX Conference on File and Storage Technologies (FAST’15). 81--94.Google ScholarGoogle Scholar
  36. K. V. Rashmi, Nihar B. Shah, Dikang Gu, Hairong Kuang, Dhruba Borthakur, and Kannan Ramchandran. 2013. A solution to the network challenges of data recovery in erasure-coded distributed storage systems: A study on the Facebook warehouse cluster. In Proceedings of the 5th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage’13).Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. K. V. Rashmi, Nihar B. Shah, Dikang Gu, Hairong Kuang, Dhruba Borthakur, and Kannan Ramchandran. 2014. A “Hitchhiker’s” guide to fast and efficient data reconstruction in erasure-coded data centers. ACM SIGCOMM Computer Communication Review, 44, 4 (2014), 331--342.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. K. V. Rashmi, Nihar B. Shah, and P. Vijay Kumar. 2011. Optimal exact-regenerating codes for distributed storage at the MSR and MBR points via a product-matrix construction. IEEE Transactions on Information Theory 57, 8 (Aug. 2011), 5227--5239.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Irving S. Reed and Gustave Solomon. 1960. Polynomial codes over certain finite fields. Journal of the Society for Industrial and Applied Mathematics 8, 2 (1960), 300--304.Google ScholarGoogle ScholarCross RefCross Ref
  40. Eitan Rosenfeld, Nadav Amit, and Dan Tsafrir. 2013. Using disk add-ons to withstand simultaneous disk failures with fewer replicas. In Proceedings of the 7th Annual Workshop on the Interaction Amongst Virtualization, Operating Systems, and Computer Architecture (WIVOSCA’13).Google ScholarGoogle Scholar
  41. Maheswaran Sathiamoorthy, Megasthenis Asteris, Dimitris Papailiopoulos, Alexandros G. Dimakis, Ramkumar Vadali, Scott Chen, and Dhruba Borthakur. 2013. XORing elephants: Novel erasure codes for big data. In Proceedings of the 39th International Conference on Very Large Data Bases (VLDB’13), Vol. 6. 325--336.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Zhirong Shen, Jiwu Shu, Patrick P. C. Lee, and Yingxun Fu. 2016. Seek-efficient I/O optimization in single failure recovery for XOR-coded storage systems. IEEE Transactions on Parallel and Distributed Systems 28, 3 (2016), 877--890.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Roman Shor, Gala Yadgar, Wentao Huang, Eitan Yaakobi, and Jehoshua Bruck. 2018. How to best share a big secret. In Proceedings of the 11th ACM International Systems and Storage Conference. ACM, New York, NY, 76--88.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Mark Silberstein, Lakshmi Ganesh, Yang Wang, Lorenzo Alvisi, and Mike Dahlin. 2014. Lazy means smart: Reducing repair bandwidth costs in erasure-coded distributed storage. In Proceedings of the International Conference on Systems and Storage (SYSTOR’14). 1--7.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Itzhak Tamo and Alexander Barg. 2014. A family of optimal locally recoverable codes. IEEE Transactions on Information Theory 60, 8 (Aug. 2014), 4661--4676.Google ScholarGoogle ScholarCross RefCross Ref
  46. Itzhak Tamo, Zhiying Wang, and Jehoshua Bruck. 2012. Zigzag codes: MDS array codes with optimal rebuilding. IEEE Transactions on Information Theory 59, 3 (2012), 1597--1616.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Myna Vajha, Vinayak Ramkumar, Bhagyashree Puranik, Ganesh Kini, Elita Lobo, Birenjith Sasidharan, P. Vijay Kumar, et al. 2018. Clay codes: Moulding MDS codes to yield an MSR code. In Proceedings of the 16th USENIX Conference on File and Storage Technologies (FAST’18). 139--154.Google ScholarGoogle Scholar
  48. Feiyi Wang, Mark Nelson, Sarp Oral, Scott Atchley, Sage Weil, Bradley W. Settlemyer, Blake Caldwell, and Jason Hill. 2013. Performance and scalability evaluation of the Ceph parallel file system. In Proceedings of the 8th Parallel Data Storage Workshop. ACM, New York, NY.Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Zhiying Wang, Alexandros G. Dimakis, and Jehoshua Bruck. 2010. Rebuilding for array codes in distributed storage systems. In Proceedings of the GLOBECOM Workshops (GC’10). IEEE, Los Alamitos, CA, 1905--1909.Google ScholarGoogle ScholarCross RefCross Ref
  50. Zhufan Wang, Guangyan Zhang, Yang Wang, Qinglin Yang, and Jiaji Zhu. 2019. Dayu: Fast and low-interference data recovery in very-large storage systems. In Proceedings of the 2019 USENIX Annual Technical Conference (USENIX ATC’19).Google ScholarGoogle Scholar
  51. Sage A. Weil, Scott A. Brandt, Ethan L. Miller, Darrell D. E. Long, and Carlos Maltzahn. 2006. Ceph: A scalable, high-performance distributed file system. In Proceedings of the 7th Symposium on Operating Systems Design and Implementation (OSDI’06). 307--320.Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Sage A. Weil, Scott A. Brandt, Ethan L. Miller, and Carlos Maltzahn. 2006. CRUSH: Controlled, scalable, decentralized placement of replicated data. In Proceedings of the ACM/IEEE Conference on Supercomputing (SC’06). 31.Google ScholarGoogle ScholarCross RefCross Ref
  53. Sage A. Weil, Andrew W. Leung, Scott A. Brandt, and Carlos Maltzahn. 2007. RADOS: A scalable, reliable storage service for petabyte-scale storage clusters. In Proceedings of the 2nd International Workshop on Petascale Data Storage (PDSW’07): Held in Conjunction with Supercomputing. 35--44.Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Mingyuan Xia, Mohit Saxena, Mario Blaum, and David A. Pease. 2015. A tale of two erasure codes in HDFS. In Proceedings of the 13th USENIX Conference on File and Storage Technologies (FAST’15). 213--226.Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Xin Xie, Chentao Wu, Junqing Gu, Han Qiu, Jie Li, Minyi Guo, Xubin He, Yuanyuan Dong, and Yafei Zhao. 2019. AZ-Code: An efficient availability zone level erasure code to provide high fault tolerance in cloud storage systems. In Proceedings of the 2019 35th Symposium on Mass Storage Systems and Technologies (MSST’19).Google ScholarGoogle ScholarCross RefCross Ref
  56. Min Ye and Alexander Barg. 2017. Explicit constructions of high-rate MDS array codes with optimal repair bandwidth. IEEE Transactions on Information Theory 63, 4, 2001--2014.Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Alexander Zeh and Eitan Yaakobi. 2016. Bounds and constructions of codes with multiple localities. arXiv:1601.02763.Google ScholarGoogle Scholar
  58. Guangyan Zhang, Zican Huang, Xiaosong Ma, Songlin Yang, Zhufan Wang, and Weimin Zheng. 2018. RAID+: Deterministic and balanced data distribution for large disk enclosures. In Proceedings of the 16th USENIX Conference on File and Storage Technologies (FAST’18). 279--294.Google ScholarGoogle Scholar
  59. Tianli Zhou and Chao Tian. 2019. Fast erasure coding for data storage: A comprehensive study of the acceleration techniques. In Proceedings of the 17th USENIX Conference on File and Storage Technologies (FAST’19). 317--329.Google ScholarGoogle Scholar

Index Terms

  1. On Fault Tolerance, Locality, and Optimality in Locally Repairable Codes

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Storage
          ACM Transactions on Storage  Volume 16, Issue 2
          SOSP 2019 Special Section and Regular Papers
          May 2020
          194 pages
          ISSN:1553-3077
          EISSN:1553-3093
          DOI:10.1145/3399155
          • Editor:
          • Sam H. Noh
          Issue’s Table of Contents

          Copyright © 2020 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 22 May 2020
          • Online AM: 7 May 2020
          • Accepted: 1 February 2020
          • Revised: 1 December 2019
          • Received: 1 August 2019
          Published in tos Volume 16, Issue 2

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format .

        View HTML Format