skip to main content
research-article

STAIR Codes: A General Family of Erasure Codes for Tolerating Device and Sector Failures

Published:31 October 2014Publication History
Skip Abstract Section

Abstract

Practical storage systems often adopt erasure codes to tolerate device failures and sector failures, both of which are prevalent in the field. However, traditional erasure codes employ device-level redundancy to protect against sector failures, and hence incur significant space overhead. Recent sector-disk (SD) codes are available only for limited configurations. By making a relaxed but practical assumption, we construct a general family of erasure codes called STAIR codes, which efficiently and provably tolerate both device and sector failures without any restriction on the size of a storage array and the numbers of tolerable device failures and sector failures. We propose the upstairs encoding and downstairs encoding methods, which provide complementary performance advantages for different configurations. We conduct extensive experiments on STAIR codes in terms of space saving, encoding/decoding speed, and update cost. We demonstrate that STAIR codes not only improve space efficiency over traditional erasure codes, but also provide better computational efficiency than SD codes based on our special code construction. Finally, we present analytical models that characterize the reliability of STAIR codes, and show that the support of a wider range of configurations by STAIR codes is critical for tolerating sector failure bursts discovered in the field.

References

  1. Bairavasundaram, L. N., Goodson, G. R., Pasupathy, S., and Schindler, J. 2007. An analysis of latent sector errors in disk drives. In Proceedings of the ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS’07), 289--300. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Blaum, M. 2006. A family of MDS array codes with minimal number of encoding operations. In Proceedings of the IEEE International Symposium on Information Theory (ISIT’06), 2784--2788.Google ScholarGoogle ScholarCross RefCross Ref
  3. Blaum, M., Brady, J., Bruck, J., and Menon, J. 1995. EVENODD: An efficient scheme for tolerating double disk failures in RAID architectures. IEEE Trans. Comput. 44, 2, 192--202. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Blaum, M., Bruck, J., and Vardy, A. 1996. MDS array codes with independent parity symbols. IEEE Trans. Inf. Theory 42, 2, 529--542. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Blaum, M., Hafner, J. L., and Hetzler, S. 2013. Partial-MDS codes and their application to RAID type of architectures. IEEE Trans. Inf. Theory 59, 7, 4510--4519. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Blaum, M., Hafner, J. L., and Hetzler, S. R. 2012. Nested multiple erasure correcting codes for storage arrays. U.S. Patent No. 13/036,845, Filed February 28, 2011, Issued August 30, 2012.Google ScholarGoogle Scholar
  7. Blaum, M. and Plank, J. S. 2013. Construction of sector-disk (SD) codes with two global parity symbols. IBM Res. Rep. RJ10511 (ALM1308-007), Almaden Research Center, IBM Research Division.Google ScholarGoogle Scholar
  8. Blomer, J., Kalfane, M., Karp, R., Karpinski, M., Luby, M., and Zuckerman, D. 1995. An XOR-based erasure-resilient coding scheme. Tech. Rep. TR-95-048, International Computer Science Institute, University of California, Berkeley.Google ScholarGoogle Scholar
  9. Boboila, S. and Desnoyers, P. 2010. Write endurance in flash drives: Measurements and analysis. In Proceedings of the 8th USENIX Conference on File and Storage Technologies (FAST’10), 115--128. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Corbett, P., English, B., Goel, A., Grcanac, T., Kleiman, S., Leong, J., and Sankar, S. 2004. Row-diagonal parity for double disk failure correction. In Proceedings of the 3rd USENIX Conference on File and Storage Technologies (FAST’04), 1--14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Dholakia, A., Eleftheriou, E., Hu, X.-Y., Iliadis, I., Menon, J., and Rao, K. 2008. A new intra-disk redundancy scheme for high-reliability RAID storage systems in the presence of unrecoverable errors. ACM Trans. Storage 4, 1, 1--42. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Dholakia, A., Eleftheriou, E., Hu, X.-Y., Iliadis, I., Menon, J., and Rao, K. 2011. Disk scrubbing versus intradisk redundancy for RAID storage systems. ACM Trans. Storage 7, 2, 1--42. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Elias, P. 1954. Error-free coding. IRE Trans. Inf. Theory 4, 4, 29--37.Google ScholarGoogle ScholarCross RefCross Ref
  14. Feng, G., Deng, R., Bao, F., and Shen, J. 2005a. New efficient MDS array codes for RAID Part I: Reed-Solomon-like codes for tolerating three disk failures. IEEE Trans. Comput. 54, 9, 1071--1080. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Feng, G., Deng, R., Bao, F., and Shen, J. 2005b. New efficient MDS array codes for RAID Part II: Rabin-like codes for tolerating multiple (≥4) disk failures. IEEE Trans. Comput. 54, 12, 1473--1483. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Greenan, K. M., Plank, J. S., and Wylie, J. J. 2010. Mean time to meaningless: MTTDL, Markov models, and storage system reliability. In Proceedings of the 2nd Workshop on Hot Topics in Storage and File Systems (HotStorage’10), 1--5. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Grupp, L. M., Caulfield, A. M., Coburn, J., Swanson, S., Yaakobi, E., Siegel, P. H., and Wolf, J. K. 2009. Characterizing flash memory: Anomalies, observations, and applications. In Proceedings of the 42nd International Symposium on Microarchitecture (MICRO’09), 24--33. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Grupp, L. M., Davis, J. D., and Swanson, S. 2012. The bleak future of NAND flash memory. In Proceedings of the 10th USENIX Conference on File and Storage Technologies (FAST’12), 17--24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Hafner, J. L. 2005. WEAVER codes: Highly fault tolerant erasure codes for storage systems. In Proceedings of the 4th USENIX Conference on File and Storage Technologies (FAST’05), 211--224. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Hafner, J. L. 2006. HoVer erasure codes for disk arrays. In Proceedings of the International Conference on Dependable Systems and Networks (DSN’06), 1--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Huang, C., Chen, M., and Li, J. 2013. Pyramid codes: Flexible schemes to trade space for access efficiency in reliable data storage systems. ACM Trans. Storage 9, 1, 1--28. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Huang, C., Simitci, H., Xu, Y., Ogus, A., Calder, B., Gopalan, P., Li, J., and Yekhanin, S. 2012. Erasure coding in Windows Azure storage. In Proceedings of the USENIX Annual Technical Conference (USENIX ATC’12), 15--26. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Huang, C. and Xu, L. 2005. STAR: An efficient coding scheme for correcting triple storage node failures. In Proceedings of the 4th USENIX Conference on File and Storage Technologies (FAST’05), 889--901. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Iliadis, I. and Hu, X.-Y. 2008. Reliability assurance of RAID storage systems for a wide range of latent sector errors. In Proceedings of the IEEE International Conference on Networking, Architecture, and Storage (NAS’08), 10--19. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Intel. 2005. Intelligent RAID 6 theory --- overview and implementation. White Paper. Intel Corporation.Google ScholarGoogle Scholar
  26. Li, M. and Lee, P. P. C. 2014. STAIR codes: A general family of erasure codes for tolerating device and sector failures in practical storage systems. In Proceedings of the 12th USENIX Conference on File and Storage Technologies (FAST’14), 147--162. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Li, M. and Shu, J. 2011. C-Codes: Cyclic lowest-density MDS array codes constructed using starters for RAID 6. IBM Res. Rep. RC25218 (C1110-004), China Research Laboratory, IBM Research Division.Google ScholarGoogle Scholar
  28. Li, M., Shu, J., and Zheng, W. 2009. GRID codes: Strip-based erasure codes with high fault tolerance for storage systems. ACM Trans. Storage 4, 4, 1--22. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Oprea, A. and Juels, A. 2010. A clean-slate look at disk scrubbing. In Proceedings of the 8th USENIX Conference on File and Storage Technologies (FAST’10), 1--14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Pinheiro, E., Weber, W.-D., and Barroso, L. A. 2007. Failure trends in a large disk drive population. In Proceedings of the 5th USENIX Conference on File and Storage Technologies (FAST’07), 17--28. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Plank, J. S. 1997. A tutorial on Reed-Solomon coding for fault-tolerance in RAID-like systems. Softw. Pract. Exp. 27, 9, 995--1012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Plank, J. S. and Blaum, M. 2014. Sector-disk (SD) erasure codes for mixed failure modes in RAID systems. ACM Trans. Storage 10, 1, 1--17. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Plank, J. S., Blaum, M., and Hafner, J. L. 2013a. SD codes: Erasure codes designed for how storage systems really fail. In Proceedings of the 11th USENIX Conference on File and Storage Technologies (FAST’13), 95--104. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Plank, J. S., Buchsbaum, A. L., and Vander Zanden, B. T. 2011. Minimum density RAID-6 codes. ACM Trans. Storage 6, 4, 1--22. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Plank, J. S. and Ding, Y. 2005. Note: Correction to the 1997 tutorial on Reed-Solomon coding. Softw. Pract. Exp. 35, 2, 189--194. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Plank, J. S., Greenan, K. M., and Miller, E. L. 2013b. Screaming fast Galois Field arithmetic using Intel SIMD instructions. In Proceedings of the 11th USENIX Conference on File and Storage Technologies (FAST’13), 299--306. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Plank, J. S. and Huang, C. 2013. Tutorial: Erasure coding for storage applications. Slides presented at the 11th USENIX Conference on File and Storage Technologies.Google ScholarGoogle Scholar
  38. Plank, J. S. and Xu, L. 2006. Optimizing Cauchy Reed-Solomon codes for fault-tolerant network storage applications. In Proceedings of the 5th IEEE International Symposium on Network Computing and Applications (NCA’06), 173--180. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Reed, I. S. and Solomon, G. 1960. Polynomial codes over certain finite fields. J. Soc. Indust. Appl. Math. 8, 2, 300--304.Google ScholarGoogle ScholarCross RefCross Ref
  40. Sathiamoorthy, M., Asteris, M., Papailiopoulous, D., Dimakis, A. G., Vadali, R., Chen, S., and Borthakur, D. 2013. XORing elephants: Novel erasure codes for big data. In Proceedings of the 39th International Conference on Very Large Data Bases (VLDB’13), 325--336. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Schroeder, B., Damouras, S., and Gill, P. 2010. Understanding latent sector errors and how to protect against them. In Proceedings of the 8th USENIX Conference on File and Storage Technologies (FAST’10), 71--84. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Schroeder, B. and Gibson, G. A. 2007. Disk failures in the real world: What does an MTTF of 1,000,000 hours mean to you? In Proceedings of the 5th USENIX Conference on File and Storage Technologies (FAST’07), 1--16. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Schwarz, T. J. E., Xin, Q., Miller, E. L., and Long, D. D. E. 2004. Disk scrubbing in large archival storage systems. In Proceedings of the 12th Annual Meeting of the IEEE/ACM International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS’04), 409--418. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. White, J. and Lueth, C. 2010. RAID-DP: NetApp implementation of double-parity RAID for data protection. Tech. Rep. TR-3298, NetApp, Inc.Google ScholarGoogle Scholar
  45. Wildani, A., Schwarz, T. J. E., Miller, E. L., and Long, D. D. 2009. Protecting against rare event failures in archival systems. In Proceedings of the 17th Annual Meeting of the IEEE/ACM International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS’09), 1--11.Google ScholarGoogle Scholar
  46. Xu, L., Bohossian, V., Bruck, J., and Wagner, D. G. 1999. Low-density MDS codes and factors of complete graphs. IEEE Trans. Inf. Theory 45, 6, 1817--1826. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Xu, L. and Bruck, J. 1999. X-Code: MDS array codes with optimal encoding. IEEE Trans. Inf. Theory 45, 1, 272--276. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Zheng, M., Tucek, J., Qin, F., and Lillibridge, M. 2013. Understanding the robustness of SSDs under power fault. In Proceedings of the 11th USENIX Conference on File and Storage Technologies (FAST’13), 271--284. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. STAIR Codes: A General Family of Erasure Codes for Tolerating Device and Sector Failures

              Recommendations

              Comments

              Login options

              Check if you have access through your login credentials or your institution to get full access on this article.

              Sign in

              Full Access

              • Published in

                cover image ACM Transactions on Storage
                ACM Transactions on Storage  Volume 10, Issue 4
                Special Issue on Usenix Fast 2014
                October 2014
                102 pages
                ISSN:1553-3077
                EISSN:1553-3093
                DOI:10.1145/2685385
                • Editor:
                • Darrell Long
                Issue’s Table of Contents

                Copyright © 2014 ACM

                Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                Publisher

                Association for Computing Machinery

                New York, NY, United States

                Publication History

                • Published: 31 October 2014
                • Accepted: 1 July 2014
                • Received: 1 June 2014
                Published in tos Volume 10, Issue 4

                Permissions

                Request permissions about this article.

                Request Permissions

                Check for updates

                Qualifiers

                • research-article
                • Research
                • Refereed

              PDF Format

              View or Download as a PDF file.

              PDF

              eReader

              View online with eReader.

              eReader