research-article

Rebuild processing in RAID5 with emphasis on the supplementary parity augmentation method[37]

Author:

Alexander ThomasianAuthors Info & Claims

ACM SIGARCH Computer Architecture News, Volume 40, Issue 2

Pages 18 - 27

https://doi.org/10.1145/2234336.2234340

Published: 31 May 2012 Publication History

Abstract

The rotated parity RAID5 disk array tolerates single disk failures by continuing operation by on-demand reconstruction of data blocks of the failed disk, until the systematic reconstruction of the contents of the failed disk is completed by the rebuild process on a spare disk. Supplementary Parity Augmentation (SPA), unlike the pyramid code, which has two parities covering half of the arrays disks each, extends RAID5's P parity with an additional S parity, which covers half of the disks. The extra load with respect to RAID5 of updating the S parity by one half of the disks is compensated by the more efficient on demand reconstructtion and rebuild processing when a disk fails. Although SPA has the same disk space redundancy level as RAID6, unlike RAID6 it can only deal with roughly half of all possible double disk failure cases for eight disks. For rebuild processing SPA reads half of the disks required by RAID5 and this leads to a higher Mean Time to Data Loss than RAID5, since fewer Latent Sector Errors are encountered. We review performance and reliability modeling of RAID5 arrays to provide insights into SPA's performance and reliability, which cannot be gained from numerical results alone. SPA is outperformed by the Intra-Disk Redundancy schemes combined with RAID5, which results in RAID6's reliability and RAID5 performance.

References

[1]

M. Blaum, J. Brady, J. Bruck, and J. Menon. "EVENODD: An Efficient scheme for tolerating double disk failures in RAID architectures", IEEE Trans. Computers 44(2): 192--202 (February 1995).

Digital Library

[2]

A. Blum, A. Goyal, P. Heidelberger, S. S. Lavenberg,M. Nakayama, and P. Shahabuddin. "Modeling and analysis of system dependability using the system availability estimator", In Proc. 24th IEEE Ann'l Int'l Symp. on Fault-Tolerant Computing Systems (FTCS), Austin, TX, June 1994, 137--141.

[3]

P. M. Chen, E. K. Lee, G. A. Gibson, R. H. Katz, and D. A. Patterson. "RAID: High-performance, reliable secondary storage", ACM Computing Surveys 26(2): 145--185 (June 1994).

Digital Library

[4]

P. F. Corbett, R. English, A. Goel, T. Grcanac, S. Kleiman, J. Leong, and S. Sankar. "Row-diagonal parity for double disk failure correction", Proc. USENIX Conf. on File and Storage Technologies (FAST'04), San Francisco, CA, March-April 2004, 1--14.

Digital Library

[5]

A. Dholakia, E. Eleftheriou, X.-Y. Hu, I. Iliadis, J. Menon, and K. K. Rao. "A new intra-disk redundancy scheme for high-reliability RAID storage systems in the presence of unrecoverable errors", ACM Trans. on Storage (TOS) 4(1): article 1 (May 2008).

Digital Library

[6]

G. A. Gibson. Redundant Disk Arrays: Reliable, Parallel Secondary Storage, The MIT Press, 1992.

Digital Library

[7]

M. Holland, G. A. Gibson, and D. P. Siewiorek. "Architectures and algorithms for on-line failure recovery in redundant disk arrays", Distributed and Parallel Databases 2(3): 295--335 (July 1994).

Digital Library

[8]

R. Y. Hou, J. Menon, and Y. N. Patt. "Balancing I/O response time and disk rebuild time in a RAID5 disk array", In Proc. 26th Hawaii Int'l Conf. System Sciences (HICSS 26), Vol. I, Honolulu, HI, January 1993, 70--79.

[9]

C. Huang, M. Chen, and J. Li. "Pyramid codes: Flexible schemes to trade space for access efficiency", Proc. 6th Int'l Symp. on Network Computing and Applications (NCA 2007), Cambridge, MA, July 2007, 79--86.

[10]

I. Iliadis, R. Haas, X.-Y. Hu, and E. Eleftheriou. "Disk scrubbing versus intra-disk redundancy for high-reliability raid storage systems", ACM Trans. on Storage (TOS) 7(2): Article 5 (July 2011).

Digital Library

[11]

B. L. Jacob, S. W. Ng, and D. T. Wang. Memory Systems: Cache, DRAM, Disk, Elsevier, 2008.

Digital Library

[12]

H. H. Kari. "Latent Sector Faults and Relability of Disk Arrays", PhD Thesis, University of Helsinki, Finland, 1977.

[13]

L. Kleinrock. Queueing Systems, Vol. I: Theory; Vol. II: Applications to Computer Systems, Wiley-Interscience, 1975/76.

Digital Library

[14]

E. Krevat, J. Tucek, and G. R. Ganger. "Disks are like snowflakes: No two are alike", In Proc. 13th Workshop on Hot Topics in Operating Systems (HotOS 2011), Napa Valley, CA. May 2011.

Digital Library

[15]

J. Y. B. Lee and J. C. S. Lui. "Automatic recovery from disk failure in continuous-media servers", IEEE Trans. Parallel Distributed Systems (TPDS) 13(5): 499--515 (May 2002).

Digital Library

[16]

S. W. Ng and R. L. Mattson. "Uniform parity group distribution in disk arrays with multiple disk failures", IEEE Trans. Computers 43(4): 501--506 (April 1994).

Digital Library

[17]

V. Nicola. M. Nakayama, P. Heidelberger, and A. Goyal. "Fast simulation of highly dependable systems with general failure and repair processes", IEEE Trans. Computers 42(12): 1440--1452 (December 1993).

Digital Library

[18]

J. Menon. "Performance of RAID5 disk arrays with read and write caching", Distributed and Parallel Databases 2(3): 261--293 (July 1994).

Digital Library

[19]

A. Merchant and P. S. Yu. "Analytic modeling of clustered RAID with mapping based on nearly random permutation", IEEE Trans. Computers 45(3): 367--373 (March 1996).

Digital Library

[20]

R. R. Muntz and J. C. S. Lui. "Performance analysis of disk arrays under failure", Proc. 16th Int'l Conf. on Very Large Data Bases (VLDB'90), Brisbane, Queensland, Australia, August 1990, 162--173.

Digital Library

[21]

Y. W. Ng and A. Avizienis. "A unified reliability model for fault-tolerant computers", IEEE Trans. Computers 29(11): 1002--1011 (November 1980).

Digital Library

[22]

K. K. Ramakrishnan, P. Biswas, and R. Karedla. "Analysis of file I/O traces in commercial computing environments", Proc. ACM SIGMETRICS/PERFORMANCE'92 Int'l Conf. on Measurement and Modeling of Computer Systems, Newport, Rhode Island, June 1992, 78--90.

Digital Library

[23]

K. K. Rao, J. L. Hafner, and R. A. Golding. "Reliability for networked storage nodes", IEEE Trans. Dependable Secure Computing (TPDS) 8(3): 404--418 (May-June 2011).

Digital Library

[24]

B. Schroeder and G. A. Gibson. "Understanding disk failure rates: What does anMTTF of 1,000,000 hours mean to you?", ACM Trans. on Storage (TOS) 3(3): article 8 (October 2007).

Digital Library

[25]

B. Schroeder, S. Damouras, and P. Gill. "Understanding latent sector errors and how to protect against them", ACM Trans. on Storage (TOS) 6(3): article 2 (2010).

Digital Library

[26]

H. Takagi. Queueing Analysis: A Foundation of Performance Evaluation Vacation and Priority Systems, Part 1, North-Holland, 1991.

[27]

A. Thomasian and J. Menon. "Performance analysis of RAID5 disk arrays with a vacationing server model for rebuild mode operation", Proc. 10th IEEE Int'l Conf. on Data Engineering (ICDE), Houston, TX, February 1994, 111--119.

Digital Library

[28]

A. Thomasian. "Rebuild options in RAID5 disk arrays", Proc. 7th IEEE Symp. Parallel and Distributed Systems, San Antonio, TX, October 1995, 511--518.

Digital Library

[29]

A. Thomasian and J. Menon. "RAID5 performance with distributed sparing", IEEE Trans. Parallel and Distributed Systems 8(6): June 1997, 640--657.

Digital Library

[30]

A. Thomasian. "Clustered RAID arrays and their access costs", The Computer Journal 48(6): 702--713 (November 2005).

Digital Library

[31]

A. Thomasian, G. Fu, and C. Han. "Performance of two-disk failure-tolerant disk arrays". IEEE Trans. Computers 56(6): 799--814 (June 2007).

Digital Library

[32]

A. Thomasian, G. Fu, and S. W. Ng. "Analysis of rebuild processing in RAID5 disk arrays", The Computer Journal 50(2): 217--231 (March 2007).

Digital Library

[33]

A. Thomasian and M. Blaum. "Higher reliability redundant disk arrays: Organization, operation, and coding", ACM Trans. on Storage 5(3): article 7 (November 2009).

Digital Library

[34]

A. Thomasian. "Survey and analysis of disk scheduling methods", ACM SIGARCH Computer Architecture News 39(2): 8--25 (May 2011).

Digital Library

[35]

A. Thomasian and J. Xu. "X-code double parity array operation with two disk failures", Information Processing Letters (IPL) 111(12): 568--574 (June 2011).

[36]

A. Thomasian and J. Xu. "Data allocation in heterogeneous disk arrays", Proc. 6th Int'l Conf. on Networking, Architecture, and Storage (NAS'11), Dalian, China, July 2011, 82--91.

Digital Library

[37]

L. Tian, Q. Cao, H. Jiang, D. Feng, C. Xie, and Q. Xin. "Online availability upgrades for parity-based RAIDs through supplementary parity augmentations", ACM Trans. on Storage (TOS) 6(4): article 17 (May 2011).

Digital Library

[38]

K. S. Trivedi. Probability and Statistics with Reliability, Queuing, and Computer Science Applications, Wiley, 2001.

Digital Library

[39]

Z. Wang, A. G. Dimakis, and J. Bruck. "Rebuilding for array codes in distributed storage systems", Proc. IEEE GLOBECOM Workshop on Application of Communication Theory to Emerging Memory Technologies, Miami, FL, December 2010, 1995--1999.

[40]

B. Welch, M. Unangst, Z. Abbasi, G. A. Gibson, B. Mueller, J. Small, J. Zelenka, and B. Zhou. "Scalable performance of the Panasas parallel file system", In Proc. 6th USENIX Conf. on File and Storage Technologies (FAST'08), San Jose, CA, February 2008, 17--33.

Digital Library

[41]

L. Xiang, Y. Xu, J. C. S. Lui, and Q. Chang. "Optimal recovery of single disk failure in RDP code storage systems", Proc. ACM SIGMETRICS Int'l Conf. on Measurement and Modeling of Computer Systems, New York, NY, June 2010, 119--130.

Digital Library

[42]

L. Xu and J. Bruck. "X-code: MDS array codes with optimal encoding", IEEE Trans. Information Theory 45(1): 272--276 (January 1999).

Digital Library

Cited By

(2022)BibliographyStorage Systems10.1016/B978-0-32-390796-5.00023-1(641-693)Online publication date: 2022
https://doi.org/10.1016/B978-0-32-390796-5.00023-1
Thomasian A(2022)Redundant Arrays of Independent Disks - RAIDStorage Systems10.1016/B978-0-32-390796-5.00014-0(269-336)Online publication date: 2022
https://doi.org/10.1016/B978-0-32-390796-5.00014-0
Thomasian A(2014)Disk arrays with multiple RAID levelsACM SIGARCH Computer Architecture News10.1145/2641361.264136441:5(6-24)Online publication date: 18-Jun-2014
https://dl.acm.org/doi/10.1145/2641361.2641364

Rebuild processing in RAID5 with emphasis on the supplementary parity augmentation method[37]

Recommendations

Comparing rebuild algorithms for mirrored and RAID5 disk arrays
SIGMOD '93: Proceedings of the 1993 ACM SIGMOD international conference on Management of data

Several disk array architectures have been proposed to provide high throughput for transaction processing applications. When a single disk in a redundant array fails, the array continues to operate, albeit in a degraded mode with a corresponding ...
Analysis of Rebuild Processing in RAID5 Disk Arrays

RAID5 tolerates single disk failures by exclusive-ORing (XORing) the blocks corresponding to a requested block on the failed disk to reconstruct it. This results in increased loads on surviving disks and degraded disk response times with respect to ...
Rebuild options in RAID5 disk arrays
SPDP '95: Proceedings of the 7th IEEE Symposium on Parallel and Distributeed Processing

The response time of disk accesses in RAID5 disk arrays degrades when one of the N+1 disks fails and there is a further degradation by the interference caused by rebuild processing. In addition to giving user accesses a higher non-preemptive priority ...

Comments

Information & Contributors

Information

Published In

cover image ACM SIGARCH Computer Architecture News

ACM SIGARCH Computer Architecture News Volume 40, Issue 2

May 2012

49 pages

ISSN:0163-5964

DOI:10.1145/2234336

Issue’s Table of Contents

Copyright © 2012 Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 31 May 2012

Published in SIGARCH Volume 40, Issue 2

Check for updates

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
106
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 16 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

(2022)BibliographyStorage Systems10.1016/B978-0-32-390796-5.00023-1(641-693)Online publication date: 2022
https://doi.org/10.1016/B978-0-32-390796-5.00023-1
Thomasian A(2022)Redundant Arrays of Independent Disks - RAIDStorage Systems10.1016/B978-0-32-390796-5.00014-0(269-336)Online publication date: 2022
https://doi.org/10.1016/B978-0-32-390796-5.00014-0
Thomasian A(2014)Disk arrays with multiple RAID levelsACM SIGARCH Computer Architecture News10.1145/2641361.264136441:5(6-24)Online publication date: 18-Jun-2014
https://dl.acm.org/doi/10.1145/2641361.2641364

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents