skip to main content
article

Multi-level RAID for very large disk arrays

Published: 01 March 2006 Publication History

Abstract

Very Large Disk Arrays - VLDAs have been developed to cope with the rapid increase in the volume of data generated requiring ultrareliable storage. Bricks or Storage Nodes - SNs holding a dozen or more disks are cost effective VLDA building blocks, since they cost less than traditional disk arrays. We utilize the Multilevel RAID - MRAID paradigm for protecting both SNs and their disks. Each SN is a k-disk-failure-tolerant kDFT array, while replication or l-node failure tolerance - lNFTs paradigm is applied at the SN level. For example, RAID1(M)/5(N) denotes a RAID1 at the higher level with a degree of replication M and each virtual disk is an SN configured as a RAID5 with N physical disks. We provide the data layout for RAID5/5 and RAID6/5 MRAIDs and give examples of updating data and recovering lost data. The former requires storage transactions to ensure the atomicity of storage updates. We discuss some weaknesses in reliability modeling in RAID5 and give examples of an asymptotic expansion method to compare the reliability of several MRAID organizations. We outline the reliability analysis of Markov chain models of VLDAs and briefly report on conclusions from simulation results. In Conclusions we outline areas for further research.

References

[1]
K. Amiri, G. A. Gibson, and R. Golding. "Highly concurrent shared storage", Proc. 20th Int'l Conf. Distributed Computing Systems - ICDCS, Taipei, Taiwan, April 2000, pp. 298--307.
[2]
S. H. Baek, B. W. Kim, E. Jeung, and C. W. Park. "Reliability and performance of hierarchical RAID with multiple controllers", Proc. 20th Annual ACM Symp. on Principles of Distributed Computing - PODC, Newport, RI, August 2001, pp. 246--254.
[3]
M. Blaum, J. Brady, J. Bruck, and J. Menon. "EVENODD: An optimal scheme for tolerating double disk failures in RAID architectures", IEEE Trans. Computers TC-44(2): 192--202 (February 1995).
[4]
M. Blaum, J. Brady, J. Bruck, J. Menon, and A. Vardy. "The EVENODD code and its generalizations", in High Performance Mass Storage and Parallel I/O: Technologies and Applications, H. Jin. T. Cortes, and R. Buyya (editors), Wiley 2002, pp. 187--205.
[5]
J. Chandy and A. L. N. Reddy. "Failure evaluation of disk array organizations", Proc. 13th Int'l Conf. on Distributed Computing Systems - ICDCS, Pittsburgh, PA, May 1993, pp. 319--326.
[6]
P. M. Chen, E. K. Lee, G. A. Gibson, R. H. Katz, and D. A. Patterson. "RAID: High-performance, reliable secondary storage", ACM Computing Surveys 26(2): 145--185 (June 1994).
[7]
P. Corbett, B. English, A. Goel, T. Grcanac, S. Kleiman, J. Leong, and S. Sankar. "Row-diagonal parity for double disk failure correction", Proc. 3rd Conf. File and Storage Technologies - FAST'04, San Francisco, CA, March/April 2004.
[8]
P. A. Franaszek, J. R. Haritsa, J. T. Robinson, and A. Thomasian. "Distributed concurrency control based on limited wait depth", IEEE Trans. Parallel and Distributed Systems TPDS-4(11): 1246--1264 (November 1993).
[9]
J. Gray. "Storage bricks - Keynote speech", First USENIX Conf. on File and Storage Technologies - FAST'02, Monterey, CA, January 2002.
[10]
J. Gray. "Greetings from a filesystem user", Fourth USENIX_Conf. on File and Storage Technologies - FAST'02, San Francisco, CA, December 2005 (keynote speech).
[11]
M. P. Herlihy and J. M Wing. "Linearizability: A correctness criterion for concurrent objects", ACM Trans. Programming Languages and Systems - ASPLOS-12(3): 463--492.
[12]
C. R. Lumb, R. Golding, and G. R. Ganger. "D-SPTF: Decentralized request distribution in brick-based storage systems", Proc 11th Int'l Conf. Architectural Support for Programming Languages and Operating Systems - ASPLOS, Cambridge, MA, October 2004, pp. 37--47.
[13]
A. Merchant and P. S. Yu. "Analytic modeling of clustered RAID with mapping based on nearly random permutation", IEEE Trans. Computers TC-45(3): 367--373 (March 1996).
[14]
R. R. Muntz and J. C.-S. Lui. "Performance analysis of disk arrays under failure", Proc. 16th Int'l Conf. Very Large Data Bases - VLDB, Brisbane, Queensland, Australia, August 1990, pp. 162--173.
[15]
C.-I. Park. "Efficient placement of parity and data to tolerate two disk failures in disk array systems", IEEE Trans. Parallel and Distributed Systems TPDS-6(11): 1177--1184 (November 1995).
[16]
D. A. Patterson, G. A. Gibson, and R. H. Katz. "A case for redundant arrays of inexpensive disks (RAID)", Proc. ACM SIGMOD Int'l Conf. on Management of Data, Chicago, IL, June 1988, pp. 109--116.
[17]
E. Rahm. "Empirical performance evaluation of concurrency and coherency control protocols for database sharing systems", ACM Trans. Database Systems TODS-18(2): 333--377 (June 1993).
[18]
R. Ramakrishnan and J. Gehrke. Database Management Systems, Third Edition, McGraw-Hill, 2003.
[19]
K. K. Rao, J. L. Hafner, and R. A. Golding. "Reliability for networked storage nodes", IBM Research Report RJ10358, Almaden Research Center, San Jose, CA, September 2005.
[20]
D. J. Rosenkrantz, R. E. Stearns, and P. M. Lewis II. "System level concurrency control in distributed database systems", ACM Trans. Database Systems TODS-3(2): 178--198 (June 1978).
[21]
B. Seeger. "An analysis of schedules for performing multi-page requests", Information Systems 21(5): 387--407_(1996).
[22]
A. Thomasian and J. Menon. "RAID5 performance with distributed sparing", IEEE Trans. Parallel and Distributed Systems TPDS-8(6): 640--657 (June 1997).
[23]
A. Thomasian, C. Han, G. Fu, and C. Liu. "A performance tool for RAID disk arrays", Proc. Quantitative Evaluation of Systems - QEST'04, Enschede, The Netherlands, September 2004, pp. 8--17.
[24]
A. Thomasian. "Read-modify-writes versus reconstruct writes in RAID", Information Processing Letters - IPL 93(4): 163--168 (February 2005).
[25]
A. Thomasian. "Access costs in clustered RAID disk arrays", The Computer Journal 48(11): 702--713 (November 2005).
[26]
A. Thomasian and J. Xu. "Reliability and performance of mirrored disk organizations", revised and resubmitted November 2005.
[27]
A. Thomasian. "Shortcut method for reliability comparisons in RAID", The Journal of Systems and Software, revised and resubmitted Fall 2005.
[28]
D. F. Towsley, S.-Z. Chen, and S.-P. Yu. "Performance analysis of fault-tolerant mirrored disk systems", in Performance '90, P. J. B. King, I. Mitrani, and R. J. Pooley (eds.), North-Holland 1990, pp. 239--253.
[29]
Q. Xin, E. L. Miller, T. J. E. Schwarz, D. D. E. Long, S. A. Brandt, and W. Litwin. "Reliability mechanisms for very large storage systems", Proc. 20th IEEE/11th NASA Goddard Conf. on Mass Storage Systems - MSS'03, San Diego, CA. April 2003, pp. 146--156.

Cited By

View all

Index Terms

  1. Multi-level RAID for very large disk arrays

                      Recommendations

                      Comments

                      Information & Contributors

                      Information

                      Published In

                      cover image ACM SIGMETRICS Performance Evaluation Review
                      ACM SIGMETRICS Performance Evaluation Review  Volume 33, Issue 4
                      Design, implementation, and performance of storage systems
                      March 2006
                      45 pages
                      ISSN:0163-5999
                      DOI:10.1145/1138085
                      Issue’s Table of Contents

                      Publisher

                      Association for Computing Machinery

                      New York, NY, United States

                      Publication History

                      Published: 01 March 2006
                      Published in SIGMETRICS Volume 33, Issue 4

                      Check for updates

                      Qualifiers

                      • Article

                      Contributors

                      Other Metrics

                      Bibliometrics & Citations

                      Bibliometrics

                      Article Metrics

                      • Downloads (Last 12 months)5
                      • Downloads (Last 6 weeks)1
                      Reflects downloads up to 16 Feb 2025

                      Other Metrics

                      Citations

                      Cited By

                      View all
                      • (2023)Design Considerations and Analysis of Multi-Level Erasure Coding in Large-Scale Data CentersProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3581784.3607072(1-13)Online publication date: 12-Nov-2023
                      • (2022)BibliographyStorage Systems10.1016/B978-0-32-390796-5.00023-1(641-693)Online publication date: 2022
                      • (2022)Hierarchical RAID - HRAIDStorage Systems10.1016/B978-0-32-390796-5.00020-6(593-621)Online publication date: 2022
                      • (2018)Informed Prefetching for Distributed Multi-Level Storage SystemsJournal of Signal Processing Systems10.1007/s11265-017-1277-z90:4(619-640)Online publication date: 1-Apr-2018
                      • (2018)Redundant Independent Files (RIF): A Technique for Reducing Storage and Resources in Big Data ReplicationTrends and Advances in Information Systems and Technologies10.1007/978-3-319-77703-0_18(182-193)Online publication date: 2018
                      • (2017)A data-check based distributed storage model for storing hot temporary dataFuture Generation Computer Systems10.1016/j.future.2017.03.01973(13-21)Online publication date: Aug-2017
                      • (2015)Predictive Prefetching for Parallel Hybrid Storage SystemsInternational Journal of Communications, Network and System Sciences10.4236/ijcns.2015.8501808:05(161-180)Online publication date: 2015
                      • (2012)A Pipelining Approach to Informed Prefetching in Distributed Multi-level Storage SystemsProceedings of the 2012 IEEE 11th International Symposium on Network Computing and Applications10.1109/NCA.2012.26(87-95)Online publication date: 23-Aug-2012
                      • (2012)Hierarchical RAIDJournal of Parallel and Distributed Computing10.1016/j.jpdc.2012.07.00272:12(1753-1769)Online publication date: 1-Dec-2012
                      • (2011)Composite RAID for rapid prototyping data gridInternational Journal of Web and Grid Services10.1504/IJWGS.2011.0383887:1(58-74)Online publication date: 1-Feb-2011
                      • Show More Cited By

                      View Options

                      Login options

                      View options

                      PDF

                      View or Download as a PDF file.

                      PDF

                      eReader

                      View online with eReader.

                      eReader

                      Figures

                      Tables

                      Media

                      Share

                      Share

                      Share this Publication link

                      Share on social media