RAID-6Plus: A Fast and Reliable Coding Scheme Aided by Multi-failure Degradation

Deng, Ming-Zhu; Ou, Yang; Xiao, Nong; Yu, Song-Ping; Chen, Wei; Chen, Zhi-Guang; Liu, Fang

doi:10.1007/978-3-319-26979-5_15

Ming-Zhu Deng¹⁹,
Yang Ou¹⁹,
Nong Xiao¹⁹,
Song-Ping Yu¹⁹,
Wei Chen¹⁹,
Zhi-Guang Chen¹⁹ &
…
Fang Liu¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9464))

Included in the following conference series:

Asia-Pacific Services Computing Conference

938 Accesses

Abstract

Existing triple-failure-tolerant codes assume that failures are independent and instantaneous. Such assumptions overlook the underlying mechanism of multi-failure occurrences and ignored the effect of reconstruction window. These codes are not adapted to the occurrence pattern of failure in real-world applications. As a result, the third parity drive is almost idle as it set to handle the triple-failure scenario only with lower-level failure situations unattended. Furthermore, the problem of single failure rebuild deteriorates with the increasing disk capacity, and the system’s reliability will decrease with user experience impaired. Aiming at these problems, a fast reconstructable coding scheme extended from RAID-6 has been developed in this study. RAID-6Plus maintains a smaller reconstruction window by recoding the third parity drive. Existing codes provide absolute reliability for triple failures via full combinations. As a contrast, RAID-6Plus employs short combinations which are able to greatly reuse overlapped elements during reconstruction to remake the third parity drive. The short combinations shorten the reconstruction window of single failure, which avoids multi-failure overlapping in the reconstruction window. The capability of multi-failure degradation provides RAID-6Plus with (1) a better system performance comparing to RTP and STAR and (2) an enhanced reliability comparing to RAID-6.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Huang, C., Xu, L.: STAR: an efficient coding scheme for correcting triple storage node failures. IEEE Trans. Comput. 57, 889–901 (2008)
Article MathSciNet Google Scholar
Goel, A., Corbett, P.: RAID triple parity. ACM SIGOPS Oper. Syst. Rev. 46, 41–49 (2012)
Article Google Scholar
Blaum, M., Bruck, J., Vardy, A.: MDS array codes with independent parity symbols. IEEE Trans. Inf. Theor. 42, 529–542 (1996)
Article MATH Google Scholar
Jain, N., Dahlin, M., Tewari, R.: TAPER: tiered approach for eliminating redundancy in replica synchronization. In: FAST, pp. 21–21
Google Scholar
Chen, P.M., Lee, E.K., Gibson, G.A., Katz, R.H., Patterson, D.A.: RAID: high-performance, reliable secondary storage. ACM Comput. Surv. (CSUR) 26, 145–185 (1994)
Article Google Scholar
Amer, A., Long, D.D., Thomas Schwarz, S.: Reliability challenges for storing exabytes. In: 2014 International Conference on Computing, Networking and Communications (ICNC), pp. 907–913. IEEE (2014)
Google Scholar
Schroeder, B., Gibson, G.A.: Disk failures in the real world: what does an MTTF of 1, 000, 000 hours mean to you? In: FAST, pp. 1–16
Google Scholar
Plank, J.S., Blaum, M.: Sector-disk (SD) erasure codes for mixed failure modes in RAID systems. ACM Trans. Storage (TOS) 10, 4 (2014)
Google Scholar
Leventhal, A.: Triple-parity RAID and beyond. Queue 7, 30 (2009)
Google Scholar
Xiang, L., Xu, Y., Lui, J., Chang, Q.: Optimal recovery of single disk failure in RDP code storage systems. ACM SIGMETRICS Perform. Eval. Rev. 38, 119–130 (2010)
Article Google Scholar
Xiang, L., Xu, Y., Lui, J., Chang, Q., Pan, Y., Li, R.: A hybrid approach to failed disk recovery using RAID-6 codes: algorithms and performance evaluation. ACM Trans. Storage (TOS) 7, 11 (2011)
Google Scholar
Zhu, Y., Lee, P.P., Xiang, L., Xu, Y., Gao, L.: A cost-based heterogeneous recovery scheme for distributed storage systems with RAID-6 codes, pp. 1–12. IEEE
Google Scholar
Khan, O., Burns, R.C., Plank, J.S., Pierce, W., Huang, C.: Rethinking erasure codes for cloud file systems: minimizing I/O for recovery and degraded reads, p. 20
Google Scholar
Ma, A., Douglis, F., Lu, G., Sawyer, D., Chandra, S., Hsu, W.: RAIDShield: characterizing, monitoring, and proactively protecting against disk failures. In: Proceedings of the 13th USENIX Conference on File and Storage Technologies, pp. 241–256. USENIX Association (2015)
Google Scholar
Mingyuan, X., Mohit, S., Mario, B., David, A.P.: A tale of two erasure codes in HDFS. In: FAST, pp. 213–226 (2015)
Google Scholar
Pinheiro, E., Weber, W.-D., Barroso, L.A.: Failure trends in a large disk drive population. In: FAST, pp. 17–23
Google Scholar
Luo, X., Shu, J.: Load-balanced recovery schemes for single-disk failure in storage systems with any erasure code. In: 2013 42nd International Conference on Parallel Processing (ICPP), pp. 552–561. IEEE (2013)
Google Scholar
Boboila, S., Desnoyers, P.: Write endurance in flash drives: measurements and analysis, pp. 9–9
Google Scholar
Elerath, J.G., Schindler, J.: Beyond MTTDL: a closed-form RAID 6 reliability equation. ACM Trans. Storage (TOS) 10, 7 (2014)
Google Scholar
Corbett, P., English, B., Goel, A., Grcanac, T., Kleiman, S., Leong, J., Sankar, S.: Row-diagonal parity for double disk failure correction. In: Proceedings of the 3rd USENIX Conference on File and Storage Technologies, pp. 1–14
Google Scholar
Rongdong, H., Guangming, L., Jingfei, J.: An efficient coding scheme for tolerating double disk failures. In: 2010 12th IEEE International Conference on High Performance Computing and Communications (HPCC), pp. 707–712 (2010)
Google Scholar

Download references

Acknowledgment

We are grateful to our anonymous reviewers for their suggestions to improve this paper. This work is supported by the National Natural Science Foundation of China under Grant Nos. 61232003, 61332003, 61202121, 61402503, 61303073.

Author information

Authors and Affiliations

State Key Laboratory of High Performance Computing, College of Computer, National University of Defense Technology, Changsha, 410073, China
Ming-Zhu Deng, Yang Ou, Nong Xiao, Song-Ping Yu, Wei Chen, Zhi-Guang Chen & Fang Liu

Authors

Ming-Zhu Deng
View author publications
You can also search for this author in PubMed Google Scholar
Yang Ou
View author publications
You can also search for this author in PubMed Google Scholar
Nong Xiao
View author publications
You can also search for this author in PubMed Google Scholar
Song-Ping Yu
View author publications
You can also search for this author in PubMed Google Scholar
Wei Chen
View author publications
You can also search for this author in PubMed Google Scholar
Zhi-Guang Chen
View author publications
You can also search for this author in PubMed Google Scholar
Fang Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ming-Zhu Deng .

Editor information

Editors and Affiliations

School of Computer Sci,Level 4, University of Adelaide, Adelaide, South Australia, Australia
Lina Yao
School of Computer Cience & Technol, Huazhong University of Science &, Wuhan, China
Xia Xie
School of Software Technology, Dalian University of Technology, Dalian, China
Qingchen Zhang
Department of Computer Science, St. Francis Xavier University, Antigonish, Nova Scotia, Canada
Laurence T. Yang
School of Information Technologies, University of Sydney, Sydney, New South Wales, Australia
Albert Y. Zomaya
School of Comp. Science and Technology, Huazhong Univ. of Science and Technology, Wuhan, China
Hai Jin

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Deng, MZ. et al. (2015). RAID-6Plus: A Fast and Reliable Coding Scheme Aided by Multi-failure Degradation. In: Yao, L., Xie, X., Zhang, Q., Yang, L., Zomaya, A., Jin, H. (eds) Advances in Services Computing. APSCC 2015. Lecture Notes in Computer Science(), vol 9464. Springer, Cham. https://doi.org/10.1007/978-3-319-26979-5_15

Download citation

DOI: https://doi.org/10.1007/978-3-319-26979-5_15
Published: 09 December 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-26978-8
Online ISBN: 978-3-319-26979-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics