research-article

A Rack-Aware Pipeline Repair Scheme for Erasure-Coded Distributed Storage Systems

Authors:
Tong Liu

Temple University, USA

Temple University, USA
View Profile

,
Shakeel Alibhai

Temple University, USA

Temple University, USA
View Profile

,
Xubin He

Temple University, USA

Temple University, USA
View Profile

ICPP '20: Proceedings of the 49th International Conference on Parallel ProcessingAugust 2020Article No.: 37Pages 1–11https://doi.org/10.1145/3404397.3404444

Published:17 August 2020Publication History

ICPP '20: Proceedings of the 49th International Conference on Parallel Processing

Pages 1–11

ABSTRACT

Nowadays, modern industry data centers have employed erasure codes to provide reliability for large amounts of data at a low cost. Although erasure codes provide optimal storage efficiency, they suffer from high repair costs compared to traditional three-way replication: when a data miss occurs in a data center, erasure codes would require high disk usage and network bandwidth consumption across nodes and racks to repair the failed data. In this paper, we propose RPR, a rack-aware pipeline repair scheme for erasure-coded distributed storage systems. RPR for the first time investigates the insights of the racks, and explores the connection between the node level and rack level to help improve the repair performance when a single failure or multiple failures occur in a data center. The evaluation results on several common RS code configurations show that, for single-block failures, our RPR scheme reduces the total repair time by up to 81.5% compared to the traditional RS code repair method and 50.2% compared to the state-of-the-art CAR algorithm. For multi-block failures, RPR reduces the total repair time and cross-rack data transfer traffic by up to 64.5% and 50%, respectively, over the traditional repair.

References

[n.d.]. Amazon Elastic Compute Cloud (Amazon EC2). https://aws.amazon.com/ec2/.Google Scholar
[n.d.]. Simics Full System Simulator. https://www.windriver.com/products/simics/.Google Scholar
[n.d.]. wondershaper - A network traffic management tool in Linux. https://github.com/magnific0/wondershaper.Google Scholar
AA Bjrrck and V Pereyra. 1970. Solution of Vandermonde system of equations. Math Comput 24(1970), 893–903.Google ScholarCross Ref
Mosharaf Chowdhury, Srikanth Kandula, and Ion Stoica. 2013. Leveraging endpoint flexibility in data-intensive clusters. In ACM SIGCOMM Computer Communication Review, Vol. 43. ACM, 231–242.Google Scholar
Jeffrey Dean and Sanjay Ghemawat. 2008. MapReduce: simplified data processing on large clusters. Commun. ACM 51, 1 (2008), 107–113.Google ScholarDigital Library
Daniel Ford, François Labelle, Florentina Popovici, Murray Stokely, Van-Anh Truong, Luiz Barroso, Carrie Grimes, and Sean Quinlan. 2010. Availability in globally distributed storage systems. (2010).Google Scholar
Svend Frolund, Arif Merchant, Yasushi Saito, Susan Spence, and Alistair Veitch. 2004. A decentralized algorithm for erasure-coded virtual disks. In International Conference on Dependable Systems and Networks, 2004. IEEE, 125–134.Google ScholarCross Ref
Yingxun Fu, Jiwu Shu, and Xianghong Luo. 2014. A stack-based single disk failure recovery scheme for erasure coded storage systems. In 2014 IEEE 33rd International Symposium on Reliable Distributed Systems. IEEE, 136–145.Google ScholarDigital Library
Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung. 2003. The Google file system. (2003).Google Scholar
Qingyuan Gong, Jiaqi Wang, Dongsheng Wei, Jin Wang, and Xin Wang. 2015. Optimal node selection for data regeneration in heterogeneous distributed storage systems. In 2015 44th international conference on parallel processing. IEEE, 390–399.Google ScholarDigital Library
Yuchong Hu, Patrick PC Lee, and Xiaoyang Zhang. 2016. Double regenerating codes for hierarchical data centers. In 2016 IEEE International Symposium on Information Theory (ISIT). IEEE, 245–249.Google ScholarDigital Library
Yuchong Hu, Xiaolu Li, Mi Zhang, Patrick PC Lee, Xiaoyang Zhang, Pan Zhou, and Dan Feng. 2017. Optimal repair layering for erasure-coded data centers: From theory to practice. ACM Transactions on Storage (TOS) 13, 4 (2017), 33.Google ScholarDigital Library
Cheng Huang, Huseyin Simitci, Yikang Xu, Aaron Ogus, Brad Calder, Parikshit Gopalan, Jin Li, and Sergey Yekhanin. 2012. Erasure coding in windows azure storage. In Presented as part of the 2012 {USENIX} Annual Technical Conference ({USENIX}{ATC} 12). 15–26.Google Scholar
Cheng Huang and Lihao Xu. 2008. STAR: An efficient coding scheme for correcting triple storage node failures. IEEE Trans. Comput. 57, 7 (2008), 889–901.Google ScholarDigital Library
Jianzhong Huang, Xianhai Liang, Xiao Qin, Qiang Cao, and Changsheng Xie. 2014. Push: A pipelined reconstruction i/of or erasure-coded storage clusters. IEEE Transactions on Parallel and Distributed Systems 26, 2 (2014), 516–526.Google ScholarDigital Library
Kevin M. Greenan James S. Plank. [n.d.]. Jerasure: A Library in C Facilitating Erasure Coding for Storage Applications Version 2.0. https://github.com/magnific0/wondershaper.Google Scholar
Osama Khan, Randal C Burns, James S Plank, William Pierce, and Cheng Huang. 2012. Rethinking erasure codes for cloud file systems: minimizing I/O for recovery and degraded reads.. In FAST. 20.Google Scholar
Runhui Li, Yuchong Hu, and Patrick PC Lee. 2017. Enabling efficient and reliable transition from replication to erasure coding for clustered file systems. IEEE Transactions on Parallel and Distributed Systems 28, 9 (2017), 2500–2513.Google ScholarDigital Library
Peter S Magnusson, Magnus Christensson, Jesper Eskilson, Daniel Forsgren, Gustav Hallberg, Johan Hogberg, Fredrik Larsson, Andreas Moestedt, and Bengt Werner. 2002. Simics: A full system simulation platform. Computer 35, 2 (2002), 50–58.Google ScholarDigital Library
Subramanian Muralidhar, Wyatt Lloyd, Sabyasachi Roy, Cory Hill, Ernest Lin, Weiwen Liu, Satadru Pan, Shiva Shankar, Viswanath Sivakumar, Linpeng Tang, 2014. f4: Facebook’s Warm {BLOB} Storage System. In 11th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 14). 383–398.Google Scholar
Eduardo Pinheiro, Wolf-Dietrich Weber, and Luiz André Barroso. 2007. Failure trends in a large disk drive population. (2007).Google Scholar
James S. Plank. [n.d.]. Jerasure: A Library in C/C++ Facilitating Erasure Coding for Storage Applications. http://web.eecs.utk.edu/~jplank/plank/papers/CS-07-603.pdf.Google Scholar
KV Rashmi, Preetum Nakkiran, Jingyan Wang, Nihar B Shah, and Kannan Ramchandran. 2015. Having your cake and eating it too: Jointly optimal erasure codes for i/o, storage, and network-bandwidth. In 13th {USENIX} Conference on File and Storage Technologies ({FAST} 15). 81–94.Google Scholar
KV Rashmi, Nihar B Shah, Dikang Gu, Hairong Kuang, Dhruba Borthakur, and Kannan Ramchandran. 2013. A solution to the network challenges of data recovery in erasure-coded distributed storage systems: A study on the Facebook warehouse cluster. In Presented as part of the 5th {USENIX} Workshop on Hot Topics in Storage and File Systems.Google Scholar
KV Rashmi, Nihar B Shah, Dikang Gu, Hairong Kuang, Dhruba Borthakur, and Kannan Ramchandran. 2014. A hitchhiker’s guide to fast and efficient data reconstruction in erasure-coded data centers. In ACM SIGCOMM Computer Communication Review, Vol. 44. ACM, 331–342.Google Scholar
Irving S Reed and Gustave Solomon. 1960. Polynomial codes over certain finite fields. Journal of the society for industrial and applied mathematics 8, 2(1960), 300–304.Google ScholarCross Ref
Maheswaran Sathiamoorthy, Megasthenis Asteris, Dimitris Papailiopoulos, Alexandros G Dimakis, Ramkumar Vadali, Scott Chen, and Dhruba Borthakur. 2013. Xoring elephants: Novel erasure codes for big data. In Proceedings of the VLDB Endowment, Vol. 6. VLDB Endowment, 325–336.Google ScholarDigital Library
Bianca Schroeder and Garth A Gibson. 2007. Disk failures in the real world: What does an MTTF of 1, 000, 000 hours mean to you?. In FAST, Vol. 7. 1–16.Google Scholar
Zhirong Shen and Patrick PC Lee. 2018. Cross-Rack-Aware Updates in Erasure-Coded Data Centers. In Proceedings of the 47th International Conference on Parallel Processing. ACM, 80.Google ScholarDigital Library
Zhirong Shen, Patrick PC Lee, Jiwu Shu, and Wenzhong Guo. 2017. Cross-rack-aware single failure recovery for clustered file systems. IEEE Transactions on Dependable and Secure Computing (2017).Google Scholar
Zhirong Shen, Jiwu Shu, and Patrick PC Lee. 2016. Reconsidering single failure recovery in clustered file systems. In 2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). IEEE, 323–334.Google ScholarCross Ref
Konstantin Shvachko, Hairong Kuang, Sanjay Radia, Robert Chansler, 2010. The hadoop distributed file system.. In MSST, Vol. 10. 1–10.Google Scholar
Hakim Weatherspoon and John D Kubiatowicz. 2002. Erasure coding vs. replication: A quantitative comparison. In International Workshop on Peer-to-Peer Systems. Springer, 328–337.Google ScholarCross Ref
Mingyuan Xia, Mohit Saxena, Mario Blaum, and David A Pease. 2015. A Tale of Two Erasure Codes in {HDFS}. In 13th {USENIX} Conference on File and Storage Technologies ({FAST} 15). 213–226.Google Scholar
Yanwen Xie, Dan Feng, and Fang Wang. 2017. Non-sequential striping for distributed storage systems with different redundancy schemes. In 2017 46th International Conference on Parallel Processing (ICPP). IEEE, 231–240.Google ScholarCross Ref

Recommendations

Cross-Rack-Aware Updates in Erasure-Coded Data Centers
ICPP '18: Proceedings of the 47th International Conference on Parallel Processing

The update performance in erasure-coded data centers is often bottlenecked by the constrained cross-rack bandwidth. We propose CAU, a cross-rack-aware update mechanism that aims to mitigate the cross-rack update traffic in erasure-coded data centers. ...
Read More
Repair Pipelining for Erasure-coded Storage: Algorithms and Evaluation
We propose repair pipelining, a technique that speeds up the repair performance in general erasure-coded storage. By carefully scheduling the repair of failed data in small-size units across storage nodes in a pipelined manner, repair pipelining reduces ...
Read More
H-V: An Improved Coding Layout Based on Erasure Coded Storage System
Database Systems for Advanced Applications. DASFAA 2022 International Workshops
Abstract
The failure of a single unreliable commodity components is very common in large-scale distributed storage systems. In order to ensure the reliability of data in large-scale distributed storage systems, a lot of studies have emerged one after ... $^{}$ $^{^{}^{}}$ $^{}$ $^{}$
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

ICPP '20: Proceedings of the 49th International Conference on Parallel Processing
August 2020
844 pages
ISBN:9781450388160
DOI:10.1145/3404397

Copyright © 2020 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 17 August 2020
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
data reliability
distributed storage system
erasure coding
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate91of313submissions,29%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 3
  Total Citations
  View Citations
- 154
  Total Downloads
- Downloads (Last 12 months)27
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

A Rack-Aware Pipeline Repair Scheme for Erasure-Coded Distributed Storage Systems

ICPP '20: Proceedings of the 49th International Conference on Parallel Processing

ABSTRACT

References

Cited By

Recommendations

Cross-Rack-Aware Updates in Erasure-Coded Data Centers

Repair Pipelining for Erasure-coded Storage: Algorithms and Evaluation

H-V: An Improved Coding Layout Based on Erasure Coded Storage System

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

A Rack-Aware Pipeline Repair Scheme for Erasure-Coded Distributed Storage Systems

ICPP '20: Proceedings of the 49th International Conference on Parallel Processing

ABSTRACT

References

Cited By

Recommendations

Cross-Rack-Aware Updates in Erasure-Coded Data Centers

Repair Pipelining for Erasure-coded Storage: Algorithms and Evaluation

H-V: An Improved Coding Layout Based on Erasure Coded Storage System

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media