skip to main content
10.1145/3404397.3404444acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicppConference Proceedingsconference-collections
research-article

A Rack-Aware Pipeline Repair Scheme for Erasure-Coded Distributed Storage Systems

Published:17 August 2020Publication History

ABSTRACT

Nowadays, modern industry data centers have employed erasure codes to provide reliability for large amounts of data at a low cost. Although erasure codes provide optimal storage efficiency, they suffer from high repair costs compared to traditional three-way replication: when a data miss occurs in a data center, erasure codes would require high disk usage and network bandwidth consumption across nodes and racks to repair the failed data. In this paper, we propose RPR, a rack-aware pipeline repair scheme for erasure-coded distributed storage systems. RPR for the first time investigates the insights of the racks, and explores the connection between the node level and rack level to help improve the repair performance when a single failure or multiple failures occur in a data center. The evaluation results on several common RS code configurations show that, for single-block failures, our RPR scheme reduces the total repair time by up to 81.5% compared to the traditional RS code repair method and 50.2% compared to the state-of-the-art CAR algorithm. For multi-block failures, RPR reduces the total repair time and cross-rack data transfer traffic by up to 64.5% and 50%, respectively, over the traditional repair.

References

  1. [n.d.]. Amazon Elastic Compute Cloud (Amazon EC2). https://aws.amazon.com/ec2/.Google ScholarGoogle Scholar
  2. [n.d.]. Simics Full System Simulator. https://www.windriver.com/products/simics/.Google ScholarGoogle Scholar
  3. [n.d.]. wondershaper - A network traffic management tool in Linux. https://github.com/magnific0/wondershaper.Google ScholarGoogle Scholar
  4. AA Bjrrck and V Pereyra. 1970. Solution of Vandermonde system of equations. Math Comput 24(1970), 893–903.Google ScholarGoogle ScholarCross RefCross Ref
  5. Mosharaf Chowdhury, Srikanth Kandula, and Ion Stoica. 2013. Leveraging endpoint flexibility in data-intensive clusters. In ACM SIGCOMM Computer Communication Review, Vol. 43. ACM, 231–242.Google ScholarGoogle Scholar
  6. Jeffrey Dean and Sanjay Ghemawat. 2008. MapReduce: simplified data processing on large clusters. Commun. ACM 51, 1 (2008), 107–113.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Daniel Ford, François Labelle, Florentina Popovici, Murray Stokely, Van-Anh Truong, Luiz Barroso, Carrie Grimes, and Sean Quinlan. 2010. Availability in globally distributed storage systems. (2010).Google ScholarGoogle Scholar
  8. Svend Frolund, Arif Merchant, Yasushi Saito, Susan Spence, and Alistair Veitch. 2004. A decentralized algorithm for erasure-coded virtual disks. In International Conference on Dependable Systems and Networks, 2004. IEEE, 125–134.Google ScholarGoogle ScholarCross RefCross Ref
  9. Yingxun Fu, Jiwu Shu, and Xianghong Luo. 2014. A stack-based single disk failure recovery scheme for erasure coded storage systems. In 2014 IEEE 33rd International Symposium on Reliable Distributed Systems. IEEE, 136–145.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung. 2003. The Google file system. (2003).Google ScholarGoogle Scholar
  11. Qingyuan Gong, Jiaqi Wang, Dongsheng Wei, Jin Wang, and Xin Wang. 2015. Optimal node selection for data regeneration in heterogeneous distributed storage systems. In 2015 44th international conference on parallel processing. IEEE, 390–399.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Yuchong Hu, Patrick PC Lee, and Xiaoyang Zhang. 2016. Double regenerating codes for hierarchical data centers. In 2016 IEEE International Symposium on Information Theory (ISIT). IEEE, 245–249.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Yuchong Hu, Xiaolu Li, Mi Zhang, Patrick PC Lee, Xiaoyang Zhang, Pan Zhou, and Dan Feng. 2017. Optimal repair layering for erasure-coded data centers: From theory to practice. ACM Transactions on Storage (TOS) 13, 4 (2017), 33.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Cheng Huang, Huseyin Simitci, Yikang Xu, Aaron Ogus, Brad Calder, Parikshit Gopalan, Jin Li, and Sergey Yekhanin. 2012. Erasure coding in windows azure storage. In Presented as part of the 2012 {USENIX} Annual Technical Conference ({USENIX}{ATC} 12). 15–26.Google ScholarGoogle Scholar
  15. Cheng Huang and Lihao Xu. 2008. STAR: An efficient coding scheme for correcting triple storage node failures. IEEE Trans. Comput. 57, 7 (2008), 889–901.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Jianzhong Huang, Xianhai Liang, Xiao Qin, Qiang Cao, and Changsheng Xie. 2014. Push: A pipelined reconstruction i/of or erasure-coded storage clusters. IEEE Transactions on Parallel and Distributed Systems 26, 2 (2014), 516–526.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Kevin M. Greenan James S. Plank. [n.d.]. Jerasure: A Library in C Facilitating Erasure Coding for Storage Applications Version 2.0. https://github.com/magnific0/wondershaper.Google ScholarGoogle Scholar
  18. Osama Khan, Randal C Burns, James S Plank, William Pierce, and Cheng Huang. 2012. Rethinking erasure codes for cloud file systems: minimizing I/O for recovery and degraded reads.. In FAST. 20.Google ScholarGoogle Scholar
  19. Runhui Li, Yuchong Hu, and Patrick PC Lee. 2017. Enabling efficient and reliable transition from replication to erasure coding for clustered file systems. IEEE Transactions on Parallel and Distributed Systems 28, 9 (2017), 2500–2513.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Peter S Magnusson, Magnus Christensson, Jesper Eskilson, Daniel Forsgren, Gustav Hallberg, Johan Hogberg, Fredrik Larsson, Andreas Moestedt, and Bengt Werner. 2002. Simics: A full system simulation platform. Computer 35, 2 (2002), 50–58.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Subramanian Muralidhar, Wyatt Lloyd, Sabyasachi Roy, Cory Hill, Ernest Lin, Weiwen Liu, Satadru Pan, Shiva Shankar, Viswanath Sivakumar, Linpeng Tang, 2014. f4: Facebook’s Warm {BLOB} Storage System. In 11th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 14). 383–398.Google ScholarGoogle Scholar
  22. Eduardo Pinheiro, Wolf-Dietrich Weber, and Luiz André Barroso. 2007. Failure trends in a large disk drive population. (2007).Google ScholarGoogle Scholar
  23. James S. Plank. [n.d.]. Jerasure: A Library in C/C++ Facilitating Erasure Coding for Storage Applications. http://web.eecs.utk.edu/~jplank/plank/papers/CS-07-603.pdf.Google ScholarGoogle Scholar
  24. KV Rashmi, Preetum Nakkiran, Jingyan Wang, Nihar B Shah, and Kannan Ramchandran. 2015. Having your cake and eating it too: Jointly optimal erasure codes for i/o, storage, and network-bandwidth. In 13th {USENIX} Conference on File and Storage Technologies ({FAST} 15). 81–94.Google ScholarGoogle Scholar
  25. KV Rashmi, Nihar B Shah, Dikang Gu, Hairong Kuang, Dhruba Borthakur, and Kannan Ramchandran. 2013. A solution to the network challenges of data recovery in erasure-coded distributed storage systems: A study on the Facebook warehouse cluster. In Presented as part of the 5th {USENIX} Workshop on Hot Topics in Storage and File Systems.Google ScholarGoogle Scholar
  26. KV Rashmi, Nihar B Shah, Dikang Gu, Hairong Kuang, Dhruba Borthakur, and Kannan Ramchandran. 2014. A hitchhiker’s guide to fast and efficient data reconstruction in erasure-coded data centers. In ACM SIGCOMM Computer Communication Review, Vol. 44. ACM, 331–342.Google ScholarGoogle Scholar
  27. Irving S Reed and Gustave Solomon. 1960. Polynomial codes over certain finite fields. Journal of the society for industrial and applied mathematics 8, 2(1960), 300–304.Google ScholarGoogle ScholarCross RefCross Ref
  28. Maheswaran Sathiamoorthy, Megasthenis Asteris, Dimitris Papailiopoulos, Alexandros G Dimakis, Ramkumar Vadali, Scott Chen, and Dhruba Borthakur. 2013. Xoring elephants: Novel erasure codes for big data. In Proceedings of the VLDB Endowment, Vol. 6. VLDB Endowment, 325–336.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Bianca Schroeder and Garth A Gibson. 2007. Disk failures in the real world: What does an MTTF of 1, 000, 000 hours mean to you?. In FAST, Vol. 7. 1–16.Google ScholarGoogle Scholar
  30. Zhirong Shen and Patrick PC Lee. 2018. Cross-Rack-Aware Updates in Erasure-Coded Data Centers. In Proceedings of the 47th International Conference on Parallel Processing. ACM, 80.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Zhirong Shen, Patrick PC Lee, Jiwu Shu, and Wenzhong Guo. 2017. Cross-rack-aware single failure recovery for clustered file systems. IEEE Transactions on Dependable and Secure Computing (2017).Google ScholarGoogle Scholar
  32. Zhirong Shen, Jiwu Shu, and Patrick PC Lee. 2016. Reconsidering single failure recovery in clustered file systems. In 2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). IEEE, 323–334.Google ScholarGoogle ScholarCross RefCross Ref
  33. Konstantin Shvachko, Hairong Kuang, Sanjay Radia, Robert Chansler, 2010. The hadoop distributed file system.. In MSST, Vol. 10. 1–10.Google ScholarGoogle Scholar
  34. Hakim Weatherspoon and John D Kubiatowicz. 2002. Erasure coding vs. replication: A quantitative comparison. In International Workshop on Peer-to-Peer Systems. Springer, 328–337.Google ScholarGoogle ScholarCross RefCross Ref
  35. Mingyuan Xia, Mohit Saxena, Mario Blaum, and David A Pease. 2015. A Tale of Two Erasure Codes in {HDFS}. In 13th {USENIX} Conference on File and Storage Technologies ({FAST} 15). 213–226.Google ScholarGoogle Scholar
  36. Yanwen Xie, Dan Feng, and Fang Wang. 2017. Non-sequential striping for distributed storage systems with different redundancy schemes. In 2017 46th International Conference on Parallel Processing (ICPP). IEEE, 231–240.Google ScholarGoogle ScholarCross RefCross Ref

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Other conferences
    ICPP '20: Proceedings of the 49th International Conference on Parallel Processing
    August 2020
    844 pages
    ISBN:9781450388160
    DOI:10.1145/3404397

    Copyright © 2020 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 17 August 2020

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article
    • Research
    • Refereed limited

    Acceptance Rates

    Overall Acceptance Rate91of313submissions,29%

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format