Skip to main content
Log in

Storage and repair bandwidth tradeoff for heterogeneous cluster distributed storage systems

  • Research Paper
  • Published:
Science China Information Sciences Aims and scope Submit manuscript

Abstract

The storage and repair bandwidth tradeoff is an important issue in distributed storage systems (DSSs) where large scale data are stored in multiple nodes with erasure coding to ensure reliability. There are lots of studies on the DSS model with multiple clusters where the repair bandwidths from intra-cluster and cross-cluster nodes are differentiated to improve repair efficiency based on the realistic network topological structures. At the same time, separate nodes are also prevalent due to the variety of practical networks, but the work on the cluster DSS model with multiple separate nodes is insufficient, which is a main motivation of this paper. We formulate the tradeoff bound between storage repair bandwidth for a heterogeneous DSS model consisting of clusters and separate nodes by analyzing the min-cuts of heterogeneous information flow graphs corresponding to the orders of failed nodes. Additionally, the tradeoff bounds are investigated in multiple aspects when the repair bandwidth constraints and the amount of separate nodes vary, respectively. Moreover, a class of regenerating codes are illustrated to achieve the tradeoff in the heterogeneous cases.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Hu Y C, Li X L, Zhang M, et al. Optimal repair layering for erasure-coded data centers. ACM Trans Storage, 2017, 13: 1–24

    Article  Google Scholar 

  2. Huang C, Chen M H, Li J. Pyramid codes: flexible schemes to trade space for access efficiency in reliable data storage systems. ACM Trans Storage, 2013, 9: 3

    Article  Google Scholar 

  3. Zhang H Y, Li H, Li S Y R. Repair tree: fast repair for single failure in erasure-coded distributed storage systems. IEEE Trans Parallel Distrib Syst, 2017, 28: 1728–1739

    Article  Google Scholar 

  4. Balaji S B, Krishnan M N, Vajha M, et al. Erasure coding for distributed storage: an overview. Sci China Inf Sci, 2018, 61: 100301

    Article  Google Scholar 

  5. Hou H X, Han Y S. A class of binary MDS array codes with asymptotically weak-optimal repair. Sci China Inf Sci, 2018, 61: 100302

    Article  MathSciNet  Google Scholar 

  6. Dimakis A G, Godfrey P B, Wu Y N, et al. Network coding for distributed storage systems. IEEE Trans Inform Theor, 2010, 56: 4539–4551

    Article  MATH  Google Scholar 

  7. Li S Y R, Yeung R W, Cai N. Linear network coding. IEEE Trans Inform Theor, 2003, 49: 371–381

    Article  MathSciNet  MATH  Google Scholar 

  8. Yang B, Tang X, Li J. A systematic piggybacking design for minimum storage regenerating codes. IEEE Trans Inform Theor, 2015, 61: 5779–5786

    Article  MathSciNet  MATH  Google Scholar 

  9. Goparaju S, Fazeli A, Vardy A. Minimum storage regenerating codes for all parameters. IEEE Trans Inform Theor, 2017, 63: 6318–6328

    Article  MathSciNet  MATH  Google Scholar 

  10. Rashmi K V, Shah N B, Kumar P V. Optimal exact-regenerating codes for distributed storage at the MSR and MBR points via a product-matrix construction. IEEE Trans Inform Theor, 2011, 57: 5227–5239

    Article  MathSciNet  MATH  Google Scholar 

  11. Rashmi K V, Shah N B, Ramchandran K. A piggybacking design framework for read-and download-efficient distributed storage codes. IEEE Trans Inform Theor, 2017, 63: 5802–5820

    MathSciNet  MATH  Google Scholar 

  12. Tang X, Yang B, Li J, et al. A new repair strategy for the hadamard minimum storage regenerating codes for distributed storage systems. IEEE Trans Inform Theor, 2015, 61: 5271–5279

    Article  MathSciNet  MATH  Google Scholar 

  13. Ford D, Labelle F, Popovici F I, et al. Availability in globally distributed storage systems. In: Proceedings of the 9th USENIX Conference on Operating Systems Design and Implementation, Vancouver, 2010. 61–74

  14. Kubiatowicz J, Bindel D, Chen Y, et al. OceanStore: an architecture for global-scale persistent storage. ACM SIGOPS Oper Syst Rev, 2000, 34: 190–201

    Article  Google Scholar 

  15. Ahmad F, Chakradhar S T, Raghunathan A, et al. Shufflewatcher: shuffle-aware scheduling in multi-tenant mapreduce clusters. In: Proceedings of 2014 USENIX Annual Technical Conference, Philadelphia, 2014. 1–13

  16. Prakash N, Abdrashitov V, Medard M. The storage versus repair-bandwidth trade-off for clustered storage systems. IEEE Trans Inform Theor, 2018, 64: 5783–5805

    Article  MathSciNet  MATH  Google Scholar 

  17. Sohn J Y, Choi B, Yoon S W, et al. Capacity of clustered distributed storage. IEEE Trans Inform Theor, 2019, 65: 81–107

    Article  MathSciNet  MATH  Google Scholar 

  18. Hou H X, Lee P P C, Shum K W, et al. Rack-aware regenerating codes for data centers. IEEE Trans Inform Theor, 2019, 65: 4730–4745

    Article  MathSciNet  MATH  Google Scholar 

  19. Hu Y C, Lee P P C, Zhang X. Double regenerating codes for hierarchical data centers. In: Proceedings of 2016 IEEE International Symposium on Information Theory (ISIT), 2016. 245–249

  20. Abdrashitov V, Prakash N, Medard M. The storage vs repair bandwidth trade-off for multiple failures in clustered storage networks. In: Proceedings of 2017 IEEE Information Theory Workshop (ITW), 2017. 46–50

  21. Akhlaghi S, Kiani A, Ghanavati M R. Cost-bandwidth tradeoff in distributed storage systems. Comput Commun, 2010, 33: 2105–2115

    Article  Google Scholar 

  22. Wang J Z, Wang T H, Luo Y. Storage and repair bandwidth tradeoff for distributed storage systems with clusters and separate nodes. Sci China Inf Sci, 2018, 61: 100303

    Article  MathSciNet  Google Scholar 

  23. Pernas J, Yuen C, Gastón B, et al. Non-homogeneous two-rack model for distributed storage systems. In: Proceedings of 2013 IEEE International Symposium on Information Theory (ISIT), 2013. 1237–1241

  24. Sohn J Y, Choi B, Yoon S W, et al. Capacity of clustered distributed storage. In: Proceedings of 2017 IEEE International Conference on Communications (ICC), Paris, 2017

  25. Ernvall T, El Rouayheb S, Hollanti C, et al. Capacity and security of heterogeneous distributed storage systems. IEEE J Sel Areas Commun, 2013, 31: 2701–2709

    Article  Google Scholar 

  26. Golrezaei N, Dimakis A G, Molisch A F. Wireless device-to-device communications with distributed caching. In: Proceedings of IEEE International Symposium on Information Theory, Cambridge, 2012

  27. Yu Q, Shum K W, Sung C W. Tradeoff between storage cost and repair cost in heterogeneous distributed storage systems. Trans Emerging Tel Tech, 2015, 26: 1201–1211

    Article  Google Scholar 

  28. Wang J Z, Wang T H, Luo Y, et al. Capacity of distributed storage systems with clusters and separate nodes. 2019. ArXiv: 1901.03000

  29. Dimakis A G, Ramchandran K, Wu Y N, et al. A survey on network codes for distributed storage. Proc IEEE, 2011, 99: 476–489

    Article  Google Scholar 

  30. Ahlswede R, Cai N, Li S Y R, et al. Network information flow. IEEE Trans Inform Theor, 2000, 46: 1204–1216

    Article  MathSciNet  MATH  Google Scholar 

  31. Ahuja R K, Magnanti T L, Orlin J B. Network Flows: Theory, Algorithms, and Applications. Upper Saddle River: Prentice Hall, 1993. 77–78

    MATH  Google Scholar 

  32. Zorgui M, Wang Z Y. Code constructions for multi-node exact repair in distributed storage. Sci China Inf Sci, 2018, 61: 100304

    Article  MathSciNet  Google Scholar 

  33. Wu Y N, Dimakis A G. Reducing repair traffic for erasure coding-based storage via interference alignment. In: Proceedings of 2009 IEEE International Symposium on Information Theory (ISIT), 2009. 2276–2280

Download references

Acknowledgements

This work was partially supported by National Natural Science Foundation of China (Grant No. 61571293), China Program of International S&T Cooperation (Grant No. 2016YFE0100300), and SJTU-CUHK Joint Research Collaboration Fund 2018.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yuan Luo.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, J., Luo, Y. & Shum, K.W. Storage and repair bandwidth tradeoff for heterogeneous cluster distributed storage systems. Sci. China Inf. Sci. 63, 122301 (2020). https://doi.org/10.1007/s11432-019-9937-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11432-019-9937-7

Keywords

Navigation