Abstract
Since demand for data is significantly heterogeneous in cloud storage systems (CSSs), there is traffic congestion in nodes storing hot data. In erasure-coded CSSs, traffic congestion can be alleviated by degraded reads sacrificing the bandwidth of surviving nodes. Local reconstruction codes (LRCs) reduce the bandwidth consumption of degraded reads, but cannot provide skewed throughput gain for the hot data. In this paper, we propose a scalable local reconstruction code (SLRC) that relies on LRCs but is more flexible in improving the throughput of a specific data block. First, we develop the local maximum throughput (LMT) to measure the maximum throughput of the hot data blocks by analyzing the actual read arrival rate of LRCs. Further, we elaborate on the structure of SLRC and analyze their performance metrics, which include storage overhead, reconstruction cost, and LMT. To select the appropriate code, we present the minimum reconstruction cost, minimum storage overhead, and minimum penalty algorithms. Finally, we implement extensive experiments on several typical SLRCs on the Hadoop distributed file system. Higher LMT and lower bandwidth consumption can be provided by SLRCs for hot data block degraded reads in CSSs compared with RS codes and LRCs.
Similar content being viewed by others
References
Ananthanarayanan G, Agarwal S, Kandula S, et al. Scarlett: coping with skewed content popularity in MapReduce cluster. In: Proceedings of ACM SIGOPS/EuroSys European Conference Computer Systems, 2011. 287–300
Tan X Y, Guo Y C, Chen Y S, et al. Accurate inference of user popularity preference in a large-scale online video streaming system. Sci China Inf Sci, 2018, 61: 018101
André F, Kermarrec A, Merrer E L, et al. Archiving cold data in warehouses with clustered network coding. In: Proceedings of ACM SIGOPS/EuroSys European Conference Computer Systems, 2014. 1–14
Ghosh M, Raina A, Xu L, et al. Popular is cheaper: curtailing memory costs in interactive analytics engines. In: Proceedings of ACM SIGOPS/EuroSys European Conference Computer Systems, 2018. 1–14
Hu D, Feng D, Xie Y, et al. Efficient provenance management via clustering and hybrid storage in big data environments. IEEE Trans Big Data, 2020, 6: 792–803
Balakrishnan S, Black R, Donnelly A, et al. Pelican: a building block for exascale cold data storage. In: Proceedings of USENIX Conference Operating Systems Design and Implementation, 2014. 351–365
Schroeder B, Gibson G A. Disk failures in the real world: what does an MTTF of 1,000,000 hours mean to you? In: Proceedings of the 5th USENIX Conference File Storage Technology, 2007. 1–16
Ford D, Labelle F, Popovici F I, et al. Availability in globally distributed storage systems. In: Proceedings of USENIX Conference Operating Systems Design and Implementation, 2010. 61–74
Fang J, Wan S, Huang P, et al. Early identification of critical blocks: making replicated distributed storage systems reliable against node failures. IEEE Trans Parallel Distrib Syst, 2018, 29: 2446–2459
Calder B, Wang J, Ogus A, et al. Windows azure storage: a highly available cloud storage service with strong consistency. In: Proceedings of ACM Symposium on Operating Systems Principles, 2011. 143–157
Ghemawat S, Gobioff H, Leung S T. The Google file system. In: Proceedings of ACM Symposium on Operating Systems Principles, 2003. 29–43
Abulibdeh H, Princehouse L, Weatherspoon H. RACS: a case for cloud storage diversity. In: Proceedings of the 1st ACM Symposium on Cloud Computing, 2010. 229–240
Muralidhar S, Lloyd W, Roy S, et al. F4: Facebook’s Warm BLOB storage system. In: Proceedings of USENIX Conference Operating Systems Design and Implementation, 2014. 383–398
Wang J Z, Luo Y, Shum K W. Storage and repair bandwidth tradeoff for heterogeneous cluster distributed storage systems. Sci China Inf Sci, 2020, 63: 122301
Zhou L Y, Zhang Z F. Explicit construction of minimum bandwidth rack-aware regenerating codes. Sci China Inf Sci, 2021. doi: https://doi.org/10.1007/s11432-021-3304-6
Balaji S B, Krishnan M N, Vajha M, et al. Erasure coding for distributed storage: an overview. Sci China Inf Sci, 2018, 61: 100301
Hou H X, Han Y S. A class of binary MDS array codes with asymptotically weak-optimal repair. Sci China Inf Sci, 2018, 61: 100302
Huang C, Simitci H, Xu Y K, et al. Erasure coding in windows azure storage. In: Proceedings of USENIX Annual Technical Conference, 2012. 2
Dimakis A G, Godfrey P B, Wu Y, et al. Network coding for distributed storage systems. IEEE Trans Inform Theor, 2010, 56: 4539–4551
Tang X, Yang B, Li J, et al. A new repair strategy for the Hadamard minimum storage regenerating codes for distributed storage systems. IEEE Trans Inform Theor, 2015, 61: 5271–5279
Yang B, Tang X, Li J. A systematic piggybacking design for minimum storage regenerating codes. IEEE Trans Inform Theor, 2015, 61: 5779–5786
Li J, Tang X, Parampalli U. A framework of constructions of minimal storage regenerating codes with the optimal access/update property. IEEE Trans Inform Theor, 2015, 61: 1920–1932
Rashmi K, Shah N B, Gu D K, et al. A hitchhiker’s guide to fast and efficient data reconstruction in erasure-coded data centers. In: Proceedings of ACM Conference SIGCOMM, 2014. 331–342
Shen Z, Lee P P C, Shu J, et al. Encoding-aware data placement for efficient degraded reads in xor-coded storage systems: algorithms and evaluation. IEEE Trans Parallel Distrib Syst, 2018, 29: 2757–2770
Zhu Y, Lin J, Lee P P C, et al. Boosting degraded reads in heterogeneous erasure-coded storage systems. IEEE Trans Comput, 2015, 64: 2145–2157
Shen Z, Shu J, Fu Y. HV Code: an all-around MDS code for RAID-6 storage systems. IEEE Trans Parallel Distrib Syst, 2015, 27: 1674–1686
Li R, Lee P P C, Hu Y. Degraded-first scheduling for map-reduce in erasure-coded storage clusters. In: Proceedings of the 44th Annual IEEE/IFIP International Conference Dependable Systems and Networks, 2014. 419–430
Khan O, Burns R, Plank J S, et al. Rethinking erasure codes for cloud file systems: minimizing I/O for recovery and degraded reads. In: Proceedings of the 10th USENIX Conference on File and Storage Technologies, 2012. 251–264
Fu Y, Shu J, Shen Z. EC-FRM: an erasure coding framework to speed up reads for erasure coded cloud storage systems. In: Proceedings of 2015 44th International Conference on Parallel Processing Workshops, 2015. 480–489
Aggarwal V, Chen Y F R, Lan T, et al. Sprout: a functional caching approach to minimize service latency in erasure-coded storage. IEEE/ACM Trans Networking, 2017, 25: 3683–3694
Zhang X J, Cai Y, Liu Y F, et al. NADE: nodes performance awareness and accurate distance evaluation for degraded read in heterogeneous distributed erasure code-based storage. J Supercomput, 2020, 76: 4946–4975
Chowdhury M, Kandula S, Stoica I. Leveraging endpoint flexibility in data-intensive clusters. In: Proceedings of ACM SIGCOMM Conference, 2013. 231–242
Fu M, Feng D, Hua Y, et al. Reducing fragmentation for in-line deduplication backup storage via exploiting backup history and cache knowledge. IEEE Trans Parallel Distrib Syst, 2016, 27: 855–868
Li P, Jin X T, Stones R J, et al. Parallelizing degraded read for erasure coded cloud storage systems using collective communications. In: Proceedings of IEEE Trustcom/BigDataSE/ISPA, 2016. 1272–1279
Nachiappan R, Javadi B, Calheiros R N, et al. ProactiveCache: on reducing degraded read latency of erasure coded. In: Proceedings of IEEE International Conference on Cloud Computing Technology and Science (CloudCom), 2019. 223–230
Shuai Q, Li V O K. Delay performance of direct reads in distributed storage systems with coding. In: Proceedings of IEEE 17th International Conference on High Performance Computing and Communication, 2015. 184–189
Lee K, Shah N B, Huang L, et al. The MDS queue: analysing the latency performance of erasure codes. IEEE Trans Inform Theor, 2017, 63: 2822–2842
Hu Y, Liu Y, Li W, et al. Unequal failure protection coding technique for distributed cloud storage systems. IEEE Trans Cloud Comput, 2021, 9: 386–400
Wei B, Xiao L M, Wei W, et al. A new adaptive coding selection method for distributed storage systems. IEEE Access, 2018, 6: 13350–13357
Acknowledgements
This work was supported by National Natural Sciences Foundation of China (Grant Nos. 61831008, 62027802, 62271165), Guangdong Science and Technology Planning Project (Grant No. 2021A1515011572), Shenzhen Natural Science Fund (Grant No. JCYJ20200109112822953), Shenzhen Natural Science Fund (Stable Support Plan Program) (Grant No. GXWD20201230155427003-20200824081029001), and Major Key Project of PCL (Grant No. PCL2021A03-1).
Author information
Authors and Affiliations
Corresponding authors
Rights and permissions
About this article
Cite this article
Zhang, Z., Gu, S. & Zhang, Q. Scalable local reconstruction code design for hot data reads in cloud storage systems. Sci. China Inf. Sci. 65, 222303 (2022). https://doi.org/10.1007/s11432-021-3421-6
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11432-021-3421-6