skip to main content
10.1145/3357223.3362713acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Coupling Decentralized Key-Value Stores with Erasure Coding

Published: 20 November 2019 Publication History

Abstract

Modern decentralized key-value stores often replicate and distribute data via consistent hashing for availability and scalability. Compared to replication, erasure coding is a promising redundancy approach that provides availability guarantees at much lower cost. However, when combined with consistent hashing, erasure coding incurs a lot of parity updates during scaling (i.e., adding or removing nodes) and cannot efficiently handle degraded reads caused by scaling. In this paper, we propose a novel erasure coding model called FragEC, which incurs no parity updates during scaling. We further extend consistent hashing with multiple hash rings to enable erasure coding to seamlessly address degraded reads during scaling. We realize our design as an in-memory key-value store called ECHash, and conduct testbed experiments on different scaling workloads in both local and cloud environments. We show that ECHash achieves better scaling performance (in terms of scaling throughput and degraded read latency during scaling) over the baseline erasure coding implementation, while maintaining high basic I/O and node repair performance.

References

[1]
Amazon DynamoDB. https://aws.amazon.com/dynamodb.
[2]
Amazon Elastic Compute Cloud (EC2). http://aws.amazon.com/ec2.
[3]
Amazon Elasticache. https://docs.aws.amazon.com/elasticache.
[4]
AWS Autoscaling. https://aws.amazon.com/autoscaling.
[5]
etcd. https://etcd.io.
[6]
Intel ISA-L. https://github.com/01org/isal.
[7]
LibMemcached. https://libmemcached.org.
[8]
Memcached. https://memcached.org.
[9]
Openstack. https://openstack.org.
[10]
Openstack Swift. https://swift.org.
[11]
Twemcache is the Twitter Memcached. https://twitter.com/twemcache.
[12]
B. Atikoglu, Y. Xu, E. Frachtenberg, S.Jiang, and M. Paleczny. Workload analysis of a large-scale key-value store. In Proc. of ACM SIGMETRICS, pages 53--64, 2012.
[13]
J. C. Chan, Q. Ding, P. P. Lee, and H. H. Chan. Parity logging with reserved space: Towards efficient updates and recovery in erasure-coded clustered storage. In Proc. of USENIX FAST, pages 163--176, 2014.
[14]
F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Burrows, T. Chandra, A. Fikes, and R. E. Gruber. Bigtable: A distributed storage system for structured data. ACM Trans. on Computer Systems, 26(2):1--26, 2008.
[15]
H. Chen, H. Zhang, M. Dong, Z. Wang, Y. Xia, H. Guan, and B. Zang. Efficient and available in-memory KV-store with hybrid erasure coding and replication. ACM Trans. on Storage, 13(3):25, 2017.
[16]
M. Chen and E. Zadok. Kurma: Secure geo-distributed multi-cloud storage gateways. In Proc. of ACM SYSTOR, pages 109--120, 2019.
[17]
Y. L. Chen, S. Mu, J. Li, C. Huang, J. Li, A. Ogus, and D. Phillips. Giza: Erasure coding objects across global data centers. In Proc. of USENIX ATC, pages 539--551, 2017.
[18]
B. F. Cooper, A. Silberstein, E. Tam, R. Ramakrishnan, and R. Sears. Benchmarking cloud serving systems with YCSB. In Proc. of ACM SoCC, pages 143--154, 2010.
[19]
G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall, and W. Vogels. Dynamo: Amazon's highly available key-value store. In Proc. of ACM SOSP, pages 205--220, 2007.
[20]
B. Fan, D. G. Andersen, and M. Kaminsky. MemC3: Compact and concurrent MemCache with dumber caching and smarter hashing. In Proc. of USENIX NSDI, pages 371--384, 2013.
[21]
D. Ford, F. Labelle, F. I. Popovici, M. Stokely, V.-A. Truong, L. Barroso, C. Grimes, and S. Quinlan. Availability in globally distributed storage systems. In Proc. of USENIX OSDI, pages 61--74, 2010.
[22]
U. U. Hafeez, M. Wajahat, and A. Gandhi. ElMem: Towards an elastic Memcached system. In Proc. of IEEE ICDCS, pages 278--289, 2018.
[23]
Y.-J. Hong and M. Thottethodi. Understanding and mitigating the impact of load imbalance in the memory caching tier. In Proc. of ACM SoCC, page 13, 2013.
[24]
X. Hu, X. Wang, Y. Li, L. Zhou, Y. Luo, C. Ding, S. Jiang, and Z. Wang. LAMA: Optimized locality-aware memory allocation for key-value cache. In Proc. of USENIX ATC, pages 57--69, 2015.
[25]
Y. Hu, Y. Wang, B. Liu, D. Niu, and C. Huang. Latency reduction and load balancing in coded storage systems. In Proc. of ACM SoCC, pages 365--377, 2017.
[26]
C. Huang, H. Simitci, Y. Xu, A. Ogus, B. Calder, P. Gopalan, J. Li, and S. Yekhanin. Erasure coding in Windows Azure Storage. In Proc. of USENIX ATC, pages 15--26, 2012.
[27]
J. Huang, X. Liang, X. Qin, P. Xie, and C. Xie. Scale-RS: An efficient scaling scheme for RS-coded storage clusters. IEEE Trans. on Parallel and Distributed Systems, 26(6):1704--1717, 2015.
[28]
P. Hunt, M. Konar, F. P. Junqueira, and B. Reed. ZooKeeper: Wait-free coordination for internet-scale systems. In Proc. of USENIX ATC, pages 1--14, 2010.
[29]
D. Karger, E. Lehman, T. Leighton, R. Panigrahy, M. Levine, and D. Lewin. Consistent hashing and random trees: Distributed caching protocols for relieving hot spots on the World Wide Web. In Proc. of ACM STOC, pages 654--663, 1997.
[30]
C. Lai, S.Jiang, L. Yang, S. Lin, G. Sun, Z. Hou, C. Cui, and J. Cong. Atlas: Baidu's key-value storage system for cloud data. In Proc. of IEEE MSST, pages 1--14, 2015.
[31]
A. Lakshman and P. Malik. Cassandra: A decentralized structured storage system. ACM SIGOPS Operating Systems Review, 44(2):35--40, 2010.
[32]
R. Li, X. Li, P. P. Lee, and Q. Huang. Repair pipelining for erasure-coded storage. In Proc. of USENIX ATC, pages 567--579, 2017.
[33]
S. Li, Q. Zhang, Z. Yang, and Y. Dai. BCStore: Bandwidth-efficient in-memory KV-store with batch coding. In Proc. of IEEE MSST, 2017.
[34]
X. Li, D. G. Andersen, M. Kaminsky, and M.J. Freedman. Algorithmic improvements for fast concurrent cuckoo hashing. In Proc. of ACM EuroSys, page 27, 2014.
[35]
H. Lim, D. Han, D. G. Andersen, and M. Kaminsky. MICA: A holistic approach to fast in-memory key-value storage. In Proc. of USENIX NSDI, pages 429--444, 2014.
[36]
W. Litwin, R. Moussa, and T. Schwarz. LH* RS: A highly-available scalable distributed data structure. ACM Trans. on Database Systems, 30(3):769--811, 2005.
[37]
S. Muralidhar, W. Lloyd, S. Roy, C. Hill, E. Lin, W. Liu, S. Pan, S. Shankar, V. Sivakumar, L. Tang, et al. f4: Facebook's warm blob storage system. In Proc. of USENIX OSDI, pages 383--398, 2014.
[38]
R. Nishtala, H. Fugal, S. Grimm, M. Kwiatkowski, H. Lee, H. C. Li, R. McElroy, M. Paleczny, D. Peek, P. Saab, et al. Scaling Memcache at Facebook. In Proc. of USENIX NSDI, pages 385--398, 2013.
[39]
M. Ovsiannikov, S. Rus, D. Reeves, P. Sutter, S. Rao, and J. Kelly. The Quantcast File System. Proc. of VLDB Endowment, 6(11):1092--1101, 2013.
[40]
K. Rashmi, M. Chowdhury, J. Kosaian, I. Stoica, and K. Ramchandran. EC-Cache: Load-balanced, low-latency cluster caching with online erasure coding. In Proc. of USENIX OSDI, pages 401--417, 2016.
[41]
I. Reed and G. Solomon. Polynomial Codes over Certain Finite Fields. Journal of the Society for Industrial & Applied Mathematics, 8(2):300--304, 1960.
[42]
T. Saemundsson, H. Bjornsson, G. Chockler, and Y. Vigfusson. Dynamic performance profiling of cloud caches. In Proc. of ACM SoCC, pages 1--14, 2014.
[43]
M. Sathiamoorthy, M. Asteris, D. Papailiopoulos, A. G. Dimakis, R. Vadali, S. Chen, and D. Borthakur. XORing elephants: Novel erasure codes for big data. In Proc. of VLDB Endowment, volume 6, pages 325--336, 2013.
[44]
M. Silberstein, L. Ganesh, Y. Wang, L. Alvisi, and M. Dahlin. Lazy means smart: Reducing repair bandwidth costs in erasure-coded distributed storage. In Proc. of ACM SYSTOR, pages 1--7, 2014.
[45]
K. Taranov, G. Alonso, and T. Hoefler. Fast and strongly-consistent per-item resilience in key-value stores. In Proc. of ACM EuroSys, page 39, 2018.
[46]
M. Vrable, S. Savage, and G. M. Voelker. Bluesky: A cloud-backed file system for the enterprise. In Proc. of USENIX FAST, pages 19--19, 2012.
[47]
H. Weatherspoon and J. D. Kubiatowicz. Erasure coding vs. replication: A quantitative comparison. In Proc. of Springer International Workshop on Peer-to-Peer Systems, pages 328--337, 2002.
[48]
S. Wu, Y. Xu, Y. Li, and Z. Yang. I/O-efficient scaling schemes for distributed storage systems with CRS codes. IEEE Trans. on Parallel and Distributed Systems, 27(9):2639--2652, Sep 2016.
[49]
M. M. Yiu, H. H. Chan, and P. P. Lee. Erasure coding for small objects in in-memory KV storage. In Proc. of ACM SYSTOR, page 14, 2017.
[50]
X. Zhang, Y. Hu, P. P. Lee, and P. Zhou. Toward optimal storage scaling via network coding: From theory to practice. In Proc. of IEEE INFOCOM, pages 1808--1816, 2018.

Cited By

View all
  • (2025)A Survey of the Past, Present, and Future of Erasure Coding for Storage SystemsACM Transactions on Storage10.1145/370899421:1(1-39)Online publication date: 8-Jan-2025
  • (2025)Optimizing encoding and repair for wide-stripe minimum bandwidth regenerating codes in in-memory key-value storesJournal of Systems Architecture10.1016/j.sysarc.2025.103369161(103369)Online publication date: Apr-2025
  • (2024)ELECTProceedings of the 22nd USENIX Conference on File and Storage Technologies10.5555/3650697.3650715(293-310)Online publication date: 27-Feb-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SoCC '19: Proceedings of the ACM Symposium on Cloud Computing
November 2019
503 pages
ISBN:9781450369732
DOI:10.1145/3357223
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 November 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Erasure coding
  2. Key-value stores
  3. Scaling

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

SoCC '19
Sponsor:
SoCC '19: ACM Symposium on Cloud Computing
November 20 - 23, 2019
CA, Santa Cruz, USA

Acceptance Rates

SoCC '19 Paper Acceptance Rate 39 of 157 submissions, 25%;
Overall Acceptance Rate 169 of 722 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)25
  • Downloads (Last 6 weeks)1
Reflects downloads up to 23 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)A Survey of the Past, Present, and Future of Erasure Coding for Storage SystemsACM Transactions on Storage10.1145/370899421:1(1-39)Online publication date: 8-Jan-2025
  • (2025)Optimizing encoding and repair for wide-stripe minimum bandwidth regenerating codes in in-memory key-value storesJournal of Systems Architecture10.1016/j.sysarc.2025.103369161(103369)Online publication date: Apr-2025
  • (2024)ELECTProceedings of the 22nd USENIX Conference on File and Storage Technologies10.5555/3650697.3650715(293-310)Online publication date: 27-Feb-2024
  • (2024)Achieving Tunable Erasure Coding with Cluster-Aware Redundancy TransitioningACM Transactions on Architecture and Code Optimization10.1145/3672077Online publication date: 10-Jun-2024
  • (2024)Enabling Efficient Erasure Coding in Disaggregated Memory SystemsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2023.333278235:1(154-168)Online publication date: Jan-2024
  • (2024)Elastic Reed-Solomon Codes for Efficient Redundancy Transitioning in Distributed Key-Value StoresIEEE/ACM Transactions on Networking10.1109/TNET.2023.330386532:1(670-685)Online publication date: Feb-2024
  • (2024)Advanced Elastic Reed–Solomon Codes for Erasure-Coded Key–Value StoresIEEE Internet of Things Journal10.1109/JIOT.2023.329957411:3(4747-4762)Online publication date: 1-Feb-2024
  • (2024)Storage ReliabilityData Storage Architectures and Technologies10.1007/978-981-97-3534-1_9(225-270)Online publication date: 28-Aug-2024
  • (2023)Toward Optimal Repair and Load Balance in Locally Repairable CodesProceedings of the 52nd International Conference on Parallel Processing10.1145/3605573.3605635(725-735)Online publication date: 7-Aug-2023
  • (2023)Towards Practical Auditing of Dynamic Data in Decentralized StorageIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2022.314261120:1(708-723)Online publication date: 1-Jan-2023
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media