research-article

An in-memory object caching framework with adaptive load balancing

Authors:

Ali R. ButtAuthors Info & Claims

EuroSys '15: Proceedings of the Tenth European Conference on Computer Systems

Article No.: 4, Pages 1 - 16

https://doi.org/10.1145/2741948.2741967

Published: 17 April 2015 Publication History

Abstract

The extreme latency and throughput requirements of modern web applications are driving the use of distributed in-memory object caches such as Memcached. While extant caching systems scale-out seamlessly, their use in the cloud --- with its unique cost and multi-tenancy dynamics --- presents unique opportunities and design challenges.

In this paper, we propose MBal, a high-performance in-memory object caching framework with adaptive <u>M</u>ultiphase load <u>B</u>alancing, which supports not only horizontal (scale-out) but vertical (scale-up) scalability as well. MBal is able to make efficient use of available resources in the cloud through its fine-grained, partitioned, lockless design. This design also lends itself naturally to provide adaptive load balancing both within a server and across the cache cluster through an event-driven, multi-phased load balancer. While individual load balancing approaches are being lever-aged in in-memory caches, MBal goes beyond the extant systems and offers a holistic solution wherein the load balancing model tracks hotspots and applies different strategies based on imbalance severity -- key replication, server-local or cross-server coordinated data migration. Performance evaluation on an 8-core commodity server shows that compared to a state-of-the-art approach, MBal scales with number of cores and executes 2.3x and 12x more queries/second for GET and SET operations, respectively.

Supplementary Material

MP4 File (a4-sidebyside.mp4)

Download
916.67 MB

References

[1]

Amazon EC2 Pricing. http://aws.amazon.com/ec2/pricing/.

[2]

Amazon Web Service ElastiCache. http://aws.amazon.com/elasticache/.

[3]

Cache with Twemcache. https://blog.twitter.com/2012/caching-with-twemcache.

[4]

How Twitter Uses Redis to Scale. http://highscalability.com/blog/2014/9/8/how-twitter-uses-redis-to-scale.html.

[5]

libmemcached. http://libmemcached.org/.

[6]

Memcached. http://memcached.org/.

[7]

Memcached Protocol. https://code.google.com/p/memcached/wiki/NewProtocols.

[8]

pylibmc. https://pypi.python.org/pypi/pylibmc.

[9]

Scaling Memcached with vBuckets. http://dustin.sallings.org/2010/06/29/memcached-vbuckets.html.

[10]

SpyMemcached. https://code.google.com/p/spymemcached/.

[11]

A. Acharya, M. Uysal, and J. Saltz. Active disks: Programming model, algorithms and evaluation. In Proceedings of the Eighth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS VIII, pages 81--91, New York, NY, USA, 1998. ACM.

Digital Library

[12]

B. Atikoglu, Y. Xu, E. Frachtenberg, S. Jiang, and M. Paleczny. Workload analysis of a large-scale key-value store. In Proceedings of the 12th ACM SIGMETRICS/PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems, SIGMETRICS '12, pages 53--64, New York, NY, USA, 2012. ACM.

Digital Library

[13]

M. Berezecki, E. Frachtenberg, M. Paleczny, and K. Steele. Many-core key-value store. In Green Computing Conference and Workshops (IGCC), 2011 International, pages 1--8, July 2011.

Digital Library

[14]

E. D. Berger, K. S. McKinley, R. D. Blumofe, and P. R. Wilson. Hoard: A scalable memory allocator for multithreaded applications. In Proceedings of the Ninth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS IX, pages 117--128, New York, NY, USA, 2000. ACM.

Digital Library

[15]

B. F. Cooper, A. Silberstein, E. Tam, R. Ramakrishnan, and R. Sears. Benchmarking cloud serving systems with ycsb. In Proceedings of the 1st ACM Symposium on Cloud Computing, SoCC '10, pages 143--154, New York, NY, USA, 2010. ACM.

Digital Library

[16]

Couchbase. vbuckets: The core enabling mechanism for couchbase server data distribution (aka "auto-sharding"). Technical Whitepaper.

[17]

F. Dabek, M. F. Kaashoek, D. Karger, R. Morris, and I. Stoica. Wide-area cooperative storage with cfs. In Proceedings of the Eighteenth ACM Symposium on Operating Systems Principles, SOSP '01, pages 202--215, New York, NY, USA, 2001. ACM.

Digital Library

[18]

J. Evans. A scalable concurrent malloc(3) implementation for freebsd. In BSDCAN'06, 2006.

[19]

B. Fan, D. G. Andersen, and M. Kaminsky. Memc3: Compact and concurrent memcache with dumber caching and smarter hashing. In Proceedings of the 10th USENIX Conference on Networked Systems Design and Implementation, nsdi'13, pages 371--384, Berkeley, CA, USA, 2013. USENIX Association.

Digital Library

[20]

B. Fan, H. Lim, D. G. Andersen, and M. Kaminsky. Small cache, big effect: Provable load balancing for randomly partitioned cluster services. In Proceedings of the 2Nd ACM Symposium on Cloud Computing, SOCC '11, pages 23:1--23:12, New York, NY, USA, 2011. ACM.

Digital Library

[21]

R. Gandhi, A. Gupta, A. Povzner, W. Belluomini, and T. Kaldewey. Mercury: Bringing efficiency to key-value stores. In Proceedings of the 6th International Systems and Storage Conference, SYSTOR '13, pages 6:1--6:6, New York, NY, USA, 2013. ACM.

Digital Library

[22]

B. Godfrey, K. Lakshminarayanan, S. Surana, R. Karp, and I. Stoica. Load balancing in dynamic structured p2p systems. In INFOCOM 2004. Twenty-third AnnualJoint Conference of the IEEE Computer and Communications Societies, volume 4, pages 2253--2262 vol. 4, March 2004.

[23]

Y.-J. Hong and M. Thottethodi. Understanding and mitigating the impact of load imbalance in the memory caching tier. In Proceedings of the 4th Annual Symposium on Cloud Computing, SOCC '13, pages 13:1--13:17, New York, NY, USA, 2013. ACM.

Digital Library

[24]

Q. Huang, H. Gudmundsdottir, Y. Vigfusson, D. A. Freedman, K. Birman, and R. van Renesse. Characterizing load imbalance in real-world networked caches. In Proceedings of the 13th ACM Workshop on Hot Topics in Networks, HotNets-XIII, pages 8:1--8:7, New York, NY, USA, 2014. ACM.

Digital Library

[25]

P. Hunt, M. Konar, F. P. Junqueira, and B. Reed. Zookeeper: Wait-free coordination for internet-scale systems. In Proceedings of the 2010 USENIX Conference on USENIX Annual Technical Conference, USENIXATC'10, pages 11--11, Berkeley, CA, USA, 2010. USENIX Association.

Digital Library

[26]

J. Hwang, K. K. Ramakrishnan, and T. Wood. Netvm: High performance and flexible networking using virtualization on commodity platforms. In 11th USENIX Symposium on Networked Systems Design and Implementation (NSDI 14), pages 445--458, Seattle, WA, Apr. 2014. USENIX Association.

Digital Library

[27]

J. Hwang and T. Wood. Adaptive performance-aware distributed memory caching. In Proceedings of the 10th International Conference on Autonomic Computing (ICAC 13), pages 33--43, San Jose, CA, 2013. USENIX.

[28]

Intel Corporation. Intel data plane development kit: Getting started guide.

[29]

E. Jeong, S. Wood, M. Jamshed, H. Jeong, S. Ihm, D. Han, and K. Park. mtcp: a highly scalable user-level tcp stack for multicore systems. In 11th USENIX Symposium on Networked Systems Design and Implementation (NSDI 14), pages 489--502, Seattle, WA, Apr. 2014. USENIX Association.

Digital Library

[30]

R. Kapoor, G. Porter, M. Tewari, G. M. Voelker, and A. Vahdat. Chronos: Predictable low latency for data center applications. In Proceedings of the Third ACM Symposium on Cloud Computing, SoCC '12, pages 9:1--9:14, New York, NY, USA, 2012. ACM.

Digital Library

[31]

D. Karger, E. Lehman, T. Leighton, R. Panigrahy, M. Levine, and D. Lewin. Consistent hashing and random trees: Distributed caching protocols for relieving hot spots on the world wide web. In ACM STOC '97, 1997.

Digital Library

[32]

D. R. Karger and M. Ruhl. Simple efficient load balancing algorithms for peer-to-peer systems. In Proceedings of the Sixteenth Annual ACM Symposium on Parallelism in Algorithms and Architectures, SPAA '04, pages 36--43, New York, NY, USA, 2004. ACM.

Digital Library

[33]

A. Kivity, D. Laor, G. Costa, P. Enberg, N. Har'El, D. Marti, and V. Zolotarov. Osv---optimizing the operating system for virtual machines. In 2014 USENIX Annual Technical Conference (USENIX ATC 14), pages 61--72, Philadelphia, PA, June 2014. USENIX Association.

Digital Library

[34]

S. Li, S. Wang, F. Yang, S. Hu, F. Saremi, and T. Abdelzaher. Proteus: Power proportional memory cache cluster in data centers. In Distributed Computing Systems (ICDCS), 2013 IEEE 33rd International Conference on, pages 73--82, 2013.

Digital Library

[35]

X. Li, D. G. Andersen, M. Kaminsky, and M. J. Freedman. Algorithmic improvements for fast concurrent cuckoo hashing. In Proceedings of the Ninth European Conference on Computer Systems, EuroSys '14, pages 27:1--27:14, New York, NY, USA, 2014. ACM.

Digital Library

[36]

H. Lim, D. Han, D. G. Andersen, and M. Kaminsky. Mica: A holistic approach to fast in-memory key-value storage. In 11th USENIX Symposium on Networked Systems Design and Implementation (NSDI 14), pages 429--444, Seattle, WA, Apr. 2014. USENIX Association.

Digital Library

[37]

Y. Mao, E. Kohler, and R. T. Morris. Cache craftiness for fast multicore key-value storage. In Proceedings of the 7th ACM European Conference on Computer Systems, EuroSys '12, pages 183--196, New York, NY, USA, 2012. ACM.

Digital Library

[38]

T. Marian, K. S. Lee, and H. Weatherspoon. Netslices: Scalable multi-core packet processing in user-space. In Proceedings of the Eighth ACM/IEEE Symposium on Architectures for Networking and Communications Systems, ANCS '12, pages 27--38, New York, NY, USA, 2012. ACM.

Digital Library

[39]

J. Martins, M. Ahmed, C. Raiciu, V. Olteanu, M. Honda, R. Bifulco, and F. Huici. Clickos and the art of network function virtualization. In 11th USENIX Symposium on Networked Systems Design and Implementation (NSDI 14), pages 459--473, Seattle, WA, Apr. 2014. USENIX Association.

Digital Library

[40]

R. Nishtala, H. Fugal, S. Grimm, M. Kwiatkowski, H. Lee, H. C. Li, R. McElroy, M. Paleczny, D. Peek, P. Saab, D. Stafford, T. Tung, and V. Venkataramani. Scaling memcache at facebook. In Presented as part of the 10th USENIX Symposium on Networked Systems Design and Implementation (NSDI 13), pages 385--398, Lombard, IL, 2013. USENIX.

Digital Library

[41]

D. Ongaro, S. M. Rumble, R. Stutsman, J. Ousterhout, and M. Rosenblum. Fast crash recovery in ramcloud. In Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles, SOSP '11, pages 29--41, New York, NY, USA, 2011. ACM.

Digital Library

[42]

A. Phanishayee, E. Krevat, V. Vasudevan, D. G. Andersen, G. R. Ganger, G. A. Gibson, and S. Seshan. Measurement and analysis of tcp throughput collapse in cluster-based storage systems. In Proceedings of the 6th USENIX Conference on File and Storage Technologies, FAST'08, pages 12:1--12:14, Berkeley, CA, USA, 2008. USENIX Association.

Digital Library

[43]

A. Rao, K. Lakshminarayanan, S. Surana, R. Karp, and I. Stoica. Load balancing in structured p2p systems. In IPTPS'03, 2003.

[44]

K. Shvachko, H. Kuang, S. Radia, and R. Chansler. The hadoop distributed file system. In Proceedings of the 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), MSST '10, pages 1--10, Washington, DC, USA, 2010. IEEE Computer Society.

Digital Library

[45]

A. Wiggins and J. Langston. Enhancing the scalability of memcached, 2012.

[46]

Y. Xu, E. Frachtenberg, S. Jiang, and M. Paleczny. Characterizing facebook's memcached workload. Internet Computing, IEEE, 18(2): 41--49, Mar 2014.

[47]

W. Zhang, J. Hwang, T. Wood, K. Ramakrishnan, and H. Huang. Load balancing of heterogeneous workloads in memcached clusters. In 9th International Workshop on Feedback Computing (Feedback Computing 14), Philadelphia, PA, June 2014. USENIX Association.

Cited By

Wang SLuo JLi Y(2024)An Access-Oriented Placement Strategy with Online Erasure Coding in Memory StoresProceedings of the 2024 9th International Conference on Intelligent Information Processing10.1145/3696952.3696974(154-161)Online publication date: 21-Nov-2024
https://dl.acm.org/doi/10.1145/3696952.3696974
Zhou ZZu JHuang EWang ZZhou HZhang DChen XWu C(2024)SpotMon: Enabling General Hotspot Monitoring in Key-Value Stores2024 IEEE 32nd International Conference on Network Protocols (ICNP)10.1109/ICNP61940.2024.10858526(1-12)Online publication date: 28-Oct-2024
https://doi.org/10.1109/ICNP61940.2024.10858526
Azqueta-Alzúaz APatiño-Martínez M(2024)Poster: Load Balancing for In-Memory Key-Value Data Stores2024 IEEE 44th International Conference on Distributed Computing Systems (ICDCS)10.1109/ICDCS60910.2024.00142(1442-1443)Online publication date: 23-Jul-2024
https://doi.org/10.1109/ICDCS60910.2024.00142
Show More Cited By

Index Terms

An in-memory object caching framework with adaptive load balancing

Recommendations

Adaptive memory-side last-level GPU caching
ISCA '19: Proceedings of the 46th International Symposium on Computer Architecture

Emerging GPU applications exhibit increasingly high computation demands which has led GPU manufacturers to build GPUs with an increasingly large number of streaming multiprocessors (SMs). Providing data to the SMs at high bandwidth puts significant ...
Elastic Load Balancing Classic Load Balancers
CAMLB-SpMV: An Efficient Cache-Aware Memory Load-Balancing SpMV on CPU
ICPP '24: Proceedings of the 53rd International Conference on Parallel Processing

Sparse Matrix-Vector Multiplication (SpMV) plays a crucial role in scientific computing, but severe load imbalance among threads restricts its performance. Previous load-balancing methods have primarily ignored the CPU’s cache line-based memory access ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

EuroSys '15: Proceedings of the Tenth European Conference on Computer Systems

April 2015

503 pages

ISBN:9781450332385

DOI:10.1145/2741948

General Chair:
Laurent Réveillère
LaBRI, University of Bordeaux, France
,
Program Chairs:
Tim Harris
Oracle Labs, UK
,
Maurice Herlihy
Brown University

Copyright © 2015 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGOPS: ACM Special Interest Group on Operating Systems

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 April 2015

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article

Funding Sources

National Science Foundation

Conference

EuroSys '15

Sponsor:

SIGOPS

EuroSys '15: Tenth EuroSys Conference 2015

April 21 - 24, 2015

Bordeaux, France

Acceptance Rates

Overall Acceptance Rate 241 of 1,308 submissions, 18%

Upcoming Conference

EuroSys '25

Sponsor:
sigops

Twentieth European Conference on Computer Systems

March 30 - April 3, 2025

Rotterdam , Netherlands

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

39
Total Citations
View Citations
1,307
Total Downloads

Downloads (Last 12 months)41
Downloads (Last 6 weeks)3

Reflects downloads up to 14 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Wang SLuo JLi Y(2024)An Access-Oriented Placement Strategy with Online Erasure Coding in Memory StoresProceedings of the 2024 9th International Conference on Intelligent Information Processing10.1145/3696952.3696974(154-161)Online publication date: 21-Nov-2024
https://dl.acm.org/doi/10.1145/3696952.3696974
Zhou ZZu JHuang EWang ZZhou HZhang DChen XWu C(2024)SpotMon: Enabling General Hotspot Monitoring in Key-Value Stores2024 IEEE 32nd International Conference on Network Protocols (ICNP)10.1109/ICNP61940.2024.10858526(1-12)Online publication date: 28-Oct-2024
https://doi.org/10.1109/ICNP61940.2024.10858526
Azqueta-Alzúaz APatiño-Martínez M(2024)Poster: Load Balancing for In-Memory Key-Value Data Stores2024 IEEE 44th International Conference on Distributed Computing Systems (ICDCS)10.1109/ICDCS60910.2024.00142(1442-1443)Online publication date: 23-Jul-2024
https://doi.org/10.1109/ICDCS60910.2024.00142
Danish MSchnicke F(2024)Integrating Systems of Record (SOR) into the Asset Administration Shell (AAS) Dataspace2024 IEEE 29th International Conference on Emerging Technologies and Factory Automation (ETFA)10.1109/ETFA61755.2024.10710635(1-8)Online publication date: 10-Sep-2024
https://doi.org/10.1109/ETFA61755.2024.10710635
Zhang KSha EZhuge QXu R(2024)An efficient flattened index structure with lazy restructuring and hotness awarenessFuture Generation Computer Systems10.1016/j.future.2023.11.025153(139-153)Online publication date: Apr-2024
https://doi.org/10.1016/j.future.2023.11.025
Park HGanger GAmvrosiadis GGilad YKostic DMoatti YBiran O(2023)Mimir: Finding Cost-efficient Storage Configurations in the Public CloudProceedings of the 16th ACM International Conference on Systems and Storage10.1145/3579370.3594776(22-34)Online publication date: 5-Jun-2023
https://dl.acm.org/doi/10.1145/3579370.3594776
Wang ZZhao JAgrawal KLiu HXu MLi JDehnavi MKulkarni MKrishnamoorthy S(2023)Provably Good Randomized Strategies for Data Placement in Distributed Key-Value StoresProceedings of the 28th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming10.1145/3572848.3577501(27-38)Online publication date: 25-Feb-2023
https://dl.acm.org/doi/10.1145/3572848.3577501
Baganal-Krishna NMunstein DRizk A(2023)LETHE: Combined Time-to-Live Caching and Load Balancing on the Network Data Plane2023 IEEE 29th International Symposium on Local and Metropolitan Area Networks (LANMAN)10.1109/LANMAN58293.2023.10189809(1-6)Online publication date: 10-Jul-2023
https://doi.org/10.1109/LANMAN58293.2023.10189809
Zhang QLiu YLiu T(2022)iBalancer: Load-Aware in-Server Flow Scheduling for Sub-Millisecond Tail LatencyIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2021.312002133:8(1761-1774)Online publication date: 1-Aug-2022
https://doi.org/10.1109/TPDS.2021.3120021
Liu LWang HWang AXiao MCheng YChen S(2021)Mind the GapProceedings of the ACM Symposium on Cloud Computing10.1145/3472883.3486997(243-257)Online publication date: 1-Nov-2021
https://dl.acm.org/doi/10.1145/3472883.3486997
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten