skip to main content
10.1145/3626183.3659968acmconferencesArticle/Chapter ViewAbstractPublication PagesspaaConference Proceedingsconference-collections
research-article
Open access

Distributed Load Balancing in the Face of Reappearance Dependencies

Published: 17 June 2024 Publication History

Abstract

We consider the problem of load-balancing on distributed databases. We assume that data is divided into chunks and each chunk can be replicated on a constant number d of servers. When a request arrives, it is routed to one of the servers that contains the relevant chunk. Each server may store outstanding requests in a bounded queue and requests may be rejected if the queue is full. The goal is to design strategies for data distribution and request routing that minimize both the rejection rate and the average request latency.
What makes this problem technically difficult is reappearance dependencies: if a chunk x is accessed at multiple different time steps, then the set of d servers that it can be routed to is the same each time it is accessed. This is a substantial departure from classical balls-and-bins settings where each ball arrival introduces fresh randomness into the system.
We show that, with new algorithmic and analytical approaches, it is possible to overcome reappearance dependencies and construct algorithms with optimal rejection rate, latency, and queue size.

References

[1]
Micah Adler, Soumen Chakrabarti, Michael Mitzenmacher, and Lars Rasmussen. 1995. Parallel Randomized Load Balancing. In Proceedings of the Twenty-Seventh Annual ACM Symposium on Theory of Computing (Las Vegas, Nevada, USA) (STOC '95). Association for Computing Machinery, New York, NY, USA, 238--247. https://doi.org/10.1145/225058.225131
[2]
Berk Atikoglu, Yuehai Xu, Eitan Frachtenberg, Song Jiang, and Mike Paleczny. 2012. Workload analysis of a large-scale key-value store. In Proceedings of the 12th ACM SIGMETRICS/PERFORMANCE joint international conference on Measurement and Modeling of Computer Systems. 53--64.
[3]
Yossi Azar, Andrei Z. Broder, Anna R. Karlin, and Eli Upfal. 1999. Balanced Allocations. SIAM J. Comput. 29, 1 (1999), 180--200. https://doi.org/10.1137/ S0097539795288490 arXiv:https://doi.org/10.1137/S0097539795288490
[4]
Nikhil Bansal and Ohad N. Feldheim. 2022. The power of two choices in graphical allocation. In Proceedings of the 54th Annual ACM SIGACT Symposium on Theory of Computing (Rome, Italy) (STOC 2022). Association for Computing Machinery, New York, NY, USA, 52--63. https://doi.org/10.1145/3519935.3519995
[5]
Nikhil Bansal and William Kuszmaul. 2022. Balanced Allocations: The Heavily Loaded Case with Deletions. In 2022 IEEE 63rd Annual Symposium on Foundations of Computer Science (FOCS). 801--812. https://doi.org/10.1109/FOCS54457.2022. 00081
[6]
LucaBecchetti Becchetti, Andrea Clementi, Emanuele Natale, Francesco Pasquale, and Gustavo Posta. 2015. Self-Stabilizing Repeated Balls-into-Bins. In Proceedings of the 27th ACM Symposium on Parallelism in Algorithms and Architectures (Portland, Oregon, USA) (SPAA '15). Association for Computing Machinery, New York, NY, USA, 332--339. https://doi.org/10.1145/2755573.2755584
[7]
Michael A Bender, Alex Conway, Martín Farach-Colton, William Kuszmaul, and Guido Tagliavini. 2023. Tiny pointers. In Proceedings of the 2023 Annual ACMSIAM Symposium on Discrete Algorithms (SODA). SIAM, 477--508.
[8]
Petra Berenbrink, Artur Czumaj, Matthias Englert, Tom Friedetzky, and Lars Nagel. 2012. Multiple-Choice Balanced Allocation in (Almost) Parallel. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, Anupam Gupta, Klaus Jansen, José Rolim, and Rocco Servedio (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 411--422.
[9]
Petra Berenbrink, Artur Czumaj, Angelika Steger, and Berthold Vöcking. 2000. Balanced allocations: The heavily loaded case. In Proceedings of the thirty-second annual ACM symposium on Theory of computing. 745--754.
[10]
Hong Chen and Heng-Qing Ye. 2009. Asymptotic Optimality of Balanced Routing. Operations Research 60. https://doi.org/10.2307/41476346
[11]
Richard Cole, Alan Frieze, Bruce Maggs, Michael Mitzenmacher, Andréa Richa, Ramesh Sitaraman, and Eli Upfal. 1998. On Balls and Bins with Deletions, Vol. 1518. 145--158. https://doi.org/10.1007/3--540--49543--6_12
[12]
Richard Cole, Bruce M. Maggs, Friedhelm Meyer auf der Heide, Michael Mitzenmacher, Andréa W. Richa, Klaus Schröder, Ramesh K. Sitaraman, and Berthold Vöcking. 1998. Randomized Protocols for Low-Congestion Circuit Routing in Multistage Interconnection Networks. In Proceedings of the Thirtieth Annual ACM Symposium on Theory of Computing (Dallas, Texas, USA) (STOC '98). Association for Computing Machinery, New York, NY, USA, 378--388. https://doi.org/10.1145/276698.276790
[13]
Brian F Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell Sears. 2010. Benchmarking cloud serving systems with YCSB. In Proceedings of the 1st ACM symposium on Cloud computing. 143--154.
[14]
Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall, and Werner Vogels. 2007. Dynamo: Amazon's Highly Available Key-Value Store. In Proceedings of Twenty-First ACM SIGOPS Symposium on Operating Systems Principles (Stevenson, Washington, USA) (SOSP '07). Association for Computing Machinery, New York, NY, USA, 205--220. https://doi.org/10.1145/1294261.1294281
[15]
Felix Garcia-Carballeira and Alejandro Calderon. 2017. Reducing Randomization in the Power of Two Choices Load Balancing Algorithm. In 2017 International Conference on High Performance Computing & Simulation (HPCS). 365--372. https: //doi.org/10.1109/HPCS.2017.62
[16]
Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung. 2003. The Google file system. In Proceedings of the nineteenth ACM symposium on Operating systems principles. 29--43.
[17]
Antonie S. Godtschalk and Florin Ciucu. 2012. Stochastic bounds for randomized load balancing. SIGMETRICS Perform. Eval. Rev. 40, 3 (jan 2012), 74--76. https: //doi.org/10.1145/2425248.2425267
[18]
Adam Kirsch, Michael Mitzenmacher, and Udi Wieder. 2009. More Robust Hashing: Cuckoo Hashing with a Stash. SIAM J. Comput. 39 (01 2009), 1543--1561. https://doi.org/10.1137/080728743
[19]
Markus Klems, Adam Silberstein, Jianjun Chen, Masood Mortazavi, Sahaya Andrews Albert, PPS Narayan, Adwait Tumbde, and Brian Cooper. 2012. The yahoo! cloud datastore load balancer. In Proceedings of the fourth international workshop on Cloud data management. 33--40.
[20]
Avinash Lakshman and Prashant Malik. 2010. Cassandra: A Decentralized Structured Storage System. SIGOPS Oper. Syst. Rev. 44, 2 (apr 2010), 35--40. https://doi.org/10.1145/1773912.1773922
[21]
Dimitrios Los and Thomas Sauerwald. 2023. Balanced Allocations in Batches: The Tower of Two Choices. In Proceedings of the 35th ACM Symposium on Parallelism in Algorithms and Architectures (Orlando, FL, USA) (SPAA '23). Association for Computing Machinery, New York, NY, USA, 51--61. https://doi.org/10.1145/ 3558481.3591088
[22]
Yi Lu, Qiaomin Xie, Gabriel Kliot, Alan Geller, James Larus, and Albert Greenberg. 2011. Join-Idle-Queue: A Novel Load Balancing Algorithm for Dynamically Scalable Web Services. Perform. Eval. 68, 1056--1071. https://doi.org/10.1016/j. peva.2011.07.015
[23]
M. Mitzenmacher. 1996. The power of two choices in randomized load balancing. PhD thesis. University of California, Berkeley.
[24]
Michael Mitzenmacher. 1999. On the analysis of randomized load balancing schemes. Theory of Computing Systems 32, 3 (1999), 361--386.
[25]
M. Mitzenmacher. 2001. The power of two choices in randomized load balancing. IEEE Transactions on Parallel and Distributed Systems 12, 10 (2001), 1094--1104. https://doi.org/10.1109/71.963420
[26]
Debankur Mukherjee, Sem C. Borst, Johan S. H. van Leeuwaarden, and Philip A. Whiting. 2018. Universality of Power-of-d Load Balancing in Many-Server Systems. Stochastic Systems 8, 4 (2018), 265--292. https://doi.org/10.1287/stsy.2018. 0016 arXiv:https://doi.org/10.1287/stsy.2018.0016
[27]
Anis Nasir, Gianmarco Morales, Nicolas Kourtellis, and Marco Serafini. 2015. When Two Choices Are not Enough: Balancing at Scale in Distributed Stream Processing.
[28]
Muhammad Anis Uddin Nasir, Gianmarco De Francisci Morales, David García- Soriano, Nicolas Kourtellis, and Marco Serafini. 2015. The power of both choices: Practical load balancing for distributed stream processing engines. In 2015 IEEE 31st International Conference on Data Engineering. 137--148. https://doi.org/10. 1109/ICDE.2015.7113279
[29]
Rasmus Pagh and Flemming Friche Rodler. 2004. Cuckoo hashing. Journal of Algorithms 51, 2 (2004), 122--144. https://doi.org/10.1016/j.jalgor.2003.12.002
[30]
Andréa Richa, Michael Mitzenmacher, and Ramesh Sitaraman. 2000. The Power of Two Random Choices: A Survey of Techniques and Results. https://doi.org/ 10.1007/978--1--4615-0013--1_9
[31]
Andrea W Richa, M Mitzenmacher, and R Sitaraman. 2001. The power of two random choices: A survey of techniques and results. Combinatorial Optimization 9 (2001), 255--304.
[32]
Mehul Nalin Vora. 2011. Hadoop-HBase for large-scale data. In Proceedings of 2011 International Conference on Computer Science and Network Technology, Vol. 1. IEEE, 601--605.
[33]
Berthold Vöcking. 2003. How asymmetry helps load balancing. J. ACM 50, 568--589. https://doi.org/10.1145/792538.792546
[34]
Zhe Wang, Jinhao Zhao, Kunal Agrawal, He Liu, Meng Xu, and Jing Li. 2023. Provably Good Randomized Strategies for Data Placement in Distributed Key- Value Stores. In Proceedings of the 28th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming (Montreal, QC, Canada) (PPoPP '23). Association for Computing Machinery, New York, NY, USA, 27--38. https: //doi.org/10.1145/3572848.3577501
[35]
Y. Xing, S. Zdonik, and J.-H. Hwang. 2005. Dynamic load distribution in the Borealis stream processor. In 21st International Conference on Data Engineering (ICDE'05). 791--802. https://doi.org/10.1109/ICDE.2005.53
[36]
Jingyu Zhou, Meng Xu, Alexander Shraer, Bala Namasivayam, Alex Miller, Evan Tschannen, Steve Atherton, Andrew J Beamon, Rusty Sears, John Leach, et al. 2021. Foundationdb: A distributed unbundled transactional key value store. In Proceedings of the 2021 International Conference on Management of Data. 2653-- 2666.

Index Terms

  1. Distributed Load Balancing in the Face of Reappearance Dependencies

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      SPAA '24: Proceedings of the 36th ACM Symposium on Parallelism in Algorithms and Architectures
      June 2024
      510 pages
      ISBN:9798400704161
      DOI:10.1145/3626183
      This work is licensed under a Creative Commons Attribution International 4.0 License.

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 17 June 2024

      Check for updates

      Author Tags

      1. balls into bins
      2. distributed key-value stores
      3. load balancing
      4. power of $d$ choices

      Qualifiers

      • Research-article

      Funding Sources

      • Harvard Rabin Postdoctoral Fellowship
      • National Science Foundation

      Conference

      SPAA '24
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 447 of 1,461 submissions, 31%

      Upcoming Conference

      SPAA '25
      37th ACM Symposium on Parallelism in Algorithms and Architectures
      July 28 - August 1, 2025
      Portland , OR , USA

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 264
        Total Downloads
      • Downloads (Last 12 months)264
      • Downloads (Last 6 weeks)47
      Reflects downloads up to 20 Jan 2025

      Other Metrics

      Citations

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media