skip to main content
10.1145/3609021.3609307acmconferencesArticle/Chapter ViewAbstractPublication PagescommConference Proceedingsconference-collections
research-article

HEELS: A Host-Enabled eBPF-Based Load Balancing Scheme

Published:10 September 2023Publication History

ABSTRACT

Layer 4 (L4) load balancing is crucial in cloud computing and elastic microservices. Existing L4 load balancer designs can be split into two main categories: centralized designs using a hardware or software middlebox, and decentralized designs in which every node can play the role of the load balancer. Centralized designs offer better scheduling policies and easier worker node management, but suffer from I/O and CPU limitations. Decentralized designs scale better, but are harder to manage. We introduce HEELS, a novel load balancing scheme designed for internal cloud workloads and microservices, achieving the best of both worlds. HEELS uses the load balancer only during the connection establishment and allows clients and servers to communicate directly after that. Supporting general L4 load balancers and requiring no kernel changes, HEELS is readily deployable on the public cloud. We implement HEELS as a set of eBPF programs split across the client and server. Our evaluation shows that HEELS introduces minimal overheads, works with off-the-shelf load balancers (e.g., Katran by Meta), and significantly reduces the costs of cloud load balancers.

References

  1. João Taveira Araújo, Lorenzo Saino, Lennert Buytenhek, and Raul Landa. 2018. Balancing on the Edge: Transport Affinity without Network State.. In Proceedings of the 15th Symposium on Networked Systems Design and Implementation (NSDI). 111--124.Google ScholarGoogle Scholar
  2. AWS. 2023. AWS Elastic Load Balancing. (2023). https://aws.amazon.com/elasticloadbalancing/ [Accessed: (06/2023)].Google ScholarGoogle Scholar
  3. Tom Barbette, Chen Tang, Haoran Yao, Dejan Kostic, Gerald Q. Maguire Jr., Panagiotis Papadimitratos, and Marco Chiesa. 2020. A High-Speed Load-Balancer Design with Guaranteed Per-Connection-Consistency.. In Proceedings of the 17th Symposium on Networked Systems Design and Implementation (NSDI). 667--683.Google ScholarGoogle Scholar
  4. Cloudflare. 2020. Unimog - Cloudflare's edge load balancer. (2020). https://blog.cloudflare.com/unimog-cloudflares-edge-load-balancer/ [Accessed: (06/2023)].Google ScholarGoogle Scholar
  5. Docker. 2023. Docker Swarm. (2023). https://docs.docker.com/engine/swarm/ [Accessed: (06/2023)].Google ScholarGoogle Scholar
  6. Daniel E. Eisenbud, Cheng Yi, Carlo Contavalli, Cody Smith, Roman Kononov, Eric Mann-Hielscher, Ardas Cilingiroglu, Bin Cheyney, Wentao Shang, and Jinnah Dylan Hosein. 2016. Maglev: A Fast and Reliable Software Network Load Balancer.. In Proceedings of the 13th Symposium on Networked Systems Design and Implementation (NSDI). 523--535.Google ScholarGoogle Scholar
  7. Rohan Gandhi, Hongqiang Harry Liu, Y. Charlie Hu, Guohan Lu, Jitendra Padhye, Lihua Yuan, and Ming Zhang. 2014. Duet: cloud scale load balancing with hardware and software.. In Proceedings of the ACM SIGCOMM 2014 Conference. 27--38.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Yoann Ghigoff, Julien Sopena, Kahina Lazri, Antoine Blin, and Gilles Muller. 2021. BMC: Accelerating Memcached using Safe In-kernel Caching and Pre-stack Processing.. In Proceedings of the 18th Symposium on Networked Systems Design and Implementation (NSDI). 487--501.Google ScholarGoogle Scholar
  9. Github. 2016. Github Load Balancer. (2016). https://github.blog/2016-09-22-introducing-glb/ [Accessed: (06/2023)].Google ScholarGoogle Scholar
  10. Yutaro Hayakawa, Lars Eggert, Michio Honda, and Douglas Santry. 2017. Prism: a proxy architecture for datacenter networks.. In Proceedings of the 2017 ACM Symposium on Cloud Computing (SOCC). 181--188.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. IETF. 2021. The QUIC Transport Protocol - IETF. (2021). https://datatracker.ietf.org/doc/html/draft-ietf-quic-transport-34 [Accessed: (06/2023)].Google ScholarGoogle Scholar
  12. Rick Jones. 2005. NetPerf. (2005). https://fossies.org/linux/netperf/doc/netperf.pdf [Accessed: (06/2023)].Google ScholarGoogle Scholar
  13. Kostis Kaffes, Jack Tigar Humphries, David Mazières, and Christos Kozyrakis. 2021. Syrup: User-Defined Scheduling Across the Stack.. In Proceedings of the 28th ACM Symposium on Operating Systems Principles (SOSP). 605--620.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Marios Kogias, Rishabh Iyer, and Edouard Bugnion. 2020. Bypassing the load balancer without regrets.. In Proceedings of the 2020 ACM Symposium on Cloud Computing (SOCC). 193--207.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Kubernetes. 2018. IPVS-based Kubernetes Load Balancing. (2018). https://kubernetes.io/blog/2018/07/09/ipvs-based-in-cluster-load-balancing-deep-dive [Accessed: (06/2023)].Google ScholarGoogle Scholar
  16. Kubernetes. 2023. Kube Proxy. (2023). https://kubernetes.io/docs/reference/command-line-tools-reference/kube-proxy/ [Accessed: (06/2023)].Google ScholarGoogle Scholar
  17. Kubernetes. 2023. Kubernetes Container Orchestrator. (2023). https://kubernetes.io/ [Accessed: (06/2023)].Google ScholarGoogle Scholar
  18. Linux. 2017. BPF_PROG_TYPE_SOCK_OPS. (2017). https://lwn.net/Articles/727189/ [Accessed: (06/2023)].Google ScholarGoogle Scholar
  19. Linux. 2023. BPF_MAP_TYPE_SK_STORAGE. (2023). https://docs.kernel.org/bpf/map_sk_storage.html [Accessed: (06/2023)].Google ScholarGoogle Scholar
  20. Linux. 2023. IPVS Virtual Server. (2023). http://www.linuxvirtualserver.org/software/ipvs.html [Accessed: (06/2023)].Google ScholarGoogle Scholar
  21. Linux. 2023. Linux kernel driver for Elastic Network Adapter (ENA) family. (2023). https://www.kernel.org/doc/html/latest/networking/device_drivers/ethernet/amazon/ena.html [Accessed: (06/2023)].Google ScholarGoogle Scholar
  22. Linux. 2023. Linux Traffic Control. (2023). https://man7.org/linux/man-pages/man8/tc.8.html [Accessed: (06/2023)].Google ScholarGoogle Scholar
  23. Zaoxing Liu, Ran Ben-Basat, Gil Einziger, Yaron Kassner, Vladimir Braverman, Roy Friedman, and Vyas Sekar. 2019. Nitrosketch: robust and general sketch-based monitoring in software switches.. In Proceedings of the ACM SIGCOMM 2019 Conference. 334--350.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Meta. 2023. Katran. (2023). https://github.com/facebookincubator/katran [Accessed: (06/2023)].Google ScholarGoogle Scholar
  25. Sebastiano Miano, Xiaoqi Chen, Ran Ben Basat, and Gianni Antichi. 2023. Fast In-kernel Traffic Sketching in eBPF. Comput. Commun. Rev. 53, 1 (2023), 3--13.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Sebastiano Miano, Fulvio Risso, Mauricio Vásquez Bernal, Matteo Bertrone, and Yunsong Lu. 2021. A Framework for eBPF-Based Network Functions in an Era of Microservices. IEEE Trans. Netw. Serv. Manag. 18, 1 (2021), 133--151.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Rui Miao, Hongyi Zeng, Changhoon Kim, Jeongkeun Lee, and Minlan Yu. 2017. SilkRoad: Making Stateful Layer-4 Load Balancing Fast and Cheap Using Switching ASICs.. In Proceedings of the ACM SIGCOMM 2017 Conference. 15--28.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. NGINX. 2016. NGINX DSR: IP Transparency and Direct Server Return with NGINX and NGINX Plus as Transparent Proxy. (2016). https://www.nginx.com/blog/ip-transparency-direct-server-return-nginx-plus-transparent-proxy/ [Accessed: (06/2023)].Google ScholarGoogle Scholar
  29. NGINX. 2023. NGINX Reverse Proxy. (2023). https://docs.nginx.com/nginx/admin-guide/web-server/reverse-proxy [Accessed: (06/2023)].Google ScholarGoogle Scholar
  30. Vladimir Andrei Olteanu, Alexandru Agache, Andrei Voinescu, and Costin Raiciu. 2018. Stateless Datacenter Load-balancing with Beamer.. In Proceedings of the 15th Symposium on Networked Systems Design and Implementation (NSDI). 125--139.Google ScholarGoogle Scholar
  31. Parveen Patel, Deepak Bansal, Lihua Yuan, Ashwin Murthy, Albert G. Greenberg, David A. Maltz, Randy Kern, Hemant Kumar, Marios Zikos, Hongyu Wu, Changhoon Kim, and Naveen Karri. 2013. Ananta: cloud scale load balancing.. In Proceedings of the ACM SIGCOMM 2013 Conference. 207--218.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Shixiong Qi, Leslie Monis, Ziteng Zeng, Ian-Chin Wang, and K. K. Ramakrishnan. 2022. SPRIGHT: extracting the server from serverless computing! high-performance eBPF-based event-driven, shared-memory processing.. In Proceedings of the ACM SIGCOMM 2022 Conference. 780--794.Google ScholarGoogle Scholar
  33. Gil Tene. 2023. wrk2: a HTTP benchmarking tool. (2023). https://github.com/giltene/wrk2/ [Accessed: (06/2023)].Google ScholarGoogle Scholar

Index Terms

  1. HEELS: A Host-Enabled eBPF-Based Load Balancing Scheme
                Index terms have been assigned to the content through auto-classification.

                Recommendations

                Comments

                Login options

                Check if you have access through your login credentials or your institution to get full access on this article.

                Sign in
                • Published in

                  cover image ACM Conferences
                  eBPF '23: Proceedings of the 1st Workshop on eBPF and Kernel Extensions
                  September 2023
                  96 pages
                  ISBN:9798400702938
                  DOI:10.1145/3609021

                  Copyright © 2023 ACM

                  Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

                  Publisher

                  Association for Computing Machinery

                  New York, NY, United States

                  Publication History

                  • Published: 10 September 2023

                  Permissions

                  Request permissions about this article.

                  Request Permissions

                  Check for updates

                  Qualifiers

                  • research-article

                  Acceptance Rates

                  eBPF '23 Paper Acceptance Rate12of21submissions,57%Overall Acceptance Rate12of21submissions,57%
                • Article Metrics

                  • Downloads (Last 12 months)302
                  • Downloads (Last 6 weeks)63

                  Other Metrics

                PDF Format

                View or Download as a PDF file.

                PDF

                eReader

                View online with eReader.

                eReader