ABSTRACT
Stateful load balancers (LB) are essential services in cloud data centers, playing a crucial role in enhancing the availability and capacity of applications. Numerous studies have proposed methods to improve the throughput, connections per second, and concurrent flows of single LBs. For instance, with the advancement of programmable switches, hardware-based load balancers (HLB) have become mainstream due to their high efficiency. However, programmable switches still face the issue of limited registers and table entries, preventing them from fully meeting the performance requirements of data centers. In this paper, rather than solely focusing on enhancing individual HLBs, we introduce SlimeMold, which enables HLBs to work collaboratively at scale as an integrated LB system in data centers.
First, we design a novel HLB building block capable of achieving load balancing and exchanging states with other building blocks in the data plane. Next, we decouple forwarding and state operations, organizing the states using our proposed 2-level mapping mechanism. Finally, we optimize the system with flow caching and table entry balancing. We implement a real HLB building block using the Broadcom 56788 SmartToR chip, which attains line rate for state read and >1M OPS for flow write operations. Our simulation demonstrates full scalability in large-scale experiments, supporting 454 million concurrent flows with 512 state-hosting building blocks.
- 2018. Unveiling the Networks behind the 2018 Double 11 Global Shopping Festival. https://www.alibabacloud.com/blog/594167?spm=a2c5t.11065265.1996646101.searchclickresult.289b2f0575gg5Z.Google Scholar
- 2023. BCM56780 Series. https://www.broadcom.com/products/ethernet-connectivity/switching/strataxgs/bcm56780.Google Scholar
- 2023. Broadcom Breaks New Ground with Trident SmartToR, Converging Switching, Routing, and L4-L7 Services. https://investors.broadcom.com/news-releases/news-release-details/broadcom-breaks-new-ground-trident-smarttor-converging-switching.Google Scholar
- 2023. DPVS is a high performance Layer-4 load balancer based on DPDK. https://github.com/iqiyi/dpvs.Google Scholar
- 2023. Equal Cost Multipath Load Sharing - Hardware ECMP. https://docs.nvidia.com/networking-ethernet-software/cumulus-linux-43/Layer-3/Routing/Equal-Cost-Multipath-Load-Sharing-Hardware-ECMP/.Google Scholar
- 2023. NPL – Open, High-Level language for developing feature-rich solutions for programmable networking platforms. https://nplang.org/.Google Scholar
- 2023. Spirent FX3 2-Port Quint-Speed QSFP28 Modules. https://www.spirent.com/assets/u/spirent_fx3_hse_module_datasheet.Google Scholar
- 2023. Trident SmartToR. https://www.broadcom.com/products/ethernet-connectivity/switching/strataxgs/smarttor.Google Scholar
- Mohammad Al-Fares, Alexander Loukissas, and Amin Vahdat. 2008. A scalable, commodity data center network architecture. ACM SIGCOMM computer communication review 38, 4 (2008), 63–74.Google Scholar
- Mohammad Alizadeh, Albert Greenberg, David A Maltz, Jitendra Padhye, Parveen Patel, Balaji Prabhakar, Sudipta Sengupta, and Murari Sridharan. 2010. Data center tcp (dctcp). In ACM SIGCOMM (2010).Google ScholarDigital Library
- Tom Barbette, Chen Tang, Haoran Yao, Dejan Kostić, Gerald Q Maguire Jr, Panagiotis Papadimitratos, and Marco Chiesa. 2020. A high-speed load-balancer design with guaranteed per-connection-consistency. In USENIX NSDI (2020).Google Scholar
- Daniel E Eisenbud, Cheng Yi, Carlo Contavalli, Cody Smith, Roman Kononov, Eric Mann-Hielscher, Ardas Cilingiroglu, Bin Cheyney, Wentao Shang, and Jinnah Dylan Hosein. 2016. Maglev: A fast and reliable software network load balancer. In USENIX NSDI (2016).Google ScholarDigital Library
- Rohan Gandhi, Y Charlie Hu, Cheng-kok Koh, Hongqiang Harry Liu, and Ming Zhang. 2015. Rubik: Unlocking the power of locality and end-point flexibility in cloud scale load balancing. In USENIX ATC (2015).Google Scholar
- Rohan Gandhi, Hongqiang Harry Liu, Y Charlie Hu, Guohan Lu, Jitendra Padhye, Lihua Yuan, and Ming Zhang. 2014. Duet: Cloud scale load balancing with hardware and software. ACM SIGCOMM Computer Communication Review (2014).Google ScholarDigital Library
- Albert Greenberg, James R Hamilton, Navendu Jain, Srikanth Kandula, Changhoon Kim, Parantap Lahiri, David A Maltz, Parveen Patel, and Sudipta Sengupta. 2009. VL2: A scalable and flexible data center network. In Proceedings of the ACM SIGCOMM 2009 conference on Data communication. 51–62.Google ScholarDigital Library
- Rui Miao, Hongyi Zeng, Changhoon Kim, Jeongkeun Lee, and Minlan Yu. 2017. Silkroad: Making stateful layer-4 load balancing fast and cheap using switching asics. In ACM SIGCOMM (2017).Google ScholarDigital Library
- Parveen Patel, Deepak Bansal, Lihua Yuan, Ashwin Murthy, Albert Greenberg, David A Maltz, Randy Kern, Hemant Kumar, Marios Zikos, Hongyu Wu, 2013. Ananta: Cloud scale load balancing. ACM SIGCOMM Comput. Commun. Rev 43, 4 (2013), 207–218.Google ScholarDigital Library
- Cheng Tan, Ze Jin, Chuanxiong Guo, Tianrong Zhang, Haitao Wu, Karl Deng, Dongming Bi, and Dong Xiang. 2019. NetBouncer: Active Device and Link Failure Localization in Data Center Networks.. In USENIX NSDI (2019).Google Scholar
- Chaoliang Zeng, Layong Luo, Teng Zhang, Zilong Wang, Luyang Li, Wenchen Han, Nan Chen, Lebing Wan, Lichao Liu, Zhipeng Ding, 2022. Tiara: A scalable and efficient hardware acceleration architecture for stateful layer-4 load balancing. In USENIX NSDI (2022).Google Scholar
- Lior Zeno, Dan RK Ports, Jacob Nelson, Daehyeok Kim, Shir Landau-Feibish, Idit Keidar, Arik Rinberg, Alon Rashelbach, Igor De-Paula, and Mark Silberstein. 2022. { SwiSh} : Distributed Shared State Abstractions for Programmable Switches. In USENIX NSDI (2022).Google Scholar
Index Terms
- SlimeMold: Hardware Load Balancer at Scale in Datacenter
Recommendations
SilkRoad: Making Stateful Layer-4 Load Balancing Fast and Cheap Using Switching ASICs
SIGCOMM '17: Proceedings of the Conference of the ACM Special Interest Group on Data CommunicationIn this paper, we show that up to hundreds of software load balancer (SLB) servers can be replaced by a single modern switching ASIC, potentially reducing the cost of load balancing by over two orders of magnitude. Today, large data centers typically ...
Load Balancer as a Service in Cloud Computing
SOSE '14: Proceedings of the 2014 IEEE 8th International Symposium on Service Oriented System EngineeringThe explosive growth of cloud computing in recent years has led to a massive increase in both the amount of traffic and the number of service requests to cloud servers. This growth trend of load poses serious challenges to the cloud load balancer in ...
A distributed dynamic load balancer for iterative applications
SC '13: Proceedings of the International Conference on High Performance Computing, Networking, Storage and AnalysisFor many applications, computation load varies over time. Such applications require dynamic load balancing to improve performance. Centralized load balancing schemes, which perform the load balancing decisions at a central location, are not scalable. In ...
Comments