Loading [MathJax]/extensions/MathZoom.js
Accelerator-Aware In-Network Load Balancing for Improved Application Performance | IEEE Conference Publication | IEEE Xplore

Accelerator-Aware In-Network Load Balancing for Improved Application Performance


Abstract:

The end of Moore's law has sparked a surge on programmable accelerators (e.g., SmartNICs, TPUs) for continued scaling of application performance. However, despite the gre...Show More

Abstract:

The end of Moore's law has sparked a surge on programmable accelerators (e.g., SmartNICs, TPUs) for continued scaling of application performance. However, despite the great success in offloading tasks from the CPU, we still lack proper mechanisms for balancing load among the multiple computing units present on current systems. On the one hand, traditional load balancers (either software or hardware-based) have no visibility of the different accelerators in a server and can only dispatch requests at a perserver granularity. On the other hand, emerging offloading engines can assign tasks at a finer-granularity (e.g., peraccelerator), but are hosted by the accelerator itself and thus waste precious resources for balancing load rather than processing it. This paper presents P4Mite, an accelerator-aware in-network load balancing system. P4Mite is based on two key insights: i) using programmable switches for load balancing traffic among different accelerators (and also the CPU) located in the same server; and ii) collecting statistics from each accelerator on demand for increased load visibility. We implement a P4Mite prototype on top of Intel Tofino and a Mellanox SmartNIC and evaluate it using real-world applications, including machine learning inference (VGG-16) and DNS. Our results show that P4Mite reduces flow latency by up to 50% and also makes the system handle 10-20% more load compared to standard server-level load balancing approaches. Moreover, it can process at least an order of magnitude more requests than a SmartNIC-based load balancer, with negligible latency and memory footprint.
Date of Conference: 13-16 June 2022
Date Added to IEEE Xplore: 22 July 2022
ISBN Information:
Electronic ISSN: 1861-2288
Conference Location: Catania, Italy

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.