Processing math: 100%
Toward Full-Coverage and Low-Overhead Profiling of Network-Stack Latency | IEEE Journals & Magazine | IEEE Xplore

Toward Full-Coverage and Low-Overhead Profiling of Network-Stack Latency


Abstract:

In modern data center networks (DCNs), network-stack processing denotes a large portion of the end-to-end latency of TCP flows. So profiling network-stack latency anomali...Show More

Abstract:

In modern data center networks (DCNs), network-stack processing denotes a large portion of the end-to-end latency of TCP flows. So profiling network-stack latency anomalies has been considered as a crucial part in DCN performance diagnosis and troubleshooting. In particular, such profiling requires full coverage (i.e., profiling every TCP packet) and low overhead (i.e., profiling should avoid high CPU consumption in end-hosts). However, existing solutions rely on system calls or tracepoints in end-hosts to implement network-stack latency profiling, leading to either low coverage or high overhead. We propose Torp, a framework that offers full-coverage and low-overhead profiling of network-stack latency. Our key idea is to offload as much of the profiling from costly system calls or tracepoints to the Torp agent built on eBPF modules, and further to include a Torp handler on the ToR switch to accelerate the remaining profiling operations. Torp efficiently coordinates the ToR switch and the Torp agent on end-hosts to jointly execute the entire latency profiling task. We have implemented Torp on 32\times 100 Gbps Tofino switches. Testbed experiments indicate that Torp achieves full coverage and orders of magnitude lower host-side overhead compared to other solutions.
Published in: IEEE/ACM Transactions on Networking ( Volume: 32, Issue: 5, October 2024)
Page(s): 4441 - 4455
Date of Publication: 03 July 2024

ISSN Information:

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.