Abstract:
In-band network telemetry (INT) allows for fine-grained network monitoring, without requiring communication with the controller at each hop. Existing INT-based network-wi...Show MoreMetadata
Abstract:
In-band network telemetry (INT) allows for fine-grained network monitoring, without requiring communication with the controller at each hop. Existing INT-based network-wide telemetry systems achieve low-overhead monitoring with non-overlapping path planning algorithms. However, these systems do not constrain the length of the generated probing paths, which will lead to packet loss when the size of the packet with collected telemetry data exceeds the MTU limit. To address this issue, we propose MTU-adaptive path segmentation algorithms step by step in this paper. Initially, we present two single-path planning algorithms: the INT-optimize algorithm, which produces a single path that covers the entire network with the lowest southbound communication overhead, and the INT-low-cost algorithm, which further accelerates the INT-optimize. Next, to consider the MTU limit, we propose the single-MTU adaptive INT-Segment algorithm to divide the single long path generated in the previous step into multiple path segments. In addition, we generalize the MTU-adaptive network telemetry problem and propose a multi-MTU adaptive INT-Segment solution to achieve high-performance network telemetry in networks with multiple MTU settings. Extensive evaluations demonstrate that our proposed MTU-adaptive solutions can achieve sub-second network-wide telemetry for large-scale networks, with less than 2.9ms to calculate the probing paths for an 18-pod FatTree. Furthermore, our multi-MTU adaptive INT-Segment solution significantly reduces the number of INT Sinks and INT Sources by 13.25%-42.39% when deployed in multi-MTU networks while maintaining stable telemetry data collection time. Compared with the state-of-the-art INT-path, our solution adapts the probing path to the network MTU limit, producing a telemetry data collection efficiency improvement of 10%-94%.
Published in: IEEE/ACM Transactions on Networking ( Volume: 32, Issue: 3, June 2024)