Abstract:
Network failures are common in wide area networks (WANs). Failure recovery in a software-defined WAN takes minutes or longer, as the controller needs to calculate a new t...Show MoreMetadata
Abstract:
Network failures are common in wide area networks (WANs). Failure recovery in a software-defined WAN takes minutes or longer, as the controller needs to calculate a new traffic engineering solution and update the forwarding rules across all switches. This severely degrades application performance. Existing reactive and proactive approaches inevitably lead to transient congestion or bandwidth underutilization and impair the efficiency of running the expensive WANs. We present Sentinel, a novel failure recovery system for traffic engineering in software-defined WANs. Sentinel pre-computes and installs backup tunnels to accelerate failure recovery. When a link fails, switches locally redirect traffic to backup tunnels and recover immediately in the data plane, thus substantially reducing the transient congestion compared to reactive rescaling. On the other hand, Sentinel completely avoids the bandwidth headroom required by existing proactive approaches. Extensive experiments on Mininet and numerical simulations show that similar to state-of-the-art FFC, Sentinel reduces congestion by 45% compared with rescaling, and its algorithm runs much faster than FFC. Sentinel only introduces a small number of additional forwarding rules and can be readily implemented on today's Openflow switches.
Published in: IEEE/ACM Transactions on Networking ( Volume: 27, Issue: 5, October 2019)