Deadlock-free local fast failover for arbitrary data center networks | IEEE Conference Publication | IEEE Xplore

Deadlock-free local fast failover for arbitrary data center networks


Abstract:

Today, given data center networks' sizes and bursty workloads, it is likely that at any moment there is packet loss due to some type of failure in the network. This paper...Show More

Abstract:

Today, given data center networks' sizes and bursty workloads, it is likely that at any moment there is packet loss due to some type of failure in the network. This paper focuses on solving the two most common types of data center network failures: congestion and routing failures. Recently, there has been demand for lossless Ethernet (DCB) in data center networks as a solution to congestion failures. However, DCB complicates fault tolerance by introducing a new type of failure, deadlock. If DCB is enabled, then all routing must be deadlock free. To the best of our knowledge, this paper describes the first ever deadlock-free approaches to local fast failover that can be combined with DCB, DF-FI and DF-EDST resilience. Moreover, in the evaluation, this paper shows that DF-EDST resilience, which is the paper's main contribution, can improve fault tolerance without adversely impacting performance when compared to a state-of-the-art approach to deadlock-free routing. If, however, a small reduction in aggregate throughput is acceptable, then it is possible to build routes such that only 0.00001% of the total flows in the network are likely to fail given 16 edge failures on networks with 1K-4K hosts.
Date of Conference: 10-14 April 2016
Date Added to IEEE Xplore: 28 July 2016
ISBN Information:
Conference Location: San Francisco, CA, USA

Contact IEEE to Subscribe

References

References is not available for this document.