skip to main content
10.1145/1851182.1851218acmconferencesArticle/Chapter ViewAbstractPublication PagescommConference Proceedingsconference-collections
research-article
Free access

R3: resilient routing reconfiguration

Published: 30 August 2010 Publication History

Abstract

Network resiliency is crucial to IP network operations. Existing techniques to recover from one or a series of failures do not offer performance predictability and may cause serious congestion. In this paper, we propose Resilient Routing Reconfiguration (R3), a novel routing protection scheme that is (i) provably congestion-free under a large number of failure scenarios; (ii) efficient by having low router processing overhead and memory requirements; (iii) flexible in accommodating different performance requirements (e.g., handling realistic failure scenarios, prioritized traffic, and the trade-off between performance and resilience); and (iv) robust to both topology failures and traffic variations. We implement R3 on Linux using a simple extension of MPLS, called MPLS-ff. We then conduct extensive Emulab experiments and simulations using realistic network topologies and traffic demands. Our results show that R3 achieves near-optimal performance and is at least 50% better than the existing schemes under a wide range of failure scenarios.

References

[1]
S. Agarwal, A. Nucci, and S. Bhattacharyya. Measuring the shared fate of IGP engineering and interdomain traffic. In Proc. ICNP, Nov. 2005.
[2]
D. Applegate, L. Breslau, and E. Cohen. Coping with network failures: Routing strategies for optimal demand oblivious restoration. In Proc. ACM SIGMETRICS, June 2004.
[3]
D. Applegate and E. Cohen. Making intra-domain routing robust to changing and uncertain traffic demands: Understanding fundamental tradeoffs. In Proc. ACM SIGCOMM, Aug. 2003.
[4]
A. Atlas and Z. Zinin. Basic specification for IP Fast-Reroute: loop-free alternates. (IETF Internet-Draft), draft-ietf-rtgwg-ipfrr-spec-base-10.txt, 2007.
[5]
D. Bertsekas. Nonlinear Programming. Athena Scientific, 1999.
[6]
D. Bertsekas and R. Gallager. Data Networks. Prentice-Hall, 1992.
[7]
CAIDA. http://www.caida.org/tools/.
[8]
ILOG CPLEX: optimization software. http://www.ilog.com/products/cplex/.
[9]
A. Elwalid, C. Jin, S. Low, and I. Widjaja. MATE: MPLS adaptive traffic engineering. In Proc. IEEE INFOCOM, Apr. 2001.
[10]
A. Farrel, J.-P. Vasseur, and J. Ash. A Path Computation Element (PCE)-based Architecture, RFC 4655, Aug. 2006.
[11]
N. Feamster, H. Balakrishnan, J. Rexford, A. Shaikh, and K. van der Merwe. The case for separating routing from routers. In Proc. ACM SIGCOMM FDNA Workshop, Sept. 2004.
[12]
B. Fortz, J. Rexford, and M. Thorup. Traffic engineering with traditional IP routing protocols. IEEE Communication Magazine, Oct. 2002.
[13]
B. Fortz and M. Thorup. Internet traffic engineering by optimizing OSPF weights. In Proc. IEEE INFOCOM, Mar. 2000.
[14]
B. Fortz and M. Thorup. Robust optimization of OSPF/IS-IS weights. In Proc. INOC, Oct. 2003.
[15]
P. Francois, C. Filsfils, J. Evans, and O. Bonaventure. Achieving sub-second IGP convergence in large IP networks. ACM CCR, 35(3), 2005.
[16]
G. Iannaccone, C. Chuah, S. Bhattacharyya, and C. Diot. Feasibility of IP restoration in a tier-1 backbone. IEEE Network Magazine, 18(2):13--19, 2004.
[17]
S. Iyer, S. Bhattacharyya, N. Taft, and C. Diot. An approach to alleviate link overload as observed on an IP backbone. In Proc. IEEE INFOCOM, Apr. 2003.
[18]
S. Kandula, D. Katabi, B. Davie, and A. Charny. Walking the tightrope: Responsive yet stable traffic engineering. In Proc. ACM SIGCOMM, Aug. 2005.
[19]
S. Kandula, D. Katabi, S. Sinha, and A. Berger. Dynamic load balancing without packet reordering. SIGCOMM CCR, 37(2), 2007.
[20]
K. Kar, M. S. Kodialam, and T. V. Lakshman. Routing restorable bandwidth guaranteed connections using maximum 2-route flows. IEEE/ACM Transactions on Networking, 11(5):772--781, 2003.
[21]
M. Kodialam and T. V. Lakshman. Dynamic routing of locally restorable bandwidth guaranteed tunnels using aggregated link usage information. In Proc. IEEE INFOCOM, Apr. 2001.
[22]
M. Kodialam, T. V. Lakshman, and S. Sengupta. Efficient and robust routing of highly variable traffic. In Proc. HotNets-III, Nov. 2004.
[23]
M. Kodialam, T. V. Lakshman, and S. Sengupta. A simple traffic independent scheme for enabling restoration oblivious routing of resilient connections. In Proc. IEEE INFOCOM, Apr. 2004.
[24]
M. S. Kodialam and T. V. Lakshman. Dynamic routing of restorable bandwidth-guaranteed tunnels using aggregated network resource usage information. IEEE/ACM Transactions on Networking, 11(3):399--410, 2003.
[25]
R. R. Kompella, J. Yates, A. Greenberg, and A. C. Snoeren. IP fault localization via risk modeling. In Proc. NSDI, 2005.
[26]
K. Lakshminarayanan, M. Caesar, M. Rangan, T. Anderson, S. Shenker, and I. Stoica. Achieving convergence-free routing using failure-carrying packets. In Proc. ACM SIGCOMM, Aug. 2007.
[27]
A. Li, P. Francois, and X. Yang. On improving the efficiency and manageability of NotVia. In Proc. CoNEXT, Dec. 2007.
[28]
A. Markopoulou, G. Iannaccone, S. Bhattacharyya, C. Chuah, and C. Diot. Characterization of failures in an IP backbone network. In Proc. IEEE INFOCOM, Apr. 2004.
[29]
M. Motiwala, M. Elmore, N. Feamster, and S. Vempala. Path splicing. In Proc. ACM SIGCOMM, 2008.
[30]
M. Roughan. First order characterization of Internet traffic matrices. In Proc. 55th Session of the International Statistics Institute, Apr. 2005.
[31]
M. Roughan, M. Thorup, and Y. Zhang. Traffic engineering with estimated traffic matrices. In Proc. IMC, Oct. 2003.
[32]
M. Shand and S. Bryant. IP fast reroute framework. (IETF Internet-Draft), draft-ietf-rtgwg-ipfrr-framework-06.txt, 2007.
[33]
V. Sharma, B. M. Crane, S. Makam, K. Owens, C. Huang, F. Hellstrand, J. Weil, L. Andersson, B. Jamoussi, B. Cain, S. Civanlar, and A. Chiu. Framework for MPLS-Based Recovery. RFC 3469, Feb. 2003.
[34]
N. So and H. Huang. Building a highly adaptive, resilient, and scalable MPLS backbone. http://www.wandl.com/html/support/papers/VerizonBusiness_WANDL_MPLS2007%.pdf, 2007.
[35]
N. Spring, R. Mahajan, and D. Wetherall. Rocketfuel: An ISP topology mapping engine. Available from http://www.cs.washington.edu/research/networking/rocketfuel/.
[36]
Telemark. Telemark survey. http://www.telemarkservices.com/, 2006.
[37]
L. G. Valiant. A scheme for fast parallel communication. SIAM Journal on Computing, 11(7):350--361, 1982.
[38]
J. P. Vasseur, M. Pickavet, and P. Demeester. Network Recovery: Protection and Restoration of Optical, SONET-SDH, and MPLS. Morgan Kaufmann, 2004.
[39]
H. Wang, H. Xie, L. Qiu, Y. R. Yang, Y. Zhang, and A. Greenberg. COPE: Traffic engineering in dynamic networks. In Proc. ACM SIGCOMM, 2006.
[40]
H. Wang, Y. R. Yang, P. H. Liu, J. Wang, A. Gerber, and A. Greenberg. Reliability as an interdomain service. In Proc. ACM SIGCOMM, Aug. 2007.
[41]
B. White, J. Lepreau, L. Stoller, R. Ricci, S. Guruprasad, M. Newbold, M. Hibler, C. Barb, and A. Joglekar. An integrated experimental environment for distributed systems and networks. In Proc. OSDI, Dec. 2002.
[42]
Wired News. The backhoe: A real cyberthreat, Jan. 2006. http://www.wired.com/news/technology/1,70040-0.html.
[43]
C. Zhang, Z. Ge, J. Kurose, Y. Liu, and D. Towsley. Optimal routing with multiple traffic matrices: Tradeoff between average case and worst case performance. In Proc. ICNP, Nov. 2005.
[44]
Y. Zhang and Z. Ge. Finding critical traffic matrices. In Proc. DSN '05, 2005.
[45]
Y. Zhang, M. Roughan, C. Lund, and D. L. Donoho. An information-theoretic approach to traffic matrix estimation. In Proc. ACM SIGCOMM, Aug. 2003.
[46]
R. Zhang-Shen and N. McKeown. Designing a predictable Internet backbone network. In Proc. HotNets-III, Nov. 2004.

Cited By

View all
  • (2024)FERN: Leveraging Graph Attention Networks for Failure Evaluation and Robust Network DesignIEEE/ACM Transactions on Networking10.1109/TNET.2023.331167832:2(1003-1018)Online publication date: Apr-2024
  • (2024)FLAIR: A Fast and Low-Redundancy Failure Recovery Framework for Inter Data Center NetworkIEEE Transactions on Cloud Computing10.1109/TCC.2024.339373512:2(737-749)Online publication date: Apr-2024
  • (2024)STARVERI: Efficient and Accurate Verification for Risk-Avoidance Routing in Leo Satellite Networks2024 IEEE 32nd International Conference on Network Protocols (ICNP)10.1109/ICNP61940.2024.10858518(1-11)Online publication date: 28-Oct-2024
  • Show More Cited By

Index Terms

  1. R3: resilient routing reconfiguration

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGCOMM '10: Proceedings of the ACM SIGCOMM 2010 conference
    August 2010
    500 pages
    ISBN:9781450302012
    DOI:10.1145/1851182
    • cover image ACM SIGCOMM Computer Communication Review
      ACM SIGCOMM Computer Communication Review  Volume 40, Issue 4
      SIGCOMM '10
      October 2010
      481 pages
      ISSN:0146-4833
      DOI:10.1145/1851275
      Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 30 August 2010

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. network resiliency
    2. routing
    3. routing protection

    Qualifiers

    • Research-article

    Conference

    SIGCOMM '10
    Sponsor:
    SIGCOMM '10: ACM SIGCOMM 2010 Conference
    August 30 - September 3, 2010
    New Delhi, India

    Acceptance Rates

    Overall Acceptance Rate 462 of 3,389 submissions, 14%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)137
    • Downloads (Last 6 weeks)18
    Reflects downloads up to 15 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)FERN: Leveraging Graph Attention Networks for Failure Evaluation and Robust Network DesignIEEE/ACM Transactions on Networking10.1109/TNET.2023.331167832:2(1003-1018)Online publication date: Apr-2024
    • (2024)FLAIR: A Fast and Low-Redundancy Failure Recovery Framework for Inter Data Center NetworkIEEE Transactions on Cloud Computing10.1109/TCC.2024.339373512:2(737-749)Online publication date: Apr-2024
    • (2024)STARVERI: Efficient and Accurate Verification for Risk-Avoidance Routing in Leo Satellite Networks2024 IEEE 32nd International Conference on Network Protocols (ICNP)10.1109/ICNP61940.2024.10858518(1-11)Online publication date: 28-Oct-2024
    • (2024)An ML-Accelerated Framework for Large-Scale Constrained Traffic Engineering2024 IEEE 44th International Conference on Distributed Computing Systems (ICDCS)10.1109/ICDCS60910.2024.00014(47-58)Online publication date: 23-Jul-2024
    • (2023)Comparative Synthesis: Learning Near-Optimal Network Designs by QueryProceedings of the ACM on Programming Languages10.1145/35711977:POPL(91-120)Online publication date: 11-Jan-2023
    • (2023)FRAVaR: A Fast Failure Recovery Framework for Inter-DC Network2023 IEEE Wireless Communications and Networking Conference (WCNC)10.1109/WCNC55385.2023.10119088(1-6)Online publication date: Mar-2023
    • (2023)Achieving High Availability in Inter-DC WAN Traffic EngineeringIEEE/ACM Transactions on Networking10.1109/TNET.2022.321659231:6(2406-2421)Online publication date: Dec-2023
    • (2023)Discovery of Flow Splitting Ratios in ISP Networks with Measurement Noise2023 IEEE 28th Pacific Rim International Symposium on Dependable Computing (PRDC)10.1109/PRDC59308.2023.00017(64-70)Online publication date: 24-Oct-2023
    • (2023)Achieving Resilient and Performance-Guaranteed Routing in Space-Terrestrial Integrated NetworksIEEE INFOCOM 2023 - IEEE Conference on Computer Communications10.1109/INFOCOM53939.2023.10229104(1-10)Online publication date: 17-May-2023
    • (2023)An Efficient Approach for Resilience and Reliability Against Cascading Failure2023 15th International Conference on Developments in eSystems Engineering (DeSE)10.1109/DeSE58274.2023.10100283(71-76)Online publication date: 9-Jan-2023
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media