skip to main content
10.1145/1254882.1254922acmconferencesArticle/Chapter ViewAbstractPublication PagesmetricsConference Proceedingsconference-collections
Article

Oblivious routing for fat-tree based system area networks with uncertain traffic demands

Published: 12 June 2007 Publication History

Abstract

Fat-tree based system area networks have been widely adopted in high performance computing clusters. In such systems, the routing is often deterministic and the traffic demand is usually uncertain and changing. In this paper, we study routing performance on fat-tree based system area networks with deterministic routing under the assumption that the traffic demand is uncertain. The performance of a routing algorithm under uncertain traffic demands is characterized by the oblivious performance ratio that bounds the relative performance of the routing algorithm and the optimal routing algorithm for any given traffic demand. We consider both single path routing where the traffic between each source-destination pair follows one path, and multi-path routing where multiple paths can be used for the traffic between a source-destination pair. We derive lower bounds of the oblivious performance ratio of any single path routing scheme for fat-tree topologies and develop single path oblivious routing schemes that achieve the optimal oblivious performance ratio for commonly used fat-tree topologies. These oblivious routing schemes provide the best performance guarantees among all single path routing algorithms under uncertain traffic demands. For multi-path routing, we show that it is possible to obtain a scheme that is optimal for any traffic demand (an oblivious performance ratio of 1) on the fat-tree topology. These results quantitatively demonstrate that single path routing cannot guarantee high routing performance while multi-path routing is very effective in balancing network loads on the fat-tree topology.

References

[1]
D. Applegate and E. Cohen, "Making Intra-Domain Routing Robust to Changing and Uncertain Traffic Demands: Understanding Fundamental Tradeoffs." ACM SIGCOMM, pages 313--324, 2003.
[2]
D. Applegate, L. Breslau, and E. Cohen, "Coping with Network Failures: Routing Strategies for Optimal Demand Oblivious Restoration." ACM SIGMETRICS, pages 270--281, 2004.
[3]
A. Bermudez, R. Casado, F. J. Quiles, and J. Duato, "Use of Provisional Routes to Speed-up Change Assimilation in Infiniband Netrworks," Proc. 2004 IEEE International Workship on Communication Architecture for Clusters (CAC'04), April 2004.
[4]
A. Bermudez, R. Casado, F. J. Quiles, and J. Duato, "Fast Routing Computation on Infiniband Networks," IEEE Trans. on Parallel and Distributed Systems, Vol. 17, No. 3, pp 215--226, March 2006.
[5]
R. I. Greenberg and C. E. Lerserson, "Ramdonzied Routing on Fat-trees." In 26th Annual IEEE Symposium on Foundations of Computer Science, pages 241--249, Oct. 1985.
[6]
X. Lin, Y. Chung, and T. Huang, "A Multiple LID Routing Scheme for Fat-Tree-Based Infiniband Networks." Proceedings of the 18th IEEE International Parallel and Distributed Processing Symposium (IPDPS'04), p. 11a, Sana Fe, NM, April 2004.
[7]
InfinibandTM Trade Association, Infiniband TM Architecture Specification, Release 1.2, October 2004.
[8]
C. E. Leiserson, "Fat-Trees: Universal Networks for Hardware-Efficient Supercomputing." IEEE Transactions on Computers, 34(10)892--901, October 1985.
[9]
C. E. Leiserson, Z. S. Abuhamdeh, D. C. Douglas, C. R. Feynman, M. N. Ganmukhi, J. V. Hill, W. D. Hillis, B. C. Kuszmaul, M. A. St. Pierre, D. S. Wells, M. C. Wong-Chan, S -W. Yang, and R. Zak, "The network architecture of the Connection Machine CM-5." Journal of Parallel and Distributed Computing, 33(2):145--158, Mar 1996.
[10]
P. Lopez, J. Flich, and J. Duato, "Deadlock-Free Routing in Infiniband through Destination Renaming," Proc. 2001 International Conference on Parallel Processing (ICPP), Sept. 2001.
[11]
J. C. Martinez, J. Flich, A. Robles, P. Lopez, and J. Duato, "Supporting Fully Adaptive Routing in Infiniband Networks." Proceedings of the 17th IEEE International Parallel and Distributed Processing Symposium (IPDPS'03), p 44a, Nice, France, April 2003.
[12]
Mellanox Technologies, "Infiniband in the Enterprise Data Center." White Paper, 2006. Available at http://www.mellanox.com/pdf/whitepapers/scaling10gbsclusters.pdf.
[13]
Myricom home page, http://www.myri.com.
[14]
J. C. Sancho, A. Robles, and J. Duato, "Effective Strategy to Computing Forwarding Tables for Infiniband Networks," Proc. International Conference on Parallel Processing (ICPP), Sept. 2001.
[15]
J. C. Sancho, A. Robles, and J. Duato, "Effective Methodology for Deadlock-Free Minimal Routing in Infiniband Networks," Proc. International Conference on Parallel Processing (ICPP), 2002.
[16]
Top 500 supercomputer sites. http://www.top500.org
[17]
M. Valerio, L. Moser, and P. Melliar-Smith, "Recursively Scalable Fat-trees as Interconnect Networks." Proceedings of the 13th IEEE International Phoenix Conference on Computers and Communications, pages 40--46, 1994.
[18]
H. Wang, H. Xie, L. Qiu, Y. R. Yang, Y. Zhang, and A. Greenberg, "COPE: Traffic Engineering in Dynamic Networks." ACM SIGCOMM, 2006.

Cited By

View all

Index Terms

  1. Oblivious routing for fat-tree based system area networks with uncertain traffic demands

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGMETRICS '07: Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
    June 2007
    398 pages
    ISBN:9781595936394
    DOI:10.1145/1254882
    • cover image ACM SIGMETRICS Performance Evaluation Review
      ACM SIGMETRICS Performance Evaluation Review  Volume 35, Issue 1
      SIGMETRICS '07 Conference Proceedings
      June 2007
      382 pages
      ISSN:0163-5999
      DOI:10.1145/1269899
      Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 12 June 2007

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. fat-tree
    2. oblivious routing
    3. system area networks

    Qualifiers

    • Article

    Conference

    SIGMETRICS07

    Acceptance Rates

    Overall Acceptance Rate 459 of 2,691 submissions, 17%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)3
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 23 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2017)Traffic Engineering With Equal-Cost-MultiPathIEEE/ACM Transactions on Networking10.1109/TNET.2016.261424725:2(779-792)Online publication date: 1-Apr-2017
    • (2016)Discharging the network from its flow control headachesIEEE/ACM Transactions on Networking10.1109/TNET.2014.237801224:1(15-28)Online publication date: 1-Feb-2016
    • (2015)Responsive multipath TCP in SDN-based datacenters2015 IEEE International Conference on Communications (ICC)10.1109/ICC.2015.7249165(5296-5301)Online publication date: Jun-2015
    • (2014)Multi-homed Fat-Tree Routing with InfiniBandProceedings of the 2014 22nd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing10.1109/PDP.2014.22(122-129)Online publication date: 12-Feb-2014
    • (2014)Traffic engineering with Equal-Cost-Multipath: An algorithmic perspectiveIEEE INFOCOM 2014 - IEEE Conference on Computer Communications10.1109/INFOCOM.2014.6848095(1590-1598)Online publication date: Apr-2014
    • (2013)LABERIOProceedings of the 2013 IEEE 27th International Conference on Advanced Information Networking and Applications10.1109/AINA.2013.7(290-297)Online publication date: 25-Mar-2013
    • (2011)On Nonblocking Folded-Clos Networks in Computer Communication EnvironmentsProceedings of the 2011 IEEE International Parallel & Distributed Processing Symposium10.1109/IPDPS.2011.27(188-196)Online publication date: 16-May-2011
    • (2009)LID Assignment in InfiniBand NetworksIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2008.14420:4(484-497)Online publication date: 1-Apr-2009
    • (2008)Low jitter guaranteed-rate communications for cluster computing systemsInternational Journal of Communication Networks and Distributed Systems10.1504/IJCNDS.2008.0202581:2(140-160)Online publication date: 1-Sep-2008
    • (2021)Deadlock-free local fast failover for arbitrary data center networksIEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications10.1109/INFOCOM.2016.7524356(1-9)Online publication date: 10-Mar-2021
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media