skip to main content
10.1145/2907294.2907313acmconferencesArticle/Chapter ViewAbstractPublication PageshpdcConference Proceedingsconference-collections
research-article

Routing on the Dependency Graph: A New Approach to Deadlock-Free High-Performance Routing

Published: 31 May 2016 Publication History

Abstract

Lossless interconnection networks are omnipresent in high performance computing systems, data centers and network-on-chip architectures. Such networks require efficient and deadlock-free routing functions to utilize the available hardware. Topology-aware routing functions become increasingly inapplicable, due to irregular topologies, which either are irregular by design or as a result of hardware failures. Existing topology-agnostic routing methods either suffer from poor load balancing or are not bounded in the number of virtual channels needed to resolve deadlocks in the routing tables. We propose a novel topology-agnostic routing approach which implicitly avoids deadlocks during the path calculation instead of solving both problems separately. We present a model implementation, called Nue, of a destination-based and oblivious routing function. Nue routing heuristically optimizes the load balancing while enforcing deadlock-freedom without exceeding a given number of virtual channels, which we demonstrate based on the InfiniBand architecture.

References

[1]
T. Bjerregaard and S. Mahadevan. A Survey of Research and Practices of Network-on-chip. ACM Comput. Surv., 38(1), June 2006.
[2]
U. Brandes. A Faster Algorithm for Betweenness Centrality. Journal of Mathematical Sociology, 25:163--177, 2001.
[3]
L. Cherkasova, V. Kotov, and T. Rokicki. Fibre channel fabrics: evaluation and design. In Proceedings of the Twenty-Ninth Hawaii International Conference on System Sciences, volume 1, 1996.
[4]
W. Dally and B. Towles. Principles and Practices of Interconnection Networks. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 2003.
[5]
W. J. Dally. Virtual-channel Flow Control. In Proceedings of the 17th Annual International Symposium on Computer Architecture, ISCA '90, New York, USA, 1990. ACM Press.
[6]
W. J. Dally and C. L. Seitz. Deadlock-Free Message Routing in Multiprocessor Interconnection Networks. IEEE Trans. Comput., 36(5):547--553, 1987.
[7]
J. Domke, T. Hoefler, and S. Matsuoka. Fail-in-place Network Design: Interaction Between Topology, Routing Algorithm and Failures. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC '14, Piscataway, NJ, USA, 2014. IEEE Computer Society.
[8]
J. Domke, T. Hoefler, and W. E. Nagel. Deadlock-Free Oblivious Routing for Arbitrary Topologies. In Proceedings of the 25th IEEE International Parallel & Distributed Processing Symposium (IPDPS), Washington, DC, USA, 2011. IEEE Computer Society.
[9]
G. Faanes, A. Bataineh, D. Roweth, T. Court, E. Froese, B. Alverson, T. Johnson, J. Kopnick, M. Higgins, and J. Reinhard. Cray Cascade: a Scalable HPC System based on a Dragonfly Network. In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, SC '12, pages 103:1--103:9, Los Alamitos, CA, USA, 2012. IEEE Computer Society Press.
[10]
J. Flich, P. López, J. C. Sancho, A. Robles, and J. Duato. Improving InfiniBand Routing Through Multiple Virtual Networks. In Proceedings of the 4th International Symposium on High Performance Computing, ISHPC '02, London, UK, UK, 2002. Springer-Verlag.
[11]
J. Flich, T. Skeie, A. Mejia, O. Lysne, P. Lopez, A. Robles, J. Duato, M. Koibuchi, T. Rokicki, and J. C. Sancho. A Survey and Evaluation of Topology-Agnostic Deterministic Routing Algorithms. IEEE Trans. Parallel Distrib. Syst., 23(3):405--425, 2012.
[12]
L. C. Freeman. A Set of Measures of Centrality Based on Betweenness. Sociometry, 40(1):35--41, 1977.
[13]
M. Garcia, E. Vallejo, R. Beivide, M. Odriozola, and M. Valero. Efficient Routing Mechanisms for Dragonfly Networks. In 42nd International Conference on Parallel Processing (ICPP), 2013.
[14]
GSIC. TSUBAME2 Hardware Architecture. http://tsubame.gsic.titech.ac.jp/en/hardware-architecture, Jan. 2016.
[15]
M. C. Heydemann, J. Meyer, and D. Sotteau. On Forwarding Indices of Networks. Discrete Appl. Math., 23(2):103--123, May 1989.
[16]
T. Hoefler, T. Schneider, and A. Lumsdaine. Multistage Switches are not Crossbars: Effects of Static Routing in High-Performance Networks. In Proceedings of the 2008 IEEE International Conference on Cluster Computing. IEEE Computer Society Press, 2008.
[17]
T. Hoefler, T. Schneider, and A. Lumsdaine. Optimized Routing for Large-Scale InfiniBand Networks. In 17th Annual IEEE Symposium on High Performance Interconnects (HOTI 2009), 2009.
[18]
InfiniBand Trade Association. Infiniband Architecture Specification Volume 1, Release 1.2.1, 2007.
[19]
G. Karypis and V. Kumar. Multilevel K-way Partitioning Scheme for Irregular Graphs. J. Parallel Distrib. Comput., 48(1):96--129, 1998.
[20]
J. Kim, W. J. Dally, S. Scott, and D. Abts. Technology-Driven, Highly-Scalable Dragonfly Topology. In Proceedings of the 35th Annual International Symposium on Computer Architecture, ISCA '08, pages 77--88, Washington, DC, USA, 2008. IEEE Computer Society.
[21]
M. A. Kinsy, M. H. Cho, T. Wen, E. Suh, M. van Dijk, and S. Devadas. Application-aware deadlock-free oblivious routing. In Proceedings of the 36th annual international symposium on Computer architecture, ISCA '09, New York, NY, USA, 2009. ACM Press.
[22]
M. Koibuchi, A. Funahashi, A. Jouraku, and H. Amano. L-turn routing: an adaptive routing in irregular networks. In International Conference on Parallel Processing, 2001.
[23]
D. Li, X. Lu, and J. Su. Graph-Theoretic Analysis of Kautz Topology and DHT Schemes. In NPC, pages 308--315, 2004.
[24]
A. Mejia, J. Flich, J. Duato, S.-A. Reinemo, and T. Skeie. Segment-based routing: an efficient fault-tolerant routing algorithm for meshes and tori. In 20th International Parallel and Distributed Processing Symposium (IPDPS), 2006.
[25]
Mellanox Technologies. Mellanox OFED for Linux User Manual, rev 2.0-3.0.0 edition, 2013.
[26]
T. Rauber and G. Rünger. Parallel Programming: for Multicore and Cluster Systems. Springer-Verlag, 2013.
[27]
J. C. Sancho, A. Robles, and J. Duato. A Flexible Routing Scheme for Networks of Workstations. In ISHPC '00: Proceedings of the Third International Symposium on High Performance Computing, London, UK, 2000. Springer-Verlag.
[28]
J. C. Sancho, A. Robles, and J. Duato. A new methodology to compute deadlock-free routing tables for irregular networks. In Network-Based Parallel Computing. Communication, Architecture, and Applications, volume 1797 of Lecture Notes in Computer Science, pages 45--60. Springer Berlin Heidelberg, 2000.
[29]
M. D. Schroeder, A. Birell, M. Burrows, H. Murray, R. Needham, T. Rodeheffer, E. Satterthwaite, and C. Thacker. Autonet: A High-speed, Self-Configuring Local Area Network Using Point-to-Point Links. IEEE Journal on Selected Areas in Communications, 9(8), 1991.
[30]
K. S. Shim, M. H. Cho, M. Kinsy, T. Wen, M. Lis, G. E. Suh, and S. Devadas. Static Virtual Channel Allocation in Oblivious Routing. In Proceedings of the 2009 3rd ACM/IEEE International Symposium on Networks-on-Chip, NOCS '09, pages 38--43, Washington, DC, USA, 2009. IEEE Computer Society.
[31]
T. Skeie, O. Lysne, J. Flich, P. López, A. Robles, and J. Duato. LASH-TOR: A Generic Transition-Oriented Routing Algorithm. In ICPADS '04: Proceedings of the Tenth International Conference on Parallel and Distributed Systems, Washington, DC, USA, 2004. IEEE Computer Society Press.
[32]
T. Skeie, O. Lysne, and I. Theiss. Layered Shortest Path (LASH) Routing in Irregular System Area Networks. In IPDPS '02: Proceedings of the 16th International Parallel and Distributed Processing Symposium, Washington, DC, USA, 2002. IEEE Computer Society Press.
[33]
E. Zahavi, G. Johnson, D. J. Kerbyson, and M. Lang. Optimized InfiniBand fat-tree routing for shift all-to-all communication patterns. Concurr. Comput.: Pract. Exper., 22(2):217--231, 2010.
[34]
J. Zhou and Y.-C. Chung. Tree-turn routing: an efficient deadlock-free routing algorithm for irregular networks. The Journal of Supercomputing, 59(2):882--900, 2012.

Cited By

View all
  • (2024)A high-performance design, implementation, deployment, and evaluation of the slim fly networkProceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation10.5555/3691825.3691882(1025-1044)Online publication date: 16-Apr-2024
  • (2022)Review, Analysis, and Implementation of Path Selection Strategies for 2D NoCsIEEE Access10.1109/ACCESS.2022.322746010(129245-129268)Online publication date: 2022
  • (2022)Performance evaluation of multi-exaflops machines using Equality network topologyThe Journal of Supercomputing10.1007/s11227-022-05005-179:8(8729-8753)Online publication date: 26-Dec-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
HPDC '16: Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing
May 2016
302 pages
ISBN:9781450343145
DOI:10.1145/2907294
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 31 May 2016

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. deadlock-free
  2. destination-based
  3. routing
  4. virtual channels

Qualifiers

  • Research-article

Conference

HPDC'16
Sponsor:

Acceptance Rates

HPDC '16 Paper Acceptance Rate 20 of 129 submissions, 16%;
Overall Acceptance Rate 166 of 966 submissions, 17%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)40
  • Downloads (Last 6 weeks)10
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)A high-performance design, implementation, deployment, and evaluation of the slim fly networkProceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation10.5555/3691825.3691882(1025-1044)Online publication date: 16-Apr-2024
  • (2022)Review, Analysis, and Implementation of Path Selection Strategies for 2D NoCsIEEE Access10.1109/ACCESS.2022.322746010(129245-129268)Online publication date: 2022
  • (2022)Performance evaluation of multi-exaflops machines using Equality network topologyThe Journal of Supercomputing10.1007/s11227-022-05005-179:8(8729-8753)Online publication date: 26-Dec-2022
  • (2021)High-Performance Routing With Multipathing and Path Diversity in Ethernet and HPC NetworksIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2020.303576132:4(943-959)Online publication date: 1-Apr-2021
  • (2019)HyperX topologyProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3295500.3356140(1-23)Online publication date: 17-Nov-2019
  • (2019)Deadlock-Free Layered Routing for Infiniband Networks2019 Seventh International Symposium on Computing and Networking Workshops (CANDARW)10.1109/CANDARW.2019.00023(84-90)Online publication date: Nov-2019
  • (2019)Extended Routing Table Generation Algorithm for the Angara InterconnectSupercomputing10.1007/978-3-030-36592-9_47(573-583)Online publication date: 10-Dec-2019
  • (2018)High-Performance, Low-Complexity Deadlock Avoidance for Arbitrary Topologies/RoutingsProceedings of the 2018 International Conference on Supercomputing10.1145/3205289.3205307(129-138)Online publication date: 12-Jun-2018
  • (2018)Modular routing design for chiplet-based systemsProceedings of the 45th Annual International Symposium on Computer Architecture10.1109/ISCA.2018.00066(726-738)Online publication date: 2-Jun-2018
  • (2018)Topological Response to Deadlock Detection and Resolution in Real-Time Database Systems2018 IEEE International Conference on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData)10.1109/Cybermatics_2018.2018.00312(1880-1887)Online publication date: Jul-2018
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media