Skip to main content
Log in

A fault tolerant routing scheme for hypercubes

  • Published:
Telecommunication Systems Aims and scope Submit manuscript

Abstract

An efficient distributed fault‐tolerant routing algorithm for the hypercube is proposed based on the existence of a complete set of node‐disjoint paths between any two nodes. Node failure and repairs may occur dynamically provided that the total number of faulty nodes at any time is less than the node‐connectivity n of the n‐cube. Each node maintains for each possible destination which of the associated node‐disjoint paths to use. When a message is blocked by a node failure, the source node is warned and requested to switch to a different node‐disjoint path. The methods used to identify the paths, to propagate node failure information to source nodes, and to switch from one routing path to another incur little communication and computation overhead. We show that if the faults occur reasonably apart in time, then all messages will be routed on optimal or near optimal paths. In the unlikely case where many faults occur in a short period, the algorithm still delivers all messages but via possibly longer paths. An extension of the obtained algorithm to handle link failures in addition to node failures is discussed. We also show how to adapt the algorithm to n‐ary n‐cube networks. The algorithm can be similarly adapted to any interconnection network for which there exists a simple characterization of node‐disjoint paths between its nodes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. S. Borkar, R. Cohen, G. Cox, S. Gleason, T. Gross, H.T. Kung, M. Lam, B. Moore, C. Peterson, J. Pieper, L. Rankin, P.S. Tseng, J. Sutton, J. Urbanski and J. Webb, iWarp: An integrated solution to high-speed parallel computing, in: Proc. of Supercomputing' 88 (November 1988) pp. 330–339.

  2. B. Bose, B. Broeg, Y. Kwon and Y. Ashir, Lee distance and topological properties of k-ary n-cubes, IEEE Transactions on Computers 44(8) (1995) 1021–1030.

    Article  Google Scholar 

  3. M.S. Chen and K.G. Shin, Depth-first search approach for fault-tolerant routing in hypercube multicomputers, IEEE Transactions on Parallel and Distributed Systems 1(2) (1990) 152–159.

    Article  Google Scholar 

  4. M.S. Chen and K.G. Shin, Adaptive fault-tolerant routing in hypercube multicomputers, IEEE Transactions on Computers 39(12) (1990) 1406–1416.

    Article  Google Scholar 

  5. G.M. Chiu and S.-P. Wu, A fault-tolerant routing strategy in hypercube multicomputers, IEEE Transactions on Computers 45(2) (1996) 143–155.

    Article  Google Scholar 

  6. W.J. Dally, A. Chien, S. Fiske, W. Horwat, J. Keen, M. Larivee, R. Lethin, P. Nuth, S. Wills, P. Carrick and G. Fyler, The J-machine: A fine-grain concurrent computer, in: Information Processing' 89 (Elsevier Science, Amsterdam, 1989) pp. 1147–1153.

    Google Scholar 

  7. J.M. Gordon and Q.F. Stout, Hypercube message routing in the presence of faults, in: Proc. of the 3d Conf. on Hypercube Concurrent Computers and Applications (January 1988) pp. 251–263.

  8. L. Gravano, G. Pifarre, P. Berman and J. Sanz, Adaptive deadlock-and livelock-free routing with minimal paths in torus networks, IEEE Transactions on Parallel and Distributed Systems 5(12) (1994) 1233–1251.

    Article  Google Scholar 

  9. W.D. Hillis, The connection machine, Scientific American 256(6) (1987) 108–115.

    Article  Google Scholar 

  10. Y. Lan, A fault-tolerant routing algorithm in hypercubes, in: Proc. of 1994 Internat. Conf. on Parallel Processing (August 1994) pp. III 163–166.

  11. T.C. Lee and J.P. Hayes, A fault-tolerant communication scheme for hypercube computers, IEEE Transactions on Computers 41(10) (1992) 1242–1256.

    Article  Google Scholar 

  12. D. Linder and J. Harden, An adaptive and fault tolerant wormhole routing strategy for k-ary n-cubes, IEEE Transactions on Computers 40(1) (1991) 2–12.

    Article  Google Scholar 

  13. Y. Saad and M. Schultz, Topological properties of hypercubes, IEEE Transactions on Computers 37(7) (1988) 867–871.

    Article  Google Scholar 

  14. C.L. Seitz, The cosmic cube, Communications of ACM 28 (July 1985) 22–23.

  15. C.L. Seitz et al., The architecture and programming of the Ametek series 2010, in: Proc. of the 3rd Conf. on Hypercube Concurrent Computers and Applications, Pasadena, CA (January 1988) pp. 33–37.

  16. C.L. Seitz, W.C. Athas, C.M. Flaig, A.J. Martin, J. Scizovic, C.S. Steele and W.K. Su, Submicron systems architecture project, Semiannual Technical Report, Caltec-CS-TR-88-18, California Institute of Technology (November 1988).

  17. S.B. Tien and C.S. Raghavendra, Algorithms and bounds for shortest paths and diameter for faulty hypercubes, IEEE Transactions on Parallel and Distributed Systems 4(6) (1993) 713–718.

    Article  Google Scholar 

  18. J. Wu, Reliable unicasting in faulty hypercubes using safety levels, IEEE Transactions on Computers 46(2) (1997) 241–247.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Day, K., Harous, S. & Al‐Ayyoub, A. A fault tolerant routing scheme for hypercubes. Telecommunication Systems 13, 29–44 (2000). https://doi.org/10.1023/A:1019171418147

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1019171418147

Keywords

Navigation