Skip to main content
Log in

Fault-Tolerant Routing Algorithm in Meshes with Solid Faults

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

A fault-tolerant routing method that can tolerate solid faults using only two virtual channels is presented. The proposed routing algorithm, called FT-Ecube, not only uses a fewer number of virtual channels but also tolerates f-chains in the meshes. Furthermore, the proposed scheme misroutes messages both clockwise and counter clockwise directions to reduce channel contention on f-rings. It is shown that the proposed algorithm is deadlock-free and livelock-free in meshes when it has nonoverlapping multiple f-regions. Further, we conducted flit-level simulations to evaluate the performance of the proposed routing algorithm. As our simulation results show, FT-Ecube tolerates multiple faulty blocks using only two virtual channels per physical channel, and has good performance in terms of average latency.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. F. Allen and et al. Blue gene: a vision for protein science using a petaflop supercomputer. IBM Systems J., 4:310–327, 2001.

    Article  Google Scholar 

  2. R.V. Boppana and S. Chalasani. Fault-tolerant wormhole routing algorithms for mesh networks. IEEE Trans. Computers, 44(7):848–864, July 1995.

    Article  MATH  Google Scholar 

  3. S. Chalasani and R.V. Boppana. Communication in multicomputers with nonconvex faults. IEEE Trans. Computers, 46(5):616–622, May 1997.

    Article  MathSciNet  Google Scholar 

  4. C. Chen and G. Chiu. A fault-tolerant routing scheme for meshes with nonconvex faults. IEEE Trans. On Parallel and Distributed Systems, 616–622, May 2001.

  5. J. Duato, S. Yalmanchili and L. Ni. Interconnection networks an engineering approach. IEEE Computer Society Press, Los Alamitos, California, 1997.

  6. C. Ho and L. Stockmeyer. A new approach to fault-tolerant wormhole Routing for mesh-connected parallel computers. In International Conference on Parallel and Distributed Processing Techniques and applications (IPDPS’02), 460–468, 2002.

  7. R. Libeskind-Hadas and E. Brandt. Origin-based fault-tolerant routing in the mesh. IEEE Symposium on High-Performance Computer Architecture, 102–111, 1995.

  8. S. Park, J. Youn and B. Bose. Fault-tolerant wormhole routing algorithms in the presence of concave faults. International Parallel and Distributed Processing Symposium, 633–638, May 2000.

  9. S. Park, J. Youn and B. Bose. Wormhole routing in faulty mesh networks. In International Conference on Parallel and Distributed Processing Techniques and applications, 1007–1012, June 2000.

  10. C. Su and K. Shin. Adaptive fault-tolerant deadlock-free routing in meshes and hypercubes. IEEE Trans. Computers, 45(6):666–683, June 1996.

    Article  MATH  Google Scholar 

  11. P. Sui and S. Wang. An improved algorithm for fault-tolerant wormhole routing in meshes. IEEE Trans. Computers, 46(9):1040–1042, Sept. 1997.

    Article  MathSciNet  Google Scholar 

  12. J. Youn, B. Bose and S. Park. Fault-tolerant communication in meshes with some nonconvex faults. In International Conference on Communications in Computing, 233–239, June 2000.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jong-Hoon Youn.

Additional information

This work is supported by the NSF grant MIP-9705738

Rights and permissions

Reprints and permissions

About this article

Cite this article

Youn, JH., Bose, B. & Park, S. Fault-Tolerant Routing Algorithm in Meshes with Solid Faults. J Supercomput 37, 161–177 (2006). https://doi.org/10.1007/s11227-006-5530-7

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-006-5530-7

Keywords

Navigation