Skip to main content
Log in

On fault tolerance of 3-dimensional mesh networks

  • Computer Network and Internet
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

In this paper, the concept ofk-submesh andk-submesh connectivity fault tolerance model is proposed. And the fault tolerance of 3-D mesh networks is studied under a more realistic model in which each network node has an independent failure probability. It is first observed that if the node failure probability is fixed, then the connectivity probability of 3-D mesh networks can be arbitrarily small when the network size is sufficiently large. Thus, it is practically important for multicomputer system manufacturer to determine the upper bound for node failure probability when the probability of network connectivity and the network size are given. A novel technique is developed to formally derive lower bounds on the connectivity probability for 3-D mesh networks. The study shows that 3-D mesh networks of practical size can tolerate a large number of faulty nodes thus are reliable enough for multicomputer systems. A number of advantages of 3-D mesh networks over other popular network topologies are given. Compared to 2-D mesh networks, 3-D mesh networks are much stronger in tolerating faulty nodes, while for practical network size, the fault tolerance of 3-D mesh networks is comparable with that of hypercube networks but enjoys much lower node degree.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Alverson R. The Tera computer system. InProc. Int. Conf. Supercomputing, 1990, pp.1–6.

  2. Cray T3D System Architecture Overview. Technical Report, Cray Research Inc. HR-04033, March, 1994.

  3. Allen F, Almasi G, Andreoni Wet al. Blue Gene: A vision for protein science using a petaflop supercomputer.IBM Systems Journal 2001, 40: 310–337.

    Article  Google Scholar 

  4. Chuang P, Tzeng N. Allocating precise submesh in mesh-connected systems.IEEE Trans. Parallel and Distributed Systems, 1994, 5(2): 211–217.

    Article  Google Scholar 

  5. Liu T, Huang W. Lombardi Fet al. A submesh allocation scheme for mesh-connected multiprocessor systems. InProc. Int. Conf. Parallel Processing II, 1995, pp.159–163.

  6. Chang C, Mohapatra P. An efficient method for approximating submesh reliability of two-dimensional meshes.IEEE Trans. parallel and Distributed Systems, 1998, 9(11): 1115–1124.

    Article  Google Scholar 

  7. Yoo B, Das C. A fast and efficient processor allocation scheme for mesh-connected multicomputers.IEEE Trans. Computers. 2002, 51(1): 46–60.

    Article  Google Scholar 

  8. Almohammand B F A, Bose Bella. Fault-tolerant communication algorithms in toroidal networks.IEEE Trans. Parallel and Distributed Systems, 1999, 10(10) 976–983.

    Article  Google Scholar 

  9. Cang S, Wu J. Time-step optimal broadcasting in 3-D meshes with minimum total communication distance.Journal of Parallel and Distributed Computing, 2000, 60: 966–997.

    Article  MATH  Google Scholar 

  10. Wu J. A simple fault-tolerant adaptive and minimal routing approach in 3-D meshes.Journal of Computer Science and Technology, 2003, 18(1): 1–13.

    Article  MATH  MathSciNet  Google Scholar 

  11. Boppana R, Chalasani S. Fault-tolerant wormhole routing algorithms for mesh networks.IEEE Trans. Computers, 1995, 44(7): 848–864.

    Article  MATH  Google Scholar 

  12. Chen C, Chiu G. A fault-tolerant routing scheme for meshes with nonconvex faults.IEEE Trans. Parallel and Distributed Systems, 2001, 12(5): 467–475.

    Article  Google Scholar 

  13. Kim S, Han T. Fault-tolerant wormhole routing in mesh with overlapped solid fault regions.Parallel Computing, 1997, 23: 1937–1962.

    Article  MATH  MathSciNet  Google Scholar 

  14. Wu J, Chen X. Fault-tolerant tree-based multicasting in mesh multicomputers.Journal of Computer Science and Technology, 2001, 16(5): 393–400.

    Article  MATH  MathSciNet  Google Scholar 

  15. Leighton F T. Introduction to Parallel Algorithms and Architectures. Arrays, Trees, Hypercubes. Morgan Kaufmann Publishers. San Mateo, CA, 1992.

    MATH  Google Scholar 

  16. Chen J, Kang I, Wang G. Hypercube network fault, tolerance: A probabilistic approach. InProc. Int. Conf. Parallel Processing (ICPP'2002), 2002, pp.65–72.

  17. Najjar W, Gaudiot J. Network resilience: A measure of network fault tolerance.IEEE Trans. Computers, 1990, 39(2): 174–181.

    Article  Google Scholar 

  18. Chen J, Wang T. Probabilistic analysis on mesh network fault tolerance. InProc. 14th International Conference on Parallel and Distributed Computing and Systems (PDCS'02), 2002, pp.606–611.

  19. Cormen T H, Leiserson C E, Rivest R Let al. Introduction to Algorithms. 2nd Ed., McGraw-Hill, 2001.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gao-Cai Wang.

Additional information

This research is supported in part by the National Natural Science Foundation of China for Distinguished Young Scholars under Grant No.69928201, the Major Research Plan of National Natural Science Foundation of China, Grant No.90104028, and by the National Science Foundation of USA under Grant NoCCR-0000206.

Gao-Cai Wang received the M.S. degree in geographic information system from Central South University, China, in 2001. Currently, he is a Ph.D. candidate in the Department of Computer Science, College of Information Science and Engineering at Central South University. His research interests include computer networks, routing algorithms, computer fault tolerance. He has published more than 15 papers in these areas.

Jian-Er Chen received the Ph.D. degree in computer science from the Courant Institute of Mathematical Science, New York University (NYU), in 1987. After graduation from NYU, he went to the Department of Mathematics at Columbia University, where he received the Ph.D. degree in mathematics in 1990. Since then, he has been with the Department of Computer Science at Texas A & M University, where he is currently a professor. He also holds a Chang Jiang Scholar Professorship at Central South University, China. His research interests include computational complexity and optimization, graph theory and algorithms, parallel processing and networks, and computer graphics. He has published more than 100 papers in these areas.

Guo-Jun Wang received the M.S. degree and Ph.D. degree in computer science from Central South University, China, in 1996 and 2002, respectively. Currently, he is an associate professor in the Department of Computer Science, College of Information Science and Engineering at Central South University. His research interests include computer networks, routing algorithms, computer fault tolerance, and software engineering. He has published more than 40 papers in these areas.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, GC., Chen, JE. & Wang, GJ. On fault tolerance of 3-dimensional mesh networks. J. Comput. Sci. & Technol. 19, 183–190 (2004). https://doi.org/10.1007/BF02944796

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02944796

Keywords

Navigation