Abstract
Beside universality and very low latency, Youssef's randomized self-routing algorithms [25] have high tolerance for multiple faults and more strikingly have the potential for fault tolerance without diagnosis. In this paper we study the performance of Youssef's routing algorithms for faulty Clos networks in the presence of multiple faults in multiple columns with and without fault detection. We show that with fault detection and diagnosis, randomized routing algorithms provide scalable, very efficient and fault tolerant routing mechanisms. Without fault detection and diagnosis, randomized routing provides good fault tolerance for faulty switches in either the first or the second column. The delays become large for faults in the third column or for faults in more than one column. In conclusion, randomized routing enables the system to run without periodic fault detection/diagnosis, and if and when the performance degrades beyond a certain threshold, diagnosis can be performed to improve the routing performance.
Similar content being viewed by others
References
G.B. Adam, D.P. Agrawal and H.J. Siegel, A survey and comparison of fault-tolerant multistage interconnection networks, Computer 20(6) (1987) 14-27.
G.B. Adam and H.J. Siegel, The extra stage cube: A fault-tolerant interconnection network for supercomputer, IEEE Transactions on Computers 31(5) (1982) 443-454.
D.P. Agrawal, Testing and fault tolerance of multistage interconnection networks, Computer (April 1982) 41-53.
D.P. Agrawal and J.-S. Leu, Dynamic accessibility testing and path length optimization of multistage interconnection networks, IEEE Transactions on Computers 34 (1985) 255-266.
E. Benes, Mathematical Theory on Connecting Networks and Telephone Traffic (Academic Press, New York, 1965).
M. Bhatia and A. Youssef, Efficient randomized fault-tolerant routing on Clos network, in: IEEE Workshop on Fault-Tolerant Parallel and Distributed Systems (July 1992) pp. 217-224.
M. Chen and K.G. Shin, Adaptive fault-tolerant routing in hypercube multicomputers, IEEE Transactions on Computers 39(12) (1990) 1406-1416.
V. Cherkassky, E. Opper and M. Malek, Reliability and fault diagnosis analysis of fault-tolerant multistage interconnection networks, in: Proc. of the 14th Annual International Symposium on Fault-Tolerant Computing (1984) pp. 178-183.
C. Clos, A study of non-blocking switching networks, Bell System Techn. Journal 32 (1953) 406-424.
N.J. Davis, W.T.-Y. Hsu and H.J. Siegel, Fault location techniques for distributed control interconnection networks, IEEE Transactions on Computers 24 (October 1985) 902-910.
T. Feng and W. Young, An O(log2 N) control algorithm, in: Proc. of the International Conference on Parallel Processing (1985) pp. 334-340.
S.-T. Huang and C.-H. Tung, On fault-tolerant routing of Benes networks, Journal of Information Science and Engineering 4 (July 1988) 1-13.
V.P. Kumar and S.M. Reddy, Augmented shuffle-exchange multistage interconnection networks, IEEE Computer (June 1987) 30-40.
K.Y. Lee, A new Benes network control algorithm, IEEE Transactions on Computers 36 (May 1987) 768-772.
G.F. Lev, N. Pippenger and L.G. Valiant, A fast parallel algorithm in permutation networks, IEEE Transactions on Computers 30 (February 1981) 93-100.
A. Mourad, B. Ozden and M. Malek, Comprehensive testing of multistage interconnection networks, IEEE Transactions on Computers 40(8) (1991) 935-951.
D.K. Pradhan, Fault-tolerant multiprocessor link and bus network architectures, IEEE Transactions on Computers 34(1) (1985) 33-45.
J.P. Shen and J.P. Hayes, Fault-tolerance of dynamic full-access interconnection networks, IEEE Transactions on Computers 34(1) (1984) 241-248.
R.E. Tarjan, Depth first search and linear graph algorithms, SIAM Journal on Computing 1(2) (1972) 146-160.
S. Thanawastien and V.P. Nelson, Obtimal fault detection sequences for shuffle/exchange networks, in: Proc. of the 13th Annual International Symposium on Fault-Tolerant Computing (June 1983) pp. 442-445.
N.-F. Tzeng, P.-C. Yew and C.-Q. Zhu, Fault-diagnosis in a multi-path interconnection network, in: Proc. of the 16th Annual International Symposium on Fault-Tolerant Computing (1986) pp. 98-103.
A. Varma and C.S. Raghavendra, Fault-tolerant routing in multistage interconnection networks, IEEE Transactions on Computers 38(3) (1989) 385-393.
C.L. Wu and T.Y. Feng, Fault-tolerant routing in multistage interconnection networks, IEEE Transactions on Computers 30 (October 1981) 743-758.
Y.-M. Yeh and T.-Y. Feng, Fault-tolerant routing on a class of rearrangeable networks, in: International Conference on Parallel Processing, Vol. I (1991) pp. 305-312.
A. Youssef, Randomized routing algorithms for Clos networks, Computers & Electrical Engineering 19(6) (1993) 419-429.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Bhatia, M., Youssef, A. Performance analysis and fault tolerance of randomized routing on Clos networks. Telecommunication Systems 10, 157–173 (1998). https://doi.org/10.1023/A:1019115016388
Issue Date:
DOI: https://doi.org/10.1023/A:1019115016388