Abstract
An approach to fault-tolerant Boolean n-cube architectures (FTBns) is proposed in this paper. We employ spares, including nodes, links and switches, to reconfigure a failed system so that system topology with its original dimension can be retained. The FTBn is designed in two levels. In the first level, we use a Boolean m-cube of 2m nodes with 2p, p≤m, spare nodes, and some switching elements to build a faulttolerant module (FTM). Then an FTBn, n≥m, is built in the second level by taking 2n−m FTMs, and augmenting several switching elements between two adjacent FTMs. We will show that each FTM can achieve full spare utilization. and also that the degree of each node maintains a constant n. A two-phase reconfiguration algorithm is developed to allocate an adequate spare node to replace a faulty node. Finally. the reliability and costs of the FTBn are evaluated, and we then show that the FTBn can achieve higher or the same reliability as previous comparable systems at less extra hardware cost.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
M. S. Alam and R. G. Meinem, “Fault-tolerance and reliable routing in augmented hypercube architecture,” Proc. of the IEEE Phoenix Conference on Computers and Communications, pp. 19–23. March 1989.
M. S. Alam and R. G. Melhem,“An efficient modular spare allocation scheme and its application to fault tolerant binary hypercubes,” IEEE Tran. Parallel Distributed Syst., Vol. 2, No. 1, pp. 117–126, 1991.
P. Banerjee, “Strategies for reconfiguring hypercubes under fault,” Proc. 20th Int. Conf. Parallel Processing, 1990. pp. 210–216.
P. Banerjee, J. T. Rahmeh, G. Stunkel, V. S. Nair, K. Roy V. Balasubramanian and J. A. Abraham, “Algorithm-based fault tolerance on a hypercube multiprocessor,” IEEE Tran. Comput., Vol. 39, No. 9, pp. 1132–1145, 1990.
Igor Bazovsky, Reliability Theory and Practice (Prentice-Hall, 1980)
S. C. Chau. A. L. Liestman, “A proposal for a fault-tolerant binary hypercube architecture,” in Proc. 19th Int. Symp. Fault Tolerant Computing, pp. 323–330. 1989.
M. S. Chen and K. G. Shin, “Message routing in an injured hypercube,” in Proc. 3rd Hypercube Concurrent Computers and Applications., pp. 312–317, 1988.
M. S. Chen and K. G. Shin, “Depth-first search approach for fault-tolerant routing in hypercube multicomputers,” IEEE Tran. Parallel Distributed Syst., Vol. 1, No. 2, pp. 152–159, 1990.
M. S. Chen and K. G. Shin, “Adaptive fault-tolerant routing in hypercube multicomputers,” IEEE Tran. Comput., Vol. 39, No. 12, pp. 1406–1416, 1990.
Y. Y. Chen and S. J. Upadhyaya, “Reliability. reconfiguration, and spare allocation issues in binary-tree architectures based on multiple-level redundancy,” IEEE Tran. Comput., Vol, 42, No. 6. 1993.
S. Dutt and J. P. Hayes, “Design and reconfiguration strategies for near-optimal fault-tolerant tree architectures, “ in Proc. 18th Fault Tolerant Computing Symp., pp. 328–333. 1988.
S. Dutt and J. P. Hayes, “An automorphic approach to the design of fault tolerant multiprocessors,” in Proc. 19th Int. Symp. Fault Tolerant Computing. 1989, pp. 496–503.
S. Dutt and J. P. Hayes, “On designing and reconfiguring k-fault-tolerant tree architectures,” IEEE Tran. Comput., Vol. C-39, pp. 490–503, 1990.
S. Dutt and J. P. Hayes, “Some practical issues in the desgn of fault-tolerant multiprocessors,” IEEE Tran. Comput., Vol. 41, No. 5, pp. 588–598, 1992.
G. C. Fox and J. G. Koller, “A dynamic load balancer on the Intel hypercube,” Proc. 3rd Conf. Hypercube Concurrent Computers and Applications, Pasadena, CA, 1988.
E. Chow, H. S. Madan, J. C. Peterson, D. Grunwald. and D. Reed, “Hyperswitch network for the hypercube computer,” in Proc. 15th Annu. Int. Symp. Comput. Architecture, May 30–June 2. 1988, pp. 90–99.
NCUBE Corp., “NCUBE/ten: An overview,” Beaverton, OR, Nov. 1985.
A. H. Esfahanian and S. L. Hakimi, “Fault-tolerant routing in Debruijn communication networks,” IEEE Tran. Comput. Vol. C-34. No. 9. pp. 777–788. 1982.
A. S. M. Hassan and V. K. Agrawal, “A fault-tolerant modular architecture for binary trees,” IEEE Tran. Comp., Vol. C-35, No. 4, pp. 356–361, 1986.
C. K. Kim and D. A. Reed, “Adaptive packet routing in a hypercube,” in Proc. 3rd Hypercube Concurrent Computers and Applications, 1988. pp. 625–630.
J. G. Kuhl and S. M. Reddy, “Distributed fault tolerance for large multiprocessor systems,” in Proc. 7th Annu. Int. Symp. Comput. Architecture, May 1980, pp. 23–30.
F. özgüner and C. Aykanat “A reconfiguration algorithm for fault tolerance in a hypercube multiprocessor,” Information Processing Letters, Vol. 29. No. 5. pp. 247–254, Nov., 1988.
D. K. Pradhan and S. M. Reddy, “A fault-tolerant communication architecture for distributed systems,” IEEE Tran. Comput., Vol. C-31, No. 9. pp. 863–870. 1981.
D. K. Pradhan, “Fault-tolerant multiprocessor link and bus network architecture,” IEEE. Tran. Compul., Vol. 34. No. 1, pp. 33–45. 1985.
D. A. Rennels, “Fault tolerant computing: concepts and examples,” IEEE Tran. Comput., Vol. C-33. No. 12. pp. 1116–1129, 1984.
D. A. Rennels, “On implementing fault-tolerance in binary hypercube,” in Proc. IEEE Fault Tolerant Computing. pp. 344–349, 1985.
Y. Saad and M. H. Schultz, “Topological properties of hypercube,” IEEE Tran. Comput., Vol. 37. No. 7. pp. 867–871, 1988.
C. L. Seitz, “The cosmic cube,” Commun. ACM, Vol. 28, pp. 22–33, 1985.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1994 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yang, CS., Wu, SY. (1994). Fault-tolerance on boolean n-cube architectures. In: Echtle, K., Hammer, D., Powell, D. (eds) Dependable Computing — EDCC-1. EDCC 1994. Lecture Notes in Computer Science, vol 852. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-58426-9_157
Download citation
DOI: https://doi.org/10.1007/3-540-58426-9_157
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-58426-1
Online ISBN: 978-3-540-48785-2
eBook Packages: Springer Book Archive