Abstract
The near future will present large scale parallel computers, able to provide computing power of more than one TFlop per second. It is commonly agreed that these systems will be based on the model of asynchronous processors connected by a point to point network. There are a number of different network architectures presented in the past.
In this paper we present an architectural principle that combines efficiency, realizability for very large systems, and inherent reliability needed for such large parallel processing systems. The here presented Fat Mesh of Clos network principle can be scaled in many ways to fulfill the special requirements of a system design.
Two realizations of this principle are presented: One is based on static switches combined to form a fully reconfigurable system. This architecture has been realized for systems containing up to 320 processors.
The other realization uses dynamic routing switches. By combining wormhole routing with randomized and local adaptive routing this network provides large capacity and very short latency times. The efficiency of our principle is demonstrated by simulations.
Both realizations presented here are built and commercialized by Parsytec Computer.
This work was partly supported by the German Federal Department of Science and Technology (BMFT), PARAWAN project 413-5839-ITR 9007 BO and by the DFG-Forschergruppe “Effiziente Nutzung massiv paralleler Systeme”
Preview
Unable to display preview. Download preview PDF.
References
V. E. Benes, Mathematical Theory of Connecting Networks and Telephone Traffic, New York, Academic Press, 1965
B. Bollobás, Extremal Graph Theory, Academic Press 1978
G. Broomell, J. R. Heath, Classification, Categories and Historical Development of Circuit Switching Technologies, Computing Surveys, vol. 15, no. 2, June 1983
C. Clos, A study of non blocking switching networks, Bell System Technical Journal, March 1953, pp. 407–424
W. Dally, Performance Analysis of k ary n cube interconnection networks, IEEE Trans. Computers, 39, 1990, pp. 775–785
W. Dally, Fine grain message passing concurrent computers, 3rd Conf. on Hypercube Concurrent Computers and Applications, ACM Press, 1988, pp. 2–12
W. J. Dally, C. L. Seitz, The Torus Routing Chip, Distributed Computing, 1986, no.1, pp. 187–196
W. J. Dally, C. L. Seitz, Deadlock-Free Message Routing in Multiprocessor Interconnectron Networks, IEEE Transactions on Computers, vol. C-36 1987, no. 5, pp. 547–553
S. Felperin, P. Raghavan, E. Upfal, A Theory of Wormwhole Routing in Parallel Computers, ACM Symposium on Foundations of Computer Science, 1992, pp. 563–572
M. J. Flynn, Very high-speed computing systems, Proceedings of the IEEE 54,12, Dec. 1966, pp. 1901–1909
R. Funke, R. Lüling, B. Monien, F. Lücking, H. Blanke-Bohne, An optimized reoncfigurable architecture for transputer networks, Proc. of the 25th Hawaii Int. Conf. on System Sciences (HICSS) 1992, vol. 1, pp. 237–245
H. Hofestädt, A. Klein, E. Reyzl, Performance Benefits from Locally Adaptrve Interval Routing in Dynamically Switched Interconnection Networks, Proc. of 2nd European Distributed Memory Computing Conference, Lecture Notes in Computer Science 487, pp. 193–202
Inmos, The T9000 Transputer Products Overview Manual, First Edition 1991
F. Langhammer, F. Wray, Supercomputing and Transputers, ACM Int. Conf. on Supercomputing, 1992, pp. 114–129
F. T. Leighton, Introduction to Parallel Algorithms and Architectures, Arrays, Trees, Hypercubes, Morgan Kaufmann Publishers, 1992
F. T. Leighton, B. M. Maggs, A. G. Ranade, S. B. Rao, Randomized Routing and Sorting on Fixed-Connection Networks Internal Report
C. E. Leiserson et.al., The Network Architecture of the Connection Machine CM-5, ACM Symposium on Parallel Algorithms and Architectures, 1992, pp. 272–285
Meiko CS-2, product announcment at Supercomputing 92, Minneapolis, Parallelogram, November 1992, pp. 10–11
B. Monien, H. Sudborough, Embedding one Interconnection Network in Another, Computing Suppl. 7, 1990, pp. 257–282
B. Monien, R. Feldmann, R. Klasing, R. Lüling, Parallel Architectures: Design and Efficient Use, Symposium on Theoretical Aspects of Computer Science (STACS) 1993, Lecture Notes in Computer Science
J. Petersen, Die Theorie regulärer Graphen, Acta Math. 15 1891, pp. 193–220
J. Rattner, The New Age of Supercomputing, 2nd European Conf. on Distributed Memory Computing 1991, Lecture Notes in Computer Science 487, pp. 1–6
G. D. Stamoulis, J. N. Tsitsiklis, The Efficiency of Greedy Routing in Hypercubes and Butterflies, ACM Symposium on Parallel Algorrthms and Architectures, 1991, pp. 248–259
J. D. Ullman, Computational Aspects of VLSI, Computer Science Press, Inc. 1984
L. G. Valiant, G. J. Brebner, Universal schemes for parallel communication Proc. of ACM STOC 1981, pp. 263–277
L. G. Valiant, General Purpose Parallel Architectures, in: J. van Leeuwen, Handbook of Theoretrcal Computer Science, vol. A, chapter 18, pp. 943–971, Elsevier Publishers, 1990
J. van Leeuwen, R. B. Tan, Interval Routing, The Computer Journal, vol. 30, no. 4, 1987, pp. 298–307
J. S. Ward, J. B. G. Roberts, J. G. Harp, Design of a Configurable Multi-Transputer Machine, Esprit P1085 Working Paper 1, August 1985
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1993 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Monien, B., Lüling, R., Langhammer, F. (1993). A realizable efficient parallel architecture. In: Meyer, F., Monien, B., Rosenberg, A.L. (eds) Parallel Architectures and Their Efficient Use. Nixdorf 1992. Lecture Notes in Computer Science, vol 678. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-56731-3_10
Download citation
DOI: https://doi.org/10.1007/3-540-56731-3_10
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-56731-8
Online ISBN: 978-3-540-47637-5
eBook Packages: Springer Book Archive