Abstract
A scalable parallel algorithm for matrix multiplication on SISAMD computers is presented. Our method enables us to implement an efficient BLAS library on the Italian APE100/Quadrics SISAMD massively parallel computer on which hitherto scalable parallel BLAS-3 were not available. The approach proposed is based on a one-dimensional ring connectivity. The flow of data is hyper-systolic. The communication overhead is competitive with that of established algorithms for SIMD and MIMD machines. Advantages are that (i) the layout of the matrices is preserved during the computation, (ii) BLAS-2 fit well into this layout and (iii) indexed addressing is avoided, which renders the algorithm suitable for SISAMD machines and, in this way, for all other types of parallel computers. On the APE100/Quadrics, a performance of nearly 25 % of the peak performance for multiplications of complex matrices is achieved.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Preview
Unable to display preview. Download preview PDF.
References
J. Choi, J. J. Dongarra and D. W. Walker: ‘The Design of Parallel Software Libraries for Distributed Memory Concurrent Computers', in J. J. Dongarra and B. Tourancheau (edts.): Environments and Tools for Parallel Scientific Computing (Elsevier, 1992).
H. Gupta and P. Sadayappan: ‘Communication-Efficient Matrix Multiplication on Hypercubes', Parallel Computing 22 (1996) 25.
J. J. Dongarra, J. Du Croz, I. Duff and S. Hammarling: ‘A Set of level 3 Basic Linear Algebra Subprograms', ACM Transaction on Math. Software, 16 (1990) 1.
Alenia Spazio S.p.A.: Quadrics Primer (Rome: Alenia Spazio, 1995).
R. Tripiccione, in: F. Karsch et al. (edts.), Proceedings of the International Conference ”Multi-scale Phenomena and their Simulation”, ZiF, Bielefeld, Sep. 30–Oct. 4, 1996, to appear.
I. Arsenin et al. in: T. D Kieu et al. (edts.), Lattice 95, Proceedings of the International Symposium on Lattice Field Theory, Melbourne, Australia, 1995, Nucl. Phys. B (Proc. Suppl.) 47 (1996) 804.
N. Christ, in: F. Karsch et al. (edts.), Proceedings of the International Conference ”Multi-scale Phenomena and their Simulation”, ZiF, Bielefeld, Sep. 30–Oct. 4, 1996, to appear.
P. S. Paolucci: ‘N-Body Classical Systems and Neural Networks on a 3D SIMD Massive Parallel Processor: APE100/Quadrics', IJMPC 6 (1995) 169.
A. Hoferichter, Th. Lippert, P. Palazzari, K. Schilling, and H. Simma: ‘Hyper-Systolic Routing for SIMD Systems', technical report, HLRZ 03/97, Jülich, Germany, submitted to PARCO '97.
L. E. Cannon: ‘A Cellular Computer to Implement the Kalman Filter Algorithm', PhD Thesis, Montana State University, 1969.
V. Kumar, A. Grama, A. Gupta, and G. Karypis: Introduction to Parallel Computing (Redwood City, Benjamin/Cummings, 1994).
N. Petkov: Systolic Parallel Processing (North-Holland, 1992).
G. S. Almasi, A. Gottlieb: Highly Parallel Computing (Redwood City, Benjamin/Cummings, 1994).
Th. Lippert, A. Seyfried, A. Bode, K. Schilling. ‘Hyper-Systolic Parallel Computing', preprint HEP-LAT 9507021, WUB 95-13, HLRZ 32/95.
Th. Lippert, U. Glaessner, H. Hoeber, G. Ritzenhöfer, K. Schilling and A. Seyfried: ‘Hyper-Systolic Processing on APE100/Quadrics, I. n 2-loop computations', Int. Jour. Mod. Phys. C 7 (1996) 485.
A. Galli. Generalized Hyper-Systolic Parallel Computing. HEP-LAT 9509011 and MPI-PhT/95-87.
P. Palazzari, Th. Lippert and K. Schilling: ‘Simulated Annealing Techniques for Communication-Efficient Hyper-Systolic Parallel Computing on APE100/Quadrics', in: L. Grandinetti et al. (edts.), Proceedings of NATO Advanced Research Workshop on High Performance Computing, Technology and Applications Cetraro, Italy-June 1996 (NATO ASI Series, Kluwer, 1996).
Th. Lippert, N. Petkov, and K. Schilling: ‘Hyper-Systolic Matrix Multiplication on a SISAMD Computer', to appear.
C. Battista et al.: ‘The APE-100 Computer: (I) the Architecture', Int. J. of High Speed Computing 5 (1993) 637.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1997 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lippert, T., Petkov, N., Schilling, K. (1997). BLAS-3 for the quadrics parallel computer. In: Hertzberger, B., Sloot, P. (eds) High-Performance Computing and Networking. HPCN-Europe 1997. Lecture Notes in Computer Science, vol 1225. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0031605
Download citation
DOI: https://doi.org/10.1007/BFb0031605
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-62898-9
Online ISBN: 978-3-540-69041-2
eBook Packages: Springer Book Archive