Abstract
The parallelization of numerical algorithms is very important in scientific applications, but many points of this parallelization remain open today. Specifically, the overhead introduced by loading and unloading the data degrades the efficiency, and in a realistic approach should be taking into account for performance estimation. The authors of this paper present a way of overcoming the bottleneck of loading and unloading the data by overlapping computations and communications in a specific algorithm such as matrix-vector multiplication. Also, a way of mapping this algorithm in hardware is presented in order to demonstrate the parallelization methodology.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Quinn M.J.: Parallel Computing. Theory and Practice. McGraw-Hill International Editions. (1994)
Banerjee U.: Dependence Analysis for Supercomputing. Kluwer Academic Publishers. (1988)
Booth A.D.: A Signed Binary Multiplication Technique. Quart. Journ. Mech. and Appl. Math vol. 4 part 2. (1951) 236–240
Golub G.H., Van Loan C.F.: Matrix Computations. Second edition. The Johns Hopkins University Press. (1989)
Moldovan D.I., Fortes J.A.B.: Partitioning and mapping algorithms into fixed systolic arrays. IEEE transactions on computers vol. C-35 no. 1. (1986)
Ojeda-Guerra C.N., Suárez A.: Solving Linear Systems of Equations Overlapping Computations and Communications in Torus Networks. Fifth Euromicro Workshop on Parallel and Distributed Processing. (1997) 453–460
Suárez A., Ojeda-Guerra C.N.: Overlapping Computations and Communications on Torus Networks. Fourth Euromicro Workshop on Parallel and Distributed Processing. (1996) 162–169
Trimberger S.N.: Field Programmable Gate Array Technology. Kluwer Academic Publishers. (1994)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ojeda-Guerra, C.N., Esper-Chaín, R., Estupiñán, M., Macías, E., Suárez, A. (1998). Hardware mapping of a parallel algorithm for matrix-vector multiplication overlapping communications and computations. In: Hartenstein, R.W., Keevallik, A. (eds) Field-Programmable Logic and Applications From FPGAs to Computing Paradigm. FPL 1998. Lecture Notes in Computer Science, vol 1482. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0055268
Download citation
DOI: https://doi.org/10.1007/BFb0055268
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64948-9
Online ISBN: 978-3-540-68066-6
eBook Packages: Springer Book Archive