Abstract
-matrices, as they were introduced in previous papers, allow the usage of the common matrix arithmetic in an efficient, almost optimal way. This article is concerned with the parallelisation of this arithmetics, in particular matrix building, matrix-vector multiplication, matrix multiplication and matrix inversion.
Of special interest is the design of algorithms, which reuse as much as possible of the corresponding sequential methods, thereby keeping the effort to update an existing implementation at a minimum. This could be achieved by making use of the properties of shared memory systems as they are widely available in the form of workstations or compute servers. These systems provide a simple and commonly supported programming interface in the form of POSIX-threads.
The theoretical results for the parallel algorithms are confirmed with numerical examples from BEM and FEM applications.
Similar content being viewed by others
References
Amdahl, G.: Validity of the single processor approach to achieving large-scale computing capabilities. In: AFIPS Conf. Proc., pp. 483–485 (1967).
Bebendorf, M.: Approximation of boundary element matrices. Numer. Math. 86, 565–589 (2000).
Bebendorf, M., Kriemann, R.: Fast parallel solution of boundary integral equations and related problems. Preprint 10/2004, MPI Leipzig 2004.
Börm, S., Grasedyck, L., Hackbusch, W.: Hierarchical matrices. Technical report. Lecture Note 21, MPI Leipzig 2003.
Börm, S., Grasedyck, L., Hackbusch, W.: Introduction to hierarchical matrices with applications. Engng. Anal. Bound. Elements 27, 405–422 (2003).
Braess, D.: Finite elements. Theory, fast solvers and applications in solid mechanics. Cambridge University Press 2001.
Brandt, A.: Multilevel computations of integral transforms and particle interactions with oscillatory kernels. Comp. Phys. Comm. 65, 24–38 (1991).
Butenhof. D.R.: Programming with POSIX threads. Addison Wesley 1997.
Cook, S., Reckhow, R.: Time bounded random access machines. J. Comp. Syst. Sci. 7, 354–375 (1973).
Dagum, L., Menon, R.: OpenMP: An industry-standard API for shared-memory programming, IEEE Comput. Sci. Engng. 5(1), January/March (1998).
Fortune, S., Wyllie, J.: Parallelism in random access machines. Proc. 10th Annual ACM Symp. on Theory of Computing, pp. 114–118. ACM Press 1978.
Graham, R.L.: Bounds on multiprocessing timing anomalies. SIAM J. Appl. Math. 17(2), 416–429 (1969).
Grasedyck, L., Hackbusch, W.: Construction and arithmetics of -matrices. Computing 70, 295–334 (2003).
Hackbusch,W.: Integral equations: Theory and numerical treatment, vol. 128 of ISNM. Birkhäuser, Basel 1995.
Hackbusch, W.: A sparse matrix arithmetic based on -matrices. I. Introduction to -matrices, Computing 62(2), 89–108 (1999).
Hackbusch, W.: Direct domain decomposition using the hierarchical matrix technique. In: Proc. 14th Int. Conf. on Domain Decomposition Methods, Cacoyoc, Mexico, pp. 30–50 (2003).
Hackbusch, W., Khoromskij, B.N.: A sparse -matrix arithmetic: general complexity estimates. J. Comput. Appl. Math. 125, 479–501 (2000).
Kriemann, R.: Implementation and usage of a thread pool based on POSIX threads. Technical Report and Documentations 2/2003, Max-Planck-Institute for Mathematics in the Sciences, Leipzig 2003.
Kriemann, R., Burmeister, J., Kleinrensing, R.: Benchmarking a shared memory system, Technical Report and Documentations 1/2003, Max-Planck-Institute for Mathematics in the Sciences, Leipzig 2003.
McColl, W. F.: Scalable computing. In: Computer Science Today: Recent Trends and Developments, 1000 (van Leeuwen, J., ed.), pp. 46–61. Springer 1995.
Olstad, B., Manne, F.: Efficient partitioning of sequences. IEEE Trans. Comput. 44(11), 1322–1326 (1995).
Sagan, H.: Space-filling curves. Berlin Heidelberg New York: Springer 1994.
Tarjan, R.: Depth-first search and linear graph algorithms. SIAM J. Comput. 1(2), 146–160 (1972).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Kriemann, R. Parallel -Matrix Arithmetics on Shared Memory Systems. Computing 74, 273–297 (2005). https://doi.org/10.1007/s00607-004-0102-2
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00607-004-0102-2