Elsevier

Information and Computation

Volume 161, Issue 2, 15 September 2000, Pages 172-210
Information and Computation

On designing optimal parallel triangular solvers1

https://doi.org/10.1006/inco.2000.2866Get rights and content
Under an Elsevier user license
open archive

Abstract

This paper explores the problem of solving triangular linear systems on parallel distributed-memory machines. Working within the LogP model, tight asymptotic bounds for solving these systems using forward/backward substitution are presented. Specifically, lower bounds on execution time independent of the data layout, lower bounds for data layouts in which the number of data items per processor is bounded, and lower bounds for specific data layouts commonly used in designing parallel algorithms for this problem are presented in this paper. Furthermore, algorithms are provided which have running times within a constant factor of the lower bounds described. One interesting result is that the popular two-dimensional block matrix layout necessarily results in significantly longer running times than simpler one-dimensional schemes. Finally, a generalization of the lower bounds to banded triangular linear systems is presented.

Key words

triangular solvers
matrix computation
parallel algorithmsand complexity
distributed-memory
numerical methods
LogP model

Cited by (0)

1

A preliminary version [16] was presented in the Seventh IEEE Symposium on Parallel and Distributed Processing 1995.

2

Research partially supported by an NSF Career grant.