skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: QR factorization of a dense matrix on a shared-memory multiprocessor

Technical Report ·
DOI:https://doi.org/10.2172/5928811· OSTI ID:5928811

A new algorithm for computing an orthogonal decomposition of a rectangular m x n matrix A on a shared-memory parallel computer is described. The algorithm uses Givens rotations, and has the feature that its synchronization cost is low. In particular, for a multiprocessor having p processors, an analysis of the algorithm shows that this cost is O (n/sup 2//p) if m/p greater than or equal to n, and O (mn/p/sup 2/) if m/p < n. Note that in the latter case, the synchronization cost is smaller than O (n/sup 2//p). Therefore, the synchronization cost of the algorithm proposed in this article is bounded by O (n/sup 2//p) when m greater than or equal to n. This is important for machines where synchronization cost is high, and when m >> n. Analysis and experiments show that the algorithm is effective in balancing the load and producing high efficiency (speed-up). 13 refs.

Research Organization:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
DOE Contract Number:
AC05-84OR21400
OSTI ID:
5928811
Report Number(s):
ORNL/TM-10581; ON: DE88001506
Resource Relation:
Other Information: Portions of this document are illegible in microfiche products. Original copy available until stock is exhausted
Country of Publication:
United States
Language:
English