A locally optimized reordering algorithm and its application to a parallel sparse linear system solver

Gallivan, K.; Hansen, P. C.; Ostromsky, Tz.; Zlatev, Z.

doi:10.1007/BF02238079

A locally optimized reordering algorithm and its application to a parallel sparse linear system solver

Ein lokal optimierter Umordnungsalgorithmus und seine Anwendung auf einen parallelen Löser für dünnbesetzte lineare System

Published: March 1995

Volume 54, pages 39–67, (1995)
Cite this article

Computing Aims and scope Submit manuscript

K. Gallivan¹,
P. C. Hansen²,
Tz. Ostromsky² &
…
Z. Zlatev³

54 Accesses
12 Citations
Explore all metrics

Abstract

A coarse-grain parallel solver for systems of linear algebraic equations with general sparse matrices by Gaussian elimination is discussed. Before the factorization two other steps are performed. A reordering algorithm is used during the first step in order to obtain a permuted matrix with as many zero elements under the main diagonal as possible. During the second step the reordered matrix is partitioned into blocks for asynchronous parallel processing (normally the number of blocks is equal to the number of processors). It is possible to obtain blocks with nearly the same number of rows, because there is no requirement to produce square diagonal blocks. The first step is much more important than the second one and has a significant influence on the performance of the solver. A straightforward implementation of the reordering algorithm will result inO(n ²) operations. By using binary trees this cost can be reduced toO(NZ logn), whereNZ is the number of non-zero elements in the matrix andn is its order (normallyNZ is much smaller thann ²). Some experiments on parallel computers with shared memory have been performed. The results show that a solver based on the proposed reordering performs better than another solver based on a cheaper (but at the same time rather crude) reordering whose cost is onlyO(NZ) operations.

Zusammenfassung

Ein coarse-grain paralleler Gleichungslöser für lineare algebraische Systeme mit dünnbesetzten Matrizen durch Gauß-Elimination wird untersucht. Vor der Faktorisierung werden zwei andere Schritte durchgeführt. Im ersten Schritt wird ein Umordnungsalgorithmus verwendet, um eine permutierte Matrix mit möglichst vielen Nullelementen unter der Hauptdiagonale zu erhalten. Im zweiten Schritt wird die umgeordnete Matrix zur asynchronen Parallelverarbeitung in Blöcke partitioniert (üblicherweise ist die Anzahl der Blöcke gleich der Anzahl der Prozessoren). Es ist möglich, Blöcke mit annäheernd gleicher Zeilenanzahl zu erhalten, da keine Diagonablöcke erzeugt werden müssen. Der erste Schritt ist viel wichtiger als der zweite und hat großen Einfluß auf die Performance des Gleichungslösers. Eine einfache Implementierung des Umordnungsalgorithmus ergibt eine Komplexität vonO(n ²) Operationen. Durch Verwendung binärer Bäume kann die Komplexität aufO(NZ logn), wobeiNZ die Anzahl der von Null verschiedenen Elemente der Matrix undn die Ordnung des Gleichungssystems bezeichnet (üblicherweise istNZ viel kleiner alsn ²). Einige Experimente auf Parallelrechnern mit shared memory wurden durchgeführt. Die Ergebnisse zeigen, daß ein Gleichungslöser mit dem vorgeschlagenen Umordnungsllgorithmus eine bessere Performance zeigt als ein anderer Gleichungslöser mit einem Umordnungsalgorithmus der Komplexität vonO(NZ).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Practical Fixed-Parameter Algorithm for Constructing Tree-Child Networks from Multiple Binary Trees

Article 15 February 2022

The Peridigm Meshfree Peridynamics Code

Article Open access 08 May 2023

Parallelizing the dual revised simplex method

Article Open access 14 December 2017

References

Aho, A. V., Hopcroft, J. E., Ullman, J. D.: The design and analysis of computer algorithms. Reading: Addison-Wesley 1976.
Google Scholar
Aho, A. V., Hopcroft, J. E., Ullman, J. D.: Data structures and algorithms. Reading: Addison-Wesley 1983.
Google Scholar
Alvarado, F. L., Pothen, A., Schreiber, R.: Highly parallel sparse triangular solution. Report No. CS-92-09, Department of Computer Science, The Pennsylvania State University, 1992.
Anderson, E., Bai, Z., Bischof, C., Demmel, J., Dongarra, J., Du Croz, J., Greenbaum, A., Hammarling, S., McKenney, A., Ostrouchov, S., Sorensen, D.: LAPACK: User's guide. Philadelphia: SIAM 1992.
Google Scholar
Anderson, E., Saad, Y.: Preconditioned conjugate gradient methods for general sparse matrices on shared memory machines. In: Parallel processing for scientific computing (Rodrigue, G., ed.), pp. 88–92. Philadelphia: SIAM, 1989.
Google Scholar
Arioli, M., Duff, I. S., Gould, N. I. M., Reid, J. K.: Use of theP ⁴ andP ⁵ algorithms for in-core factorization of sparse matrices. SIAM J. Sci. Statist. Comput.11, 913–927 (1990).
Article Google Scholar
Davis, T. A., Yew, P.-C.: A nondeterministic parallel algorithm for general unsymmetric sparse LU factorization. SIAM J. Matrix Anal. Appl.3, 383–402 (1990).
Article Google Scholar
Duff, I. S., Erisman, S. M., Reid, J. K.: Direct methods for sparse matrices. Oxford: Oxford University Press 1986.
Google Scholar
Duff, I. S., Grimes, G., Lewis, J. C.: Sparse matrix test problems. ACM Trans. Math. Software15, 1–14 (1989).
Article MathSciNet Google Scholar
Eisenstat, S. C., Elman, H. C., Schultz, M. H.: Variational methods for nonsymmetric systems of linear equations. SIAM J. Numer. Anal.20, 345–357 (1983).
Article Google Scholar
Erisman, A. M., Grimes, R. G., Lewis, J. G., Poole, G. W. Jr.: A structurally stable modification of Hellerman-Raric'sP ⁴ algorithm for reordering unsymmetric sparse matrices. SIAM J. Numer. Anal.22, 369–385 (1985).
Article Google Scholar
Gallivan, K. A., Jalby, W., Meier, U.: The use of BLAS3 in linear algebra on a parallel processor with hierarchical memory. SIAM J. Sci. Statist. Comput.8, 1079–1084 (1987).
Article Google Scholar
Gallivan, K. A., Marsolf, B., Wijsoff, H.: A large-grain parallel sparse system solver. In: Proceedings of the SIAM conference on parallel processing for scientific computing, pp. 23–28. Philadelphia: SIAM 1991.
Google Scholar
Gallivan, K. A., Plemmons, R. J., Sameh, A. H.: Parallel algorithms for dense linear algebra computations. SIAM Rev.32, 54–135 (1990).
Article Google Scholar
Gallivan, K. A., Sameh, A. H., Zlatev, Z.: Solving general sparse linear systems using conjugate gradient-type methods. In: Proceedings of the 1990 international conference on supercomputing, June 11–15 1990, Amsterdam, The Netherlands, pp. 132–139. New York: ACM Press 1990.
Google Scholar
Gallivan, K. A., Sameh, A. H., Zlatev, Z.: A parellel hybrid sparse linear system solver. Comput. Syst. Eng.1, 183–195 (1990).
Article Google Scholar
Gallivan, K. A., Sameh, A. H., Zlatev, Z.: Parallel direct methods for general sparse matrices. Preprint No. 9. NATO ASI on comp. alg. for solving linear equations: the state of the art. University of Bergamo, Italy 1990.
Google Scholar
George, J. A., Liu, J. W.: Computer solution of large sparse positive definite systems. Englewood Cliffs: Prentice-Hall 1981.
Google Scholar
George, J. A., Liu, J. W., Ng, E.: Row ordering schemes for sparse Givens rotations. Lin. Alg. Appl.61, 55–81 (1984).
Article Google Scholar
Gilbert, J. R.: An efficient parallel sparse partial pivoting algorithm. Report No. 88/45052-1. Chr. Michelsen Institute, Department of Science and Technology, Centre for Computer Science, Fantoftvegen 38, N-5036 Fantoft, Bergen, Norwary, 1988.
Hellerman, E., Rarick, D. C.: Reinversion with the preassigned pivot procedure. Programming1, 195–216 (1971).
Article Google Scholar
Hellerman, E., Rarick, D. C.: The partitioned preassigned pivot procedure (P ⁴). In: Sparse matrices and their applications (Rose, D. J., Willoughby, R. A., eds.), pp. 67–76. New York: Plenum Press 1972.
Google Scholar
Knuth, D.: The art of computer programming, Vol. 3, pp. 151–152. Reading: Addison-Wesley 1973.
Google Scholar
van der Stappen, A. F., Bisseling, R. H., van der Vorst, G. G.: Parallel sparse LU decomposition on a mesh network of transputers. SIAM J. Matrix Anal. Appl.14, 853–879 (1993).
Article Google Scholar
Vinsome, P. K. W.: Orthomin, an iterative method for solving sparse sets of simultaneous linear equations. In: Proceedings of the fourth symposium on reservoir simulation, pp. 140–159. Society of Petroleum Engineers of AIME, 1976.
Zlatev, Z.: Use of iterative refinement in the solution of sparse linear systems. SIAM J. Numer. Anal.19, 381–399 (1982).
Article Google Scholar
Zlatev, Z.: Computational methods for general sparse matrices. Dordrecht-Toronto-London: Kluwer 1991.
Google Scholar
Zlatev, Z., Vu, Ph., Waśniewski, J., Schaumburg, K.: Condition number estimators in a sparse matrix software. SIAM J. Sci. Statist. Comput.7, 1175–1186 (1986).
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Center for Supercomputing Research and Development, University of Illinois, 1308 W. Main Street, 61801, Urbana, Illinois, USA
K. Gallivan
UNI·C. Danish Computer Centre for Research and Education, Technical University of Denmark, Bldg 304, DK-2800, Lyngby, Denmark
P. C. Hansen & Tz. Ostromsky
National Environmental Research Institute, Frederiksborgvej 399, DK-4000, Roskilde, Denmark
Z. Zlatev

Authors

K. Gallivan
View author publications
You can also search for this author in PubMed Google Scholar
P. C. Hansen
View author publications
You can also search for this author in PubMed Google Scholar
Tz. Ostromsky
View author publications
You can also search for this author in PubMed Google Scholar
Z. Zlatev
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gallivan, K., Hansen, P.C., Ostromsky, T. et al. A locally optimized reordering algorithm and its application to a parallel sparse linear system solver. Computing 54, 39–67 (1995). https://doi.org/10.1007/BF02238079

Download citation

Received: 14 October 1993
Revised: 07 June 1994
Issue Date: March 1995
DOI: https://doi.org/10.1007/BF02238079

AMS Subject Classifications

Key words

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A locally optimized reordering algorithm and its application to a parallel sparse linear system solver

Abstract

Zusammenfassung

Access this article

Similar content being viewed by others

A Practical Fixed-Parameter Algorithm for Constructing Tree-Child Networks from Multiple Binary Trees

The Peridigm Meshfree Peridynamics Code

Parallelizing the dual revised simplex method

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

AMS Subject Classifications

Key words

Navigation

A locally optimized reordering algorithm and its application to a parallel sparse linear system solver

Abstract

Zusammenfassung

Access this article

Similar content being viewed by others

A Practical Fixed-Parameter Algorithm for Constructing Tree-Child Networks from Multiple Binary Trees

The Peridigm Meshfree Peridynamics Code

Parallelizing the dual revised simplex method

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

AMS Subject Classifications

Key words

Search

Navigation