Skip to main content
Log in

A locally optimized reordering algorithm and its application to a parallel sparse linear system solver

Ein lokal optimierter Umordnungsalgorithmus und seine Anwendung auf einen parallelen Löser für dünnbesetzte lineare System

  • Published:
Computing Aims and scope Submit manuscript

Abstract

A coarse-grain parallel solver for systems of linear algebraic equations with general sparse matrices by Gaussian elimination is discussed. Before the factorization two other steps are performed. A reordering algorithm is used during the first step in order to obtain a permuted matrix with as many zero elements under the main diagonal as possible. During the second step the reordered matrix is partitioned into blocks for asynchronous parallel processing (normally the number of blocks is equal to the number of processors). It is possible to obtain blocks with nearly the same number of rows, because there is no requirement to produce square diagonal blocks. The first step is much more important than the second one and has a significant influence on the performance of the solver. A straightforward implementation of the reordering algorithm will result inO(n 2) operations. By using binary trees this cost can be reduced toO(NZ logn), whereNZ is the number of non-zero elements in the matrix andn is its order (normallyNZ is much smaller thann 2). Some experiments on parallel computers with shared memory have been performed. The results show that a solver based on the proposed reordering performs better than another solver based on a cheaper (but at the same time rather crude) reordering whose cost is onlyO(NZ) operations.

Zusammenfassung

Ein coarse-grain paralleler Gleichungslöser für lineare algebraische Systeme mit dünnbesetzten Matrizen durch Gauß-Elimination wird untersucht. Vor der Faktorisierung werden zwei andere Schritte durchgeführt. Im ersten Schritt wird ein Umordnungsalgorithmus verwendet, um eine permutierte Matrix mit möglichst vielen Nullelementen unter der Hauptdiagonale zu erhalten. Im zweiten Schritt wird die umgeordnete Matrix zur asynchronen Parallelverarbeitung in Blöcke partitioniert (üblicherweise ist die Anzahl der Blöcke gleich der Anzahl der Prozessoren). Es ist möglich, Blöcke mit annäheernd gleicher Zeilenanzahl zu erhalten, da keine Diagonablöcke erzeugt werden müssen. Der erste Schritt ist viel wichtiger als der zweite und hat großen Einfluß auf die Performance des Gleichungslösers. Eine einfache Implementierung des Umordnungsalgorithmus ergibt eine Komplexität vonO(n 2) Operationen. Durch Verwendung binärer Bäume kann die Komplexität aufO(NZ logn), wobeiNZ die Anzahl der von Null verschiedenen Elemente der Matrix undn die Ordnung des Gleichungssystems bezeichnet (üblicherweise istNZ viel kleiner alsn 2). Einige Experimente auf Parallelrechnern mit shared memory wurden durchgeführt. Die Ergebnisse zeigen, daß ein Gleichungslöser mit dem vorgeschlagenen Umordnungsllgorithmus eine bessere Performance zeigt als ein anderer Gleichungslöser mit einem Umordnungsalgorithmus der Komplexität vonO(NZ).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Aho, A. V., Hopcroft, J. E., Ullman, J. D.: The design and analysis of computer algorithms. Reading: Addison-Wesley 1976.

    Google Scholar 

  2. Aho, A. V., Hopcroft, J. E., Ullman, J. D.: Data structures and algorithms. Reading: Addison-Wesley 1983.

    Google Scholar 

  3. Alvarado, F. L., Pothen, A., Schreiber, R.: Highly parallel sparse triangular solution. Report No. CS-92-09, Department of Computer Science, The Pennsylvania State University, 1992.

  4. Anderson, E., Bai, Z., Bischof, C., Demmel, J., Dongarra, J., Du Croz, J., Greenbaum, A., Hammarling, S., McKenney, A., Ostrouchov, S., Sorensen, D.: LAPACK: User's guide. Philadelphia: SIAM 1992.

    Google Scholar 

  5. Anderson, E., Saad, Y.: Preconditioned conjugate gradient methods for general sparse matrices on shared memory machines. In: Parallel processing for scientific computing (Rodrigue, G., ed.), pp. 88–92. Philadelphia: SIAM, 1989.

    Google Scholar 

  6. Arioli, M., Duff, I. S., Gould, N. I. M., Reid, J. K.: Use of theP 4 andP 5 algorithms for in-core factorization of sparse matrices. SIAM J. Sci. Statist. Comput.11, 913–927 (1990).

    Article  Google Scholar 

  7. Davis, T. A., Yew, P.-C.: A nondeterministic parallel algorithm for general unsymmetric sparse LU factorization. SIAM J. Matrix Anal. Appl.3, 383–402 (1990).

    Article  Google Scholar 

  8. Duff, I. S., Erisman, S. M., Reid, J. K.: Direct methods for sparse matrices. Oxford: Oxford University Press 1986.

    Google Scholar 

  9. Duff, I. S., Grimes, G., Lewis, J. C.: Sparse matrix test problems. ACM Trans. Math. Software15, 1–14 (1989).

    Article  MathSciNet  Google Scholar 

  10. Eisenstat, S. C., Elman, H. C., Schultz, M. H.: Variational methods for nonsymmetric systems of linear equations. SIAM J. Numer. Anal.20, 345–357 (1983).

    Article  Google Scholar 

  11. Erisman, A. M., Grimes, R. G., Lewis, J. G., Poole, G. W. Jr.: A structurally stable modification of Hellerman-Raric'sP 4 algorithm for reordering unsymmetric sparse matrices. SIAM J. Numer. Anal.22, 369–385 (1985).

    Article  Google Scholar 

  12. Gallivan, K. A., Jalby, W., Meier, U.: The use of BLAS3 in linear algebra on a parallel processor with hierarchical memory. SIAM J. Sci. Statist. Comput.8, 1079–1084 (1987).

    Article  Google Scholar 

  13. Gallivan, K. A., Marsolf, B., Wijsoff, H.: A large-grain parallel sparse system solver. In: Proceedings of the SIAM conference on parallel processing for scientific computing, pp. 23–28. Philadelphia: SIAM 1991.

    Google Scholar 

  14. Gallivan, K. A., Plemmons, R. J., Sameh, A. H.: Parallel algorithms for dense linear algebra computations. SIAM Rev.32, 54–135 (1990).

    Article  Google Scholar 

  15. Gallivan, K. A., Sameh, A. H., Zlatev, Z.: Solving general sparse linear systems using conjugate gradient-type methods. In: Proceedings of the 1990 international conference on supercomputing, June 11–15 1990, Amsterdam, The Netherlands, pp. 132–139. New York: ACM Press 1990.

    Google Scholar 

  16. Gallivan, K. A., Sameh, A. H., Zlatev, Z.: A parellel hybrid sparse linear system solver. Comput. Syst. Eng.1, 183–195 (1990).

    Article  Google Scholar 

  17. Gallivan, K. A., Sameh, A. H., Zlatev, Z.: Parallel direct methods for general sparse matrices. Preprint No. 9. NATO ASI on comp. alg. for solving linear equations: the state of the art. University of Bergamo, Italy 1990.

    Google Scholar 

  18. George, J. A., Liu, J. W.: Computer solution of large sparse positive definite systems. Englewood Cliffs: Prentice-Hall 1981.

    Google Scholar 

  19. George, J. A., Liu, J. W., Ng, E.: Row ordering schemes for sparse Givens rotations. Lin. Alg. Appl.61, 55–81 (1984).

    Article  Google Scholar 

  20. Gilbert, J. R.: An efficient parallel sparse partial pivoting algorithm. Report No. 88/45052-1. Chr. Michelsen Institute, Department of Science and Technology, Centre for Computer Science, Fantoftvegen 38, N-5036 Fantoft, Bergen, Norwary, 1988.

  21. Hellerman, E., Rarick, D. C.: Reinversion with the preassigned pivot procedure. Programming1, 195–216 (1971).

    Article  Google Scholar 

  22. Hellerman, E., Rarick, D. C.: The partitioned preassigned pivot procedure (P 4). In: Sparse matrices and their applications (Rose, D. J., Willoughby, R. A., eds.), pp. 67–76. New York: Plenum Press 1972.

    Google Scholar 

  23. Knuth, D.: The art of computer programming, Vol. 3, pp. 151–152. Reading: Addison-Wesley 1973.

    Google Scholar 

  24. van der Stappen, A. F., Bisseling, R. H., van der Vorst, G. G.: Parallel sparse LU decomposition on a mesh network of transputers. SIAM J. Matrix Anal. Appl.14, 853–879 (1993).

    Article  Google Scholar 

  25. Vinsome, P. K. W.: Orthomin, an iterative method for solving sparse sets of simultaneous linear equations. In: Proceedings of the fourth symposium on reservoir simulation, pp. 140–159. Society of Petroleum Engineers of AIME, 1976.

  26. Zlatev, Z.: Use of iterative refinement in the solution of sparse linear systems. SIAM J. Numer. Anal.19, 381–399 (1982).

    Article  Google Scholar 

  27. Zlatev, Z.: Computational methods for general sparse matrices. Dordrecht-Toronto-London: Kluwer 1991.

    Google Scholar 

  28. Zlatev, Z., Vu, Ph., Waśniewski, J., Schaumburg, K.: Condition number estimators in a sparse matrix software. SIAM J. Sci. Statist. Comput.7, 1175–1186 (1986).

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gallivan, K., Hansen, P.C., Ostromsky, T. et al. A locally optimized reordering algorithm and its application to a parallel sparse linear system solver. Computing 54, 39–67 (1995). https://doi.org/10.1007/BF02238079

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02238079

AMS Subject Classifications

Key words

Navigation