Skip to main content

Isoefficiency analysis of CGLS algorithm for parallel least squares problems

  • Conference paper
  • First Online:
High-Performance Computing and Networking (HPCN-Europe 1997)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1225))

Included in the following conference series:

  • 119 Accesses


In this paper we study the parallelization of CGLS, a basic iterative method for large and sparse least squares problems whose main idea is to organize the computation of conjugate gradient method to normal equations. A performance model called isoefficiency concept is used to analyze the behavior of this method implemented on massively parallel distributed memory computers with two dimensional mesh communication scheme. Two different mappings of data to processors, namely simple stripe and cyclic stripe partitionings are compared by putting these communication times into the isoefficiency concept which models scalability aspects. Theoretically, the cyclic stripe partitioning is shown to be asymptotically more scalable.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. M. W. Berry and R. J. Plemmons. Algorithms and experiments for structural mechanics on high performance architecture. Computer Methods in Applied Mechanics and Engineering, 64:1987, 487–507.

    Google Scholar 

  2. Å. Björck, T. Elfving, and Z. Strakos. Stability of conjugate gradient-type methods for linear least squares problems. Technical Report LiTH-MAT-R-1995-26, Department of Mathematics, Linköping University, 1994.

    Google Scholar 

  3. H. M. Bucker. Isoefficiciency analysis of parallel QMR-like iterative methods and its implications on parallel algorithm design. Technical Report KFA-ZAM-IB-9604, Central Institute for Applied Mathematics, Research Centre Julich, Germany, January 1996.

    Google Scholar 

  4. I. C. Chio, C. L. Monma, and D. F. Shanno. Further development of a primal-dual interior point method. ORSA Journal on Computing, 2(4):304–311, 1990.

    Google Scholar 

  5. L. G. C. Crone and H. A. van der Vorst. Communication aspects of the conjugate gradient method on distributed memory machines. Supercomputer, X(6):4–9, 1993.

    Google Scholar 

  6. E. de Sturler. A parallel variant of the GMRES(m). In Proceedings of the 13th IMACS World Congress on Computational and Applied Mathematics. IMACS, Criterion Press, 1991.

    Google Scholar 

  7. E. de Sturler and H. A. van der Vorst. Reducing the effect of the global communication in GMRES(m) and CG on parallel distributed memory computers. Technical Report 832, Mathematical Institute, University of Utrecht, Utrecht, The Netheland, 1994.

    Google Scholar 

  8. J. J. Dongarra, I. S. Duff, D. C. Sorensen, and H. A. van der Vorst. Solving Linear Systems on Vector and Shared Memory Computers. SIAM, Philadelphia, PA, 1991.

    Google Scholar 

  9. T. Elfving. On the conjugate gradient method for solving linear least squares problems. Technical Report LiTH-MAT-R-78-3, Department of Mathematics, Linköping University, 1978.

    Google Scholar 

  10. M. Fortin and R. Glowinski. Augmented Lagrangian Methods: Application to the Numerical Solution of Boundary-value Problems. NH, 1983.

    Google Scholar 

  11. G. H. Golub and W. Kahan. Calculating the singular values and pseudo-inverse of a matrix. SIAM Journal on Numerical Analysis, 2:205–224, 1965.

    Google Scholar 

  12. A. Gupta, V. Kumar, and A. Sameh. Performance and scalability of preconditioned conjugate gradient methods on parallel computers. Technical Report TR-92-64, Department of Computer Science, University of Minnesota, Minneapolis, 1994.

    Google Scholar 

  13. M. T. Heath, R. J. Plemmons, and R. C. Ward. Sparse orthogonal schemes for structure optimization using the force method. SIAM Journal on Scientific and Statistical Computing, 5(3):514–532, 1984.

    Google Scholar 

  14. V. Kumar, A. Grama, A. Gupta, and G. Karypis. Introduction to Parallel Computing: Design and Analysis of Algorithms. Benjamin/Cummings, Redwood City, 1994.

    Google Scholar 

  15. C. C. Paige and M. A. Saunders. LSQR: An algorithm for sparse linear equations and sparse least squares. ACM Transactions on Mathematical Software, 8:43–71, 1982.

    Google Scholar 

  16. C. Pommerell. Solution of large unsymmetric systems of linear equations. PhD thesis, ETH, 1992.

    Google Scholar 

  17. T. Yang. Iterative methods for least squares and total least squares problems. Licentiate Thesis LiU-TEK-LIC-1996:25, 1996. Linköping University, 581 83, Linköping, Sweden.

    Google Scholar 

  18. T. Yang. Parallel inner product-free algorithm for least squares problems. In Proceedings of Workshop on Applied Parallel Computing in Industrial Problems and Optimization (Para96), August 1996. Technical University of Denmark, Lyngby, Denmark.

    Google Scholar 

  19. T. Yang. Parallel least squares problems on massively distributed memory computers. In Proceedings of The Eleventh International Conference on Computer and Information Science (ISCIS-XI), November 1996. Middle East Technical University, Antalya, Turkey.

    Google Scholar 

Download references

Author information

Authors and Affiliations


Editor information

Bob Hertzberger Peter Sloot

Copyright information

© 1997 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Yang, TR., Lin, HX. (1997). Isoefficiency analysis of CGLS algorithm for parallel least squares problems. In: Hertzberger, B., Sloot, P. (eds) High-Performance Computing and Networking. HPCN-Europe 1997. Lecture Notes in Computer Science, vol 1225. Springer, Berlin, Heidelberg.

Download citation

  • DOI:

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-62898-9

  • Online ISBN: 978-3-540-69041-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics