Analysis and Practical Use of Flexible BiCGStab

Chen, Jie; McInnes, Lois C.; Zhang, Hong

doi:10.1007/s10915-015-0159-4

Analysis and Practical Use of Flexible BiCGStab

Published: 11 January 2016

Volume 68, pages 803–825, (2016)
Cite this article

Journal of Scientific Computing Aims and scope Submit manuscript

Jie Chen¹,
Lois C. McInnes² &
Hong Zhang²

833 Accesses
Explore all metrics

Abstract

A flexible version of the BiCGStab algorithm for solving a linear system of equations is analyzed. We show that under variable preconditioning, the perturbation to the outer residual norm is of the same order as that to the application of the preconditioner. Hence, in order to maintain a similar convergence behavior to BiCGStab while reducing the preconditioning cost, the flexible version can be used with a moderate tolerance in the preconditioning Krylov solves. We explored the use of flexible BiCGStab in a large-scale reacting flow application, PFLOTRAN, and showed that the use of a variable multigrid preconditioner significantly accelerates the simulation time on extreme-scale computers using $O(10^4)$–$O(10^5)$ processor cores.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

FEMPAR: An Object-Oriented Parallel Finite Element Framework

Article Open access 11 October 2017

Newton–Raphson preconditioner for Krylov type solvers on GPU devices

Article Open access 21 June 2016

Using Algebraic Multigrid in Inexact BDDC Domain Decomposition Methods

References

ALCF: Intrepid supercomputer. http://www.alcf.anl.gov/intrepid
Andre, B., Bisht, G., Collier, N., Hammond, G., Karra, S., Kumar, J., Lichtner, P., Mills, R.: PFLOTRAN project. http://pflotran.org/
Ang, J., Evans, K., Geist, A., Heroux, M., Hovland, P., Marques, O., McInnes, L., Ng, E., Wild, S.: Report on the workshop on extreme-scale solvers: Transitions to future architectures. Office of Advanced Scientific Computing Research, U.S. Department of Energy (2012). URL http://science.energy.gov/~/media/ascr/pdf/program-documents/docs/reportExtremeScaleSolvers2012.pdf Washington, DC, March 8-9, 2012
Axelsson, O., Vassilevski, P.S.: A black box generalized conjugate gradient solver with inner iterations and variable-step preconditioning. SIAM J. Matrix Anal. Appl. 12(4), 625–644 (1991)
Article MathSciNet MATH Google Scholar
Balay, S., Abhyankar, S., Adams, M.F., Brown, J., Brune, P., Buschelman, K., Eijkhout, V., Gropp, W.D., Kaushik, D., Knepley, M.G., McInnes, L.C., Rupp, K., Smith, B.F., Zhang, H.: PETSc users manual. Tech. Rep. ANL-95/11-Revision 3.5, Argonne National Laboratory (2014) URL http://www.mcs.anl.gov/petsc
Balay, S., Gropp, W.D., McInnes, L.C., Smith, B.F.: Efficient management of parallelism in object oriented numerical software libraries. In: Arge, E., Bruaset, A.M., Langtangen, H.P. (eds.) Modern Software Tools in Scientific Computing, pp. 163–202. Birkhauser Press (1997). URL ftp://info.mcs.anl.gov/pub/tech_reports/reports/P634.ps.Z
Bouras, A., Frayssé, V.: Inexact matrix-vector products in Krylov methods for solving linear systems: a relaxation strategy. SIAM J. Matrix Anal. Appl. 26(3), 660–678 (2005)
Article MathSciNet MATH Google Scholar
Bridges, P.G., Ferreira, K.B., Heroux, M.A., Hoemmen, M.: Fault-tolerant linear solvers via selective reliability. CoRR arXiv:1206.1390 (2012)
Brown, J., Knepley, M.G., May, D.A., McInnes, L.C., Smith, B.F.: Composable linear solvers for multiphysics. In: Proceeedings of the 11th international symposium on parallel and distributed computing (ISPDC 2012), pp. 55–62. IEEE Computer Society (2012). URL http://doi.ieeecomputersociety.org/10.1109/ISPDC.2012.16
Chronopoulos, A., Gear, C.W.: S-step iterative methods for symmetric linear systems. J. Comput. Appl. Math. 25, 153–168 (1989)
Article MathSciNet MATH Google Scholar
El maliki, A., Guenette, R., Fortin, M.: An efficient hierarchical preconditioner for quadratic discretizations of finite element problems. Numer. Linear Algebra Appl. 18(5), 789–803 (2011). doi:10.1002/nla.757
Article MathSciNet MATH Google Scholar
Eshof, Jv, Sleijpen, G.L.G.: Inexact Krylov subspace methods for linear systems. SIAM J. Matrix Anal. Appl. 26(1), 125–153 (2004)
Article MathSciNet MATH Google Scholar
Fletcher, R.: Conjugate gradient methods for indefinite systems. Lect. Notes Math. 506, 73–89 (1976)
Article MathSciNet MATH Google Scholar
Ghysels, P., Ashby, T., Meerbergen, K., Vanroose, W.: Hiding global communication latency in the GMRES algorithm on massively parallel machines. Tech. report 04.2012.1, Intel Exascience Lab, Leuven, Belgium (2012). URL http://twna.ua.ac.be/sites/twna.ua.ac.be/files/latency_gmres.pdf
Giladi, E., Golub, G.H., Keller, J.B.: Inner and outer iterations for the Chebyshev algorithm. SIAM J. Numer. Anal. 35, 300–319 (1995)
Article MathSciNet MATH Google Scholar
Golub, G.H., Ye, Q.: Inexact preconditioned conjugate gradient method with inner-outer iteration. SIAM J. Sci. Comput. 21(4), 1305–1320 (1999)
Article MathSciNet MATH Google Scholar
Keyes, D.E., McInnes, L.C., Woodward, C., Gropp, W., Myra, E., Pernice, M., Bell, J., Brown, J., Clo, A., Connors, J., Constantinescu, E., Estep, D., Evans, K., Farhat, C., Hakim, A., Hammond, G., Hansen, G., Hill, J., Isaac, T., Jiao, X., Jordan, K., Kaushik, D., Kaxiras, E., Koniges, A., Lee, K., Lott, A., Lu, Q., Magerlein, J., Maxwell, R., McCourt, M., Mehl, M., Pawlowski, R., Randles, A.P., Reynolds, D., Rivière, B., Rüde, U., Scheibe, T., Shadid, J., Sheehan, B., Shephard, M., Siegel, A., Smith, B., Tang, X., Wilson, C., Wohlmuth, B.: Multiphysics simulations: challenges and opportunities. Int. J. High Perform. Comput. Appl. 27(1), 4–83 (2013). URL http://www.ipd.anl.gov/anlpubs/2012/01/72183.pdf
McInnes, L.C., Smith, B., Zhang, H., Mills, R.T.: Hierarchical Krylov and nested Krylov methods for extreme-scale computing. Parallel Comput. 40, 17–31 (2014). doi:10.1016/j.parco.2013.10.001
Article MathSciNet Google Scholar
Mills, R.T., Sripathi, V., Mahinthakumar, G., Hammond, G., Lichtner, P.C., Smith, B.F.: Engineering PFLOTRAN for scalable performance on Cray XT and IBM BlueGene architectures. In: Proceedings of SciDAC 2010 Annual Meeting (2010)
Mohiyuddin, M., Hoemmen, M., Demmel, J., Yelick, K.: Minimizing communication in sparse matrix solvers. In: Proceedings of SC09. ACM (2009). doi:10.1145/1654059.1654096
Notay, Y.: Flexible conjugate gradients. SIAM J. Sci. Comput. 22(4), 1444–1460 (2000)
Article MathSciNet MATH Google Scholar
OLCF: Jaguar supercomputer. https://www.olcf.ornl.gov/computing-resources/jaguar/
van Rosendale, J.: Minimizing inner product data dependencies in conjugate gradient iteration. In: Proceedings of the IEEE international conference on parallel processing. IEEE computer society (1983)
Saad, Y.: A flexible inner-outer preconditioned GMRES algorithm. SIAM J. Sci. Comput. 14(2), 461–469 (1993). doi:10.1137/0914028
Article MathSciNet MATH Google Scholar
Saad, Y.: Iterative Methods for Sparse Linear Systems, 2nd edn. SIAM, Philadelpha (2003)
Book MATH Google Scholar
Saad, Y., Schultz, M.H.: GMRES: a generalized minimal residual algorithm for solving nonsymmetric linear systems. SIAM J. Sci. Stat. Comput. 7, 856–869 (1986)
Article MathSciNet MATH Google Scholar
Saad, Y., Sosonkina, M.: pARMS: a package for the parallel iterative solution of general large sparse linear systems user’s guide. Tech. Rep. UMSI2004-8, Minnesota Supercomputer Institute, University of Minnesota (2004)
Shalf, J., Dosanjh, S., Morrison, J.: Exascale computing technology challenges. In: Palma, J.M.L.M., et al. (eds.) VECPAR 2010, LNCS 6449, pp. 1–25 (2010)
Simoncini, V., Szyld, D.: Flexible inner-outer Krylov subspace methods. SIAM J. Numer. Anal. 40(6), 2219–2239 (2003)
Simoncini, V., Szyld, D.B.: Theory of inexact Krylov subspace methods and applications to scientific computing. SIAM J. Sci. Comput. 25(2), 454–477 (2003)
Article MathSciNet MATH Google Scholar
Sleijpen, G.L., van Gijzen, M.B.: Exploiting BiCGstab($\ell $) strategies to induce dimension reduction. SIAM J. Sci. Comput. 32(5), 2687–2709 (2010)
Article MathSciNet MATH Google Scholar
Sleijpen, G.L., Sonneveld, P., van Gijzen, M.B.: Bi-CGSTAB as an induced dimension reduction method. Appl. Numer. Math. 60, 1100–1114 (2010)
Article MathSciNet MATH Google Scholar
Sonneveld, P., van Gijzen, M.B.: IDR(s): a family of simple and fast algorithms for solving large nonsymmetric systems of linear equations. SIAM J. Sci. Comput. 31(2), 1035–1062 (2008)
Article MathSciNet MATH Google Scholar
Sturler, E.D., van der Vorst, H.A.: Reducing the effect of global communication in GMRES(m) and CG on parallel distributed memory computers. Appl. Numer. Math. 18, 441–459 (1995)
Article MATH Google Scholar
Szyld, D.B., Vogel, J.A.: FQMR: a flexible quasi-minimal residual method with inexact preconditioning. SIAM J. Sci. Comput. 23(2), 363–380 (2001)
Article MathSciNet MATH Google Scholar
van der Vorst, H.: BiCGSTAB: a fast and smoothly converging variant of BiCG for the solution of nonsymmetric linear systems. SIAM J. Sci. Stat. Comput. 13, 631–644 (1992)
Article MATH Google Scholar
Van der Vorst, H.A., Vuik, C.: GMRESR: a family of nested GMRES methods. Numer. Linear Algebra Appl. 1(4), 369–386 (1994)
Article MathSciNet MATH Google Scholar
van Gijzen, M.B., Sleijpen, G.L., Zemke, J.P.M.: Flexible and multi-shift induced dimension reduction algorithms for solving large sparse linear systems. Tech. Rep. 11–06, Delft University of Technology (2011)
Vogel, J.A.: Flexible BiCG and flexible Bi-CGSTAB for nonsymmetric linear systems. Appl. Math. Comput. 188(1), 226–233 (2007)
MathSciNet MATH Google Scholar
Vuduc, R.: Quantitative performance modeling of scientific computations and creating locality in numerical algorithms. Ph.D. thesis, Massachusetts Institute of Technology (1995)
Yang, L.T., Brent, R.: The improved BiCGStab method for large and sparse unsymmetric linear systems on parallel distributed memory architectures. In: Proceedings of the Fifth international conference on algorithms and architectures for parallel processing. IEEE (2002)

Download references

Acknowledgments

We thank Satish Balay, Jed Brown and Barry Smith for insightful discussions and assistance with experiments. The authors were supported by the Office of Advanced Scientific Computing Research, Office of Science, U.S. Department of Energy, under Contract DE-AC02-06CH11357. Part of Jie Chen’s work was conducted when he was with Argonne National Laboratory.

Author information

Authors and Affiliations

IBM Thomas J. Watson Research Center, Yorktown Heights, NY, 10598, USA
Jie Chen
Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL, 60439, USA
Lois C. McInnes & Hong Zhang

Authors

Jie Chen
View author publications
You can also search for this author inPubMed Google Scholar
Lois C. McInnes
View author publications
You can also search for this author inPubMed Google Scholar
Hong Zhang
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Jie Chen.

Additional information

The submitted manuscript has been created by UChicago Argonne, LLC, Operator of Argonne National Laboratory (“Argonne”). Argonne, a U.S. Department of Energy Office of Science laboratory, is operated under Contract No. DE-AC02-06CH11357. The U.S. Government retains for itself, and others acting on its behalf, a paid-up nonexclusive, irrevocable worldwide license in said article to reproduce, prepare derivative works, distribute copies to the public, and perform publicly and display publicly, by or on behalf of the Government.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, J., McInnes, L.C. & Zhang, H. Analysis and Practical Use of Flexible BiCGStab. J Sci Comput 68, 803–825 (2016). https://doi.org/10.1007/s10915-015-0159-4

Download citation

Received: 23 April 2015
Revised: 23 October 2015
Accepted: 28 December 2015
Published: 11 January 2016
Issue Date: August 2016
DOI: https://doi.org/10.1007/s10915-015-0159-4

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Analysis and Practical Use of Flexible BiCGStab

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

FEMPAR: An Object-Oriented Parallel Finite Element Framework

Newton–Raphson preconditioner for Krylov type solvers on GPU devices

Using Algebraic Multigrid in Inexact BDDC Domain Decomposition Methods

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now