Abstract
A flexible version of the BiCGStab algorithm for solving a linear system of equations is analyzed. We show that under variable preconditioning, the perturbation to the outer residual norm is of the same order as that to the application of the preconditioner. Hence, in order to maintain a similar convergence behavior to BiCGStab while reducing the preconditioning cost, the flexible version can be used with a moderate tolerance in the preconditioning Krylov solves. We explored the use of flexible BiCGStab in a large-scale reacting flow application, PFLOTRAN, and showed that the use of a variable multigrid preconditioner significantly accelerates the simulation time on extreme-scale computers using \(O(10^4)\)–\(O(10^5)\) processor cores.
Similar content being viewed by others
References
ALCF: Intrepid supercomputer. http://www.alcf.anl.gov/intrepid
Andre, B., Bisht, G., Collier, N., Hammond, G., Karra, S., Kumar, J., Lichtner, P., Mills, R.: PFLOTRAN project. http://pflotran.org/
Ang, J., Evans, K., Geist, A., Heroux, M., Hovland, P., Marques, O., McInnes, L., Ng, E., Wild, S.: Report on the workshop on extreme-scale solvers: Transitions to future architectures. Office of Advanced Scientific Computing Research, U.S. Department of Energy (2012). URL http://science.energy.gov/~/media/ascr/pdf/program-documents/docs/reportExtremeScaleSolvers2012.pdf Washington, DC, March 8-9, 2012
Axelsson, O., Vassilevski, P.S.: A black box generalized conjugate gradient solver with inner iterations and variable-step preconditioning. SIAM J. Matrix Anal. Appl. 12(4), 625–644 (1991)
Balay, S., Abhyankar, S., Adams, M.F., Brown, J., Brune, P., Buschelman, K., Eijkhout, V., Gropp, W.D., Kaushik, D., Knepley, M.G., McInnes, L.C., Rupp, K., Smith, B.F., Zhang, H.: PETSc users manual. Tech. Rep. ANL-95/11-Revision 3.5, Argonne National Laboratory (2014) URL http://www.mcs.anl.gov/petsc
Balay, S., Gropp, W.D., McInnes, L.C., Smith, B.F.: Efficient management of parallelism in object oriented numerical software libraries. In: Arge, E., Bruaset, A.M., Langtangen, H.P. (eds.) Modern Software Tools in Scientific Computing, pp. 163–202. Birkhauser Press (1997). URL ftp://info.mcs.anl.gov/pub/tech_reports/reports/P634.ps.Z
Bouras, A., Frayssé, V.: Inexact matrix-vector products in Krylov methods for solving linear systems: a relaxation strategy. SIAM J. Matrix Anal. Appl. 26(3), 660–678 (2005)
Bridges, P.G., Ferreira, K.B., Heroux, M.A., Hoemmen, M.: Fault-tolerant linear solvers via selective reliability. CoRR arXiv:1206.1390 (2012)
Brown, J., Knepley, M.G., May, D.A., McInnes, L.C., Smith, B.F.: Composable linear solvers for multiphysics. In: Proceeedings of the 11th international symposium on parallel and distributed computing (ISPDC 2012), pp. 55–62. IEEE Computer Society (2012). URL http://doi.ieeecomputersociety.org/10.1109/ISPDC.2012.16
Chronopoulos, A., Gear, C.W.: S-step iterative methods for symmetric linear systems. J. Comput. Appl. Math. 25, 153–168 (1989)
El maliki, A., Guenette, R., Fortin, M.: An efficient hierarchical preconditioner for quadratic discretizations of finite element problems. Numer. Linear Algebra Appl. 18(5), 789–803 (2011). doi:10.1002/nla.757
Eshof, Jv, Sleijpen, G.L.G.: Inexact Krylov subspace methods for linear systems. SIAM J. Matrix Anal. Appl. 26(1), 125–153 (2004)
Fletcher, R.: Conjugate gradient methods for indefinite systems. Lect. Notes Math. 506, 73–89 (1976)
Ghysels, P., Ashby, T., Meerbergen, K., Vanroose, W.: Hiding global communication latency in the GMRES algorithm on massively parallel machines. Tech. report 04.2012.1, Intel Exascience Lab, Leuven, Belgium (2012). URL http://twna.ua.ac.be/sites/twna.ua.ac.be/files/latency_gmres.pdf
Giladi, E., Golub, G.H., Keller, J.B.: Inner and outer iterations for the Chebyshev algorithm. SIAM J. Numer. Anal. 35, 300–319 (1995)
Golub, G.H., Ye, Q.: Inexact preconditioned conjugate gradient method with inner-outer iteration. SIAM J. Sci. Comput. 21(4), 1305–1320 (1999)
Keyes, D.E., McInnes, L.C., Woodward, C., Gropp, W., Myra, E., Pernice, M., Bell, J., Brown, J., Clo, A., Connors, J., Constantinescu, E., Estep, D., Evans, K., Farhat, C., Hakim, A., Hammond, G., Hansen, G., Hill, J., Isaac, T., Jiao, X., Jordan, K., Kaushik, D., Kaxiras, E., Koniges, A., Lee, K., Lott, A., Lu, Q., Magerlein, J., Maxwell, R., McCourt, M., Mehl, M., Pawlowski, R., Randles, A.P., Reynolds, D., Rivière, B., Rüde, U., Scheibe, T., Shadid, J., Sheehan, B., Shephard, M., Siegel, A., Smith, B., Tang, X., Wilson, C., Wohlmuth, B.: Multiphysics simulations: challenges and opportunities. Int. J. High Perform. Comput. Appl. 27(1), 4–83 (2013). URL http://www.ipd.anl.gov/anlpubs/2012/01/72183.pdf
McInnes, L.C., Smith, B., Zhang, H., Mills, R.T.: Hierarchical Krylov and nested Krylov methods for extreme-scale computing. Parallel Comput. 40, 17–31 (2014). doi:10.1016/j.parco.2013.10.001
Mills, R.T., Sripathi, V., Mahinthakumar, G., Hammond, G., Lichtner, P.C., Smith, B.F.: Engineering PFLOTRAN for scalable performance on Cray XT and IBM BlueGene architectures. In: Proceedings of SciDAC 2010 Annual Meeting (2010)
Mohiyuddin, M., Hoemmen, M., Demmel, J., Yelick, K.: Minimizing communication in sparse matrix solvers. In: Proceedings of SC09. ACM (2009). doi:10.1145/1654059.1654096
Notay, Y.: Flexible conjugate gradients. SIAM J. Sci. Comput. 22(4), 1444–1460 (2000)
OLCF: Jaguar supercomputer. https://www.olcf.ornl.gov/computing-resources/jaguar/
van Rosendale, J.: Minimizing inner product data dependencies in conjugate gradient iteration. In: Proceedings of the IEEE international conference on parallel processing. IEEE computer society (1983)
Saad, Y.: A flexible inner-outer preconditioned GMRES algorithm. SIAM J. Sci. Comput. 14(2), 461–469 (1993). doi:10.1137/0914028
Saad, Y.: Iterative Methods for Sparse Linear Systems, 2nd edn. SIAM, Philadelpha (2003)
Saad, Y., Schultz, M.H.: GMRES: a generalized minimal residual algorithm for solving nonsymmetric linear systems. SIAM J. Sci. Stat. Comput. 7, 856–869 (1986)
Saad, Y., Sosonkina, M.: pARMS: a package for the parallel iterative solution of general large sparse linear systems user’s guide. Tech. Rep. UMSI2004-8, Minnesota Supercomputer Institute, University of Minnesota (2004)
Shalf, J., Dosanjh, S., Morrison, J.: Exascale computing technology challenges. In: Palma, J.M.L.M., et al. (eds.) VECPAR 2010, LNCS 6449, pp. 1–25 (2010)
Simoncini, V., Szyld, D.: Flexible inner-outer Krylov subspace methods. SIAM J. Numer. Anal. 40(6), 2219–2239 (2003)
Simoncini, V., Szyld, D.B.: Theory of inexact Krylov subspace methods and applications to scientific computing. SIAM J. Sci. Comput. 25(2), 454–477 (2003)
Sleijpen, G.L., van Gijzen, M.B.: Exploiting BiCGstab(\(\ell \)) strategies to induce dimension reduction. SIAM J. Sci. Comput. 32(5), 2687–2709 (2010)
Sleijpen, G.L., Sonneveld, P., van Gijzen, M.B.: Bi-CGSTAB as an induced dimension reduction method. Appl. Numer. Math. 60, 1100–1114 (2010)
Sonneveld, P., van Gijzen, M.B.: IDR(s): a family of simple and fast algorithms for solving large nonsymmetric systems of linear equations. SIAM J. Sci. Comput. 31(2), 1035–1062 (2008)
Sturler, E.D., van der Vorst, H.A.: Reducing the effect of global communication in GMRES(m) and CG on parallel distributed memory computers. Appl. Numer. Math. 18, 441–459 (1995)
Szyld, D.B., Vogel, J.A.: FQMR: a flexible quasi-minimal residual method with inexact preconditioning. SIAM J. Sci. Comput. 23(2), 363–380 (2001)
van der Vorst, H.: BiCGSTAB: a fast and smoothly converging variant of BiCG for the solution of nonsymmetric linear systems. SIAM J. Sci. Stat. Comput. 13, 631–644 (1992)
Van der Vorst, H.A., Vuik, C.: GMRESR: a family of nested GMRES methods. Numer. Linear Algebra Appl. 1(4), 369–386 (1994)
van Gijzen, M.B., Sleijpen, G.L., Zemke, J.P.M.: Flexible and multi-shift induced dimension reduction algorithms for solving large sparse linear systems. Tech. Rep. 11–06, Delft University of Technology (2011)
Vogel, J.A.: Flexible BiCG and flexible Bi-CGSTAB for nonsymmetric linear systems. Appl. Math. Comput. 188(1), 226–233 (2007)
Vuduc, R.: Quantitative performance modeling of scientific computations and creating locality in numerical algorithms. Ph.D. thesis, Massachusetts Institute of Technology (1995)
Yang, L.T., Brent, R.: The improved BiCGStab method for large and sparse unsymmetric linear systems on parallel distributed memory architectures. In: Proceedings of the Fifth international conference on algorithms and architectures for parallel processing. IEEE (2002)
Acknowledgments
We thank Satish Balay, Jed Brown and Barry Smith for insightful discussions and assistance with experiments. The authors were supported by the Office of Advanced Scientific Computing Research, Office of Science, U.S. Department of Energy, under Contract DE-AC02-06CH11357. Part of Jie Chen’s work was conducted when he was with Argonne National Laboratory.
Author information
Authors and Affiliations
Corresponding author
Additional information
The submitted manuscript has been created by UChicago Argonne, LLC, Operator of Argonne National Laboratory (“Argonne”). Argonne, a U.S. Department of Energy Office of Science laboratory, is operated under Contract No. DE-AC02-06CH11357. The U.S. Government retains for itself, and others acting on its behalf, a paid-up nonexclusive, irrevocable worldwide license in said article to reproduce, prepare derivative works, distribute copies to the public, and perform publicly and display publicly, by or on behalf of the Government.
Rights and permissions
About this article
Cite this article
Chen, J., McInnes, L.C. & Zhang, H. Analysis and Practical Use of Flexible BiCGStab. J Sci Comput 68, 803–825 (2016). https://doi.org/10.1007/s10915-015-0159-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10915-015-0159-4