Skip to main content
Log in

Analysis and Practical Use of Flexible BiCGStab

  • Published:
Journal of Scientific Computing Aims and scope Submit manuscript

Abstract

A flexible version of the BiCGStab algorithm for solving a linear system of equations is analyzed. We show that under variable preconditioning, the perturbation to the outer residual norm is of the same order as that to the application of the preconditioner. Hence, in order to maintain a similar convergence behavior to BiCGStab while reducing the preconditioning cost, the flexible version can be used with a moderate tolerance in the preconditioning Krylov solves. We explored the use of flexible BiCGStab in a large-scale reacting flow application, PFLOTRAN, and showed that the use of a variable multigrid preconditioner significantly accelerates the simulation time on extreme-scale computers using \(O(10^4)\)\(O(10^5)\) processor cores.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. ALCF: Intrepid supercomputer. http://www.alcf.anl.gov/intrepid

  2. Andre, B., Bisht, G., Collier, N., Hammond, G., Karra, S., Kumar, J., Lichtner, P., Mills, R.: PFLOTRAN project. http://pflotran.org/

  3. Ang, J., Evans, K., Geist, A., Heroux, M., Hovland, P., Marques, O., McInnes, L., Ng, E., Wild, S.: Report on the workshop on extreme-scale solvers: Transitions to future architectures. Office of Advanced Scientific Computing Research, U.S. Department of Energy (2012). URL http://science.energy.gov/~/media/ascr/pdf/program-documents/docs/reportExtremeScaleSolvers2012.pdf Washington, DC, March 8-9, 2012

  4. Axelsson, O., Vassilevski, P.S.: A black box generalized conjugate gradient solver with inner iterations and variable-step preconditioning. SIAM J. Matrix Anal. Appl. 12(4), 625–644 (1991)

    Article  MathSciNet  MATH  Google Scholar 

  5. Balay, S., Abhyankar, S., Adams, M.F., Brown, J., Brune, P., Buschelman, K., Eijkhout, V., Gropp, W.D., Kaushik, D., Knepley, M.G., McInnes, L.C., Rupp, K., Smith, B.F., Zhang, H.: PETSc users manual. Tech. Rep. ANL-95/11-Revision 3.5, Argonne National Laboratory (2014) URL http://www.mcs.anl.gov/petsc

  6. Balay, S., Gropp, W.D., McInnes, L.C., Smith, B.F.: Efficient management of parallelism in object oriented numerical software libraries. In: Arge, E., Bruaset, A.M., Langtangen, H.P. (eds.) Modern Software Tools in Scientific Computing, pp. 163–202. Birkhauser Press (1997). URL ftp://info.mcs.anl.gov/pub/tech_reports/reports/P634.ps.Z

  7. Bouras, A., Frayssé, V.: Inexact matrix-vector products in Krylov methods for solving linear systems: a relaxation strategy. SIAM J. Matrix Anal. Appl. 26(3), 660–678 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  8. Bridges, P.G., Ferreira, K.B., Heroux, M.A., Hoemmen, M.: Fault-tolerant linear solvers via selective reliability. CoRR arXiv:1206.1390 (2012)

  9. Brown, J., Knepley, M.G., May, D.A., McInnes, L.C., Smith, B.F.: Composable linear solvers for multiphysics. In: Proceeedings of the 11th international symposium on parallel and distributed computing (ISPDC 2012), pp. 55–62. IEEE Computer Society (2012). URL http://doi.ieeecomputersociety.org/10.1109/ISPDC.2012.16

  10. Chronopoulos, A., Gear, C.W.: S-step iterative methods for symmetric linear systems. J. Comput. Appl. Math. 25, 153–168 (1989)

    Article  MathSciNet  MATH  Google Scholar 

  11. El maliki, A., Guenette, R., Fortin, M.: An efficient hierarchical preconditioner for quadratic discretizations of finite element problems. Numer. Linear Algebra Appl. 18(5), 789–803 (2011). doi:10.1002/nla.757

    Article  MathSciNet  MATH  Google Scholar 

  12. Eshof, Jv, Sleijpen, G.L.G.: Inexact Krylov subspace methods for linear systems. SIAM J. Matrix Anal. Appl. 26(1), 125–153 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  13. Fletcher, R.: Conjugate gradient methods for indefinite systems. Lect. Notes Math. 506, 73–89 (1976)

    Article  MathSciNet  MATH  Google Scholar 

  14. Ghysels, P., Ashby, T., Meerbergen, K., Vanroose, W.: Hiding global communication latency in the GMRES algorithm on massively parallel machines. Tech. report 04.2012.1, Intel Exascience Lab, Leuven, Belgium (2012). URL http://twna.ua.ac.be/sites/twna.ua.ac.be/files/latency_gmres.pdf

  15. Giladi, E., Golub, G.H., Keller, J.B.: Inner and outer iterations for the Chebyshev algorithm. SIAM J. Numer. Anal. 35, 300–319 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  16. Golub, G.H., Ye, Q.: Inexact preconditioned conjugate gradient method with inner-outer iteration. SIAM J. Sci. Comput. 21(4), 1305–1320 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  17. Keyes, D.E., McInnes, L.C., Woodward, C., Gropp, W., Myra, E., Pernice, M., Bell, J., Brown, J., Clo, A., Connors, J., Constantinescu, E., Estep, D., Evans, K., Farhat, C., Hakim, A., Hammond, G., Hansen, G., Hill, J., Isaac, T., Jiao, X., Jordan, K., Kaushik, D., Kaxiras, E., Koniges, A., Lee, K., Lott, A., Lu, Q., Magerlein, J., Maxwell, R., McCourt, M., Mehl, M., Pawlowski, R., Randles, A.P., Reynolds, D., Rivière, B., Rüde, U., Scheibe, T., Shadid, J., Sheehan, B., Shephard, M., Siegel, A., Smith, B., Tang, X., Wilson, C., Wohlmuth, B.: Multiphysics simulations: challenges and opportunities. Int. J. High Perform. Comput. Appl. 27(1), 4–83 (2013). URL http://www.ipd.anl.gov/anlpubs/2012/01/72183.pdf

  18. McInnes, L.C., Smith, B., Zhang, H., Mills, R.T.: Hierarchical Krylov and nested Krylov methods for extreme-scale computing. Parallel Comput. 40, 17–31 (2014). doi:10.1016/j.parco.2013.10.001

    Article  MathSciNet  Google Scholar 

  19. Mills, R.T., Sripathi, V., Mahinthakumar, G., Hammond, G., Lichtner, P.C., Smith, B.F.: Engineering PFLOTRAN for scalable performance on Cray XT and IBM BlueGene architectures. In: Proceedings of SciDAC 2010 Annual Meeting (2010)

  20. Mohiyuddin, M., Hoemmen, M., Demmel, J., Yelick, K.: Minimizing communication in sparse matrix solvers. In: Proceedings of SC09. ACM (2009). doi:10.1145/1654059.1654096

  21. Notay, Y.: Flexible conjugate gradients. SIAM J. Sci. Comput. 22(4), 1444–1460 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  22. OLCF: Jaguar supercomputer. https://www.olcf.ornl.gov/computing-resources/jaguar/

  23. van Rosendale, J.: Minimizing inner product data dependencies in conjugate gradient iteration. In: Proceedings of the IEEE international conference on parallel processing. IEEE computer society (1983)

  24. Saad, Y.: A flexible inner-outer preconditioned GMRES algorithm. SIAM J. Sci. Comput. 14(2), 461–469 (1993). doi:10.1137/0914028

    Article  MathSciNet  MATH  Google Scholar 

  25. Saad, Y.: Iterative Methods for Sparse Linear Systems, 2nd edn. SIAM, Philadelpha (2003)

    Book  MATH  Google Scholar 

  26. Saad, Y., Schultz, M.H.: GMRES: a generalized minimal residual algorithm for solving nonsymmetric linear systems. SIAM J. Sci. Stat. Comput. 7, 856–869 (1986)

    Article  MathSciNet  MATH  Google Scholar 

  27. Saad, Y., Sosonkina, M.: pARMS: a package for the parallel iterative solution of general large sparse linear systems user’s guide. Tech. Rep. UMSI2004-8, Minnesota Supercomputer Institute, University of Minnesota (2004)

  28. Shalf, J., Dosanjh, S., Morrison, J.: Exascale computing technology challenges. In: Palma, J.M.L.M., et al. (eds.) VECPAR 2010, LNCS 6449, pp. 1–25 (2010)

  29. Simoncini, V., Szyld, D.: Flexible inner-outer Krylov subspace methods. SIAM J. Numer. Anal. 40(6), 2219–2239 (2003)

  30. Simoncini, V., Szyld, D.B.: Theory of inexact Krylov subspace methods and applications to scientific computing. SIAM J. Sci. Comput. 25(2), 454–477 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  31. Sleijpen, G.L., van Gijzen, M.B.: Exploiting BiCGstab(\(\ell \)) strategies to induce dimension reduction. SIAM J. Sci. Comput. 32(5), 2687–2709 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  32. Sleijpen, G.L., Sonneveld, P., van Gijzen, M.B.: Bi-CGSTAB as an induced dimension reduction method. Appl. Numer. Math. 60, 1100–1114 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  33. Sonneveld, P., van Gijzen, M.B.: IDR(s): a family of simple and fast algorithms for solving large nonsymmetric systems of linear equations. SIAM J. Sci. Comput. 31(2), 1035–1062 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  34. Sturler, E.D., van der Vorst, H.A.: Reducing the effect of global communication in GMRES(m) and CG on parallel distributed memory computers. Appl. Numer. Math. 18, 441–459 (1995)

    Article  MATH  Google Scholar 

  35. Szyld, D.B., Vogel, J.A.: FQMR: a flexible quasi-minimal residual method with inexact preconditioning. SIAM J. Sci. Comput. 23(2), 363–380 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  36. van der Vorst, H.: BiCGSTAB: a fast and smoothly converging variant of BiCG for the solution of nonsymmetric linear systems. SIAM J. Sci. Stat. Comput. 13, 631–644 (1992)

    Article  MATH  Google Scholar 

  37. Van der Vorst, H.A., Vuik, C.: GMRESR: a family of nested GMRES methods. Numer. Linear Algebra Appl. 1(4), 369–386 (1994)

    Article  MathSciNet  MATH  Google Scholar 

  38. van Gijzen, M.B., Sleijpen, G.L., Zemke, J.P.M.: Flexible and multi-shift induced dimension reduction algorithms for solving large sparse linear systems. Tech. Rep. 11–06, Delft University of Technology (2011)

  39. Vogel, J.A.: Flexible BiCG and flexible Bi-CGSTAB for nonsymmetric linear systems. Appl. Math. Comput. 188(1), 226–233 (2007)

    MathSciNet  MATH  Google Scholar 

  40. Vuduc, R.: Quantitative performance modeling of scientific computations and creating locality in numerical algorithms. Ph.D. thesis, Massachusetts Institute of Technology (1995)

  41. Yang, L.T., Brent, R.: The improved BiCGStab method for large and sparse unsymmetric linear systems on parallel distributed memory architectures. In: Proceedings of the Fifth international conference on algorithms and architectures for parallel processing. IEEE (2002)

Download references

Acknowledgments

We thank Satish Balay, Jed Brown and Barry Smith for insightful discussions and assistance with experiments. The authors were supported by the Office of Advanced Scientific Computing Research, Office of Science, U.S. Department of Energy, under Contract DE-AC02-06CH11357. Part of Jie Chen’s work was conducted when he was with Argonne National Laboratory.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jie Chen.

Additional information

The submitted manuscript has been created by UChicago Argonne, LLC, Operator of Argonne National Laboratory (“Argonne”). Argonne, a U.S. Department of Energy Office of Science laboratory, is operated under Contract No. DE-AC02-06CH11357. The U.S. Government retains for itself, and others acting on its behalf, a paid-up nonexclusive, irrevocable worldwide license in said article to reproduce, prepare derivative works, distribute copies to the public, and perform publicly and display publicly, by or on behalf of the Government.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, J., McInnes, L.C. & Zhang, H. Analysis and Practical Use of Flexible BiCGStab. J Sci Comput 68, 803–825 (2016). https://doi.org/10.1007/s10915-015-0159-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10915-015-0159-4

Keywords

Navigation