Skip to main content
Log in

Anderson Acceleration as a Krylov Method with Application to Convergence Analysis

  • Published:
Journal of Scientific Computing Aims and scope Submit manuscript

Abstract

Anderson acceleration (AA) is widely used for accelerating the convergence of nonlinear fixed-point methods, but little is known about how to quantify the asymptotic convergence acceleration provided by AA. As a roadway towards gaining more understanding of convergence acceleration by AA, we study AA(m), i.e., Anderson acceleration with finite window size m, applied to the case of linear fixed-point iterations. We write AA(m) as a Krylov method with polynomial residual update formulas, and derive \((m+2)\)-term recurrence relations for the AA(m) polynomials. We derive several results based on these polynomial residual update formulas, including orthogonality relations, acceleration coefficient bounds, nonlinear recursions, and residual convergence bounds. We apply these results to study AA(1) residual convergence patterns and the influence of the initial guess on the asymptotic convergence factor.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Data Availability

Data sharing is not applicable to this article as no datasets were generated or analyzed.

Notes

  1. Note that \({{\mathcal {B}}}(\phi _k, y_k)\) is indeterminate at the point \((\phi _k, y_k) = (0, 1)\) corresponding to the special case when \(r_k = r_{k-1}\).

  2. Note that these local FP bounds are the same as the AA(1) bounds (35) when \(r_{k} = r_{k-1}\), because, in this case, AA(1) just applies the basic FP iteration, \(r_{k+1} = M r_k\). For this reason, in the following comparison between AA(1) and the FP iteration, we suppose that AA(1) residuals satisfy \(r_k \ne r_{k-1}\) so that AA(1) is in fact distinct from the FP iteration.

References

  1. An, H., Jia, X., Walker, H.F.: Anderson acceleration and application to the three-temperature energy equations. J. Comput. Phys. 347, 1–19 (2017)

    Article  MathSciNet  Google Scholar 

  2. Anderson, D.G.: Iterative procedures for nonlinear integral equations. J. ACM (JACM) 12(4), 547–560 (1965)

    Article  MathSciNet  Google Scholar 

  3. Baker, A.H., Jessup, E.R., Manteuffel, T.: A Technique for Accelerating the Convergence of Restarted GMRES. SIAM J. Matrix Anal. Appl. 26(4), 962–984 (2005)

    Article  MathSciNet  Google Scholar 

  4. Brune, P.R., Knepley, M.G., Smith, B.F., Tu, X.: Composing scalable nonlinear algebraic solvers. SIAM Rev. 57(4), 535–565 (2015)

    Article  MathSciNet  Google Scholar 

  5. De Sterck, H.: A nonlinear GMRES optimization algorithm for canonical tensor decomposition. SIAM J. Sci. Comput. 34(3), A1351–A1379 (2012)

    Article  MathSciNet  Google Scholar 

  6. De Sterck, H.: Steepest descent preconditioning for nonlinear GMRES optimization. Numer. Linear Algebra Appl. 20(3), 453–471 (2013)

    Article  MathSciNet  Google Scholar 

  7. De Sterck, H., He, Y.: On the asymptotic linear convergence speed of Anderson acceleration, Nesterov acceleration, and nonlinear GMRES. SIAM J. Sci. Comput. 43(5), S21–S46 (2021)

    Article  MathSciNet  Google Scholar 

  8. De Sterck, H., He, Y.: Linear asymptotic convergence of Anderson acceleration: fixed-point analysis. SIAM J. Matrix Anal. Appl. 43(4), 1755–1783 (2022)

    Article  MathSciNet  Google Scholar 

  9. Evans, C., Pollock, S., Rebholz, L.G., Xiao, M.: A proof that Anderson Acceleration improves the convergence rate in linearly converging fixed-point methods (but not in those converging quadratically). SIAM J. Numer. Anal. 58(1), 788–810 (2020)

    Article  MathSciNet  Google Scholar 

  10. Fang, H.R., Saad, Y.: Two classes of multisecant methods for nonlinear acceleration. Numer. Linear Algebra Appl. 16(3), 197–221 (2009)

    Article  MathSciNet  Google Scholar 

  11. Fu, A., Zhang, J., Boyd, S.: Anderson accelerated Douglas–Rachford splitting. SIAM J. Sci. Comput. 42(6), A3560–A3583 (2020)

    Article  MathSciNet  Google Scholar 

  12. Henderson, N.C., Varadhan, R.: Damped Anderson acceleration with restarts and monotonicity control for accelerating EM and EM-like algorithms. J. Comput. Graph. Stat. 28(4), 834–846 (2019)

    Article  MathSciNet  Google Scholar 

  13. Ho, N., Olson, S.D., Walker, H.F.: Accelerating the Uzawa algorithm. SIAM J. Sci. Comput. 39(5), S461–S476 (2017)

    Article  MathSciNet  Google Scholar 

  14. Kindermann, S.: Optimal-order convergence of Nesterov acceleration for linear ill-posed problems. Inverse Prob. 37(6), 065002 (2021)

    Article  MathSciNet  Google Scholar 

  15. Lipnikov, K., Svyatskiy, D., Vassilevski, Y.: Anderson acceleration for nonlinear finite volume scheme for advection-diffusion problems. SIAM J. Sci. Comput. 35(2), A1120–A1136 (2013)

    Article  MathSciNet  Google Scholar 

  16. Liu, C., Belkin, M.: Parametrized accelerated methods free of condition number. arXiv preprint arXiv:1802.10235 (2018)

  17. Lockhart, S., Gardner, D.J., Woodward, C.S., Thomas, S., Olson, L.N.: Performance of low synchronization orthogonalization methods in Anderson accelerated fixed point solvers. In: Proceedings of the 2022 SIAM Conference on Parallel Processing for Scientific Computing, pp. 49–59. SIAM (2022)

  18. Lott, P., Walker, H., Woodward, C., Yang, U.: An accelerated Picard method for nonlinear systems related to variably saturated flow. Adv. Water Resour. 38, 92–101 (2012)

    Article  Google Scholar 

  19. Moler, C.B.: Numerical Computing with MATLAB. SIAM, New Delhi (2004)

    Book  Google Scholar 

  20. Ni, P.: Anderson acceleration of fixed-point iteration with applications to electronic structure computations. Ph.D. thesis, Worcester Polytechnic Institute (2009)

  21. Niu, C., Hu, X.: Momentum accelerated multigrid methods. arXiv preprint arXiv:2006.16986 (2020)

  22. Oosterlee, C., Washio, T.: Krylov subspace acceleration of nonlinear multigrid with application to recirculating flows. SIAM J. Sci. Comput. 21(5), 1670–1690 (2000)

    Article  MathSciNet  Google Scholar 

  23. Pollock, S., Rebholz, L.G.: Anderson acceleration for contractive and noncontractive operators. IMA J. Numer. Anal. 41(4), 2841–2872 (2021)

    Article  MathSciNet  Google Scholar 

  24. Pollock, S., Schwartz, H.: Benchmarking results for the Newton–Anderson method. Results Appl. Math. 8, 100095 (2020)

    Article  MathSciNet  Google Scholar 

  25. Potra, F.A., Engler, H.: A characterization of the behavior of the Anderson acceleration on linear problems. Linear Algebra Appl. 438(3), 1002–1011 (2013)

    Article  MathSciNet  Google Scholar 

  26. Saad, Y., Schultz, M.H.: GMRES: a generalized minimal residual algorithm for solving nonsymmetric linear systems. SIAM J. Sci. Stat. Comput. 7(3), 856–869 (1986)

    Article  MathSciNet  Google Scholar 

  27. Scieur, D., d’Aspremont, A., Bach, F.: Regularized nonlinear acceleration. In: Advances in Neural Information Processing Systems, pp. 712–720 (2016)

  28. Toth, A., Kelley, C.T.: Convergence analysis for Anderson acceleration. SIAM J. Numer. Anal. 53(2), 805–819 (2015)

    Article  MathSciNet  Google Scholar 

  29. Walker, H.F., Ni, P.: Anderson acceleration for fixed-point iterations. SIAM J. Numer. Anal. 49(4), 1715–1735 (2011)

    Article  MathSciNet  Google Scholar 

  30. Wang, D., He, Y., De Sterck, H.: On the asymptotic linear convergence speed of Anderson acceleration applied to admm. J. Sci. Comput. 88(2), 38 (2021)

    Article  MathSciNet  Google Scholar 

  31. Washio, T., Oosterlee, C.W.: Krylov subspace acceleration for nonlinear multigrid schemes. Electron. Trans. Numer. Anal. 6(271–290), 3–1 (1997)

    MathSciNet  Google Scholar 

Download references

Funding

The funding was provided by Natural Sciences and Engineering Research Council of Canada (Grant no. RGPIN-2019-04155).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hans De Sterck.

Ethics declarations

Conflicts of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendices

Some Results for AA(m) with General Initial Guess

In this appendix we present results that lead to the proof of Proposition 2 in Sect. 2. We first derive a result, following from expression (16), on writing the residual of the more general AA iteration (9) with general initial guess \(\{x_0,x_1,\ldots , x_m\}\) as a sum of \(m+1\) vectors which are in \(m+1\) Krylov spaces, \(\{ \mathcal {K}_s(M,r_j)\}_{j=0}^m\) generated by the \(m+1\) initial residuals \(r_0,r_1,\ldots , r_m\). We also derive recurrence relations for the polynomials that arise in this expression. Proposition 15 can then easily be specialized to Proposition 2.

Proposition 15

AA(m) iteration (9) with general initial guess \(\{x_j\}_{j=0}^{m}\) applied to linear iteration (5) is a multi-Krylov method. That is, the residual can be expressed as

$$\begin{aligned} r_{k+1} = \sum _{j=0}^m p_{k-m+1,j}(M)\,r_j,\quad k\ge m, \end{aligned}$$
(55)

where the \(p_{k-m+1,j}(\lambda )\) are polynomials of degree at most \(k-m+1\) satisfying the following relations:

$$\begin{aligned} p_{1,j}(\lambda )&=-\beta ^{(m)}_{m-j}\lambda , \quad j=0,\ldots ,m-1; \qquad p_{1,m}(\lambda )=\Big (1+\sum _{i=1}^{m}\beta _i^{(m)}\Big )\lambda ; \end{aligned}$$
(56)
$$\begin{aligned} p_{k-m+1,j}(\lambda )&=\lambda \left( \Big (1+\sum _{i=1}^{m}\beta _i^{(k)}\Big )p_{k-m,j} - \sum _{i=1}^{m}\beta _i^{(k)} p_{k-m-i,j}\right) , \nonumber \\ {}&\qquad \qquad \qquad \qquad \qquad \qquad k-m+1>1, j=0,\ldots , m; \end{aligned}$$
(57)

where for \(i=1,\ldots ,m,\) and \(j=0,\ldots , m\),

$$\begin{aligned} p_{1-i,j}(\lambda ) = {\left\{ \begin{array}{ll} 1 &{} \text {if}\,\, i=j+1-m,\\ 0 &{} \text {otherwise}. \end{array}\right. } \end{aligned}$$

Proof

The results of (56) are obvious from (16). For (57), when \(k\ge 2m+1\), from (16), \(r_{k+1}\) is a linear combination of \(\{r_{k+1-i}\}_{i=1}^{m+1}\) where the smallest subscript index in the residual is \(k-m\ge m+1\). Thus, every term \(r_{k+1-j}\) can be rewritten as a linear combination of \(\{r_j\}_{j=0}^m\). Then, (57) can be validated easily. As to \(k< 2m+1\), since some terms in \(\{r_{k+1-i}\}_{i=1}^{m+1}\) do not contain all \(\{r_j\}_{j=0}^m\), we require that \(p_{1-i,j}(\lambda )=0\) or 1 for (57) to hold. \(\square \)

Remark 3

Expression (55) indicates that the residual \(r_{k+1}\) with \(k\ge m\) of AA(m) can be decomposed as a sum of \(m+1\) vectors which are in \(m+1\) Krylov spaces, \(\{ \mathcal {K}_s(M,r_j)\}_{j=0}^m\) or \(\{ \mathcal {K}_s(A,r_j)\}_{j=0}^m\), where \(s=k-m+2\). Therefore, we can refer to AA(m) with general initial guess as a multi-Krylov space method. Note that, in the case of the usual AA(m) iteration (1) with one initial guess \(x_0\), each \(r_j\in \{r_j\}_{j=1}^m\) can, by Proposition 15, be expressed as a polynomial in M applied to \(r_0\), so AA(m) is a Krylov method, that is, \(r_{k+1}\in \mathcal {K}_s (M,r_0)\), as formalized in Proposition 2 in Sect. 2.

Remark 4

In Proposition 15, we can, if desired, also rewrite the residual in terms of polynomials in the matrix A:

$$\begin{aligned} r_{k+1} = \sum _{j=0}^m\widetilde{ p}_{k-m+1,j}(A)\,r_j,\quad k\ge m, \end{aligned}$$
(58)

where the \(\widetilde{p}_{k-m+1,j}(\lambda )\) satisfy the relations:

$$\begin{aligned} \widetilde{p}_{1,j}(\lambda )&=-\sum _{i=1}^{m}\beta _i^{(m)}(1-\lambda ), \,\, j=0,\ldots ,m-1; \\ \widetilde{p}_{1,m}(\lambda )&=\Big (1+\sum _{i=1}^{m}\beta _i^{(m)}\Big )(1-\lambda ); \\ \widetilde{p}_{k-m+1,j}(\lambda )&=(1-\lambda )\left( \Big (1+\sum _{i=1}^{m}\beta _i^{(k)}\Big ) \widetilde{p}_{k-m,j} - \sum _{i=1}^{m}\beta _i^{(k)} \widetilde{p}_{k-m-i,j}\right) , \\ {}&\qquad \qquad \qquad \qquad \qquad \qquad k-m+1>1, j=0,\ldots , m; \end{aligned}$$

where for \(i=1,\ldots ,m, j=0,\ldots , m\),

$$\begin{aligned} \widetilde{p}_{1-i,j}(\lambda ) = {\left\{ \begin{array}{ll} 1 &{} \text {if}\,\, i=j+1-m,\\ 0 &{} \text {otherwise}. \end{array}\right. } \end{aligned}$$

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

De Sterck, H., He, Y. & Krzysik, O.A. Anderson Acceleration as a Krylov Method with Application to Convergence Analysis. J Sci Comput 99, 12 (2024). https://doi.org/10.1007/s10915-024-02464-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10915-024-02464-x

Keywords

Mathematics Subject Classification

Navigation