Anderson Acceleration as a Krylov Method with Application to Convergence Analysis

De Sterck, Hans; He, Yunhui; Krzysik, Oliver A.

doi:10.1007/s10915-024-02464-x

Anderson Acceleration as a Krylov Method with Application to Convergence Analysis

Published: 28 February 2024

Volume 99, article number 12, (2024)
Cite this article

Journal of Scientific Computing Aims and scope Submit manuscript

274 Accesses
Explore all metrics

Abstract

Anderson acceleration (AA) is widely used for accelerating the convergence of nonlinear fixed-point methods, but little is known about how to quantify the asymptotic convergence acceleration provided by AA. As a roadway towards gaining more understanding of convergence acceleration by AA, we study AA(m), i.e., Anderson acceleration with finite window size m, applied to the case of linear fixed-point iterations. We write AA(m) as a Krylov method with polynomial residual update formulas, and derive $(m+2)$-term recurrence relations for the AA(m) polynomials. We derive several results based on these polynomial residual update formulas, including orthogonality relations, acceleration coefficient bounds, nonlinear recursions, and residual convergence bounds. We apply these results to study AA(1) residual convergence patterns and the influence of the initial guess on the asymptotic convergence factor.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Asymptotic convergence analysis and influence of initial guesses on composite Anderson acceleration

Article 13 December 2023

On the Asymptotic Linear Convergence Speed of Anderson Acceleration Applied to ADMM

Article 25 June 2021

The Effect of Anderson Acceleration on Superlinear and Sublinear Convergence

Article 15 June 2023

Data Availability

Data sharing is not applicable to this article as no datasets were generated or analyzed.

Notes

Note that ${{\mathcal {B}}}(\phi _k, y_k)$ is indeterminate at the point $(\phi _k, y_k) = (0, 1)$ corresponding to the special case when $r_k = r_{k-1}$.
Note that these local FP bounds are the same as the AA(1) bounds (35) when $r_{k} = r_{k-1}$, because, in this case, AA(1) just applies the basic FP iteration, $r_{k+1} = M r_k$. For this reason, in the following comparison between AA(1) and the FP iteration, we suppose that AA(1) residuals satisfy $r_k \ne r_{k-1}$ so that AA(1) is in fact distinct from the FP iteration.

References

An, H., Jia, X., Walker, H.F.: Anderson acceleration and application to the three-temperature energy equations. J. Comput. Phys. 347, 1–19 (2017)
Article MathSciNet Google Scholar
Anderson, D.G.: Iterative procedures for nonlinear integral equations. J. ACM (JACM) 12(4), 547–560 (1965)
Article MathSciNet Google Scholar
Baker, A.H., Jessup, E.R., Manteuffel, T.: A Technique for Accelerating the Convergence of Restarted GMRES. SIAM J. Matrix Anal. Appl. 26(4), 962–984 (2005)
Article MathSciNet Google Scholar
Brune, P.R., Knepley, M.G., Smith, B.F., Tu, X.: Composing scalable nonlinear algebraic solvers. SIAM Rev. 57(4), 535–565 (2015)
Article MathSciNet Google Scholar
De Sterck, H.: A nonlinear GMRES optimization algorithm for canonical tensor decomposition. SIAM J. Sci. Comput. 34(3), A1351–A1379 (2012)
Article MathSciNet Google Scholar
De Sterck, H.: Steepest descent preconditioning for nonlinear GMRES optimization. Numer. Linear Algebra Appl. 20(3), 453–471 (2013)
Article MathSciNet Google Scholar
De Sterck, H., He, Y.: On the asymptotic linear convergence speed of Anderson acceleration, Nesterov acceleration, and nonlinear GMRES. SIAM J. Sci. Comput. 43(5), S21–S46 (2021)
Article MathSciNet Google Scholar
De Sterck, H., He, Y.: Linear asymptotic convergence of Anderson acceleration: fixed-point analysis. SIAM J. Matrix Anal. Appl. 43(4), 1755–1783 (2022)
Article MathSciNet Google Scholar
Evans, C., Pollock, S., Rebholz, L.G., Xiao, M.: A proof that Anderson Acceleration improves the convergence rate in linearly converging fixed-point methods (but not in those converging quadratically). SIAM J. Numer. Anal. 58(1), 788–810 (2020)
Article MathSciNet Google Scholar
Fang, H.R., Saad, Y.: Two classes of multisecant methods for nonlinear acceleration. Numer. Linear Algebra Appl. 16(3), 197–221 (2009)
Article MathSciNet Google Scholar
Fu, A., Zhang, J., Boyd, S.: Anderson accelerated Douglas–Rachford splitting. SIAM J. Sci. Comput. 42(6), A3560–A3583 (2020)
Article MathSciNet Google Scholar
Henderson, N.C., Varadhan, R.: Damped Anderson acceleration with restarts and monotonicity control for accelerating EM and EM-like algorithms. J. Comput. Graph. Stat. 28(4), 834–846 (2019)
Article MathSciNet Google Scholar
Ho, N., Olson, S.D., Walker, H.F.: Accelerating the Uzawa algorithm. SIAM J. Sci. Comput. 39(5), S461–S476 (2017)
Article MathSciNet Google Scholar
Kindermann, S.: Optimal-order convergence of Nesterov acceleration for linear ill-posed problems. Inverse Prob. 37(6), 065002 (2021)
Article MathSciNet Google Scholar
Lipnikov, K., Svyatskiy, D., Vassilevski, Y.: Anderson acceleration for nonlinear finite volume scheme for advection-diffusion problems. SIAM J. Sci. Comput. 35(2), A1120–A1136 (2013)
Article MathSciNet Google Scholar
Liu, C., Belkin, M.: Parametrized accelerated methods free of condition number. arXiv preprint arXiv:1802.10235 (2018)
Lockhart, S., Gardner, D.J., Woodward, C.S., Thomas, S., Olson, L.N.: Performance of low synchronization orthogonalization methods in Anderson accelerated fixed point solvers. In: Proceedings of the 2022 SIAM Conference on Parallel Processing for Scientific Computing, pp. 49–59. SIAM (2022)
Lott, P., Walker, H., Woodward, C., Yang, U.: An accelerated Picard method for nonlinear systems related to variably saturated flow. Adv. Water Resour. 38, 92–101 (2012)
Article Google Scholar
Moler, C.B.: Numerical Computing with MATLAB. SIAM, New Delhi (2004)
Book Google Scholar
Ni, P.: Anderson acceleration of fixed-point iteration with applications to electronic structure computations. Ph.D. thesis, Worcester Polytechnic Institute (2009)
Niu, C., Hu, X.: Momentum accelerated multigrid methods. arXiv preprint arXiv:2006.16986 (2020)
Oosterlee, C., Washio, T.: Krylov subspace acceleration of nonlinear multigrid with application to recirculating flows. SIAM J. Sci. Comput. 21(5), 1670–1690 (2000)
Article MathSciNet Google Scholar
Pollock, S., Rebholz, L.G.: Anderson acceleration for contractive and noncontractive operators. IMA J. Numer. Anal. 41(4), 2841–2872 (2021)
Article MathSciNet Google Scholar
Pollock, S., Schwartz, H.: Benchmarking results for the Newton–Anderson method. Results Appl. Math. 8, 100095 (2020)
Article MathSciNet Google Scholar
Potra, F.A., Engler, H.: A characterization of the behavior of the Anderson acceleration on linear problems. Linear Algebra Appl. 438(3), 1002–1011 (2013)
Article MathSciNet Google Scholar
Saad, Y., Schultz, M.H.: GMRES: a generalized minimal residual algorithm for solving nonsymmetric linear systems. SIAM J. Sci. Stat. Comput. 7(3), 856–869 (1986)
Article MathSciNet Google Scholar
Scieur, D., d’Aspremont, A., Bach, F.: Regularized nonlinear acceleration. In: Advances in Neural Information Processing Systems, pp. 712–720 (2016)
Toth, A., Kelley, C.T.: Convergence analysis for Anderson acceleration. SIAM J. Numer. Anal. 53(2), 805–819 (2015)
Article MathSciNet Google Scholar
Walker, H.F., Ni, P.: Anderson acceleration for fixed-point iterations. SIAM J. Numer. Anal. 49(4), 1715–1735 (2011)
Article MathSciNet Google Scholar
Wang, D., He, Y., De Sterck, H.: On the asymptotic linear convergence speed of Anderson acceleration applied to admm. J. Sci. Comput. 88(2), 38 (2021)
Article MathSciNet Google Scholar
Washio, T., Oosterlee, C.W.: Krylov subspace acceleration for nonlinear multigrid schemes. Electron. Trans. Numer. Anal. 6(271–290), 3–1 (1997)
MathSciNet Google Scholar

Download references

Funding

The funding was provided by Natural Sciences and Engineering Research Council of Canada (Grant no. RGPIN-2019-04155).

Author information

Yunhui He
Present address: Department of Mathematics, University of Houston, 3551 Cullen Blvd, Houston, TX, 77204-3008, USA

Authors and Affiliations

Department of Applied Mathematics, University of Waterloo, 200 University Ave W, Waterloo, ON, N2L3G1, Canada
Hans De Sterck, Yunhui He & Oliver A. Krzysik

Authors

Hans De Sterck
View author publications
You can also search for this author in PubMed Google Scholar
Yunhui He
View author publications
You can also search for this author in PubMed Google Scholar
Oliver A. Krzysik
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hans De Sterck.

Ethics declarations

Conflicts of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Some Results for AA(m) with General Initial Guess

In this appendix we present results that lead to the proof of Proposition 2 in Sect. 2. We first derive a result, following from expression (16), on writing the residual of the more general AA iteration (9) with general initial guess $\{x_0,x_1,\ldots , x_m\}$ as a sum of $m+1$ vectors which are in $m+1$ Krylov spaces, $\{ \mathcal {K}_s(M,r_j)\}_{j=0}^m$ generated by the $m+1$ initial residuals $r_0,r_1,\ldots , r_m$. We also derive recurrence relations for the polynomials that arise in this expression. Proposition 15 can then easily be specialized to Proposition 2.

Proposition 15

AA(m) iteration (9) with general initial guess $\{x_j\}_{j=0}^{m}$ applied to linear iteration (5) is a multi-Krylov method. That is, the residual can be expressed as

$$\begin{aligned} r_{k+1} = \sum _{j=0}^m p_{k-m+1,j}(M)\,r_j,\quad k\ge m, \end{aligned}$$

(55)

where the $p_{k-m+1,j}(\lambda )$ are polynomials of degree at most $k-m+1$ satisfying the following relations:

$$\begin{aligned} p_{1,j}(\lambda )&=-\beta ^{(m)}_{m-j}\lambda , \quad j=0,\ldots ,m-1; \qquad p_{1,m}(\lambda )=\Big (1+\sum _{i=1}^{m}\beta _i^{(m)}\Big )\lambda ; \end{aligned}$$

(56)

$$\begin{aligned} p_{k-m+1,j}(\lambda )&=\lambda \left( \Big (1+\sum _{i=1}^{m}\beta _i^{(k)}\Big )p_{k-m,j} - \sum _{i=1}^{m}\beta _i^{(k)} p_{k-m-i,j}\right) , \nonumber \\ {}&\qquad \qquad \qquad \qquad \qquad \qquad k-m+1>1, j=0,\ldots , m; \end{aligned}$$

(57)

where for $i=1,\ldots ,m,$ and $j=0,\ldots , m$,

$$\begin{aligned} p_{1-i,j}(\lambda ) = {\left\{ \begin{array}{ll} 1 &{} \text {if}\,\, i=j+1-m,\\ 0 &{} \text {otherwise}. \end{array}\right. } \end{aligned}$$

Proof

The results of (56) are obvious from (16). For (57), when $k\ge 2m+1$, from (16), $r_{k+1}$ is a linear combination of $\{r_{k+1-i}\}_{i=1}^{m+1}$ where the smallest subscript index in the residual is $k-m\ge m+1$. Thus, every term $r_{k+1-j}$ can be rewritten as a linear combination of $\{r_j\}_{j=0}^m$. Then, (57) can be validated easily. As to $k< 2m+1$, since some terms in $\{r_{k+1-i}\}_{i=1}^{m+1}$ do not contain all $\{r_j\}_{j=0}^m$, we require that $p_{1-i,j}(\lambda )=0$ or 1 for (57) to hold. $\square $

Remark 3

Expression (55) indicates that the residual $r_{k+1}$ with $k\ge m$ of AA(m) can be decomposed as a sum of $m+1$ vectors which are in $m+1$ Krylov spaces, $\{ \mathcal {K}_s(M,r_j)\}_{j=0}^m$ or $\{ \mathcal {K}_s(A,r_j)\}_{j=0}^m$, where $s=k-m+2$. Therefore, we can refer to AA(m) with general initial guess as a multi-Krylov space method. Note that, in the case of the usual AA(m) iteration (1) with one initial guess $x_0$, each $r_j\in \{r_j\}_{j=1}^m$ can, by Proposition 15, be expressed as a polynomial in M applied to $r_0$, so AA(m) is a Krylov method, that is, $r_{k+1}\in \mathcal {K}_s (M,r_0)$, as formalized in Proposition 2 in Sect. 2.

Remark 4

In Proposition 15, we can, if desired, also rewrite the residual in terms of polynomials in the matrix A:

$$\begin{aligned} r_{k+1} = \sum _{j=0}^m\widetilde{ p}_{k-m+1,j}(A)\,r_j,\quad k\ge m, \end{aligned}$$

(58)

where the $\widetilde{p}_{k-m+1,j}(\lambda )$ satisfy the relations:

$$\begin{aligned} \widetilde{p}_{1,j}(\lambda )&=-\sum _{i=1}^{m}\beta _i^{(m)}(1-\lambda ), \,\, j=0,\ldots ,m-1; \\ \widetilde{p}_{1,m}(\lambda )&=\Big (1+\sum _{i=1}^{m}\beta _i^{(m)}\Big )(1-\lambda ); \\ \widetilde{p}_{k-m+1,j}(\lambda )&=(1-\lambda )\left( \Big (1+\sum _{i=1}^{m}\beta _i^{(k)}\Big ) \widetilde{p}_{k-m,j} - \sum _{i=1}^{m}\beta _i^{(k)} \widetilde{p}_{k-m-i,j}\right) , \\ {}&\qquad \qquad \qquad \qquad \qquad \qquad k-m+1>1, j=0,\ldots , m; \end{aligned}$$

where for $i=1,\ldots ,m, j=0,\ldots , m$,

$$\begin{aligned} \widetilde{p}_{1-i,j}(\lambda ) = {\left\{ \begin{array}{ll} 1 &{} \text {if}\,\, i=j+1-m,\\ 0 &{} \text {otherwise}. \end{array}\right. } \end{aligned}$$

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

De Sterck, H., He, Y. & Krzysik, O.A. Anderson Acceleration as a Krylov Method with Application to Convergence Analysis. J Sci Comput 99, 12 (2024). https://doi.org/10.1007/s10915-024-02464-x

Download citation

Received: 18 March 2023
Revised: 31 October 2023
Accepted: 18 December 2023
Published: 28 February 2024
DOI: https://doi.org/10.1007/s10915-024-02464-x

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Anderson Acceleration as a Krylov Method with Application to Convergence Analysis

Abstract

Access this article

Similar content being viewed by others

Asymptotic convergence analysis and influence of initial guesses on composite Anderson acceleration

On the Asymptotic Linear Convergence Speed of Anderson Acceleration Applied to ADMM

The Effect of Anderson Acceleration on Superlinear and Sublinear Convergence

Data Availability

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's Note

Appendices

Appendices

Some Results for AA(m) with General Initial Guess

Proposition 15

Proof

Remark 3

Remark 4

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Anderson Acceleration as a Krylov Method with Application to Convergence Analysis

Abstract

Access this article

Similar content being viewed by others

Asymptotic convergence analysis and influence of initial guesses on composite Anderson acceleration

On the Asymptotic Linear Convergence Speed of Anderson Acceleration Applied to ADMM

The Effect of Anderson Acceleration on Superlinear and Sublinear Convergence

Data Availability

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's Note

Appendices

Appendices

Some Results for AA(m) with General Initial Guess

Proposition 15

Proof

Remark 3

Remark 4

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation