Abstract
Anderson acceleration (AA) is widely used for accelerating the convergence of nonlinear fixed-point methods, but little is known about how to quantify the asymptotic convergence acceleration provided by AA. As a roadway towards gaining more understanding of convergence acceleration by AA, we study AA(m), i.e., Anderson acceleration with finite window size m, applied to the case of linear fixed-point iterations. We write AA(m) as a Krylov method with polynomial residual update formulas, and derive \((m+2)\)-term recurrence relations for the AA(m) polynomials. We derive several results based on these polynomial residual update formulas, including orthogonality relations, acceleration coefficient bounds, nonlinear recursions, and residual convergence bounds. We apply these results to study AA(1) residual convergence patterns and the influence of the initial guess on the asymptotic convergence factor.
Similar content being viewed by others
Data Availability
Data sharing is not applicable to this article as no datasets were generated or analyzed.
Notes
Note that \({{\mathcal {B}}}(\phi _k, y_k)\) is indeterminate at the point \((\phi _k, y_k) = (0, 1)\) corresponding to the special case when \(r_k = r_{k-1}\).
Note that these local FP bounds are the same as the AA(1) bounds (35) when \(r_{k} = r_{k-1}\), because, in this case, AA(1) just applies the basic FP iteration, \(r_{k+1} = M r_k\). For this reason, in the following comparison between AA(1) and the FP iteration, we suppose that AA(1) residuals satisfy \(r_k \ne r_{k-1}\) so that AA(1) is in fact distinct from the FP iteration.
References
An, H., Jia, X., Walker, H.F.: Anderson acceleration and application to the three-temperature energy equations. J. Comput. Phys. 347, 1–19 (2017)
Anderson, D.G.: Iterative procedures for nonlinear integral equations. J. ACM (JACM) 12(4), 547–560 (1965)
Baker, A.H., Jessup, E.R., Manteuffel, T.: A Technique for Accelerating the Convergence of Restarted GMRES. SIAM J. Matrix Anal. Appl. 26(4), 962–984 (2005)
Brune, P.R., Knepley, M.G., Smith, B.F., Tu, X.: Composing scalable nonlinear algebraic solvers. SIAM Rev. 57(4), 535–565 (2015)
De Sterck, H.: A nonlinear GMRES optimization algorithm for canonical tensor decomposition. SIAM J. Sci. Comput. 34(3), A1351–A1379 (2012)
De Sterck, H.: Steepest descent preconditioning for nonlinear GMRES optimization. Numer. Linear Algebra Appl. 20(3), 453–471 (2013)
De Sterck, H., He, Y.: On the asymptotic linear convergence speed of Anderson acceleration, Nesterov acceleration, and nonlinear GMRES. SIAM J. Sci. Comput. 43(5), S21–S46 (2021)
De Sterck, H., He, Y.: Linear asymptotic convergence of Anderson acceleration: fixed-point analysis. SIAM J. Matrix Anal. Appl. 43(4), 1755–1783 (2022)
Evans, C., Pollock, S., Rebholz, L.G., Xiao, M.: A proof that Anderson Acceleration improves the convergence rate in linearly converging fixed-point methods (but not in those converging quadratically). SIAM J. Numer. Anal. 58(1), 788–810 (2020)
Fang, H.R., Saad, Y.: Two classes of multisecant methods for nonlinear acceleration. Numer. Linear Algebra Appl. 16(3), 197–221 (2009)
Fu, A., Zhang, J., Boyd, S.: Anderson accelerated Douglas–Rachford splitting. SIAM J. Sci. Comput. 42(6), A3560–A3583 (2020)
Henderson, N.C., Varadhan, R.: Damped Anderson acceleration with restarts and monotonicity control for accelerating EM and EM-like algorithms. J. Comput. Graph. Stat. 28(4), 834–846 (2019)
Ho, N., Olson, S.D., Walker, H.F.: Accelerating the Uzawa algorithm. SIAM J. Sci. Comput. 39(5), S461–S476 (2017)
Kindermann, S.: Optimal-order convergence of Nesterov acceleration for linear ill-posed problems. Inverse Prob. 37(6), 065002 (2021)
Lipnikov, K., Svyatskiy, D., Vassilevski, Y.: Anderson acceleration for nonlinear finite volume scheme for advection-diffusion problems. SIAM J. Sci. Comput. 35(2), A1120–A1136 (2013)
Liu, C., Belkin, M.: Parametrized accelerated methods free of condition number. arXiv preprint arXiv:1802.10235 (2018)
Lockhart, S., Gardner, D.J., Woodward, C.S., Thomas, S., Olson, L.N.: Performance of low synchronization orthogonalization methods in Anderson accelerated fixed point solvers. In: Proceedings of the 2022 SIAM Conference on Parallel Processing for Scientific Computing, pp. 49–59. SIAM (2022)
Lott, P., Walker, H., Woodward, C., Yang, U.: An accelerated Picard method for nonlinear systems related to variably saturated flow. Adv. Water Resour. 38, 92–101 (2012)
Moler, C.B.: Numerical Computing with MATLAB. SIAM, New Delhi (2004)
Ni, P.: Anderson acceleration of fixed-point iteration with applications to electronic structure computations. Ph.D. thesis, Worcester Polytechnic Institute (2009)
Niu, C., Hu, X.: Momentum accelerated multigrid methods. arXiv preprint arXiv:2006.16986 (2020)
Oosterlee, C., Washio, T.: Krylov subspace acceleration of nonlinear multigrid with application to recirculating flows. SIAM J. Sci. Comput. 21(5), 1670–1690 (2000)
Pollock, S., Rebholz, L.G.: Anderson acceleration for contractive and noncontractive operators. IMA J. Numer. Anal. 41(4), 2841–2872 (2021)
Pollock, S., Schwartz, H.: Benchmarking results for the Newton–Anderson method. Results Appl. Math. 8, 100095 (2020)
Potra, F.A., Engler, H.: A characterization of the behavior of the Anderson acceleration on linear problems. Linear Algebra Appl. 438(3), 1002–1011 (2013)
Saad, Y., Schultz, M.H.: GMRES: a generalized minimal residual algorithm for solving nonsymmetric linear systems. SIAM J. Sci. Stat. Comput. 7(3), 856–869 (1986)
Scieur, D., d’Aspremont, A., Bach, F.: Regularized nonlinear acceleration. In: Advances in Neural Information Processing Systems, pp. 712–720 (2016)
Toth, A., Kelley, C.T.: Convergence analysis for Anderson acceleration. SIAM J. Numer. Anal. 53(2), 805–819 (2015)
Walker, H.F., Ni, P.: Anderson acceleration for fixed-point iterations. SIAM J. Numer. Anal. 49(4), 1715–1735 (2011)
Wang, D., He, Y., De Sterck, H.: On the asymptotic linear convergence speed of Anderson acceleration applied to admm. J. Sci. Comput. 88(2), 38 (2021)
Washio, T., Oosterlee, C.W.: Krylov subspace acceleration for nonlinear multigrid schemes. Electron. Trans. Numer. Anal. 6(271–290), 3–1 (1997)
Funding
The funding was provided by Natural Sciences and Engineering Research Council of Canada (Grant no. RGPIN-2019-04155).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendices
Some Results for AA(m) with General Initial Guess
In this appendix we present results that lead to the proof of Proposition 2 in Sect. 2. We first derive a result, following from expression (16), on writing the residual of the more general AA iteration (9) with general initial guess \(\{x_0,x_1,\ldots , x_m\}\) as a sum of \(m+1\) vectors which are in \(m+1\) Krylov spaces, \(\{ \mathcal {K}_s(M,r_j)\}_{j=0}^m\) generated by the \(m+1\) initial residuals \(r_0,r_1,\ldots , r_m\). We also derive recurrence relations for the polynomials that arise in this expression. Proposition 15 can then easily be specialized to Proposition 2.
Proposition 15
AA(m) iteration (9) with general initial guess \(\{x_j\}_{j=0}^{m}\) applied to linear iteration (5) is a multi-Krylov method. That is, the residual can be expressed as
where the \(p_{k-m+1,j}(\lambda )\) are polynomials of degree at most \(k-m+1\) satisfying the following relations:
where for \(i=1,\ldots ,m,\) and \(j=0,\ldots , m\),
Proof
The results of (56) are obvious from (16). For (57), when \(k\ge 2m+1\), from (16), \(r_{k+1}\) is a linear combination of \(\{r_{k+1-i}\}_{i=1}^{m+1}\) where the smallest subscript index in the residual is \(k-m\ge m+1\). Thus, every term \(r_{k+1-j}\) can be rewritten as a linear combination of \(\{r_j\}_{j=0}^m\). Then, (57) can be validated easily. As to \(k< 2m+1\), since some terms in \(\{r_{k+1-i}\}_{i=1}^{m+1}\) do not contain all \(\{r_j\}_{j=0}^m\), we require that \(p_{1-i,j}(\lambda )=0\) or 1 for (57) to hold. \(\square \)
Remark 3
Expression (55) indicates that the residual \(r_{k+1}\) with \(k\ge m\) of AA(m) can be decomposed as a sum of \(m+1\) vectors which are in \(m+1\) Krylov spaces, \(\{ \mathcal {K}_s(M,r_j)\}_{j=0}^m\) or \(\{ \mathcal {K}_s(A,r_j)\}_{j=0}^m\), where \(s=k-m+2\). Therefore, we can refer to AA(m) with general initial guess as a multi-Krylov space method. Note that, in the case of the usual AA(m) iteration (1) with one initial guess \(x_0\), each \(r_j\in \{r_j\}_{j=1}^m\) can, by Proposition 15, be expressed as a polynomial in M applied to \(r_0\), so AA(m) is a Krylov method, that is, \(r_{k+1}\in \mathcal {K}_s (M,r_0)\), as formalized in Proposition 2 in Sect. 2.
Remark 4
In Proposition 15, we can, if desired, also rewrite the residual in terms of polynomials in the matrix A:
where the \(\widetilde{p}_{k-m+1,j}(\lambda )\) satisfy the relations:
where for \(i=1,\ldots ,m, j=0,\ldots , m\),
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
De Sterck, H., He, Y. & Krzysik, O.A. Anderson Acceleration as a Krylov Method with Application to Convergence Analysis. J Sci Comput 99, 12 (2024). https://doi.org/10.1007/s10915-024-02464-x
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10915-024-02464-x