Preconditioning and globalizing conjugate gradients in dual space for quadratically penalized nonlinear-least squares problems

Gratton, Serge; Gürol, Selime; Toint, Philippe L.

doi:10.1007/s10589-012-9478-7

Preconditioning and globalizing conjugate gradients in dual space for quadratically penalized nonlinear-least squares problems

Published: 21 March 2012

Volume 54, pages 1–25, (2013)
Cite this article

Computational Optimization and Applications Aims and scope Submit manuscript

Serge Gratton¹,
Selime Gürol² &
Philippe L. Toint³

489 Accesses
9 Citations
Explore all metrics

Abstract

When solving nonlinear least-squares problems, it is often useful to regularize the problem using a quadratic term, a practice which is especially common in applications arising in inverse calculations. A solution method derived from a trust-region Gauss-Newton algorithm is analyzed for such applications, where, contrary to the standard algorithm, the least-squares subproblem solved at each iteration of the method is rewritten as a quadratic minimization subject to linear equality constraints. This allows the exploitation of duality properties of the associated linearized problems. This paper considers a recent conjugate-gradient-like method which performs the quadratic minimization in the dual space and produces, in exact arithmetic, the same iterates as those produced by a standard conjugate-gradients method in the primal space. This dual algorithm is computationally interesting whenever the dimension of the dual space is significantly smaller than that of the primal space, yielding gains in terms of both memory usage and computational cost. The relation between this dual space solver and PSAS (Physical-space Statistical Analysis System), another well-known dual space technique used in data assimilation problems, is explained. The use of an effective preconditioning technique is proposed and refined convergence bounds derived, which results in a practical solution method. Finally, stopping rules adequate for a trust-region solver are proposed in the dual space, providing iterates that are equivalent to those obtained with a Steihaug-Toint truncated conjugate-gradient method in the primal space.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Proximal Stabilized Interior Point Methods and Low-Frequency-Update Preconditioning Techniques

Article Open access 05 April 2023

Stefano Cipolla & Jacek Gondzio

A Proximal Point Analysis of the Preconditioned Alternating Direction Method of Multipliers

Article 26 April 2017

Kristian Bredies & Hongpeng Sun

Globally convergent coderivative-based generalized Newton methods in nonsmooth optimization

Article 27 May 2023

Pham Duy Khanh, Boris S. Mordukhovich, … Dat Ba Tran

References

Akkraoui, A.E., Gauthier, P., Pellerin, S., Buis, S.: Intercomparison of the primal and dual formulations of variational data assimilation. Q. J. R. Meteorol. Soc. 134, 1015–1025 (2008)
Article Google Scholar
Arioli, M.: A stopping criterion for the conjugate gradient algorithm in a finite element framework. Numer. Math. 97, 1–24 (2004)
Article MathSciNet MATH Google Scholar
Ashby, S.F., Holst, M.J., Manteuffel, T.A., Saylor, P.E.: The role of the inner product in stopping criteria for conjugate gradient iterations. BIT 41(1), 026–052 (2001)
Article MathSciNet Google Scholar
Axelsson, O.: Iterative Solution Methods. Cambridge University Press, Cambridge (1996)
MATH Google Scholar
Björck, Å.: Numerical Methods for Least Squares Problems. SIAM, Philadelphia (1996)
Book MATH Google Scholar
Bouttier, F., Courtier, P.: Data assimilation concepts and methods. Technical report, ECMWF, Reading, England, 1999
Cartis, C., Gould, N.I.M., Toint, P.L.: Adaptive cubic overestimation methods for unconstrained optimization. Part I: Motivation, convergence and numerical results. Math. Program., Ser. A (2009). doi:10.1007/s10107-009-0286-5, 51 pp.
Google Scholar
Conn, A.R., Gould, N.I.M., Toint, P.L.: Trust-Region Methods. MPS-SIAM Series on Optimization, vol. 01. SIAM, Philadelphia (2000)
Book MATH Google Scholar
Courtier, P.: Dual formulation of four-dimensional variational assimilation. Q. J. R. Meteorol. Soc. 123, 2449–2461 (1997)
Article Google Scholar
Courtier, P., Thépaut, J.-N., Hollingsworth, A.: A strategy for operational implementation of 4D-Var using an incremental approach. Q. J. R. Meteorol. Soc. 120, 1367–1388 (1994)
Article Google Scholar
Dennis, J.E., Schnabel, R.B.: Numerical Methods for Unconstrained Optimization and Nonlinear Equations. Prentice-Hall, Englewood Cliffs (1983). Reprinted as Classics in Applied Mathematics, vol. 16. SIAM, Philadelphia (1996)
MATH Google Scholar
Fisher, M., Nocedal, J., Tremolet, Y., Wright, S.J.: Data assimilation in weather forecasting: a case study in PDE-constrained optimization. Optim. Eng. 10, 409–426 (2009)
Article MathSciNet MATH Google Scholar
Giering, R., Kaminski, T.: Recipes for adjoint code construction. ACM Trans. Math. Softw. 24(4), 437–474 (1998)
Article MATH Google Scholar
Golub, G.H., Van Loan, C.F.: Matrix Computations, 2nd edn. Johns Hopkins University Press, Baltimore (1989)
MATH Google Scholar
Gratton, S., Tshimanga, J.: An observation-space formulation of variational assimilation using a restricted preconditioned conjugate-gradient algorithm. Q. J. R. Meteorol. Soc. 135, 1573–1585 (2009)
Article Google Scholar
Gratton, S., Lawless, A., Nichols, N.K.: Approximate Gauss-Newton methods for nonlinear least-squares problems. SIAM J. Optim. 18, 106–132 (2007)
Article MathSciNet MATH Google Scholar
Kelley, C.T.: Iterative Methods for Optimization. Frontiers in Applied Mathematics. SIAM, Philadelphia (1999)
Book MATH Google Scholar
Lampe, J., Rojas, M., Sorensen, D.C., Voss, H.: Accelerating the LSTRS algorithm. SIAM J. Sci. Comput. 33(1), 175–194 (2011)
Article MathSciNet MATH Google Scholar
Morales, J.L., Nocedal, J.: Automatic preconditioning by limited-memory quasi-Newton updating. SIAM J. Optim. 10, 1079–1096 (2000)
Article MathSciNet MATH Google Scholar
Nocedal, J., Wright, S.J.: Numerical Optimization. Series in Operations Research. Springer, Heidelberg (1999)
Book Google Scholar
Rojas, M., Santos, S.A., Sorensen, D.C.: Algorithm 873: LSTRS: MATLAB software for large-scale trust-region subproblems and regularization. ACM Trans. Math. Softw. 34(2), 11 (2008)
Article MathSciNet Google Scholar
Roux, F.X.: Acceleration of the outer conjugate gradient by reorthogonalization for a domain decomposition method for structural analysis problems. In: Proceedings of the 3rd International Conference on Supercomputing’, ICS’89, pp. 471–476. ACM, New York (1989)
Chapter Google Scholar
Stoll, M., Wathen, A.: Combination preconditioning and the Bramble-Pasciak⁺ preconditioner. SIAM J. Matrix Anal. Appl. 30(2), 582–608 (2008)
Article MathSciNet MATH Google Scholar
Tarantola, A.: Inverse Problem Theory. Methods for Data Fitting and Model Parameter Estimation. Elsevier, Amsterdam (1987)
MATH Google Scholar
Tschimanga, Jean: On a class of limited memory preconditioners for large-scale nonlinear least-squares problems (with application to variational ocean data assimilation). PhD thesis, Department of Mathematics, FUNDP, University of Namur, Namur, Belgium (2007)
Tshimanga, J., Gratton, S., Weaver, A., Sartenaer, A.: Limited-memory preconditioners with application to incremental four-dimensional variational data assimilation. Q. J. R. Meteorol. Soc. 134, 751–769 (2008)
Article Google Scholar
van der Vorst, H.A.: Iterative Krylov Methods for Large Linear Systems. Cambridge University Press, Cambridge (2003)
Book MATH Google Scholar
Weaver, A.T., Vialard, J., Anderson, D.L.T.: Three- and four-dimensional variational assimilation with a general circulation model of the tropical Pacific ocean, Part 1: Formulation, internal diagnostics and consistency checks. Mon. Weather Rev. 131, 1360–1378 (2003)
Article Google Scholar

Download references

Acknowledgements

The third author gratefully acknowledges partial support from CERFACS.

Author information

Authors and Affiliations

ENSEEIHT, 2, rue Camichel, 3100, Toulouse, France
Serge Gratton
CERFACS, 42, avenue Coriolis, 31057, Toulouse, France
Selime Gürol
Namur Centre for Complex Systems (NAXYS), FUNDP-University of Namur, 61, rue de Bruxelles, 5000, Namur, Belgium
Philippe L. Toint

Authors

Serge Gratton
View author publications
You can also search for this author in PubMed Google Scholar
Selime Gürol
View author publications
You can also search for this author in PubMed Google Scholar
Philippe L. Toint
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Philippe L. Toint.

Appendix: Proof for the Lemma 2.4

Proof

If the singular value decomposition for H is given by

$$ H = [U_1 \mbox{\ \ } U_2 ] \left [\begin{array}{c@{\quad}c}\Sigma_r & 0 \\0 & 0 \\\end{array} \right ] \left [ \begin{array}{c}V_1^T \\ [2pt]V_2^T \\\end{array}\right ],$$

(A.1)

a possible theoretical choice for $\check{H}$ could be

$$\check{H} = \Sigma_rV_1^T.$$

Denoting $\check{R}= ( U_{1}^{T}R^{-1}U_{1} )^{-1}$, direct computations show that

$$B^{-1}+H^TR^{-1}H=B^{-1}+\check{H}^T\check{R}^{-1}\check{H}.$$

Using the assumption (2.23) and denoting $\check{G}=U_{1}^{T} G U_{1}$, we also obtain that $F\check{H}^{T} = B \check{H}^{T} \check{G}$.

The matrix $\check{H}$ is now a r×n matrix of rank r and Lemma 2.3 can then be applied using r, $\check{R}$, $\check{H}$ and $\check{G}$ instead of m, R, H and G, yielding the desired result where ν ₁,…,ν _r the eigenvalues of $\check{G}(I_{r}+\check{R}^{-1}\check{H}B\check{H}^{T})$, replace those of G(I _m+R ⁻¹ HBH ^T) in (2.48). We next investigate the relations between these two sets of eigenvalues.

Using the relation on $\check{H}$, $\check{R}$ and $\check{G}$, it can be shown that

$$ \bigl[U_1U_1^TG\widehat{A}\bigr] U_1 = U_1[\check{G}\check{A}]$$

(A.2)

where $\widehat{A}$ is defined in (2.21). This says that U ₁ is an invariant subspace of $U_{1}U_{1}^{T}G\widehat{A}$ and every eigenvalue of $\check{G}\check{A}$ is an eigenvalue of $U_{1}U_{1}^{T}G\widehat{A}$. Therefore, the nonzero eigenvalues of $U_{1}U_{1}^{T}G\widehat{A}$ are equal to the eigenvalues of $\check{G}\check{A}$ using the fact that $U_{1}U_{1}^{T}G\widehat{A}$ has (m−r) null eigenvalues. We now consider the relations between the eigenvalues of $U_{1}U_{1}^{T}G\widehat{A}$ and $G\widehat{A}$, and start by rewriting these matrices blockwise.

Using the relation (A.1), it can be shown that

$$HBH^T = U_1\Sigma_rV_1^TBV_1\Sigma_rU_1^T.$$

(A.3)

Defining

$$ U_1 = \left [ \begin{array}{c}U_r \\0 \\\end{array} \right ],$$

(A.4)

HBH ^T can thus be rewritten in a block matrix form as

(A.5)

where $M_{r} = U_{r} \Sigma_{r}V_{1}^{T}BV_{1}\Sigma_{r}U_{r}^{T} $ has full rank r. Using the equality (2.23), we can write that

$$HPH^T = HBH^TG.$$

(A.6)

Hence, HBH ^T G is symmetric due to the symmetry of HPH ^T. Using the relation (A.5) and defining

$$G = \left [ \begin{array}{c@{\quad}c}G_r & G_2 \\G_3 & G_{m-r} \\\end{array} \right ] ,$$

(A.7)

where G _r is r×r matrix and G _m−r is a m−r×m−r matrix, we can write HBH ^T G as

$$ HBH^TG = \left [ \begin{array}{c@{\quad}c}M_rG_r & M_rG_2 \\0 & 0 \\\end{array}\right ].$$

(A.8)

From the symmetry of HBH ^T G given by (A.8), M _r G ₂=0 which implies that G ₂=0 since M _r is a full rank matrix. Thus, G has the form

$$ G = \left [ \begin{array}{c@{\quad}c}G_r & 0 \\G_3 & G_{m-r} \\\end{array} \right ].$$

(A.9)

We next derive a block matrix form of $\widehat{A}$. Defining R ⁻¹ as

$$R^{-1} = \left [ \begin{array}{c@{\quad}c}R^{-1}_r & R^{-1}_2 \\R^{-1}_3 & R^{-1}_{m-r} \\\end{array} \right ],$$

(A.10)

and using (A.5) and the definition of $\widehat{A}$, we can write $\widehat{A}$ as

(A.11)

where $\widehat{A}_{r} = I + R^{-1}_{r}M_{r}$. From (A.4), (A.9) and (A.11), we deduce that

$$ G\widehat{A}= \left [ \begin{array}{c@{\quad}c}G_r \widehat{A}_r & 0 \\G_3 \widehat{A}_r & G_{m-r} \\\end{array} \right ] \quad \mbox{and} \quad U_1U_1^TG\widehat{A}= \left [ \begin{array}{c@{\quad}c}G_r \widehat{A}_r & 0 \\0 & 0 \\\end{array} \right ].$$

(A.12)

From (A.12), the eigenvalues of $G\widehat{A}$ are the eigenvalues of G _r A _r and the eigenvalues of G _m−r. Also, the eigenvalues of $U_{1}U_{1}^{T}G\widehat{A}$ are the eigenvalues of G _r A _r and (m−r) null eigenvalues. Therefore, the nonzero eigenvalues of $U_{1}U_{1}^{T}G\widehat{A}$ which are equal to the eigenvalues of $\check{G}\check{A}$ form a subset of the eigenvalues of $G\widehat{A}$. As a result, the eigenvalues of $\check{G}(I_{r}+\check{R}^{-1}\check {H}B\check{H}^{T})$ can be used in (2.47) instead of those G(I _m+R ⁻¹ HBH ^T), which completes the proof. □

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gratton, S., Gürol, S. & Toint, P.L. Preconditioning and globalizing conjugate gradients in dual space for quadratically penalized nonlinear-least squares problems. Comput Optim Appl 54, 1–25 (2013). https://doi.org/10.1007/s10589-012-9478-7

Download citation

Received: 14 December 2010
Published: 21 March 2012
Issue Date: January 2013
DOI: https://doi.org/10.1007/s10589-012-9478-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Preconditioning and globalizing conjugate gradients in dual space for quadratically penalized nonlinear-least squares problems

Abstract

Access this article

Similar content being viewed by others

Proximal Stabilized Interior Point Methods and Low-Frequency-Update Preconditioning Techniques

A Proximal Point Analysis of the Preconditioned Alternating Direction Method of Multipliers

Globally convergent coderivative-based generalized Newton methods in nonsmooth optimization

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix: Proof for the Lemma 2.4

Proof

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Preconditioning and globalizing conjugate gradients in dual space for quadratically penalized nonlinear-least squares problems

Abstract

Access this article

Similar content being viewed by others

Proximal Stabilized Interior Point Methods and Low-Frequency-Update Preconditioning Techniques

A Proximal Point Analysis of the Preconditioned Alternating Direction Method of Multipliers

Globally convergent coderivative-based generalized Newton methods in nonsmooth optimization

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix: Proof for the Lemma 2.4

Appendix: Proof for the Lemma 2.4

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation