Convergence of Gauss–Newton’s method and uniqueness of the solution

https://doi.org/10.1016/j.amc.2004.12.055Get rights and content

Abstract

In this paper, we study the convergence of Gauss–Newton’s method for nonlinear least squares problems. Under the hypothesis that derivative satisfies some kinds of weak Lipschitz condition, we obtain the exact estimates of the radii of convergence ball of Gauss–Newton’s method and the uniqueness ball of the solution. New results can be used to determinate approximation zero of Gauss–Newton’s method.

Introduction

We consider the nonlinear least squares problems:minF(x):=12f(x)Tf(x)where f(x): Rn  Rm is Frechet differentiable, m  n. And in all cases ∥ · ∥ refers to the 2-norm.

Gauss–Newton’s method, defined byxn+1=xn-f(xn)Tf(xn)-1f(xn)Tf(xn),n0is one of the best-known methods for the problem (1.1). For the Gauss–Newton’s method, analyzes for local and rate of convergence properties are mostly restricted in quality [1], [6], [15], [16], only considering the existence neighborhood of the convergence, but it cannot make us clearly see how big the radius of the convergence ball. Let x* denote the solution of (1.1), B(x,r) denote an open ball with radius r and center x, and let B(x,r)¯ denote its closure.

Traub and Wozniakowski [5] and Wang [12] independently gave an exact estimate for the convergence ball of Newton’s method. Under the hypothesis that f′(x) satisfies the some kind Lipschitz conditionf(x)-1(f(x)-f(xτ))τρ(x)ρ(x)L(u)du,xB(x,r),where ρ(x) = x  x*∥, xτ = x* + τ(x  x*), and L is monotonic function, Wang [13], [14] studied the convergence of the Newton’s method.

In this paper, we consider the convergence of the Gauss–Newton’s method. Under more general conditions, we obtain the convergence domain of the Gauss–Newton’s method and the uniqueness domain of the solution. Further, we can prove the optimality of the estimation of the radii. New results can be used to determinate approximation zero of Gauss–Newton’s method.

Section snippets

Special and generalized Lipschitz condition

The condition on the function f(x)f(x)-f(xτ)Lx-xτ,xB(x,r),where xτ = x* + τ(x  x*),0  τ  1, is usually called radius Lipschitz condition in the ball B(x*, r) with constant L. Sometimes if it is only required to satisfyf(x)-f(x)Lx-x,xB(x,r),we call it the center Lipschitz condition in the ball B(x*, r) with constant L. Furthermore, the L in the Lipschitz condition need not be a constant, but a positive integrable function, If this is the case, then (2.1) or (2.2) is replaced byf(x)-f(xτ

Convergence ball of Gauss–Newton’s method

Theorem 3.1

Suppose x* satisfies (1.1), f has a continuous derivative in B(x*,r). f(x*) has full rank, and fsatisfies the radius Lipschitz condition with L average.(f(x)-f(xτ))τρ(x)ρ(x)L(u)du,xB(x,r),0τ1,where xτ = x* + τ(x  x*), ρ(x) = x  x*∥, and L is nondecreasing. Let r > 0 satisfyβ0rL(u)udur(1-β0rL(u)du)+2cβ20rL(u)dur(1-β0rL(u)du)1.Then Gauss–Newton’s method is convergent for all x0  B(x*,r) andxn+1-xβ0ρ(x0)L(u)uduρ(x0)21-β0ρ(x0)L(u)duxn-x2+2cβ20ρ(x0)L(u)duρ(x0)(1-β0ρ(x0)L(u)du)xn-x

The uniqueness ball for the optimal solution

Theorem 4.1

Suppose x* satisfies (1.1), f has a continuous derivative in B(x*,r). f(x*) has full rank, and fsatisfies the center Lipschitz condition with L average.f(x)-f(x)0ρ(x)L(u)du,xB(x,r),where ρ(x) = x  x*∥, and L is nondecreasing. Let r > 0 satisfyβr0rL(u)(r-u)du+cβ0r0rL(u)du1,where c,β hold in (3.4), andβ0=f(x)Tf(x)-1.Then Eq. (1.1) has a unique solution x* in B(x*,r).

Proof

Suppose x0  B(x*,r), x0  x* is also a solution of (1.1). Then we havef(x)Tf(x)-1f(x0)Tf(x0)=0.

Hencex0-x=x0-x-f(

The optimality of the estimation of the radius

Theorem 5.1

Suppose that the equality sign holds in the inequality (3.2) in the Theorem 3.1. Then the given value r of the convergence ball is the best possible.

Proof

We notice that when r is determined by equalityβ0rL(u)udur(1-β0rL(u)du)+2cβ20rL(u)udur(1-β0rL(u)du)=1,there exists f satisfying (3.1) in B(x*, r) and x0 on the boundary of the closed ball such that Gauss–Newton’s method fails. In fact, the following is an example on the scaled case:f(x)=x-x+β0x-x(x-x-u)L(u)du,xx<x+r;x-x+β0x-x(x-x+u)L(u

Corollaries of the main results

In the study of the Gauss–Newton’s method (or Newton’s method), the assumption that the derivative is Lipschitz continuous is considered traditional. Combining Theorem 3.1, Theorem 4.1 with Theorem 5.1, Theorem 5.2, and taking L as a constant, the following two corollaries are obtained directly.

Corollary 6.1

Suppose x* satisfies (1.1), f has a continuous derivative in B(x*,r). f(x*) has full rank, and fsatisfies the radius Lipschitz condition:f(x)-f(xτ)(1-τ)Lxτ-x,xB(x,r),0τ1,where xτ = x* + τ(x  x*

Convergence under weaker Lipschitz condition

In this section, we consider the (1.2) under more weaker Lipschitz condition.

Theorem 7.1

Suppose x* satisfies (1.1), f(x*) = 0, f has a continuous derivative in B(x*,r). f(x*) has full rank, and fsatisfies the radius Lipschitz condition with L average.f(x)(f(x)-f(xτ))τρ(x)ρ(x)L(u)du,xB(x,r),0τ1,f(x)hasfullrank,xB(x,r),where xτ = x* + τ(x  x*), ρ(x) = x  x*, f(x) = [f(x)T f(x)]1 f(x)T, and L is nondecreasing. Let r > 0 satisfy0rL(u)udur1.Then Gauss–Newton’s method is convergent for all x0  B(x*

Applications to determination of an approximation zero

To study the property of quadratic convergence of Newton’s method and computational complexity of zeros, Smale [8], [9], [10] proposed the definitions of approximation zeros of Newton’s method. With the Smale’s studies, we can propose a new definition of the approximation zeros for the Gauss–Newton’s method. This definition is as follows.

Definition 8.1

If x0Rn such that Gauss–Newton’s method (1.2) for f(x):Rn  Rm is well defined and (3.6) is satisfied with q=12, then x0 is called an approximation zero of the

Remark

In fact, the results in this paper can be generalized in real or complex infinite dimensional Hilbert space, we will continue to further study for the convergence of Gauss–Newton’s method in real or complex infinite dimensional Hilbert space.

Acknowledgement

Supported by the Financially- Aiding Program for the Backbone Teachers of Ministry of Education of China and Natural Science Foundation of University of Petroleum.

References (16)

There are more references available in the full text version of this article.

Cited by (43)

  • Cycle life prediction of NiCo<inf>2</inf>O<inf>4</inf>//activated carbon asymmetric supercapacitors

    2022, Journal of Energy Storage
    Citation Excerpt :

    Combined with the initial life values of the acceleration experiments in Table 1, the weight of each influencing factor was determined by the method of pairwise comparison. And then, the model Eq. (6) was established by the least square method [51–54] and further optimized by the Gauss-Newton method [55,56]. At the same time, JMP software was applied to solve the model Eq. (6) to obtain the parameters of A, B, Z, and n. Subsequently, the model Eq. (6) obtained above was verified by the aging test under standard conditions in Table 2.

  • Local convergence of the Gauss–Newton method for injective-overdetermined systems of equations under a majorant condition

    2013, Computers and Mathematics with Applications
    Citation Excerpt :

    In addition to the special cases obtained in [5], the lack of convexity of the derivative of the majorant function in this analysis, allows us to obtain two new important special cases, namely, the convergence can be ensured under Hölder-like and generalized Lipschitz conditions. In the latter case, the results are obtained without assuming that the function that defines the condition is nondecreasing, thus generalizing Corollary 6.3 in [8]. Moreover, it is worth to mention that, similar to the convergence analysis of the Newton method (see [2]), the hypothesis of convex derivative of the majorant function or nondecreasing of the function which defines the generalized Lipschitz condition, are needed only to obtain quadratic convergence rate.

  • Improved local convergence analysis of inexact Gauss-Newton like methods under the majorant condition in Banach spaces

    2013, Journal of the Franklin Institute
    Citation Excerpt :

    The semilocal convergence matter is, based on the information around an initial point, to give criteria ensuring the convergence of iterative procedures; while the local one is, based on the information around a solution, to find estimates of the radii of convergence balls. A plethora of sufficient conditions for the local as well as the semilocal convergence of Newton-type methods as well as an error analysis for such methods can be found in [1–14,16–22]. Argyros [4] and Argyros et al. [7], Banach's Lemma

View all citing articles on Scopus
View full text