Continuous Optimization
Convergence of Liu–Storey conjugate gradient method

https://doi.org/10.1016/j.ejor.2006.09.066Get rights and content

Abstract

The conjugate gradient method is a useful and powerful approach for solving large-scale minimization problems. Liu and Storey developed a conjugate gradient method, which has good numerical performance but no global convergence result under traditional line searches such as Armijo, Wolfe and Goldstein line searches. In this paper a convergent version of Liu–Storey conjugate gradient method (LS in short) is proposed for minimizing functions that have Lipschitz continuous partial derivatives. By estimating the Lipschitz constant of the derivative of objective functions, we can find an adequate step size at each iteration so as to guarantee the global convergence and improve the efficiency of LS method in practical computation.

Introduction

The conjugate gradient method is an approach for solving large scale minimization problems due to its smaller storage requirements and simple computation (e.g. [3], [5], [9], [10], [22], [23]). This method was motivated by Hestence and Stiefel in solving symmetric positive definite linear equations [16] and developed by Fletcher and Reeves (e.g. [6], [7], [10]) in solving unconstrained minimization problems. The conjugate gradient methods have wide applications in many fields, such as control science, engineering, management science and operations research, etc. [9], [29].

Consider an unconstrained minimization problemminf(x),xRn,where Rn denotes an n-dimensional Euclidean space and f : Rn  R1 is a continuously differentiable function.

Many approaches to solving (1) are iterative methods, such as line search and trust region methods. For sake of convenience, if xk is the current iterate then we denote f(xk) by fk, ∇f(xk) by gk, ∇2f(xk) by Gk, and f(x) by f, respectively. If Gk is available and invertible then dk=-Gk-1gk leads to the Newton method and dk = gk results in the steepest descent method (e.g. [10], [16], [19]). The conjugate gradient method has the formxk+1=xk+αkdk,k=0,1,2,,where αk is a step size and dk is a search direction of f(x) at the current iterate xk that takes the formdk=-gk,ifk=0;-gk+βkdk-1,ifk1,where βk can be defined byβkFR=gk2gk-12,βkPRP=gkT(gk-gk-1)gk-12,βkHS=gkT(gk-gk-1)dk-1T(gk-gk-1),βkLS=-gkT(gk-gk-1)dk-1Tgk-1,βkCD=-gkTgkdk-1Tgk-1,βkDY=gkTgk(gk-gk-1)Tdk-1or by other formulae (e.g. see [9], [24]). The corresponding method is respectively called FR (Fletcher–Revees [10]), PRP (Polak–Ribiére–Polyak [20], [21]), HS (Hestenes–Stiefel [16]), LS (Liu–Storey [18]), CD (conjugate descent [9]) and DY (Dai–Yuan [7]) conjugate gradient method.

Although the above mentioned conjugate gradient algorithms are equivalent to each other for minimizing strong convex quadratic functions under exact line search, they have different performance when using them to minimize non-quadratic functions or using inexact line searches.

For non-quadratic objective functions, the FR method has global convergence when exact line search or strong Wolfe line search [2], [6] is used. The PRP method has no global convergence under some traditional line searches. Some convergent versions were proposed by using some new complicated line searches or through restricting the parameter βk to a non-negative number [13], [14], [15], [25], [26]. The CD method was proved to have global convergence property under strong Wolfe line search with a strong restriction on the parameters [5] and DY method has global convergence under weak Wolfe line search [7]. Some impressive literature on conjugate gradient methods can be found in [3], [4], [5], [8], [11], [12], [18], [17], [22], [23], [30].

However, to the best of our knowledge, the global convergence of the original LS and HS methods has not been proved under all the mentioned line searches. In this paper, we propose a new line search procedure and investigate the global convergence of the original LS method.

In the line search method, the search direction dk is generally required to satisfygkTdk<0,which guarantees that dk is a descent direction of f(x) at xk [10], [16], [29]. In order to guarantee the global convergence, we sometimes require dk to satisfy the sufficient descent conditiongkTdk-cgk2,where c > 0 is a constant. Moreover, the angle property is often used in proving the global convergence of related line search methods, that iscos-gk,dk=-gkTdkgk·dkτ,where 1  τ > 0.

Once the descent direction dk is determined at the kth iteration we should seek a step size along the descent direction and complete one iteration.

There are many approaches to find an available step size in which the exact line search is difficult or even impossible to carry out and some inexact line search rules are sometimes useful and powerful in practical computation, such as Armijo line search [1], Goldstein and Wolfe line searches [9], [19], [29]. The Armijo line search is useful and easy to implement in practical computation.

Armijo line search. Let s > 0 be a constant, ρ  (0, 1) and μ  (0, 1). Choose αk to be the largest α in {s, , 2,  , } such thatfk-f(xk+αdk)-αμgkTdk.

The drawback of the Armijo line search is how to choose the initial step size s. If s is too large then the procedure needs to call much more function evaluations. If s is too small then the efficiency of related algorithm will be decreased. Thereby, we should choose an adequate initial step size s at each iteration so as to find the step size αk easily.

In this paper we propose a new Armijo-type line search in which an appropriate initial step size s is defined and varies at each iteration. The new Armijo-type line search enables us to find the step size αk easily at each iteration and guarantees the global convergence of the original LS conjugate gradient method under some mild conditions. The global convergence and linear convergence rate are analyzed and numerical results show that LS method with the new Armijo-type line search is more effective than other similar methods in solving large scale minimization problems.

The rest of this paper is organized as follows. In the next section we introduce a new Armijo-type line search and establish a convergent version of LS method. In Sections 3 Global convergence, 4 Linear convergence rate the global convergence and linear convergence rate are analyzed respectively. Some numerical results are reported in Section 5.

Section snippets

New Armijo-type line search

We first assume that

(H1). The objective function f(x) is continuously differentiable and has a lower bound on Rn.

(H2). The gradient g(x) = f(x) of f(x) is Lipschitz continuous on an open convex set B that contains the level set L(x0) = {x  Rnf(x)  f(x0)} with x0 given, i.e., there exists an L > 0 such thatg(x)-g(y)Lx-y,x,yB.

Since L is usually not known a priori in practical computation, it needs to be estimated. In the sequel, we shall discuss the problem and present some approaches for

Global convergence

Lemma 3.1

Assume that (H1) and (H2) hold, LS method with the new Armijo-type line search generates an infinite sequence {xk}. Then,dk1+L(1-c)m0gk,k,where m0 is defined in Lemma 2.1.

Proof

For k = 0, we havedk=gk1+L(1-c)m0gk.For k > 0, by Lemma 2.1, we haveαk-1-cLkgkTdkdk2-1-cm0gkTdkdk2.By the Cauchy–Schwarz inequality and the above inequality, noting the LS formula, we havedk+1=-gk+1+βk+1LSdkgk+1+|gk+1(gk+1-gk)|-gkTdkdkgk+11-αkLdk2/gkTdk1+L(1-c)m0gk+1.The proof is completed. 

Theorem 3.1

Linear convergence rate

In this section we shall prove that LS method with the new Armijo-type line search has linear convergence rate under some mild conditions.

We further assume that

(H3). The sequence {xk} generated by LS method with the new Armijo-type line search converges to x, ∇2f(x)  0 and f(x) is twice continuously differentiable on N(x, ϵ0) = {x∣∥x  x < ϵ0}.

Lemma 4.1

Assume that (H3) holds. Then there exist m, Mand ϵ0 with 0 < m  Mand ϵ  ϵ0 such thatmy2yT2f(x)yMy2,x,yN(x,ϵ);12mx-x2f(x)-f(x)12Mx-x2

Numerical reports

Some numerical experiments were conducted to show the efficiency of the new Armijo-type line search used in LS method. LS1, LS2, and LS3 denote the LS method with the new Armijo-type line search corresponding to the estimations (10), (11), (12), respectively. LS refers to the original LS method with strong Wolfe line search. LS+ represents the LS method withβk=max(0,βLS)and strong Wolfe line search [5], [14], [19].

Strong Wolfe line search. Choose αk to satisfyfk-f(xk+αdk)-μαgkTdkand|g(xk+αdk)Td

Conclusion

In this paper, a new form of Armijo-type line search has been proposed for guaranteeing the global convergence of the LS conjugate gradient method for minimizing functions that have Lipschitz continuous partial derivatives. It needs to estimate the local Lipschitz constant of the derivative of objective functions in practical computation. The global convergence and linear convergence rate of the LS method with the new Armijo-type line search were analyzed under some mild conditions. Numerical

Acknowledgement

The authors would like to thank the editor and the referee for valuable comments and suggestions that greatly improved the presentation of this paper.

Zhen-Jun Shi completed his Master Degree in Computational Mathematics from Nanjing University in 1991; Ph.D. in Operations Research from Dalian University of Technology in 2002; obtained postdoctoral training in Chinese Academy of Science from 2003 to 2005; and he is currently a research fellow in Department of Computer and Information Science, University of Michigan, Dearborn, USA. He has been a professor of Qufu Normal University since 1995. His research area includes numerical optimization,

References (30)

  • Y.H. Dai et al.

    Convergence properties of nonlinear conjugate gradient methods

    SIAM Journal of Optimization

    (1999)
  • R. Fletcher

    Practical Method of Optimization

    (1987)
  • R. Fletcher et al.

    Function minimization by conjugate gradients

    Computer Journal

    (1964)
  • G. Fasano

    Planar conjugate gradient algorithm for large-scale unconstrained optimization. Part 1: Theory

    Journal of Optimization Theory and Applications

    (2005)
  • G. Fasano

    Planar conjugate gradient algorithm for large-scale unconstrained optimization. Part 1: Applications

    Journal of Optimization Theory and Applications

    (2005)
  • Cited by (19)

    • A study of Liu-Storey conjugate gradient methods for vector optimization

      2022, Applied Mathematics and Computation
      Citation Excerpt :

      Following this research topic, the aim of the present work is to propose and analyze some Liu-Storey conjugate gradient algorithms for solving (4). Basically, we extend the methods considered in [35,36,52] to the vector context. The first algorithm restricts the LS conjugate parameter to be nonnegative and use a line search satisfying the (vector) standard Wolfe conditions.

    • A Liu-Storey-type conjugate gradient method for unconstrained minimization problem with application in motion control

      2022, Journal of King Saud University - Science
      Citation Excerpt :

      Nevertheless, some works on the LS parameter have been presented in the references (Shi and Shen, 2007; Tang et al., 2007; Zhang, 2009; Liu and Feng, 2011; Li and Feng, 2011. In Shi and Shen, 2007; Tang et al., 2007; Liu and Feng, 2011), the authors proposed a new form of Armijo-like line search strategy and used it to investigate the global convergence of the LS method. Zhang (2009), developed a new descent LS-type method based on the memoryless BFGS update and proved the global convergence of the method for general function using the Grippo and Lucidi (1997) line search.

    • Robust reliability-based design approach by inverse FORM with adaptive conjugate search algorithm

      2023, International Journal for Numerical and Analytical Methods in Geomechanics
    View all citing articles on Scopus

    Zhen-Jun Shi completed his Master Degree in Computational Mathematics from Nanjing University in 1991; Ph.D. in Operations Research from Dalian University of Technology in 2002; obtained postdoctoral training in Chinese Academy of Science from 2003 to 2005; and he is currently a research fellow in Department of Computer and Information Science, University of Michigan, Dearborn, USA. He has been a professor of Qufu Normal University since 1995. His research area includes numerical optimization, non-linear programming, numerical linear algebra, and computer applications of operations research, etc.

    Jie Shen completed his Master Degree in Computer Science in 1997 and his Ph.D. in Computer Science in 2000 from University of Saskatchewan, Canada. He is currently an Assistant Professor in Department of Computer and Information Science, University of Michigan, Dearborn, USA. He has published over 70 technical papers and one book. His research areas include computational geometry, optimization, scientific computation, etc.

    The work was supported in part by NSF DMI-0514900, USA.

    View full text