Convergence of Liu–Storey conjugate gradient method

doi:10.1016/j.ejor.2006.09.066

European Journal of Operational Research

Volume 182, Issue 2, 16 October 2007, Pages 552-560

https://doi.org/10.1016/j.ejor.2006.09.066 Get rights and content

Abstract

The conjugate gradient method is a useful and powerful approach for solving large-scale minimization problems. Liu and Storey developed a conjugate gradient method, which has good numerical performance but no global convergence result under traditional line searches such as Armijo, Wolfe and Goldstein line searches. In this paper a convergent version of Liu–Storey conjugate gradient method (LS in short) is proposed for minimizing functions that have Lipschitz continuous partial derivatives. By estimating the Lipschitz constant of the derivative of objective functions, we can find an adequate step size at each iteration so as to guarantee the global convergence and improve the efficiency of LS method in practical computation.

Introduction

The conjugate gradient method is an approach for solving large scale minimization problems due to its smaller storage requirements and simple computation (e.g. [3], [5], [9], [10], [22], [23]). This method was motivated by Hestence and Stiefel in solving symmetric positive definite linear equations [16] and developed by Fletcher and Reeves (e.g. [6], [7], [10]) in solving unconstrained minimization problems. The conjugate gradient methods have wide applications in many fields, such as control science, engineering, management science and operations research, etc. [9], [29].

Consider an unconstrained minimization problem $\min f (x), x \in R^{n},$ where Rⁿ denotes an n-dimensional Euclidean space and f : Rⁿ → R¹ is a continuously differentiable function.

Many approaches to solving (1) are iterative methods, such as line search and trust region methods. For sake of convenience, if x_k is the current iterate then we denote f(x_k) by f_k, ∇f(x_k) by g_k, ∇²f(x_k) by G_k, and f(x^∗) by f^∗, respectively. If G_k is available and invertible then $d_{k} = - G_{k}^{- 1} g_{k}$ leads to the Newton method and d_k = −g_k results in the steepest descent method (e.g. [10], [16], [19]). The conjugate gradient method has the form $x_{k + 1} = x_{k} + α_{k} d_{k}, k = 0, 1, 2, \dots,$ where α_k is a step size and d_k is a search direction of f(x) at the current iterate x_k that takes the form $d_{k} = \{\begin{matrix} - g_{k}, if k = 0; \\ - g_{k} + β_{k} d_{k - 1}, if k ⩾ 1, \end{matrix}$ where β_k can be defined by $\begin{matrix} β_{k}^{FR} = \frac{‖ g_{k} ‖^{2}}{‖ g_{k - 1} ‖^{2}}, β_{k}^{PRP} = \frac{g_{k}^{T} (g_{k} - g_{k - 1})}{‖ g_{k - 1} ‖^{2}}, \\ β_{k}^{HS} = \frac{g_{k}^{T} (g_{k} - g_{k - 1})}{d_{k - 1}^{T} (g_{k} - g_{k - 1})}, β_{k}^{LS} = - \frac{g_{k}^{T} (g_{k} - g_{k - 1})}{d_{k - 1}^{T} g_{k - 1}}, \\ β_{k}^{CD} = - \frac{g_{k}^{T} g_{k}}{d_{k - 1}^{T} g_{k - 1}}, β_{k}^{DY} = \frac{g_{k}^{T} g_{k}}{(g_{k} - g_{k - 1})^{T} d_{k - 1}} \end{matrix}$ or by other formulae (e.g. see [9], [24]). The corresponding method is respectively called FR (Fletcher–Revees [10]), PRP (Polak–Ribiére–Polyak [20], [21]), HS (Hestenes–Stiefel [16]), LS (Liu–Storey [18]), CD (conjugate descent [9]) and DY (Dai–Yuan [7]) conjugate gradient method.

Although the above mentioned conjugate gradient algorithms are equivalent to each other for minimizing strong convex quadratic functions under exact line search, they have different performance when using them to minimize non-quadratic functions or using inexact line searches.

For non-quadratic objective functions, the FR method has global convergence when exact line search or strong Wolfe line search [2], [6] is used. The PRP method has no global convergence under some traditional line searches. Some convergent versions were proposed by using some new complicated line searches or through restricting the parameter β_k to a non-negative number [13], [14], [15], [25], [26]. The CD method was proved to have global convergence property under strong Wolfe line search with a strong restriction on the parameters [5] and DY method has global convergence under weak Wolfe line search [7]. Some impressive literature on conjugate gradient methods can be found in [3], [4], [5], [8], [11], [12], [18], [17], [22], [23], [30].

However, to the best of our knowledge, the global convergence of the original LS and HS methods has not been proved under all the mentioned line searches. In this paper, we propose a new line search procedure and investigate the global convergence of the original LS method.

In the line search method, the search direction d_k is generally required to satisfy $g_{k}^{T} d_{k} < 0,$ which guarantees that d_k is a descent direction of f(x) at x_k [10], [16], [29]. In order to guarantee the global convergence, we sometimes require d_k to satisfy the sufficient descent condition $g_{k}^{T} d_{k} ⩽ - c ‖ g_{k} ‖^{2},$ where c > 0 is a constant. Moreover, the angle property is often used in proving the global convergence of related line search methods, that is $\cos 〈 - g_{k}, d_{k} 〉 = - \frac{g_{k}^{T} d_{k}}{‖ g_{k} ‖ \cdot ‖ d_{k} ‖} ⩾ τ,$ where 1 ⩾ τ > 0.

Once the descent direction d_k is determined at the kth iteration we should seek a step size along the descent direction and complete one iteration.

There are many approaches to find an available step size in which the exact line search is difficult or even impossible to carry out and some inexact line search rules are sometimes useful and powerful in practical computation, such as Armijo line search [1], Goldstein and Wolfe line searches [9], [19], [29]. The Armijo line search is useful and easy to implement in practical computation.

Armijo line search. Let s > 0 be a constant, ρ ∈ (0, 1) and μ ∈ (0, 1). Choose α_k to be the largest α in {s, sρ, sρ², … , } such that $f_{k} - f (x_{k} + α d_{k}) ⩾ - α μ g_{k}^{T} d_{k} .$

The drawback of the Armijo line search is how to choose the initial step size s. If s is too large then the procedure needs to call much more function evaluations. If s is too small then the efficiency of related algorithm will be decreased. Thereby, we should choose an adequate initial step size s at each iteration so as to find the step size α_k easily.

In this paper we propose a new Armijo-type line search in which an appropriate initial step size s is defined and varies at each iteration. The new Armijo-type line search enables us to find the step size α_k easily at each iteration and guarantees the global convergence of the original LS conjugate gradient method under some mild conditions. The global convergence and linear convergence rate are analyzed and numerical results show that LS method with the new Armijo-type line search is more effective than other similar methods in solving large scale minimization problems.

The rest of this paper is organized as follows. In the next section we introduce a new Armijo-type line search and establish a convergent version of LS method. In Sections 3 Global convergence, 4 Linear convergence rate the global convergence and linear convergence rate are analyzed respectively. Some numerical results are reported in Section 5.

Section snippets

New Armijo-type line search

We first assume that

(H1). The objective function f(x) is continuously differentiable and has a lower bound on Rⁿ.

(H2). The gradient g(x) = ∇f(x) of f(x) is Lipschitz continuous on an open convex set B that contains the level set L(x₀) = {x ∈ Rⁿ∣f(x) ⩽ f(x₀)} with x₀ given, i.e., there exists an L > 0 such that $‖ g (x) - g (y) ‖ ⩽ L ‖ x - y ‖, \forall x, y \in B .$

Since L is usually not known a priori in practical computation, it needs to be estimated. In the sequel, we shall discuss the problem and present some approaches for

Global convergence

Lemma 3.1

Assume that (H1) and (H2) hold, LS method with the new Armijo-type line search generates an infinite sequence {x_k}. Then, $‖ d_{k} ‖ ⩽ (1 + \frac{L (1 - c)}{m_{0}}) ‖ g_{k} ‖, \forall k,$ where m₀ is defined in Lemma 2.1.

Proof

For k = 0, we have $‖ d_{k} ‖ = ‖ g_{k} ‖ ⩽ (1 + \frac{L (1 - c)}{m_{0}}) ‖ g_{k} ‖ .$ For k > 0, by Lemma 2.1, we have $α_{k} ⩽ - \frac{1 - c}{L_{k}} \frac{g_{k}^{T} d_{k}}{‖ d_{k} ‖^{2}} ⩽ - \frac{1 - c}{m_{0}} \frac{g_{k}^{T} d_{k}}{‖ d_{k} ‖^{2}} .$ By the Cauchy–Schwarz inequality and the above inequality, noting the LS formula, we have $‖ d_{k + 1} ‖ = ‖ - g_{k + 1} + β_{k + 1}^{LS} d_{k} ‖ ⩽ ‖ g_{k + 1} ‖ + \frac{| g_{k + 1} (g_{k + 1} - g_{k}) |}{- g_{k}^{T} d_{k}} ‖ d_{k} ‖ ⩽ ‖ g_{k + 1} ‖ (1 - α_{k} L ‖ d_{k} ‖^{2} / (g_{k}^{T} d_{k})) ⩽ (1 + \frac{L (1 - c)}{m_{0}}) ‖ g_{k + 1} ‖ .$ The proof is completed. □

Theorem 3.1

Linear convergence rate

In this section we shall prove that LS method with the new Armijo-type line search has linear convergence rate under some mild conditions.

We further assume that

(H3). The sequence {x_k} generated by LS method with the new Armijo-type line search converges to x^∗, ∇²f(x^∗) ≻ 0 and f(x) is twice continuously differentiable on N(x^∗, ϵ₀) = {x∣∥x − x^∗∥ < ϵ₀}.

Lemma 4.1

Assume that (H3) holds. Then there exist m′, M′ and ϵ₀ with 0 < m′ ⩽ M′ and ϵ ⩽ ϵ₀ such that $\begin{matrix} m^{'} ‖ y ‖^{2} ⩽ y^{T} \nabla^{2} f (x) y ⩽ M^{'} ‖ y ‖^{2}, \\ \forall x, y \in N (x^{*}, ϵ); \end{matrix}$ $\begin{matrix} \frac{1}{2} m^{'} ‖ x - x^{*} ‖^{2} ⩽ f (x) - f (x^{*}) ⩽ \frac{1}{2} M^{'} ‖ x - x^{*} ‖^{2} \end{matrix}$

Numerical reports

Some numerical experiments were conducted to show the efficiency of the new Armijo-type line search used in LS method. LS1, LS2, and LS3 denote the LS method with the new Armijo-type line search corresponding to the estimations (10), (11), (12), respectively. LS refers to the original LS method with strong Wolfe line search. LS+ represents the LS method with $β_{k} = \max (0, β^{LS})$ and strong Wolfe line search [5], [14], [19].

Strong Wolfe line search. Choose α_k to satisfy $f_{k} - f (x_{k} + α d_{k}) ⩾ - μ α g_{k}^{T} d_{k}$ and $| g (x_{k} + α d_{k})^{T} d$

Conclusion

In this paper, a new form of Armijo-type line search has been proposed for guaranteeing the global convergence of the LS conjugate gradient method for minimizing functions that have Lipschitz continuous partial derivatives. It needs to estimate the local Lipschitz constant of the derivative of objective functions in practical computation. The global convergence and linear convergence rate of the LS method with the new Armijo-type line search were analyzed under some mild conditions. Numerical

Acknowledgement

The authors would like to thank the editor and the referee for valuable comments and suggestions that greatly improved the presentation of this paper.

Zhen-Jun Shi completed his Master Degree in Computational Mathematics from Nanjing University in 1991; Ph.D. in Operations Research from Dalian University of Technology in 2002; obtained postdoctoral training in Chinese Academy of Science from 2003 to 2005; and he is currently a research fellow in Department of Computer and Information Science, University of Michigan, Dearborn, USA. He has been a professor of Qufu Normal University since 1995. His research area includes numerical optimization,

References (30)

X.D. Chen et al.
Global convergence of a two-parameter family of conjugate gradient methods without line search
Journal of Computational and Applied Mathematics
(2002)
B.T. Polyak
The conjugate gradient method in extreme problems
USSR Computational Mathematics and Mathematical Physics
(1969)
J. Sun et al.
Quadratic cost flow and the conjugate gradient method
European Journal of Operational Research
(2005)
Z.J. Shi et al.
Convergence of descent method without line search
Applied Mathematics and Computation
(2005)
L. Armijo
Minimization of functions having Lipschits continuous partial derivatives
Pacific Journal of Mathematics
(1966)
M. Al-Baali
New property and global convergence of the Fletcher–Reeves method with inexact line searches
IMA Journal of Numerical Analysis
(1985)
Y.H. Dai
New properties of a nonlinear conjugate gradient method
Numerische Mathematik
(2001)
Y.H. Dai et al.
Convergence properties of the conjugate descent method
Advances in Mathematics
(1996)
Y.H. Dai et al.
Convergence properties of the Fletcher–Reeves method
IMA Journal of Numerical Analysis
(1996)
Y.H. Dai et al.
A nonlinear conjugate gradient method with a strong global convergence property
SIAM Journal of Optimization
(1999)

Y.H. Dai et al.

Convergence properties of nonlinear conjugate gradient methods

SIAM Journal of Optimization

(1999)

R. Fletcher

Practical Method of Optimization

(1987)

R. Fletcher et al.

Function minimization by conjugate gradients

Computer Journal

(1964)

G. Fasano

Planar conjugate gradient algorithm for large-scale unconstrained optimization. Part 1: Theory

Journal of Optimization Theory and Applications

(2005)

G. Fasano

Planar conjugate gradient algorithm for large-scale unconstrained optimization. Part 1: Applications

Journal of Optimization Theory and Applications

(2005)

Cited by (19)

A study of Liu-Storey conjugate gradient methods for vector optimization
2022, Applied Mathematics and Computation
Citation Excerpt :
Following this research topic, the aim of the present work is to propose and analyze some Liu-Storey conjugate gradient algorithms for solving (4). Basically, we extend the methods considered in [35,36,52] to the vector context. The first algorithm restricts the LS conjugate parameter to be nonnegative and use a line search satisfying the (vector) standard Wolfe conditions.
This work presents a study of Liu-Storey (LS) nonlinear conjugate gradient (CG) methods to solve vector optimization problems. Three variants of the LS-CG method originally designed to solve single-objective problems are extended to the vector setting. The first algorithm restricts the LS conjugate parameter to be nonnegative and use a sufficiently accurate line search satisfying the (vector) standard Wolfe conditions. The second algorithm combines a modification in the LS conjugate parameter with a line search satisfying the (vector) strong Wolfe conditions. The third algorithm consists of a combination of the LS conjugate parameter with a new Armijo-type line search (to be proposed here for the vector setting). Global convergence results and numerical experiments are presented.
A Liu-Storey-type conjugate gradient method for unconstrained minimization problem with application in motion control
2022, Journal of King Saud University - Science
Citation Excerpt :
Nevertheless, some works on the LS parameter have been presented in the references (Shi and Shen, 2007; Tang et al., 2007; Zhang, 2009; Liu and Feng, 2011; Li and Feng, 2011. In Shi and Shen, 2007; Tang et al., 2007; Liu and Feng, 2011), the authors proposed a new form of Armijo-like line search strategy and used it to investigate the global convergence of the LS method. Zhang (2009), developed a new descent LS-type method based on the memoryless BFGS update and proved the global convergence of the method for general function using the Grippo and Lucidi (1997) line search.
Conjugate gradient methods have played a vital role in finding the minimizers of large-scale unconstrained optimization problems due to the simplicity of their iteration, convergence properties and their low memory requirements. Based on the Liu-Storey conjugate gradient method, in this paper, we present a Liu-Storey type method for finding the minimizers of large-scale unconstrained optimization problems. The direction of the proposed method is constructed in such a way that the sufficient descent condition is satisfied. Furthermore, we establish the global convergence result of the method under the standard Wolfe and Armijo-like line searches. Numerical findings indicate that our presented approach is efficient and robust in solving large-scale test problems. In addition, an application of the method is explored.
A hybrid self-adaptive conjugate first order reliability method for robust structural reliability analysis
2018, Applied Mathematical Modelling
The traditional First Order Reliability Method (FORM) using steepest descent search direction may yield unstable solutions due to periodic nature and chaos for reliability analysis problems involving highly nonlinear performance functions. A conjugate search direction approach is attempted in the present study to overcome such problem of the FORM for Most Probable Point (MPP) search. Two iterative FORM schemes are investigated based on conjugate descent direction using self-adaptive conjugate (SAC) and hybrid self- adaptive conjugate (HSAC) search directions for estimating reliability index. The SAC is proposed using Fletcher and Reeves (FR) method and an adaptive conjugate scalar factor to improve the efficiency of the FR method for reliability analysis of highly nonlinear performance function. The HSAC is adaptively computed using FR and SAC methods to improve the robustness and efficiency of the FORM formula. The effectiveness of the proposed SAC and HSAC approaches are studied compare to the traditional FORM algorithms through several numerical examples. The proposed methods based on conjugate search direction are found to be more efficient and robust than the usual FORM algorithms.
A sufficient descent LS conjugate gradient method for unconstrained optimization problems
2011, Applied Mathematics and Computation
In this paper, we make a modification to the Liu–Storey (LS) conjugate gradient method and propose a descent LS method. The method can generate sufficient descent directions for the objective function. This property is independent of the line search used. We prove that the modified LS method is globally convergent with the strong Wolfe line search. The numerical results show that the proposed descent LS method is efficient for the unconstrained problems in the CUTEr library.
Global convergence of a modified LS nonlinear conjugate gradient method
2011, Procedia Engineering
Robust reliability-based design approach by inverse FORM with adaptive conjugate search algorithm
2023, International Journal for Numerical and Analytical Methods in Geomechanics

View all citing articles on Scopus

Jie Shen completed his Master Degree in Computer Science in 1997 and his Ph.D. in Computer Science in 2000 from University of Saskatchewan, Canada. He is currently an Assistant Professor in Department of Computer and Information Science, University of Michigan, Dearborn, USA. He has published over 70 technical papers and one book. His research areas include computational geometry, optimization, scientific computation, etc.

^☆: The work was supported in part by NSF DMI-0514900, USA.

View full text

Continuous OptimizationConvergence of Liu–Storey conjugate gradient method☆

Abstract

Introduction

Section snippets

New Armijo-type line search

Global convergence

Linear convergence rate

Numerical reports

Conclusion

Acknowledgement

Journal of Computational and Applied Mathematics

USSR Computational Mathematics and Mathematical Physics

European Journal of Operational Research

Applied Mathematics and Computation

Minimization of functions having Lipschits continuous partial derivatives

Pacific Journal of Mathematics

New property and global convergence of the Fletcher–Reeves method with inexact line searches

IMA Journal of Numerical Analysis

New properties of a nonlinear conjugate gradient method

Numerische Mathematik

Convergence properties of the conjugate descent method

Advances in Mathematics

Convergence properties of the Fletcher–Reeves method

IMA Journal of Numerical Analysis

A nonlinear conjugate gradient method with a strong global convergence property

SIAM Journal of Optimization

Convergence properties of nonlinear conjugate gradient methods

SIAM Journal of Optimization

Practical Method of Optimization

Function minimization by conjugate gradients

Computer Journal

Planar conjugate gradient algorithm for large-scale unconstrained optimization. Part 1: Theory

Journal of Optimization Theory and Applications

Planar conjugate gradient algorithm for large-scale unconstrained optimization. Part 1: Applications

Journal of Optimization Theory and Applications

Continuous Optimization
Convergence of Liu–Storey conjugate gradient method☆