Continuous OptimizationConvergence of Liu–Storey conjugate gradient method☆
Introduction
The conjugate gradient method is an approach for solving large scale minimization problems due to its smaller storage requirements and simple computation (e.g. [3], [5], [9], [10], [22], [23]). This method was motivated by Hestence and Stiefel in solving symmetric positive definite linear equations [16] and developed by Fletcher and Reeves (e.g. [6], [7], [10]) in solving unconstrained minimization problems. The conjugate gradient methods have wide applications in many fields, such as control science, engineering, management science and operations research, etc. [9], [29].
Consider an unconstrained minimization problemwhere Rn denotes an n-dimensional Euclidean space and f : Rn → R1 is a continuously differentiable function.
Many approaches to solving (1) are iterative methods, such as line search and trust region methods. For sake of convenience, if xk is the current iterate then we denote f(xk) by fk, ∇f(xk) by gk, ∇2f(xk) by Gk, and f(x∗) by f∗, respectively. If Gk is available and invertible then leads to the Newton method and dk = −gk results in the steepest descent method (e.g. [10], [16], [19]). The conjugate gradient method has the formwhere αk is a step size and dk is a search direction of f(x) at the current iterate xk that takes the formwhere βk can be defined byor by other formulae (e.g. see [9], [24]). The corresponding method is respectively called FR (Fletcher–Revees [10]), PRP (Polak–Ribiére–Polyak [20], [21]), HS (Hestenes–Stiefel [16]), LS (Liu–Storey [18]), CD (conjugate descent [9]) and DY (Dai–Yuan [7]) conjugate gradient method.
Although the above mentioned conjugate gradient algorithms are equivalent to each other for minimizing strong convex quadratic functions under exact line search, they have different performance when using them to minimize non-quadratic functions or using inexact line searches.
For non-quadratic objective functions, the FR method has global convergence when exact line search or strong Wolfe line search [2], [6] is used. The PRP method has no global convergence under some traditional line searches. Some convergent versions were proposed by using some new complicated line searches or through restricting the parameter βk to a non-negative number [13], [14], [15], [25], [26]. The CD method was proved to have global convergence property under strong Wolfe line search with a strong restriction on the parameters [5] and DY method has global convergence under weak Wolfe line search [7]. Some impressive literature on conjugate gradient methods can be found in [3], [4], [5], [8], [11], [12], [18], [17], [22], [23], [30].
However, to the best of our knowledge, the global convergence of the original LS and HS methods has not been proved under all the mentioned line searches. In this paper, we propose a new line search procedure and investigate the global convergence of the original LS method.
In the line search method, the search direction dk is generally required to satisfywhich guarantees that dk is a descent direction of f(x) at xk [10], [16], [29]. In order to guarantee the global convergence, we sometimes require dk to satisfy the sufficient descent conditionwhere c > 0 is a constant. Moreover, the angle property is often used in proving the global convergence of related line search methods, that iswhere 1 ⩾ τ > 0.
Once the descent direction dk is determined at the kth iteration we should seek a step size along the descent direction and complete one iteration.
There are many approaches to find an available step size in which the exact line search is difficult or even impossible to carry out and some inexact line search rules are sometimes useful and powerful in practical computation, such as Armijo line search [1], Goldstein and Wolfe line searches [9], [19], [29]. The Armijo line search is useful and easy to implement in practical computation.
Armijo line search. Let s > 0 be a constant, ρ ∈ (0, 1) and μ ∈ (0, 1). Choose αk to be the largest α in {s, sρ, sρ2, … , } such that
The drawback of the Armijo line search is how to choose the initial step size s. If s is too large then the procedure needs to call much more function evaluations. If s is too small then the efficiency of related algorithm will be decreased. Thereby, we should choose an adequate initial step size s at each iteration so as to find the step size αk easily.
In this paper we propose a new Armijo-type line search in which an appropriate initial step size s is defined and varies at each iteration. The new Armijo-type line search enables us to find the step size αk easily at each iteration and guarantees the global convergence of the original LS conjugate gradient method under some mild conditions. The global convergence and linear convergence rate are analyzed and numerical results show that LS method with the new Armijo-type line search is more effective than other similar methods in solving large scale minimization problems.
The rest of this paper is organized as follows. In the next section we introduce a new Armijo-type line search and establish a convergent version of LS method. In Sections 3 Global convergence, 4 Linear convergence rate the global convergence and linear convergence rate are analyzed respectively. Some numerical results are reported in Section 5.
Section snippets
New Armijo-type line search
We first assume that
(H1). The objective function f(x) is continuously differentiable and has a lower bound on Rn.
(H2). The gradient g(x) = ∇f(x) of f(x) is Lipschitz continuous on an open convex set B that contains the level set L(x0) = {x ∈ Rn∣f(x) ⩽ f(x0)} with x0 given, i.e., there exists an L > 0 such that
Since L is usually not known a priori in practical computation, it needs to be estimated. In the sequel, we shall discuss the problem and present some approaches for
Global convergence
Lemma 3.1 Assume that (H1) and (H2) hold, LS method with the new Armijo-type line search generates an infinite sequence {xk}. Then,where m0 is defined in Lemma 2.1. Proof For k = 0, we haveFor k > 0, by Lemma 2.1, we haveBy the Cauchy–Schwarz inequality and the above inequality, noting the LS formula, we haveThe proof is completed. □ Theorem 3.1
Linear convergence rate
In this section we shall prove that LS method with the new Armijo-type line search has linear convergence rate under some mild conditions.
We further assume that
(H3). The sequence {xk} generated by LS method with the new Armijo-type line search converges to x∗, ∇2f(x∗) ≻ 0 and f(x) is twice continuously differentiable on N(x∗, ϵ0) = {x∣∥x − x∗∥ < ϵ0}. Lemma 4.1 Assume that (H3) holds. Then there exist m′, M′ and ϵ0 with 0 < m′ ⩽ M′ and ϵ ⩽ ϵ0 such that
Numerical reports
Some numerical experiments were conducted to show the efficiency of the new Armijo-type line search used in LS method. LS1, LS2, and LS3 denote the LS method with the new Armijo-type line search corresponding to the estimations (10), (11), (12), respectively. LS refers to the original LS method with strong Wolfe line search. LS+ represents the LS method withand strong Wolfe line search [5], [14], [19].
Strong Wolfe line search. Choose αk to satisfyand
Conclusion
In this paper, a new form of Armijo-type line search has been proposed for guaranteeing the global convergence of the LS conjugate gradient method for minimizing functions that have Lipschitz continuous partial derivatives. It needs to estimate the local Lipschitz constant of the derivative of objective functions in practical computation. The global convergence and linear convergence rate of the LS method with the new Armijo-type line search were analyzed under some mild conditions. Numerical
Acknowledgement
The authors would like to thank the editor and the referee for valuable comments and suggestions that greatly improved the presentation of this paper.
Zhen-Jun Shi completed his Master Degree in Computational Mathematics from Nanjing University in 1991; Ph.D. in Operations Research from Dalian University of Technology in 2002; obtained postdoctoral training in Chinese Academy of Science from 2003 to 2005; and he is currently a research fellow in Department of Computer and Information Science, University of Michigan, Dearborn, USA. He has been a professor of Qufu Normal University since 1995. His research area includes numerical optimization,
References (30)
- et al.
Global convergence of a two-parameter family of conjugate gradient methods without line search
Journal of Computational and Applied Mathematics
(2002) The conjugate gradient method in extreme problems
USSR Computational Mathematics and Mathematical Physics
(1969)- et al.
Quadratic cost flow and the conjugate gradient method
European Journal of Operational Research
(2005) - et al.
Convergence of descent method without line search
Applied Mathematics and Computation
(2005) Minimization of functions having Lipschits continuous partial derivatives
Pacific Journal of Mathematics
(1966)New property and global convergence of the Fletcher–Reeves method with inexact line searches
IMA Journal of Numerical Analysis
(1985)New properties of a nonlinear conjugate gradient method
Numerische Mathematik
(2001)- et al.
Convergence properties of the conjugate descent method
Advances in Mathematics
(1996) - et al.
Convergence properties of the Fletcher–Reeves method
IMA Journal of Numerical Analysis
(1996) - et al.
A nonlinear conjugate gradient method with a strong global convergence property
SIAM Journal of Optimization
(1999)
Convergence properties of nonlinear conjugate gradient methods
SIAM Journal of Optimization
Practical Method of Optimization
Function minimization by conjugate gradients
Computer Journal
Planar conjugate gradient algorithm for large-scale unconstrained optimization. Part 1: Theory
Journal of Optimization Theory and Applications
Planar conjugate gradient algorithm for large-scale unconstrained optimization. Part 1: Applications
Journal of Optimization Theory and Applications
Cited by (19)
A study of Liu-Storey conjugate gradient methods for vector optimization
2022, Applied Mathematics and ComputationCitation Excerpt :Following this research topic, the aim of the present work is to propose and analyze some Liu-Storey conjugate gradient algorithms for solving (4). Basically, we extend the methods considered in [35,36,52] to the vector context. The first algorithm restricts the LS conjugate parameter to be nonnegative and use a line search satisfying the (vector) standard Wolfe conditions.
A Liu-Storey-type conjugate gradient method for unconstrained minimization problem with application in motion control
2022, Journal of King Saud University - ScienceCitation Excerpt :Nevertheless, some works on the LS parameter have been presented in the references (Shi and Shen, 2007; Tang et al., 2007; Zhang, 2009; Liu and Feng, 2011; Li and Feng, 2011. In Shi and Shen, 2007; Tang et al., 2007; Liu and Feng, 2011), the authors proposed a new form of Armijo-like line search strategy and used it to investigate the global convergence of the LS method. Zhang (2009), developed a new descent LS-type method based on the memoryless BFGS update and proved the global convergence of the method for general function using the Grippo and Lucidi (1997) line search.
A hybrid self-adaptive conjugate first order reliability method for robust structural reliability analysis
2018, Applied Mathematical ModellingA sufficient descent LS conjugate gradient method for unconstrained optimization problems
2011, Applied Mathematics and ComputationGlobal convergence of a modified LS nonlinear conjugate gradient method
2011, Procedia EngineeringRobust reliability-based design approach by inverse FORM with adaptive conjugate search algorithm
2023, International Journal for Numerical and Analytical Methods in Geomechanics
Zhen-Jun Shi completed his Master Degree in Computational Mathematics from Nanjing University in 1991; Ph.D. in Operations Research from Dalian University of Technology in 2002; obtained postdoctoral training in Chinese Academy of Science from 2003 to 2005; and he is currently a research fellow in Department of Computer and Information Science, University of Michigan, Dearborn, USA. He has been a professor of Qufu Normal University since 1995. His research area includes numerical optimization, non-linear programming, numerical linear algebra, and computer applications of operations research, etc.
Jie Shen completed his Master Degree in Computer Science in 1997 and his Ph.D. in Computer Science in 2000 from University of Saskatchewan, Canada. He is currently an Assistant Professor in Department of Computer and Information Science, University of Michigan, Dearborn, USA. He has published over 70 technical papers and one book. His research areas include computational geometry, optimization, scientific computation, etc.
- ☆
The work was supported in part by NSF DMI-0514900, USA.