The Dai–Liao nonlinear conjugate gradient method with optimal parameter choices

doi:10.1016/j.ejor.2013.11.012

European Journal of Operational Research

Volume 234, Issue 3, 1 May 2014, Pages 625-630

https://doi.org/10.1016/j.ejor.2013.11.012 Get rights and content

Highlights

•
Solutions for an open problem in the conjugate gradient methods are discussed.
•
A singular value study on the Dai–Liao conjugate gradient method is made.
•
Two modified Dai–Liao conjugate gradient methods are suggested.
•
Convergence analyses and numerical comparisons are made.

Abstract

Minimizing two different upper bounds of the matrix which generates search directions of the nonlinear conjugate gradient method proposed by Dai and Liao, two modified conjugate gradient methods are proposed. Under proper conditions, it is briefly shown that the methods are globally convergent when the line search fulfills the strong Wolfe conditions. Numerical comparisons between the implementations of the proposed methods and the conjugate gradient methods proposed by Hager and Zhang, and Dai and Kou, are made on a set of unconstrained optimization test problems of the CUTEr collection. The results show the efficiency of the proposed methods in the sense of the performance profile introduced by Dolan and Moré.

Introduction

Conjugate gradient (CG) methods comprise a class of unconstrained optimization algorithms characterized by low memory requirements and strong global convergence properties (Dai et al., 1999) which made them popular for engineers and mathematicians engaged in solving large-scale problems in the following form: $\min_{x \in R^{n}} f (x),$ where $f : R^{n} \to R$ is a smooth nonlinear function and its gradient is available. The iterative formula of a CG method is given by $x_{0} \in R^{n}, x_{k + 1} = x_{k} + s_{k}, s_{k} = α_{k} d_{k}, k = 0, 1, \dots,$ in which $α_{k}$ is a steplength to be computed by a line search procedure and $d_{k}$ is the search direction defined by $d_{0} = - g_{0}, d_{k + 1} = - g_{k + 1} + β_{k} d_{k}, k = 0, 1, \dots,$ where $g_{k} = \nabla f (x_{k})$ and $β_{k}$ is a scalar called the CG (update) parameter.

The steplength $α_{k}$ is usually chosen to satisfy certain line search conditions (Sun & Yuan, 2006). Among them, the so-called strong Wolfe conditions (Wolfe, 1969) have attracted special attention in the convergence analyses and the implementations of CG methods, requiring that $f (x_{k} + α_{k} d_{k}) - f (x_{k}) ⩽ δ α_{k} \nabla f {(x_{k})}^{T} d_{k},$ $| \nabla f {(x_{k} + α_{k} d_{k})}^{T} d_{k} | ⩽ - σ \nabla f {(x_{k})}^{T} d_{k},$ where $0 < δ < σ < 1$ .

Different choices for the CG parameter lead to different CG methods (Hager & Zhang, 2006b). Based on an extended conjugacy condition, one of the essential CG methods has been proposed by Dai and Liao (2001) (DL), with the following CG parameter: $β_{k}^{DL} = \frac{g_{k + 1}^{T} y_{k}}{d_{k}^{T} y_{k}} - t \frac{g_{k + 1}^{T} s_{k}}{d_{k}^{T} y_{k}},$ where t is a nonnegative parameter and $y_{k} = g_{k + 1} - g_{k}$ . Note that if $t = 0$ , then $β_{k}^{DL}$ reduces to the CG parameter proposed by Hestenes and Stiefel (1952). Also, the CG parameter proposed by Hager and Zhang (2005) (HZ), i.e. $β_{k}^{HZ} = \frac{g_{k + 1}^{T} y_{k}}{d_{k}^{T} y_{k}} - 2 \frac{‖ y_{k} ‖^{2}}{d_{k}^{T} y_{k}} \frac{g_{k + 1}^{T} d_{k}}{d_{k}^{T} y_{k}},$ can be viewed as an adaptive version of (1.5) corresponding to $t = 2 \frac{‖ y_{k} ‖^{2}}{s_{k}^{T} y_{k}}$ , where $‖ \cdot ‖$ denotes the Euclidean norm. Similarly, the CG parameter suggested by Dai and Kou (2013) (DK), i.e. $β_{k} (τ_{k}) = \frac{g_{k + 1}^{T} y_{k}}{d_{k}^{T} y_{k}} - (τ_{k} + \frac{‖ y_{k} ‖^{2}}{s_{k}^{T} y_{k}} - \frac{s_{k}^{T} y_{k}}{‖ s_{k} ‖^{2}}) \frac{g_{k + 1}^{T} s_{k}}{d_{k}^{T} y_{k}},$ in which $τ_{k}$ is a parameter corresponding to the scaling factor in the scaled memoryless BFGS method (Sun & Yuan, 2006), can be considered as another adaptive version of (1.5) corresponding to $t = τ_{k} + \frac{‖ y_{k} ‖^{2}}{s_{k}^{T} y_{k}} - \frac{s_{k}^{T} y_{k}}{‖ s_{k} ‖^{2}}$ . In (Dai & Liao, 2001), it has been shown that a CG method in the form of (1.1), (1.2) with $β_{k} = β_{k}^{DL}$ is globally convergent for uniformly convex functions.

The approach of Dai and Liao has been paid special attention to by many researches. In several efforts, modified secant equations have been applied to make modifications on the DL method. For example, Yabe and Takano (2004) used the modified secant equation proposed by Zhang, Deng, and Chen (1999). Also, Zhou and Zhang (2006) applied the modified secant equation proposed by Li and Fukushima (2001). Li, Tang, and Wei (2007) used the modified secant equation proposed by Wei, Li, and Qi (2006). Ford, Narushima, and Yabe (2008) employed the multi-step quasi-Newton equations proposed by Ford and Moghrabi (1994). Babaie-Kafaki, Ghanbari, and Mahdavi-Amiri (2010) applied a revised form of the modified secant equation proposed by Zhang et al. (1999) and the modified secant equation proposed by Yuan (1991). Furthermore, in several other attempts, the modified versions of $β_{k}^{DL}$ suggested in (Babaie-Kafaki et al., 2010, Ford et al., 2008, Li et al., 2007, Yabe and Takano, 2004, Zhou and Zhang, 2006) have been used to achieve descent CG methods. Examples include the studies made by Narushima and Yabe, 2012, Sugiki et al., 2012, Livieris and Pintelas, 2012.

Here, based on a singular value study on the DL method, two nonlinear CG methods are proposed. The remainder of this work is organized as follows. In Section 2, the methods are suggested and their global convergence analysis is discussed. In Section 3, they are numerically compared with the CG methods proposed by Hager and Zhang, and Dai and Kou, and comparative testing results are reported. Finally, conclusions are made in Section 4.

Section snippets

Two modified nonlinear conjugate gradient methods

Based on Perry’s point of view (Perry, 1976), it is notable that from (1.2), (1.5), search directions of the DL method can be written as: $d_{k + 1} = - Q_{k + 1} g_{k + 1}, k = 0, 1, \dots,$ where $Q_{k + 1} = I - \frac{s_{k} y_{k}^{T}}{s_{k}^{T} y_{k}} + t \frac{s_{k} s_{k}^{T}}{s_{k}^{T} y_{k}} .$ So, the DL method can be considered as a quasi-Newton method in which the inverse Hessian is approximated by the nonsymmetric matrix $Q_{k + 1}$ . Since $Q_{k + 1}$ presents a rank-two update, its determinant can be computed by (Sun & Yuan, 2006, chap. 1) $\det (Q_{k + 1}) = t \frac{‖ s_{k} ‖^{2}}{s_{k}^{T} y_{k}} .$ Hence, if $t > 0$ and the line search

Numerical experiments

Here, we present some numerical results obtained by applying C++ implementations of the CG methods in the form of (1.1), (1.2) in which $β_{k} = β_{k}^{DL}$ defined by (1.5) with the suggested two optimal choices $t = t_{k_{1}}^{*}$ defined by (2.17) and $t = t_{k_{2}}^{*}$ defined by (2.20), here respectively called M1 and M2, the HZ method with the CG parameter (1.6), and the DK method with the following optimal choice for $τ_{k}$ in (1.7), suggested by Dai and Kou (2013): $τ_{k} = \frac{s_{k}^{T} y_{k}}{‖ s_{k} ‖^{2}} .$ The results are extended by further study on the

Conclusions

Based on a singular value study on the matrix which generates the search directions of the Dai–Liao nonlinear conjugate gradient method, two modified conjugate gradient methods have been suggested. Global convergence of the methods has been briefly discussed. Numerical comparisons have been made between the implementations of the proposed methods, and the conjugate gradient methods proposed by Hager and Zhang, and Dai and Kou, on a set of 145 unconstrained optimization test problems of the

Acknowledgements

This research was supported by the Research Councils of Semnan University and Ferdowsi University of Mashhad. The authors are grateful to Professor William W. Hager for providing the test problems and the CG_Descent code freely. They also thank the two anonymous reviewers for their valuable comments and suggestions helped to improve the quality of this work.

References (27)

S. Babaie-Kafaki et al.
Two new conjugate gradient methods based on modified secant equations
Journal of Computational and Applied Mathematics
(2010)
J.A. Ford et al.
Multi-step quasi-Newton methods for optimization
Journal of Computational and Applied Mathematics
(1994)
D.H. Li et al.
A modified BFGS method and its global convergence in nonconvex minimization
Journal of Computational and Applied Mathematics
(2001)
G. Li et al.
New conjugacy condition and related new conjugate gradient methods for unconstrained optimization
Journal of Computational and Applied Mathematics
(2007)
Y. Narushima et al.
Conjugate gradient methods based on secant conditions that generate descent search directions for unconstrained optimization
Journal of Computational and Applied Mathematics
(2012)
Z. Wei et al.
New quasi-Newton methods for unconstrained optimization problems
Applied Mathematics and Computation
(2006)
N. Andrei
Open problems in conjugate gradient algorithms for unconstrained optimization
Bulletin of the Malaysian Mathematical Sciences Society
(2011)
Y.H. Dai et al.
Convergence properties of nonlinear conjugate gradient methods
SIAM Journal on Optimization
(1999)
Y.H. Dai et al.
A nonlinear conjugate gradient algorithm with an optimal property and an improved Wolfe line search
SIAM Journal on Optimization
(2013)
Y.H. Dai et al.
New conjugacy conditions and related nonlinear conjugate gradient methods
Applied Mathematics and Optimization
(2001)

E.D. Dolan et al.

Benchmarking optimization software with performance profiles

Mathematical Programming

(2002)

J.A. Ford et al.

Multi-step nonlinear conjugate gradient methods for unconstrained minimization

Computational Optimization and Applications

(2008)

N.I.M. Gould et al.

CUTEr: A constrained and unconstrained testing environment, revisited

ACM Transactions on Mathematical Software

(2003)

Cited by (83)

An accelerated descent CG algorithm with clustering the eigenvalues for large-scale nonconvex unconstrained optimization and its application in image restoration problems
2024, Journal of Computational and Applied Mathematics
Conjugate gradient methods are widely used for solving large-scale unconstrained optimization problems. Since they have the attractive practical factors of simple computation and low memory requirement, interesting theoretical features of curvature information and strong global convergence property. Based on the analysis of the minimization of the condition number and the positiveness of the corresponding matrix, we propose a choice for the parameter in Dai-Liao method and design a descent conjugate gradient algorithm which owns the sufficient descent property independent of the choices for line search techniques. Under some common conditions, the global convergence property for uniformly convex function and the general nonlinear function are established. In the numerical experiments, we firstly focus on 46 ill-conditioned matrix problems and present the corresponding primal results. Then 450 large-scale unconstrained problems are referred. Finally, we give an accelerated strategy for the proposed algorithm and apply it to some image restoration problems. Numerical results indicate that the algorithm is reliable and much more efficient and effective than the other methods for the test problems.
A class of spectral three-term descent Hestenes-Stiefel conjugate gradient algorithms for large-scale unconstrained optimization and image restoration problems
2023, Applied Numerical Mathematics
Conjugate gradient methods are much effective and widely used for large-scale unconstrained optimization problems by their simple computation, low memory requirement and strong global convergence property. Spectral gradient methods are also effective for large-scale problems. In this paper, a class of new descent spectral three-term conjugate gradient algorithms are proposed which automatically have the sufficient descent property and satisfy the Dai-Liao conjugate condition. Under the Wolfe line search technique and some standard conditions, the proposed methods are globally convergent for strongly convex functions and general nonlinear functions with the help of the modified secant equations. In numerical part, 732 problems with dimensions varying from 1500 to 150000 and three image restoration problems with three noise levels are considered. Numerical results indicate that the proposed algorithms are more efficient, reliable and robust than the other methods for the testing problems.
Two families of self-adjusting spectral hybrid DL conjugate gradient methods and applications in image denoising
2023, Applied Mathematical Modelling
The study in this paper introduces two families of self-adjusting spectral hybrid DL conjugate gradient methods to solve unconstrained optimization problems. The search directions in the proposed methods are improved by integrating the negative spectral gradient and the last search direction that has convex combination structure. One component of composite conjugate parameter in the search direction is a hybridization of DL-like conjugate parameter and another flexible conjugate parameter. It is proved that the search directions in the two families of newly proposed methods always automatically satisfy the sufficient descent property, which is independent of choosing the specific conjugate parameter and line search. With general nonconvex objective functions, the convergence results under the weak Wolfe line search are obtained. Numerical experiments verify the efficiency and applicability of our developed methods by comparison with some existing methods for solving unconstrained optimization and image restoration problems.
Modified optimal Perry conjugate gradient method for solving system of monotone equations with applications
2023, Applied Numerical Mathematics
This article proposes an optimal value for the scaled Perry conjugate gradient (CG) method, which aims to solve large-scale monotone nonlinear equations. An optimal choice for the scaled parameter is obtained by minimizing the largest and smallest eigenvalues of the search direction matrix. In addition, the corresponding Perry CG parameter is incorporated with the hyperplane approach to propose a robust algorithm for solving monotone equations. The global convergence of the proposed method is established based on monotonicity and Lipschitz continuity assumptions. The robustness of the proposed algorithm is validated by examples involving numerical solving of monotone equations with their application to signal and image restoration problems.
A new CG algorithm based on a scaled memoryless BFGS update with adaptive search strategy, and its application to large-scale unconstrained optimization problems
2021, Journal of Computational and Applied Mathematics
Conjugate gradient method is an effective method for solving large-scale unconstrained optimization problems. This paper proposes a new conjugate gradient algorithm based on the self-scaling memoryless BFGS update, which uses the Wolfe line search. The descent and global convergence of the method are given under mild conditions. Numerical experiments show that our new conjugate gradient method is effective.
On a Scaled Symmetric Dai–Liao-Type Scheme for Constrained System of Nonlinear Equations with Applications
2024, Journal of Optimization Theory and Applications

View all citing articles on Scopus

View full text

Continuous OptimizationThe Dai–Liao nonlinear conjugate gradient method with optimal parameter choices

Highlights

Abstract

Introduction

Section snippets

Two modified nonlinear conjugate gradient methods

Numerical experiments

Conclusions

Acknowledgements

Journal of Computational and Applied Mathematics

Journal of Computational and Applied Mathematics

Journal of Computational and Applied Mathematics

Journal of Computational and Applied Mathematics

Journal of Computational and Applied Mathematics

Applied Mathematics and Computation

Open problems in conjugate gradient algorithms for unconstrained optimization

Bulletin of the Malaysian Mathematical Sciences Society

Convergence properties of nonlinear conjugate gradient methods

SIAM Journal on Optimization

A nonlinear conjugate gradient algorithm with an optimal property and an improved Wolfe line search

SIAM Journal on Optimization

New conjugacy conditions and related nonlinear conjugate gradient methods

Applied Mathematics and Optimization

Benchmarking optimization software with performance profiles

Mathematical Programming

Multi-step nonlinear conjugate gradient methods for unconstrained minimization

Computational Optimization and Applications

CUTEr: A constrained and unconstrained testing environment, revisited

ACM Transactions on Mathematical Software

Continuous Optimization
The Dai–Liao nonlinear conjugate gradient method with optimal parameter choices