Continuous Optimization
The Dai–Liao nonlinear conjugate gradient method with optimal parameter choices

https://doi.org/10.1016/j.ejor.2013.11.012Get rights and content

Highlights

  • Solutions for an open problem in the conjugate gradient methods are discussed.

  • A singular value study on the Dai–Liao conjugate gradient method is made.

  • Two modified Dai–Liao conjugate gradient methods are suggested.

  • Convergence analyses and numerical comparisons are made.

Abstract

Minimizing two different upper bounds of the matrix which generates search directions of the nonlinear conjugate gradient method proposed by Dai and Liao, two modified conjugate gradient methods are proposed. Under proper conditions, it is briefly shown that the methods are globally convergent when the line search fulfills the strong Wolfe conditions. Numerical comparisons between the implementations of the proposed methods and the conjugate gradient methods proposed by Hager and Zhang, and Dai and Kou, are made on a set of unconstrained optimization test problems of the CUTEr collection. The results show the efficiency of the proposed methods in the sense of the performance profile introduced by Dolan and Moré.

Introduction

Conjugate gradient (CG) methods comprise a class of unconstrained optimization algorithms characterized by low memory requirements and strong global convergence properties (Dai et al., 1999) which made them popular for engineers and mathematicians engaged in solving large-scale problems in the following form:minxRnf(x),where f:RnR is a smooth nonlinear function and its gradient is available. The iterative formula of a CG method is given byx0Rn,xk+1=xk+sk,sk=αkdk,k=0,1,,in which αk is a steplength to be computed by a line search procedure and dk is the search direction defined byd0=-g0,dk+1=-gk+1+βkdk,k=0,1,,where gk=f(xk) and βk is a scalar called the CG (update) parameter.

The steplength αk is usually chosen to satisfy certain line search conditions (Sun & Yuan, 2006). Among them, the so-called strong Wolfe conditions (Wolfe, 1969) have attracted special attention in the convergence analyses and the implementations of CG methods, requiring thatf(xk+αkdk)-f(xk)δαkf(xk)Tdk,|f(xk+αkdk)Tdk|-σf(xk)Tdk,where 0<δ<σ<1.

Different choices for the CG parameter lead to different CG methods (Hager & Zhang, 2006b). Based on an extended conjugacy condition, one of the essential CG methods has been proposed by Dai and Liao (2001) (DL), with the following CG parameter:βkDL=gk+1TykdkTyk-tgk+1TskdkTyk,where t is a nonnegative parameter and yk=gk+1-gk. Note that if t=0, then βkDL reduces to the CG parameter proposed by Hestenes and Stiefel (1952). Also, the CG parameter proposed by Hager and Zhang (2005) (HZ), i.e.βkHZ=gk+1TykdkTyk-2yk2dkTykgk+1TdkdkTyk,can be viewed as an adaptive version of (1.5) corresponding to t=2yk2skTyk, where · denotes the Euclidean norm. Similarly, the CG parameter suggested by Dai and Kou (2013) (DK), i.e.βk(τk)=gk+1TykdkTyk-τk+yk2skTyk-skTyksk2gk+1TskdkTyk,in which τk is a parameter corresponding to the scaling factor in the scaled memoryless BFGS method (Sun & Yuan, 2006), can be considered as another adaptive version of (1.5) corresponding to t=τk+yk2skTyk-skTyksk2. In (Dai & Liao, 2001), it has been shown that a CG method in the form of (1.1), (1.2) with βk=βkDL is globally convergent for uniformly convex functions.

The approach of Dai and Liao has been paid special attention to by many researches. In several efforts, modified secant equations have been applied to make modifications on the DL method. For example, Yabe and Takano (2004) used the modified secant equation proposed by Zhang, Deng, and Chen (1999). Also, Zhou and Zhang (2006) applied the modified secant equation proposed by Li and Fukushima (2001). Li, Tang, and Wei (2007) used the modified secant equation proposed by Wei, Li, and Qi (2006). Ford, Narushima, and Yabe (2008) employed the multi-step quasi-Newton equations proposed by Ford and Moghrabi (1994). Babaie-Kafaki, Ghanbari, and Mahdavi-Amiri (2010) applied a revised form of the modified secant equation proposed by Zhang et al. (1999) and the modified secant equation proposed by Yuan (1991). Furthermore, in several other attempts, the modified versions of βkDL suggested in (Babaie-Kafaki et al., 2010, Ford et al., 2008, Li et al., 2007, Yabe and Takano, 2004, Zhou and Zhang, 2006) have been used to achieve descent CG methods. Examples include the studies made by Narushima and Yabe, 2012, Sugiki et al., 2012, Livieris and Pintelas, 2012.

Here, based on a singular value study on the DL method, two nonlinear CG methods are proposed. The remainder of this work is organized as follows. In Section 2, the methods are suggested and their global convergence analysis is discussed. In Section 3, they are numerically compared with the CG methods proposed by Hager and Zhang, and Dai and Kou, and comparative testing results are reported. Finally, conclusions are made in Section 4.

Section snippets

Two modified nonlinear conjugate gradient methods

Based on Perry’s point of view (Perry, 1976), it is notable that from (1.2), (1.5), search directions of the DL method can be written as:dk+1=-Qk+1gk+1,k=0,1,,whereQk+1=I-skykTskTyk+tskskTskTyk.So, the DL method can be considered as a quasi-Newton method in which the inverse Hessian is approximated by the nonsymmetric matrix Qk+1. Since Qk+1 presents a rank-two update, its determinant can be computed by (Sun & Yuan, 2006, chap. 1)det(Qk+1)=tsk2skTyk.Hence, if t>0 and the line search

Numerical experiments

Here, we present some numerical results obtained by applying C++ implementations of the CG methods in the form of (1.1), (1.2) in which βk=βkDL defined by (1.5) with the suggested two optimal choices t=tk1 defined by (2.17) and t=tk2 defined by (2.20), here respectively called M1 and M2, the HZ method with the CG parameter (1.6), and the DK method with the following optimal choice for τk in (1.7), suggested by Dai and Kou (2013):τk=skTyksk2.The results are extended by further study on the

Conclusions

Based on a singular value study on the matrix which generates the search directions of the Dai–Liao nonlinear conjugate gradient method, two modified conjugate gradient methods have been suggested. Global convergence of the methods has been briefly discussed. Numerical comparisons have been made between the implementations of the proposed methods, and the conjugate gradient methods proposed by Hager and Zhang, and Dai and Kou, on a set of 145 unconstrained optimization test problems of the

Acknowledgements

This research was supported by the Research Councils of Semnan University and Ferdowsi University of Mashhad. The authors are grateful to Professor William W. Hager for providing the test problems and the CG_Descent code freely. They also thank the two anonymous reviewers for their valuable comments and suggestions helped to improve the quality of this work.

References (27)

  • E.D. Dolan et al.

    Benchmarking optimization software with performance profiles

    Mathematical Programming

    (2002)
  • J.A. Ford et al.

    Multi-step nonlinear conjugate gradient methods for unconstrained minimization

    Computational Optimization and Applications

    (2008)
  • N.I.M. Gould et al.

    CUTEr: A constrained and unconstrained testing environment, revisited

    ACM Transactions on Mathematical Software

    (2003)
  • Cited by (83)

    View all citing articles on Scopus
    View full text