A new class of polynomial primal–dual methods for linear and semidefinite optimization

https://doi.org/10.1016/S0377-2217(02)00275-8Get rights and content

Abstract

We propose a new class of primal–dual methods for linear optimization (LO). By using some new analysis tools, we prove that the large-update method for LO based on the new search direction has a polynomial complexity of O(n4/(4+ρ)log(n/ε)) iterations, where ρ∈[0,2] is a parameter used in the system defining the search direction. If ρ=0, our results reproduce the well-known complexity of the standard primal–dual Newton method for LO. At each iteration, our algorithm needs only to solve a linear equation system. An extension of the algorithms to semidefinite optimization is also presented.

Introduction

Interior-point methods (IPMs) are among the most effective methods for solving wide classes of optimization problems. Since the seminal work of Karmarkar [7], many researchers have proposed and analyzed various IPMs for Linear and Semidefinite Optimization (LO and SDO) and a large amount of results have been reported. For a survey, we refer to recent books on the subject [16], [21], [23]. An interesting fact is that almost all known polynomial-time variants of IPMs use the so-called central path [17] as a guideline to the optimal set, and some variant of Newton's method to follow the central path approximately. Therefore, the theoretical analysis of IPMs consists for a great deal of analyzing Newton's method. At present there is still a gap between the practical behavior of the algorithms and the theoretical performance results, in favor of the practical behavior. This is especially true for so-called primal–dual large-update methods, which are the most efficient methods in practice (see, e.g. [1]).

The aim of this paper is to present a new class of primal–dual Newton-type algorithms for LO and SDO. To be more concrete, we need to go into more detail at this stage. We consider first the following linear optimization problem:(P)min{cTx:Ax=b,x⩾0},where A∈Rm×n satisfies rank(A)=m, b∈Rm, c∈Rn and its dual problem(D)max{bTy:ATy+s=c,s⩾0}.We assume that both (P) and (D) satisfy the interior point condition (IPC), i.e., there exists (x0,s0,y0) such thatAx0=b,x0>0,ATy0+s0=c,s0>0.

It is well known that the IPC can be assumed without loss of generality. For this and some other properties mentioned below, see, e.g. [16]. Finding an optimal solution of (P) and (D) is equivalent to solving the following system:Ax=b,x⩾0,ATy+s=c,s⩾0,xs=0.Here xs denotes the coordinatewise product of the vectors x and s. The basic idea of primal–dual IPMs is to replace the third equation in (1), the so-called complementarity condition for (P) and (D), by the parameterized equation xs=μe, where e denotes the all-one vector and μ>0. Thus we consider the systemAx=b,x⩾0,ATy+s=c,s⩾0,xs=μe.

If the IPC holds, then for each μ>0, the parameterized system (2) has a unique solution. This solution is denoted as (x(μ),y(μ),s(μ)) and we call x(μ) the μ-center of (P) and (y(μ),s(μ)) the μ-center of (D). The set of μ-centers (with μ running through all positive real numbers) gives a homotopy path, which is called the central path of (P) and (D). The relevance of the central path for LO was recognized first by Sonnevend [17] and Megiddo [9]. If μ→0, then the limit of the central path exists and since the limit points satisfy the complementarity condition xs=0, the limit yields optimal solutions for (P) and (D).

IPMs follow the central path approximately. Let us briefly indicate how this goes. Without loss of generality, we assume that (x(μ),y(μ),s(μ)) is known for some positive μ. We first update μ to μ≔(1−θ)μ, for some θ∈(0,1). Then we solve the following well-defined Newton system:AΔx=0,ATΔy+Δs=0,sΔx+xΔs=μe−xs,and get a unique search direction (Δx,Δs,Δy). By taking a step along the search direction where the step size is defined by some line search rules, one constructs a new triple (x,y,s) that is `close' to (x(μ),y(μ),s(μ)). This process is repeated until the point (x,y,s) is in a certain neighborhood of the central path. Then μ is again reduced by the factor 1−θ and we apply Newton's method targeting at the new μ-centers, and so on. We continue this process until μ is small enough. Most practical algorithms then construct a basic solution and produce an optimal basic solution by crossing-over to the Simplex method. An alternative way is to apply a rounding procedure as described by Ye [22] (see also [11], [16]).

The choice of the parameter θ plays an important role both in the theory and practice of IPMs. Usually, if θ is a constant independent of n, for instance, θ=1/2, then we call the algorithm a large-update (or long-step) method. If θ depends on n, such as θ=(1/n), then the algorithm is named a small-update (or short-step) method. It is now known that small-update methods have an O(nlog(n/ε)) iteration bound and the large-update ones have the worst case iteration bound as O(nlog(n/ε)) [16], [21], [23]. The reason for the worse bound for large-update methods is that, in both the analysis and implementation of large-update IPMs, we usually use some proximities (or potential functions) to control the iteration, and up to now we can only prove that the proximity (or the potential function) has at least a constant decrease after one step. For instance, considering the primal–dual Newton method, after one step the proximity δ used in this paper satisfies3 δ+2δ2⩽−β for some constant β [15]. On the other hand, contrary to the theoretical results, large-update IPMs work much more efficiently in practice than small-update methods [1]. Several authors have suggested to use so called higher-order methods to improve the complexity of large-update IPMs [4], [6], [13], [24], [25]. Then, at each iteration, one solves some additional equations based on the higher-order approximations to the system (2).

The motivation of this work is to improve the complexity of large-update IPMs. Different from the higher-order approach, we reconsider the Newton system for (2), keeping in mind that our target is to get closer to the μ-center. Now let us focus on the Newton step. For notational convenience, we introduce the following notations:v≔xsμ,v−1μxs,dxvΔxx,dsvΔss.

Using the above notations, one can state the central condition in (2) as v=v−1=e. Denoting dv=dx+ds, the last equation in (3) is equivalent todv=v−1−v.

Observe that we can also decompose the above system into two systems as the predictor direction which is obtained by solving(dv)Pred=−v,and the corrector direction by(dv)Corr=v−1.

The corrector direction serves the purpose of centering: it points towards the “analytic center” of the feasible set, while the predictor direction aims to decrease the duality gap. It is straightforward to verify that (dv)i⩽0 for all the components vi⩾1 and (dv)i>0 for the components vi<1. This means that if vi<1, then the Newton step increases vi and vi decreases whenever vi>1 to get more close to the μ-center. Thus if we could get a new search direction, saying dv which increases the small components and decreases the large components of v more, then we can move faster locally to our target the μ-center along the new direction. It is also reasonable to expect that we might arrive the μ-center faster globally by using the new direction. Motivated by this observation, we reconsider the right-hand side of the equation defining the corrector direction according to the current point v. For this, we introduce a new class of corrector directions defined as follows:(dv)Corr=v−1−ρ,ρ⩾0,where ρ⩾0 is a parameter. Note that when the above new corrector direction is employed, the corresponding Newton-type system for LO becomesĀdx=0,ĀTΔy+ds=0,dx+ds=v−1−ρ−v,where Ā=AV−1X, V=diag(v), X=diag(x). In this work, we consider only the case that the parameter ρ is restricted to the interval [0,2]. It is worthwhile to mention that when ρ=0, the new system reduces to the standard Newton system.

It may be clear from the above description that in the analysis of IPMs, we need to keep control on the `distance' from the current iterates to the current μ-centers. In other words, we need to quantify the `distance' from the vector xs to the vector μe in some proximity measure. In fact, the choice of the proximity measure is crucial for both the quality and the elegance of the analysis. The proximity measure we use here is defined as follows:δ(xs,μ)≔∥v−v−1∥.

Note that the measure vanishes if xs=μe and is positive otherwise. An interesting observation is that in the special case of ρ=2, the right-hand side of the third term in the system (6) represents the negative gradient of the proximity measure (1/2)δ2 in the v-space. When solving this system with ρ=2, we get the steepest descent direction for the proximity measure along which the proximity can be driven to zero. As we will see later, after one step using the new search direction, the proximity will decrease at least as large as βδ2/3, where β is a constant. Consequently, we get an improvement over the complexity of the algorithm. We also mention that in [15], the authors have shown that δ+2δ2⩽−βδ after one feasible standard Newton step if vmin⩾1, which is exactly the same as we will state in Lemma 3.7 of this work. However, we failed to prove a similar inequality for the case vmin<1 and hence could not improve the complexity of the large-update IPM in [15]. The measure δ, up to a factor 1/2, was introduced by Jansen et al. [5], and thoroughly used in [16], Zhao [25] and more recently in [15]. Its SDO analogue was also used in the analysis of interior-point methods for semidefinite optimization [3]. We notice that variants of the proximity δ(xs,μ) had been used by Kojima et al. in [8] and Mizuno et al. in [12].

The paper is organized as follows. First, in Section 2, we present some technical results which will be used in our analysis later. In Section 3, we analyze the method with damped step and show that it has an O(n4/(4+ρ)log(n/ε)) polynomial iteration bounds. In Section 4, we discuss an extension of the new primal–dual algorithm for SDO and study its complexity. Finally, we close this paper by some concluding remarks in Section 5.

A few words about the notations. Throughout the paper, ∥·∥2 denotes the two-norm for vectors and matrices, ∥·∥ denotes the Frobenius norm for matrices, and ∥·∥ denotes the infinity norm for vectors. For any x=(x1,x2,…,xn)TRn, xmin=min(x1,x2,…,xn) (or xmax) is the component of x which takes the minimal (or maximal) value. For any symmetric matrix G, we also define λmin(G) (or λmax(G)) the minimal (or maximal) eigenvalue of G. Furthermore, we also assume that the eigenvalues of G are listed according to the order of their absolute values such that |λ1(G)|⩾|λ2(G)|⩾⋯⩾|λn(G)|. If G is positive semidefinite, then it holds 0⩽λmin(G)=λn(G), λmax(G)=λ1(G). For any symmetric matrix G, we also denote |G|=(GG)1/2. For two symmetric matrices G,H, the relation GH means HG is positive semidefinite, or equivalently H−G⪰0.

Section snippets

Technical results

As we stated in Section 1, a key issue in the analysis of an interior-point method, particularly for a large-update IPM, is the decreasing property of a positive sequence of the proximity measure values. This is crucial for the complexity of the algorithm. In this section, we consider a general positive decreasing sequence. First we give a technical lemma which will be used in our later analysis.

Lemma 2.1

There holds(1−t)α⩽1−αt,t,α∈[0,1].If α1α2>0, then|t−t−α1|⩾|t−t−α2|,t>0.

Proof

The inequality (8) follows

New primal–dual methods for LO and their complexity

In the present section, we discuss the new primal–dual Newton methods for LO and study the complexity of the large-update algorithm. This section consists of four parts. In Section 3.1, we describe the new algorithm. In Section 3.2, we estimate the magnitude of the search direction and the maximum value of a feasible step size. Section 3.3 is devoted to the estimation of the proximity measure after one step. The complexity of the algorithm is given in Section 3.4.

New primal–dual algorithms for semidefinite optimization

In this section, we consider an extension of the algorithms posed in the previous section to the case of SDO. We consider the SDO given in the following standard form:(SDO)minTr(CX),Tr(AiX)=bi(1⩽i⩽m),X⪰0,and its dual problem(SDD)maxbTy,i=1myiAi+S=C,S⪰0.Here C and Ai (1⩽im) are symmetric n×n matrices, and b,y∈Rm. Furthermore, `X⪰0' means that X is symmetric positive semidefinite. The matrices Ai are assumed to be linearly independent. SDO is a generalization of LO where all the matrices Ai and

Concluding remarks

A new class of search directions was proposed for solving LO and SDO problems. The new directions are a slight modification of the classical Newton direction. By using some new analysis tools, we proved that the large-update method based on the new direction has a complexity of order O(n4/(4+ρ)log(n/ε)). It is worthwhile to note that a simple idea, to change slightly the corrector direction improves the complexity of the algorithm. This gives rise to some interesting issues: the first issue is

References (25)

  • J.F. Sturm et al.

    Symmetric primal–dual path following algorithms for semidefinite programming

    Applied Numerical Mathematics

    (1999)
  • E.D. Andersen et al.

    Implementation of interior-point methods for large scale linear programming

  • R.A. Horn et al.

    Topics in Matrix Analysis

    (1991)
  • E. de Klerk, Interior Point Methods for Semidefinite Programming, Ph.D. Thesis, Faculty of ITS/TWI, Delft University of...
  • P. Hung et al.

    An asymptotically O(nL)-iteration path-following linear programming algorithm that uses long steps

    SIAM Journal on Optimization

    (1996)
  • B. Jansen et al.

    Primal–dual algorithms for linear programming based on the logarithmic barrier method

    Journal of Optimization Theory and Applications

    (1994)
  • B. Jansen et al.

    Improved complexity using higher order correctors for primal–dual Dikin affine scaling

    Mathematical Programming, Series B

    (1997)
  • N.K. Karmarkar

    A new polynomial-time algorithm for linear programming

    Combinatorica

    (1984)
  • M. Kojima et al.
  • N. Megiddo

    Pathways to the optimal set in linear programming

  • S. Mehrotra

    On the implementation of a (primal–dual) interior point method

    SIAM Journal on Optimization

    (1992)
  • S. Mehrotra et al.

    On finding the optimal facet of linear programs

    Mathematical Programming

    (1993)
  • Cited by (0)

    1

    Research was done while the first author was employed by T.U. Delft, The Netherlands.

    2

    Partially supported by the Hungarian National Research Fund, OTKA T029775, an FPP grant from IBM Watson Research Lab and a grant by the Natural Sciences and Engineering Research Council of Canada.

    View full text