Content area
This paper presents a primal-dual interior-point algorithm for solving general constrained nonlinear programming problems. The inequality constraints are incorporated into the objective function by means of a logarithmic barrier function. Also, satisfaction of the equality constraints is enforced through the use of an adaptive quadratic penalty function. The penalty parameter is determined using a strategy that ensures a descent property for a merit function. Global convergence of the algorithm is achieved through the monotonic decrease of a merit function. Finally, extensive computational results show that the algorithm can solve large and difficult problems in an efficient and robust way.
journal of optimization theory and applications: Vol. 125, No. 3, pp. 497521, June 2005 ( 2005)
DOI: 10.1007/s10957-005-2086-2Globally Convergent Interior-Point Algorithm
for Nonlinear Programming1I. Akrotirianakis2 and B. Rustem3Communicated by L. C. W. DixonAbstract. This paper presents a primal-dual interior-point algorithm
for solving general constrained nonlinear programming problems. The
inequality constraints are incorporated into the objective function by
means of a logarithmic barrier function. Also, satisfaction of the
equality constraints is enforced through the use of an adaptive quadratic penalty function. The penalty parameter is determined using a
strategy that ensures a descent property for a merit function. Global
convergence of the algorithm is achieved through the monotonic
decrease of a merit function. Finally, extensive computational results
show that the algorithm can solve large and difcult problems in an
efcient and robust way.Key Words. Primal-dual interior-point algorithms, merit functions,
convergence theory.1. IntroductionIn this paper, we discuss a primal-dual interior-point algorithm
for solving general (nonconvex) nonlinear programming problems. The
algorithm is based on two different approaches. The rst is the augmented Lagrangian SQP framework for general constrained optimization
problems, discussed in Rustem (Ref. 1); the second is the primal-dual1The research reported in this paper was done while the rst author was at Imperial College. The authors gratefully acknowledge constructive comments from Professor L. C. W.
Dixon and an anonymous referee. They are also grateful to Dr. Stanislav Zakovic for
helpful suggestions and comments. Financial support was provided by EPSRC Grants
M16016 and GR/G51377/01.2Postdoctoral Research Associate, Department of Chemical Engineering, Princeton University, Princeton, New Jersey.3Professor, Department of Computing, Imperial College, London, UK.
4970022-3239/05/0600-0497/0 2005 Springer Science+Business Media, Inc.498 JOTA: VOL. 125, NO. 3, JUNE 2005interior-point method, where a barrier function and a damped Newton
framework are used in order to solve NLP problems. The algorithm is
motivated by the fact that the solution of the rst-order optimality conditions of any NLP problem, which is the core of primal-dual interiorpoint algorithms, is not sufcient to guarantee the convergence to a local
minimum, unless the problem is convex. An algorithm that merely solves
the rst-order optimality conditions may converge to a saddle point or
even to a local maximum, since these conditions are also satised at those
points. To avoid such undesirable behavior, we use a merit function, whose
aim is to guide the iterates of the algorithm toward a local minimum.
The merit function incorporates the inequality constraints by means of
the logarithmic barrier function and the equality constraints by means of
the quadratic penalty function. Furthermore, the subproblem that is used
to compute the search direction involves the augmented Lagrangian of
the equality-constrained barrier problem. The search direction is shown
to be of descent for the merit function. It is also shown that the penalty
parameter in the merit function does not increase indenitely, if the iterates of the algorithm are not near a feasible point of the barrier problem. This is a particularly important point as it involves the use of the
equality-constrained problem and the corresponding augmented Lagrangian to establish the niteness of the penalty parameter. If the iterates of
the algorithm are near a feasible point of the barrier problem, then a
switch in the merit function is activated. In this case, we use the Euclidean
norm of the rst-order optimality conditions as the merit function. The
second merit function, on its own, ensures only convergence to a point
satisfying the rst-order perturbed optimality conditions, without distinguishing between a minimum or maximum. However, the second merit
function is expected to be activated when the iterates have been placed,
by the primary penalty-barrier merit function, within a neighborhood of
a local minimum. As will be discussed later, it is this switch of the merit
functions that enables the global convergence of the algorithm to a local
minimum.Although our algorithm is related to the approaches proposed by
El-Bakry et al. (Ref. 2) and Yamashita (Ref. 3), it differs in signicant
aspects, such as the merit functions, the adaptive penalty selection rule,
the stepsize rules, and the technique that switches between the different
merit functions. Other algorithms which use an adaptive penalty have been
developed recently by Gay et al. (Ref. 4) and Vanderbei and Shanno
(Ref. 5).Recently, general (nonconvex) NLP problems have been the subject
of intensive research in the optimization community and several primaldual interior-point algorithms have emerged (e.g. Refs. 24 and 67).JOTA: VOL. 125, NO. 3, JUNE 2005 499The common feature of these algorithms is that they use a merit function within a line-search or trust-region framework to achieve global
convergence.This paper is organized as follows. Section 2 describes the primal-dual
interior-point algorithm. In Section 3, we establish the global convergence
of the algorithm. In Section 4, we report our numerical experience. We
provide also an example where we demonstrate how the mechanism that
switches between the two merit functions enables the algorithm to converge to a local minimum and avoid saddle points or local maxima.2. Description of the AlgorithmIn this paper, we consider the following constrained optimization
problem:min f(x), (1a)s.t. g(x) = 0,x 0, (1b)where x Rn,f: Rn R, and g(x) : Rn Rq . The formulation in (1)
is quite general because every equality-constrained and inequality-constrained optimization problem can be reduced to that form, by adding for
example slack variables to the constraints.The following assumptions are used throughout the paper:(A1) The second-order derivatives of f and g are continuous.
(A2) The columns of the matrix [g(x), ei : i I: lim infk xi
kis bounded.(A3) Strict complementarity of the solution w = (x,y,z)0
x0,i = 1, 2,...,n} and ei
represents the ith column of the nn identity matrix. Also, the
sequence {xk}}are linear inde-pendent, where I 0x={i=is satis-ed.(A4) The second-order sufciency condition for optimality is satis-
ed at the solution point; i.e., for all vectors 0 = v Rnsuch
0, for i that gi (x)T v = 0,i = 1, 2,...,q, and such that eTi v =I 0x ,vT 2
xx L(x, y, z)v > 0.The original equality and inequality constrained optimization problem (1) is approximated bymin f(x) + (c/2)g(x)2
2
ni=1log(xi), (2a)s.t. g(x) = 0, (2b)500 JOTA: VOL. 125, NO. 3, JUNE 2005for c, 0. The objective in (1) is augmented by the penalty and logarithmic barrier functions. The penalty is used to enforce satisfaction of
the equality constraints by adding a high cost to the objective function
for infeasible points. The barrier is needed to introduce an interior-point
method to solve the initial problem (1), since it creates a positive singularity at the boundary of the feasible region. Thus, strict feasibility is
enforced, while approaching the optimum solution.The Lagrangian associated with the optimization problem (2) is given
byi g(x)L(x, y; c, ) = f(x) + (c/2)g(x)2
2
nT y,i=1log xand the perturbed optimality conditions areF (x, y, z; c, ) =f(x) z + cg(x)T g(x) g(x)T y
g(x)XZe e =0, (3)wherez = X1e, X = diag{x1,...,xn},Z = diag{z1,...,zn}.For xed, the system (3) is solved by using the quasi-Newton method.
At the kth iteration, the Newton system isHk g
T
kI=xkykzkfk zk + ckgT
k gk g
T
k yk
gk 00
Zk 0 XkgkXkZke e,,(4)where Hk is a positive-denite approximation of the Hessian of the augmented Lagrangian. In matrix-vector form, equation (4) can be written asF(wk; ck)wk =F(wk; ck,k),(5)where F(wk; ck) is the Jacobian matrix of the vector function F(wk; ck,k).To initiate the algorithm, a strictly-interior starting point is needed,
that is, a pointw0 = x0,y0,z0, with x0,z0 > 0.By controlling the steplength xk of the primal variables xk and the steplength zk of the dual variables yk and zk, the algorithm ensures thatJOTA: VOL. 125, NO. 3, JUNE 2005 501the generated iterates remain strictly in the interior of the feasible region.
Moreover, the algorithm moves from one inner iteration to another inner
iteration (i.e., with xed) by seeking to minimize the merit function(x; c, ) = f(x) + (c/2)g(x)2
2 nlog(xi), (6)which is basically the objective function of the barrier problem (2). This
is achieved by properly selecting the values of the penalty parameter c
at each inner iteration. In order to avoid situations where the penalty
parameter may grow to large values, we introduce a second merit function, dened by the 2 norm of the KKT residuals of the barrier problem (2). The potential difculties of the penalty parameter in the merit
function (x; c, ) have been considered by other authors. For example,
Bartholomew-Biggs (Ref. 8) avoids this undesirable situation by constructing search directions based directly on the augmented Lagrangian of the
barrier problem. As shown later, the monotonic decrease of both merit
functions and the rules for determining the primal and dual stepsizes guarantee that the inner iterates converge to the solution of (2) for a xed
value of . Subsequently, by reducing , such that {} 0, the optimum
of the initial problem (1) is reached.A detailed description of the algorithm follows.Algorithm 2.1.Step 0. Initialization. Choose
x0,
z0 Rn, and
y0 Rq , such that
x0,
z0 > 0, penalty and barrier parameters c0 > 0,0 > 0, and
parameters , , 0,, (0, 1), , m, M > 0. Set l = 0 and k =
0Step 1. Test for Convergence of Outer-Iterations. If
i=1F
xl,
yl,
zl; c,l 2/1 +
xl,
yl,
zl2 0,then stop.Step 2. Start of Inner Iterations ( is xed to l throughout this
step). Set(xk,yk,zk) =
xl,
yl,
zl.Step 2.1. Test for Convergence of Inner Iterations. If
F xk,yk,zk; ck,
l l and g(x)2g, then set
xl+1,
yl+1,
zl+1 = xk,yk,zk and go to Step 3.502 JOTA: VOL. 125, NO. 3, JUNE 2005Step 2.2. Solve the Newton system (4) to obtain wk =(xk,yk,zk).Step 2.3. Penalty Parameter Selection. IfxT
kfk ckgk2
2 l xTk X1
k e +xk2Hk > 0 and2 >g, then setck+1 =max [xgkT
kfk l xT
k X1
k e+xk2Hk ]/gk2
2,ck +
..Step 2.4. Steplength Selection Rules.Set maxxkOtherwise, set ck+1 = ck=mini
k/xi
k : xi
k < 0
.1in xIf xT
k
fk ckgk2
2 l xTk X1
k e +xk2Hk > 0 and0 < gk2g, then setmax
zk=min1in zi
k/zi
k|zi
k < 0
,k =min 1,max
xk ,maxzk
.Set k = k, where is the smallest nonnegative integer
such thatF(wk+ kwk)2F(wk)2k(F
t (wk)F (wk), wk).Set wk+1 = wk + kwk.
Otherwise, set xk =min{max
xk , 1} and xk = xk, where is the smallest nonnegative integer such thatxk+1; ck+1,
l xk; ck+1,
lxk.Set LBik = min (1/2)m, xi
k+1zTxkxk; ck+1,
li
k
andUBik = max 2M, x
i
k+1zi
k
.For i = 1, 2,...,n, nd
izk=max{i : LBikxik+1zi
k+izikUBi
k
}.}
.Set xk+1 = xk + xkxk,yk+1 = yk + zkyk,
zk+1 = zk + zkzk.Set zk = min 1, min1in{i
zkJOTA: VOL. 125, NO. 3, JUNE 2005 503Step 2.5. Set k = k + 1 and go to Step 2.1.
Step 3. Reduce the barrier parameter as described in Section 2.4.
Step 4. Set l = l + 1 and go to Step 1.2.1. Penalty Parameter Selection Rule. At every iteration, the value
of the penalty parameter is determined such that a descent property is
ensured for the merit function (x; c, ). The direction xk is a descent
for at the current point xk ifxT
k(xk; ck,) 0. (7)By considering the second equation of the Newton system (4), we havexT
k(xk; ck,) = x
T
kf(xk) ckgk2xT
k X1
k e, (8)where ck is the value of the penalty parameter at the beginning of the kth
iteration. Since the barrier parameter is xed, we can deduce from (8)
that, if ck is not large enough, then the descent property (7) may not be
satised. Thus, a new value ck+1 >ck must be determined to guarantee that(7) holds. The next lemmas show that Algorithm 2.1 chooses the value of
the penalty parameter in such a way that xk is a descent direction for
the merit function.In Lemmas 2.1 and 2.4, we show that the descent is guaranteed
always if g(x)2 >g or g(x) = 0, where g denote a nite precision. As a
result, the penalty parameter ck = ck(g) remains nite. On the other hand,
at some inner iteration k,ifwehave0 < g(xk)504 JOTA: VOL. 125, NO. 3, JUNE 2005then xk is a descent direction for the merit function at the current
point xk. Furthermore,xT
k(xk; ck+1,) xk2Hk0. (10)Proof. In Step 2.3, Algorithm 2.1 initially checks the inequalityxT
kfk ckgk2xT
k X1
k e +xk2Hk0. (11)If (11) is satised, then by setting ck+1 = ckand rearranging (11), we
obtain (10). On the other hand, if (11) is not satised, by settingck+1 =max xT
kfk xT
k X1
k e +xk2Hk /gk2,ck +
, > 0,and substituting it into (8), it can be veried that (10) holds also.Remark 2.1. The role of the parameter in the denition of ck+1
is to guarantee that the penalty parameter increases by at least a certain
amount every time it needs to be updated.In the previous lemma, it is assumed that gk2 >g. The next lemma
demonstrates that xk remains a descent direction for the merit function
when gk = 0.Lemma 2.2. Let f and g be differentiable functions and let wk =
(xk,yk,zk) be the Newton direction taken by solving the system (4).
If gk =0, for some or all iterations k, then the descent property (10) is satised for any choice of the penalty parameter ck [0, ).Proof. If gk = 0, then (8) yieldsxT
k(xk; ck,) = x
T
kfk xT
k X1
k e (12)and the second equation of the Newton system (4) becomesgkxk =0. (13)Furthermore, solving the third equation of (4) for zk,wehavezk =X1
k Zkxk zk + X1
k e. (14)Substituting zk into rst equation of (4) yieldsfk + ckgT
k gk X1
k e =Hk + X1
k Zkxk +gT
k (yk + yk).(15)JOTA: VOL. 125, NO. 3, JUNE 2005 505Premultiplying (15) by xT
k and using (12) and (13) yieldsxT
k(xk; ck,)=x
T
kHk + X1
k Zkxk
< xT
k Hkxk. (16)From (16), it is derived that (10) holds for every ck [0, ).Lemma 2.3. Let the assumptions of the previous lemma hold and letgk =0 for some k. Then, the algorithm chooses ck+1 = ck in Step 2.3. Also,
xk is still a descent direction for the merit function at xk.Proof. In the previous lemma, it was proved that the descent property (10) is satised for gk = 0. Basically, this means that the condition in
Step 2.3 of Algorithm 2.1 is satised always. Consequently, the algorithm
does not need to increase the value of the penalty parameter and simply
sets ck+1 = ck. For this choice of the penalty parameter, it can be veried
that the descent property (10) still holds.Corollary 2.1. If xk= 0, then the algorithm chooses ck+1 = ck.Lemma 2.4. Let f and g be continuously differentiable functions
and M < .Then, for xed, the following properties hold:(i) There exists always a constant ck+1 0 satisfying Step 2.3 of
Algorithm 2.1.(ii) Assuming that the sequence {xk}is bounded, ck is increased
nitely often; that is, there exists an integer k 0 such that, for
all k kxT
kfk xT
k X1
k e +xk2Hk,wehave c [0, ).Proof. Part (i) is a direct consequence of Lemmas 2.12.3 and Corollary 2.1, since a nite value ck+1 is always generated in Step 2.3. Part (ii)
will be shown by contradiction. Assume that ck as k .From
the way ck+1 is dened in Step 2.3, we can deduce that, if ck , then
gk0. Hence, there exists an integer k1 such that, for all k k1, we
have20 < gk2g.506 JOTA: VOL. 125, NO. 3, JUNE 2005As can be seen in Step 2.4, however, in the case where 0 < gk2g, the
algorithm stops increasing the penalty parameter, since it switches to the
second merit function. Therefore, the maximum value that ck can take isc = ck1 = M/g,where M and c are nite values. Hence, we have c < . This contradicts our assumption that ck as k . Hence, the penalty parameter does not increase indenitely; that is, there exists an integer k 0 such
that, for all k k,wehave ck < .2.2. Primal Stepsize Rule. In Step 2.4 of the algorithm, we adopt
the Armijo rule to determine the new iterate xk+1. The maximum allowable stepsize is determined by the boundary of the feasible region and is
given bymax
xk=1in{xmini
k/(xi
k) : xi
k < 0}.We take as initial step xk a number very close to max
xk and we ensure thatit is never greater than one, i.e.,xk =min{max
xk , 1}, with (0, 1).The nal step isxk = xk,where is the rst nonnegative integer for which the Armijo rule is satis-
ed and the factor is usually chosen in the interval [0.1, 0.5], depending
on the condence we have on the initial step xk.2.3. Dual Stepsize Rule. In this section, we discuss the calculation of
the stepsize of the dual variables z. The strategy uses the information provided by the new primal iterate xk+1 in order to nd the new iterate zk+1.
It is a modication of the strategy suggested by Yamashita (Ref. 3) and
Yamashita and Yabe (Ref. 11).While the barrier parameter is xed, we determine a step izk along
the direction zik, for each dual variable zik,i =1, 2,...n, such that the box
constraints are satised,izk = max{> 0: LBi
k(xik + xkxi
k)(zi
k+zik) UBi
k
}.JOTA: VOL. 125, NO. 3, JUNE 2005 507The lower bounds LBik and upper bounds UBi
k,i =1, 2,...,n, are denedasLBik = min{(1/2)m, (xi
k+xkxik)zi
k
},UBik = max{2M, (x
i
k+xkxik)z
i
k},where the parameters m and M are chosen such that0 <m min{1, [(1 )(1 /(M0))mini {xi
kzi
k}]/},M max{1, maxi{xikzi
k}/} >0,with (0, 1) and M0 a positive large number. The common dual
steplength zk is dened aszk =min{1, min1in{
i
zk}}.Also, the stepsize for the dual variables y can be either yk =1oryk = zk.2.4. Barrier Parameter Selection Rule. The basic characteristic of
our barrier reduction strategy is that it determines the new value of by
taking into consideration the distance of the current point (xk,yk,zk) from
the central path and he optimum solution of the initial problem. The barrier reduction strategy is dened as follows:l+1l , 0.01(0.95)kF(xk,yk,zk)2}.If F(xk,yk,zk; l )2 0.1l , then:if l < 104, then l+1 = min{0.85l , 0.01(0.85)k+2 F(xk,yk,zk)2};else,l+=min{0.95min{0.85l , 0.01(0.85)k+ F(xk,yk,zk)2}.1=3. Global ConvergenceIn this section, we show that the algorithm is globally convergent,
because it guarantees always progress toward a solution from any starting
point.We show that, while the barrier parameter is xed to a value l, the
algorithm produces iterateswk(l) = (xk(l ), yk(l ), zk(l )), for k 0,508 JOTA: VOL. 125, NO. 3, JUNE 2005which are bounded and converge to a pointw(l ) = (x(l ), y(l ), z(l ))such thatl ), y(l ), z(l ); c,l )= 0,where F (x, y, z; c, ) is the vector of the perturbed KKT conditions,
dened in (3). In other words, we show that the inner steps 2.12.5 of
Algorithm 2.1 converge to a solution of the perturbed KKT conditions.
For simplicity, we suppress the index l, and we use wk instead of wk(l)
to denote the iterates produced while = l .The basic result of Lemmas 2.1 to 2.4 is that the direction xk, taken
from the solution of the Newton system (4), is a descent direction for the
merit function at the current point xk; that is the inequality (10) holds.
In the next theorem, we show that the sequence {(xk; c,)} is monotonically decreasing if the barrier parameter is xed. We show also that the
step xk chosen in Step 2.4 is always positive.Theorem 3.1. Assume that the following conditions hold:(i) The objective function f and the constraints g are twice continuously differentiable.(ii) For every iteration k and every vector v Rn, there exist constants M
>m
> 0, such that m
vF(x(2
2 vT Hkv M
v2
2.(iii) For every k, there exists a vector (xk,yk,zk) as a solution
of (4).(iv) There exists an iteration k, small g > 0, gk2 (0,g), and a0, with c = c(g), such that the penalty parameter
restriction in Step 2.3,xT
kfk ck(g)gk2
2 xscalar c T
k X1
k e +xk2Hk0,is satised for all k k, withck+1(g) = ck(g) = c(g).Then, the stepsize computed in Step 2.4 is such that xk (0, 1], hence, thesequence {(xk; c,)}is monotonically decreasing for k k and xed.JOTA: VOL. 125, NO. 3, JUNE 2005 509Proof. Consider the case gk20,g) and the rst-order approximation with remainder of the function (x; c,) around the point xk+1 =
xk + xkxk (,(xk+1; c,) (xk; c,)
xkxT
k
(xk; c,) + (2
xk/2)xT
k Hkxk + 2
xkkxk2
2, (17)wherek =1(1 t)2
x (xk + txkxk; c,) Hk2 dt.0Furthermore, from Assumption (A2) and Lemmas 2.1 and 2.2, we havexk2
2 (1/m)xT
k HkxkxT
k(xk; c, ).(18)Using (18) in (17), we obtain(xk+1; c,) (xk; c,)
xkxT
k(xk; c, )(1 xk(1/2 + k/m)). (19)
The scalar in Step 2.4 determines a steplength xk such that 1 xk(1/2 + k/m) 1/2.
From Lemmas 2.1 to 2.4, since we have alwaysxT
k(xk; c,) 0,there must exist xk (0, 1], to ensure (19) and the Armijo rule in Step2.4. Assume that 0 is the largest step in the interval (0, 1] satisfying
both (19) and the Armijo rule. Consequently, for every 0, inequality (19) and the Armijo rule are also satised. Hence, the strategy in Step2.4 selects always a steplength xk [0,0], where 0 < 1. From the
above analysis, it follows that the sequence {(xk; c,)} is monotonically
decreasing.Remark 3.1. The results of the above theorem can be proved to hold
before the penalty parameter ck achieves a constant value c. This can be
done by considering the difference (xk+1; ck+1,) (xk; ck+1,)and
the Taylor expansion of the function (x; ck+1,) instead of (x; c,).
In the above theorem, we choose to prove the case where ck = chas
been achieved, in order to show that, asymptotically, is monotonically
decreasing and the strategy in Step 2.4 selects a steplength xk (0, 1].510 JOTA: VOL. 125, NO. 3, JUNE 2005Corollary 3.1. The sequence {xk} of primal variables generated by
Algorithm 2.1, with xed, is bounded away from zero.The following lemma, proved by Yamashita in Ref. 3, shows that the
dual stepsize rule, used by Algorithm 2.1, generates iterates zk which are
also bounded above and away from zero.Lemma 3.1. While is xed, the lower bounds LBik and the upper
bounds UBik,i = 1, 2,...,n, of the box constraints in the dual stepsize rule
are bounded away form zero and bounded from above, respectively, if the
corresponding components xik of the iterates xk are also bounded above
and away from zero.Proof. The proof can be found in Ref. 3.Having established that the sequences of iterates {xk} and {zk} are
bounded above and away from zero, we show that the iterates {yk},k is bounded. In addition to this, Lemma 3.3
shows also that the Newton direction wk = (xk,yk,zk) is bounded,
while is xed. We rst establish the following technical result.Lemma 3.2. Let wk be a sequence of vectors generated by Algorithm2.1 for xed Then, the matrix sequence {0,
are also bounded. In particular, Lemma 3.3 shows that, if at each iteration of the algorithm we take a unit step along the direction yk, then the
resulting sequence {yk + yk}}is bounded, where1kk = 0 gkgT
k Hk + X1
k Zk .Proof. The proof can be found in Ref. 12.Lemma 3.3. Let wk is a sequence of vectors generated by Algorithm2.1 for xed. Then, the sequence of vectors {(xk,yk + yk,zk)}isbounded.Proof. The proof can be found in Ref. 12.Lemmas 3.4 and 3.5 provide the necessary results by Theorem 3.2,
which shows that the sequence of {wk}converges to a point w =
(x,y,z) satisfying the KKT conditions of problem (2).JOTA: VOL. 125, NO. 3, JUNE 2005 511Lemma 3.4. Let the assumptions of Theorem 3.1 be satised and let
the barrier parameter be xed. Suppose that also, for some iterationk0 0, the level setS1 ={x Rn
+: (x; c,) (xk0; c,)} (20)is compact. Then, for all k k0,wehave
lim
k0. (21)Proof. The scalar (0, 1/2) in the Armijo stepsize strategy at Step2.4 determines a stepsize xk such that 1 xk(1/2 + k/m
) 1/2.By solving for xk, we obtain(1/2)/(1/2 + k/m
) xk (1 )/(1/2 + k/m
).Hence, the largest value that xk can take and still satisfy the Armijo rule
in Step 2.4 is0
xk=xT
k(xk; c,) =min{1,(1 )/(1/2 + k/m
)}.Recall that the steplength xk is chosen by reducing the maximum allowable steplength xk until the Armijo rule is satised. Therefore, xk [0xk,0xk] and thereby satises also the Armijo rule.As the merit function is twice continuously differentiable and the
level set S1 is bounded, there exists a scalar M< such that0k =1(1 t)2
x (xk + txkxk; c,) Hk2dt M< .Thus, we have alwaysxk xk >0,wherexk =min{1,(1 )/(1/2 + M/m
)}.Hence, the stepsize xk is always bounded away from zero. Furthermore,
from the Armijo rule and Lemmas 2.1 and 2.2, we have(xk+1; c,) (xk; c,) xk(xk; c,)T xk < 0. (22)512 JOTA: VOL. 125, NO. 3, JUNE 2005From our assumption that the level set S1 is bounded, it can be deduced
that the sequence{|(xk+1; c,) (xk; c,)|} 0.Consequently, from (22), we have
limT xk) = 0.Finally, since , xk > 0, it can be deduced that (21) holds.Lemma 3.5. Let the assumptions of the previous lemma hold. Then,
lim
k xk(xk(xk; c,)k0. (23)Proof. From (10) we have(xk; c,)T xk xk2Hk .2Hk=Using (21), we have that (23) holds.Theorem 3.2. Let the assumptions of the previous lemma hold and
let g be a sufciently small positive scalar4. Then, the algorithm converges in the limit to a point satisfying F (x, y, z; c, ) = 0for xed.Proof. Consider the case where gk2 (0,g) and let x(), z() Rn,y() Rqbe such that
limkxk = x(),lim
kzk = z(),lim
kyk = y(),1, 2,... }.The existence of such points is ensured since by Assumption A2 and Lemmas 3.1 and 3.3, the sequence {(xk(), yk(), zk())} is bounded for
xed and by Theorem 3.1 the algorithm decreases always the merit function sufciently at each iteration, thereby ensuring that xk S1, with S1
compact.We rst prove that, for k sufciently large, the dual step zk becomes
the unit one, by showing that
limk zk + zk X
k k,k K{1
k+1e=0. (24)4In our numerical experiments, we used g = 108.JOTA: VOL. 125, NO. 3, JUNE 2005 513Adding X1
k+1e to both sides of (14), we havezk + zk X1
k+1e X1
k Zkxk+ X1Xk1
k+1e.(25)Moreover,X1Xk1
k+12n max
1in{(xkxi
k)2)/(xi
kxi
k+1)2}.Since we have alwaysxk (0, 1], (xik)2 xk2,and since the sequence {xk} bounded away from zero, from the above
inequality and (23), we can derive thatlim
k X0. (26)Hence, letting k in (25) and using (23) and (26), we can deduce that(24) holds. Consequently,k1X1
k+12n lim
kmax1in{xk2/(xikxi
k+1)2}=zk+1 = zk + zk,for k sufciently large.Furthermore, using (14) for k sufciently large, the complementarity
condition becomesXk+1zk+1 = Xk+1(zk + zk)=Xk+1X1
k (Zkxk + e).(27)From (23) and the fact that the elements of the diagonal matrix Xk+1X1kcan be written asxik+1/xi
k=1 + xkxi
k/xi
k, for all i = 1, 2,...,n,we can derive thatlim1=In, (28)where In is the n n identity matrix. Letting k in (27) and using (23)
and (28) yieldslim
kkkXk+1XXk+1zk+1 = X()z() = e.(29)514 JOTA: VOL. 125, NO. 3, JUNE 2005Also, for k , the second equation of the Newton system (4) and (23)
yield
lim0. (30)The rst equation of the Newton system (4) can be written asfk g
T
k yk+1 + cg(gkxk) = g(x()) =kT
k gk X1
k e =(Hk + X1
k Zk)xk,whereyk+1 = yk + yk.Letting k and using (23), the above equation yields
limk fk g
T
k yk+1 + cg0. (31)From the assumptions that the functions f and g have continuous gradients and gTk has full column rank and using (26), equation (31) yieldslim
k fk+1 gT
k gk X1
k e=T
k+1yk+1 + cgT
k+1gk+1 X1
k+1e=0,or equivalently,f(x())g(x())T y()+cg(x())T g(x())X()1e= 0.(32)From (32), (30), and (29), we can conclude that the vector(x(),y(),z())
is a solution of the perturbed KKT conditions (3).The convergence result for gk2(0,g) is a consequence of El-Bakry
et al. (Ref. 2) and Zakovic et al. (Ref. 10).An immediate consequence of Theorem 3.2 is that, for any convergent
subsequence produced by the algorithm for = l, there is an iteration
k,such thatF(x
k,y
k,z
k; c,) ,(33)for all k
k, where 0 and F (x, y, z; c, ) is given by (3). At this point,
we record the value of the current iterate(
xl,
yl,
zl) = (x
k,y
k,z
k),and set to a smaller value l+1 <l. Therefore, a sequence of approximate central points {( xl,
yl,
zl)}is generated.JOTA: VOL. 125, NO. 3, JUNE 2005 515In the remaining part of this section, we show that the sequence of
approximation central points converges to a KKT point {x, initial constrained optimization problem (1).Foragiven 0 sufciently small, consider the set of all the approximate central points generated by Algorithm 2.1,S2() ={(
xl,
yl,
zl) : F(
xl,
yl,
zl; c,l )xl are in a compact set for l 0, then there exists
a constant M1 > 0 such thatyy,
z} of thex0,
y0,
z0; c,0), l <0}.If > 0, then the stepsize rules described in Section 2 guarantee thatxl,
zl S2()are bounded away from zero for l 0. Consequently, (
xl)T
zl
is also bounded away from zero in S2(). The following lemma shows that
the sequence {yl } is bounded if the sequence {zl } is also bounded.Lemma 3.6. Assuming that the columns of g(
xl) are linearly inde-F( pendent and the iterates l M1(1 +zl).Proof. The proof can be found in Ref. 12.Lemma 3.7. If (
xl,
yl,
zl) S2()for all l 0, then the sequencex,
y,
z)} which is a KKT point of the initial constrained problem0 and let {(
xl,
yl,
zl)}be a
sequence of approximate central points satisfying (33) for = l ,l 0.
Then, the sequence {( x,
y,
z; c, ) = 0, for = 0.Proof. From Lemma 3.6, the sequence {(
xl,
yl,
zl)}is bounded and
remains in the compact set S2(). Thus, it has a limit point in S2(),is bounded above.Proof. The proof can be found in Ref. 12.The following theorem shows that the sequence {(
xl,
yl,
zl)}converges
{(
xl,
yl,
zl)}to {( (1).Theorem 3.3. Let {l } be a positive monotonically decreasing
sequence of barrier parameters with {l }xl,
yl,
zl)}is bounded its limit point (
x,
y,
z) satises F( 516 JOTA: VOL. 125, NO. 3, JUNE 2005denoted as (
x,
y,
z). From (33) and the fact that l 0, we obtain easily that liml F( xl,
yl,
zl)=KKT point of the initial constrained optimization problem (1).4. Numerical ResultsThe test problems solved by the algorithm fall into two categories.
The rst category consists of small-size problems drawn mainly from the
Hock and Schittkowski collection (Ref. 13). The second category consists
of real -world problems of larger size, which were taken from the Vanderbei website (Ref. 14). The implementation of the algorithm has been done
using standard C and is interfaced with the powerful mathematical programming language.The various parameters used in the algorithm are as follows. In Step
1, the accuracy of the stopping criterion is 0 = 108. In Step 2.3, = 1010. In the barrier reduction rule, described in Section 2.4, we set = 6.
Tables 1 and ?? summarize the numerical results for the test-problems
taken from the Hock and Schittkowski collection. The following terminology is used:Problem: problem number given in the Hock and Schittkowski collection (Ref. 13);Iterations: total number of inner iterations required to nd the optimum solution of the original problem (1);c: nal value of the penalty parameter.
All the numerical results have been obtained by using the exact Hessian, provided by AMPL, except those marked with (i.e., tests 74, 75, 93
and 97), which were solved using the BFGS updating formula (Ref. 15). In
those iterations where the exact Hessian Hk is not positive semidenite, it
is replaced it byHk = Hk + Ek,
where Ek is a nonnegative diagonal matrix that is zero if Hk is positive denite. The matrix Hk is generated by the modied Cholesky
factorization as described in Gill et al. (Ref. 16). This technique has0. Therefore,f(
x) z + cg(x)T g(x) g(
x)T
y = 0,x) = 0,
X Ze = 0.Clearly, from the above equations, we may derive that (
x,
y,
z) is ag( and g =108. In Step 2.4, we set = 0.995, = 0.5, = 104,m = 1, andM =JOTA: VOL. 125, NO. 3, JUNE 2005 517Table 1. Numerical results for the HockSchitkowski problems.Problem Iterations c Problem Iterations c1 27 0 29 47 5.810
2 15 0 30 10 1.6 10
3 100 31 91.7 10
4 10 0 32 45 0
5 10 0 33 10 0
66 0 34 19 3.4 1079 2.3 106 35 12 0
85 0 36 20 0
9 5 0 37 57 175.4
10 14 1.2 1012 39 8 5.3 106
12 15 3.8 1011 40 25 5.8 105
14 11 2.3 1011 41 36 1.2 101112 45 22 012 46 26 33.69
20 16 6.7 1011 47 20 71.8
21 10 0 48 5 022 12 5.3 1012 49 19 0
23 20 3.6 1012 50 9 0
24 19 9.7 1010 51 5 0
25 31 0 52 4 0
26 21 10 53 9 0
27 9 7065 54 13 0
28 4 0 55 11 4.1 1012worked very well in practice enabling the algorithm to solve large problems.The algorithm solved all the problems to the given accuracy. For all
of the problems, the initial value c0 of the penalty parameter c is set to
zero. Its nal value c is usually kept at low levels. However, for some tests,
its nal value needed to become large in order to achieve convergence,
which was achieved in all tests. We have observed also that usually the
penalty parameter becomes large for the problems whose starting point,
provided by Hock and Schittkowski (Ref. 13), is close to the boundary
of the feasible region. In such cases, Vanderbei and Shanno (Ref. 5) suggest that the starting point should be set to a 90%-10% mixture of the two
bounds of the box constraints, with the higher value placed on the nearer
bound. If the algorithm starts from such a point, the nal value of the
penalty parameter can be kept low, with similar convergence.11
11
111112 38 25 0
11 11 1.9 106 42 13 2.1 107
16 13 6.6 1012 43 13 0
17 55 2.3 1012 44 23 0
18 12 4.4 1015 22 3.6 1019 21 8.1 10518 JOTA: VOL. 125, NO. 3, JUNE 2005Table 2. Numerical results for the HockSchitkowski problems(continued).Problem Iterations c Problem Iterations c
56 11 9.4 108 86 16 7.6 1012
57 22 18.9 87 13 1.7 1011 88 16 2.1 10
60 11 0 89 16 1.5 1012
9
1161 20 9.5 105 90 20 1.1 1011
62 12 0 91 16 1.0 1059 12 1.7 101163 20 2.3 106 92 18 7.1 1011
64 23 4.1 1011 93 15 2.3 10121266 20 9.8 1012 96 31 4.3 1012
67 45 3.8 1011 97 22 1.9 108
68 11 2.3 107 98 19 3.1 105
69 43 4.2 105 99 35 1.2 101010 103 66 1.3 1012
73 12 4.1 1011 104 59 6.6 1012
76 12 0 108 21 6.8 1065 12 1.2 1011 95 31 1.0 1070 14 3.0 10
72 18 6.1 1012 100 14 6.1 1012
71 25 2.8 1011 102 69 7.8 101274 8 6.0 109 105 12 0
75 9 104 107 15 7.1 1012
1277 12 3243 110 11 0
78 50 1.3 104 111 13 13957
79 7 10 112 26 080 14 2.2 106 113 15 4.8 1012
81 56 465 114 48 2.7 1010 117 18 1.1 1012
5 119 19 1157.6Moreover, the algorithm was tested on many large optimization
problem. All of these problems were limited up to 300 variables and
constraints, since we used the student version of AMPL. In order to
overcome this limitation and be able to test the algorithm on a diverse
collection of problems, in some problems, we have reduced the number
of variables or constraints to 300, by modifying some parameters in their
denition. The numerical results are summarized in Table 3.At this points, we present an example in order to show the importance of the mechanism that switches between the two merit functions.
Consider the following box-constrained optimization problem:min f(x) = (x1 1283 16 2.8 10
84 35 3.6 101)(x1 2)(x1 3) + (x1 2)(x1 3)(x2 1)
(x1 3)(x2 1)(x2 2) (x2 1)(x2 2)(x2 3), (34a)
s.t. 5 xi 5,i = 1, 2. (34b)JOTA: VOL. 125, NO. 3, JUNE 2005 519Table 3. Numerical results for the large problems.Problem Iterations c Problem Iterations cAntenna 79 1.3 1010 dea2 12 0
Catenary 23 1.5 108 dea lp 11 1481
markowita100 31 25.1 dea lp2 11 1481
1 lp 150 18 9.1 1011 dea frac lin 14 0
nnls 32 0 r linear 20 5.0 1012nnls2 17 0 r convex 21 2.4 1011oetl 148 16 0 r socp 20 2.3 1011oet3 148 18 0 r exp 20 4.1 1010minsurf 9 0 svanberg299 26 1.7 1010obstclal 24 0 vanderm1 74 3.9 1011dea 25 0 vanderm3 78 1.1 1010Problem (34) has three local minima,x1
min = (5, 0.698),x2
min = (3.395, 5),2.5, 1.5),and two saddle points,x1
sad = (1.293, 1.293),x3
min = (x2
sad = (2.707, 2.707).We solved (34) with the original Algorithm 2.1 and with a variant
that uses only the norm of the KKT conditions as the merit function in all
iterations. We observed that, if we use the KKT residual norm as the only
merit function, there is no guarantee that Algorithm 2.1 will converge to
a local minimum. In fact the algorithm tends to converge to saddle points.
On the other hand, the strategy of switching merit functions ensures that
the Algorithm 2.1 converges to local minima.Finally, we mention that performance of the algorithm is very good
and comparable with that of other primal-dual interior-point algorithms.
For example, our algorithm seems to have superior performance, in terms
of the number of iterations, than the interior points algorithms described
in Ref. 5 and Ref. 4.520 JOTA: VOL. 125, NO. 3, JUNE 20051. Rustem,B., Equality and Inequality Constrained Optimization Algorithms with
Convergent Stepsizes, Journal of Optimization Theory and Applications, Vol.
76, pp. 429453, 1993.2. El-Bakry,A.S., Tapia, R. A., Tsuchiya, T., and Zhang,Y., On the Formulation and Theory of the Newton Interior-Point Method for Nonlinear Programming,
Journal of Optimization Theory and Applications, Vol. 89, pp. 507541, 1996.3. Yamashita, H., Globally Convergent Primal-Dual Interior-Point Method for
Constrained Optimization, Technical Report, Mathematical Systems Institute,
Shinjuku, Shinjuku-Ku, Tokyo, Japan, 1995.4. Gay,D.M., Overton, M. L., and Wright,M.H., A Primal-Dual Interior
Method for Nonconvex Nonlinear Programming, Technical Report 97-4-08, Bell
Laboratories, Murray Hill, New Jersey, 1997.5. Vanderbei, R. J., and Shanno,D.F., An Interior-Point Algorithm for Nonconvex Nonlinear Programming, Technical Report SOR-97-21, Department of
Civil Engineering and Operations Research, Princeton University, Princeton,
New Jersey, 1997.6. Akrotirianakis, I., and Rustem,B., A Primal-Dual Interior-Point Algorithm
with an Exact and Differentiable Merit Function for Nonlinear Programming,
Optimization Methods and Software, Vol. 14, pp. 136, 2000.7. Forsgren, A., and Gill,P.E., Primal-Dual Interior-Point Methods for Nonconvex Nonlinear Programming, SIAM Journal on Optimization, Vol. 8, pp.
11321152, 1998.8. Bartholomew-Biggs,M.C., IP from an SQP Point of View, Optimization
Methods and Software, Vol, 16, pp. 6984, 2001.9. Chamberlain,R.M., Lemarechal,C., Pedersen, H. C., and Powell,M.J.D., The Watchdog Technique for Forcing Convergence in Algorithms for Constrained Optimization, Mathematical Programming Study, Vol. 16, pp. 117,
1982.10. Zakovic,S., Rustem, B., and Pantelides,C., An Interior-Point Algorithm for
Computing Saddle Points of Constrained Continuous Minimax, Annals of Optimization Research, Vol. 99, pp. 5978, 2000.11. Yamashita H., and Yabe, H., Superlinear and Quadratic Convergence of Some
Primal-Dual Interior Point Methods for Constrained Optimization, Mathematical Programming, Vol. 75, pp. 377397, 1996.12. Akrotirianakis, I., and Rustem,B., A Globally Convergent Interior-Point
Algorithm for Nonlinear Programming Problems, Technical Report 97-14,
Department of Computing, Imperial College, London, UK, 1997.13. Hock, W., and Schittkowski, K., Test Examples for Nonlinear Programming
Codes, Springer Verlag, New York, NY, 1981.14. Vanderbei,R.J., Large-Scale Nonlinear AMPL Models, Princeton University,
Princeton, New Jersey 1997; see http://www.princeton.edu/rvdb/ampl/nlmodels.15. Powell,M.J.D., A Fast Algorithm for Nonlinearly Constrained Optimization
Calculation, Numerical Analysis Proceedings, Biennial Conference, Dundee,ReferencesJOTA: VOL. 125, NO. 3, JUNE 2005 521Scotland, Edited by G. A. Watson, Lecture Notes in Mathematics, Springer
Verlag, Berlin, Germany, Vol. 630, pp. 144157, 1978.16. Gill,P.E., Murray, W., and Wright,M.H., Practical Optimization, Academic Press, New York, NY, 1981.17. Rustem,B., Convergent Stepsizes for Constrained Optimization Algorithms,
Journal of Optimization Theory and Applications, Vol. 49, pp. 135160, 1986.18. Rustem,B., Algorithms for Nonlinear Programming and Multiple-Objective
Decisions, J. Wiley, New York, NY, 1998.
Springer Science+Business Media, Inc. 2005