程序代写代做代考 algorithm AI Numerical Optimisation Constraint optimisation: Interior point methods

Numerical Optimisation Constraint optimisation: Interior point methods

Numerical Optimisation
Constraint optimisation:
Interior point methods

Marta M. Betcke
m.betcke@ucl.ac.uk,

Kiko Rullan
f.rullan@cs.ucl.ac.uk

Department of Computer Science,
Centre for Medical Image Computing,

Centre for Inverse Problems
University College London

Lecture 15

M.M. Betcke Numerical Optimisation

Convex constraint optimisation problem

Convex constraint optimization problem

min
x∈D⊂Rn

f (x) (COP)

subject to fi (x) ≤ 0, i = 1, . . . ,m,
Ax = b,

where

f : D → R is convex, twice continuously differentiable
function, D ⊂ Rn is convex

fi : Rn → R, i = 1, . . .m are convex, twice continuously
differentiable functions

A ∈ Rp×n with rankA = p < n. M.M. Betcke Numerical Optimisation We assume that (COP) is solvable i.e. an optimal x? exists, and we denote the optimal value as p? = f (x?). (COP) is strictly feasible i.e. there exists x ∈ D that satisfies Ax = b and fi (x) < 0 for i = 1, . . . ,m. This means that Slater’s constraint qualification holds, thus there exists dual optimal λ? ∈ Rm, ν? ∈ Rp, which together with x? satisfy the KKT conditions ∇f (x?) + m∑ i=1 λ?i∇fi (x ?) + ATν? = 0, (KKT) Ax? = b, fi (x ?) ≤ 0, i = 1, . . . ,m, λ? ≥ 0, λ?i fi (x ?) = 0, i = 1, . . . .m. M.M. Betcke Numerical Optimisation Interior point methods Interior point methods solve either the problem (COP) by applying Newton’s method to a sequence of equality constraint problems. the conditions (KKT) by applying Newton’s method to a sequence of modified versions of the KKT conditions. We will consider the barrier method and the primal-dual interior-point method. M.M. Betcke Numerical Optimisation Logarithmic barrier function Rewrite (COP) making the inequality constraints implicit min x∈D⊂Rn f (x) + m∑ i=1 I−(fi (x)) subject to Ax = b, where I− : R→ R is the indication function for the nonpositive reals I−(u) = { 0 u ≤ 0, ∞ u > 0.

I− is non-differentiable thus we need a smooth approximation
before Newton method can be applied.

M.M. Betcke Numerical Optimisation

Approximate I− with a smooth logarithmic barrier

Î−(u) = −1/t log(−u), dom Î− = [−∞, 0),
where t > 0 is a parameter that sets the accuracy of the
approximation.

Like I−, Î− is convex, nondecreasing and by convention ∞ for
u > 0.
Unlike I−, Î− is differentiable and closed i.e. it increases to ∞
as u increases to 0.

-3 -2 -1 0 1

-2

0

2

4

6

8

10
Î! = !1=t log(!u)

I!
Î!; t = 1=2

Î!; t = 1

Î!; t = 2

M.M. Betcke Numerical Optimisation

Substituting Î− for I− yields an approximation

min
x∈D⊂Rn

f (x) +
m∑
i=1

−1/t log(−fi (x))

subject to Ax = b.

The objective function is convex since −1/t log(−u) is increasing
in u, and differentiable, thus Newton’s method can be applied.

Logarithmic barrier

φ(x) = −
m∑
i=1

log(−fi (x)), domφ = {x ∈ Rn : fi (x) < 0, i = 1, . . . ,m} Gradient and Hessian of φ ∇φ(x) = m∑ i=1 1 −fi (x) ∇fi (x), ∇2φ(x) = m∑ i=1 1 fi (x)2 ∇fi (x)∇fi (x)T + m∑ i=1 1 −fi (x) ∇2fi (x) M.M. Betcke Numerical Optimisation Central path Consider the equivalent problem min x∈D⊂Rn tf (x) + φ(x) (CENT) subject to Ax = b. We assume that (CENT) has a unique solution for each t > 0, and
denote this solution with x?(t).
The set of points x?(t), t > 0 is called the central path. The
points on central path are characterised by the following necessary
and sufficient centrality conditions:
x?(t) is strictly feasible i.e. satisfies

Ax?(t) = b, fi (x
?(t)) < 0, i = 1, . . . ,m, and there exists a ν̂ ∈ Rp such that 0 = t∇f (x?(t)) +∇φ(x?(t)) + ATν̂ (CENT:COND) = t∇f (x?(t)) + m∑ i=1 1 −fi (x?(t)) ∇f (x?(t)) + ATν̂ M.M. Betcke Numerical Optimisation Example: LP with inequality constraints min x∈Rn cTx subject to Ax ≤ b. The logarithmic barrier: φ(x) = − m∑ i=1 log(bi − aTi︸︷︷︸ =:Ai,: x), domφ = {x : Ax < b}. The gradient and Hessian: ∇φ(x) = m∑ i=1 1 bi − aTi x ai , ∇2φ(x) = m∑ i=1 1 (bi − aTi x)2 aia T i . Since x is strictly feasible, we have bi − aTi x > 0 and the Hessian is
nonsingular iff A has rank n.
The centrality condition (CENT:COND): (∇φ(x?(t)) ‖ −c)

tc +
m∑
i=1

1

bi − aTi x
ai = 0.

M.M. Betcke Numerical Optimisation

Example: central path

Figure: Boyd Vandenberghe Fig. 11.2

M.M. Betcke Numerical Optimisation

Dual points from central path

Claim: Every point on central path yields a dual feasible point and
hence a lower bound on p?. More precisely, the pair

λ?(t) =
1

−tfi (x?(t))
, i = 1, . . . ,m, ν?(t) = ν̂/t

is dual feasible.

Proof: λ?(t) > 0 because x?(t) is strictly feasible fi (x
?(t)) < 0 and from optimality conditions (CENT:COND) we read out that x?(t) minimises the Lagrangian L(x , λ, ν) = f (x) + m∑ i=1 λi fi (x) + ν T(Ax − b) for λ = λ?(t), ν = ν?(t). M.M. Betcke Numerical Optimisation This means that λ?(t), ν?(t) are dual feasible, the dual function is finite and g(λ?(t), ν?(t)) =f (x?(t)) + m∑ i=1 λ?i (t)︸ ︷︷ ︸ =− 1 tfi (x ?(t)) fi (x ?(t)) + ν?(t)T(Ax?(t)− b︸ ︷︷ ︸ =0 ) =f (x?(t))−m/t → duality gap thus x?(t) is not more than m/t suboptimal f (x?(t))− p? ≤ m/t and x?(t) converges to an optimal point as t →∞. M.M. Betcke Numerical Optimisation Interpretation via KKT conditions We can interpret the central path conditions as a continuous deformation of (KKT). A point x is equal to x?(t) iff there exists λ, ν such that ∇f (x) + m∑ i=1 λi∇fi (x) + ATν = 0, (KKT:CENT) Ax = b, fi (x) ≤ 0, i = 1, . . . ,m, λ ≥ 0, −λi fi (x) = 1/t, i = 1, . . . .m. The only difference to (KKT) is the complementarity condition −λi fi (x) = 1/t. For large t, x?(t), λ?(t), ν?(t) almost satisfy the KKT conditions. M.M. Betcke Numerical Optimisation Newton for centering problem (CENT) The Newton step for the centering problem (CENT) (linear equality constraint problem) reads[ t∇2f (x) +∇2φ(x) AT A 0 ] [ ∆xn νn ] = − [ t∇f (x) +∇φ(x) 0 ] . Here we assumed feasibility. We can interpret these Newton step for (CENT) as Newton for directly solving the modified (KKT:CENT) in a particular way. M.M. Betcke Numerical Optimisation Newton for modified KKT (KKT:CENT) First, eliminate λ using λi = −1/(tfi (x)) from the (KKT:CENT) system ∇f (x) + m∑ i=1 1 −tfi (x) ∇fi (x)︸ ︷︷ ︸ = 1 t ∇φ(x) +ATν = 0, Ax = b. To find the Newton step for the solution of the nonlinear equations above, we form the Taylor expansion for the nonlinear term ∇f (x + v) + 1 t ∇φ(x + v) ≈ ∇f (x) + 1 t ∇φ(x)︸ ︷︷ ︸ =:g + ( ∇2f (x) + 1 t ∇2φ(x) ) v︸ ︷︷ ︸ =:Hv . M.M. Betcke Numerical Optimisation Replace the nonlinear term with this linear approximation Hv + ATν = −g , Av = 0 and observe that the Newton step ∆xn, νn for (CENT) satisfies tH∆xn + A Tνn = −tg , A∆xn = 0. Comparing to the Newton step for (KKT:CENT) yields v = ∆xn, ν = (1/t)νn. This shows that the Newton step for the centring problem (CENT) can be interpreted (after scaling of the dual variable) as the Newton step for solving the modified (KKT:CENT) system. M.M. Betcke Numerical Optimisation The barrier method Require: Strictly feasible x s := x (0), t := t(0) > 0, µ > 1
Require: Tolerance � > 0
1: loop
2: Centering step:

obtain x?(t) by minimising (CENT) starting from x s

3: Update x s = x?(t)
4: if m/t < � then 5: break {stopping criterium �-sub optimal point} 6: end if 7: Increase t = µt 8: end loop M.M. Betcke Numerical Optimisation Barrier method: remarks Centering step: can be solved by any methods for linearly constraint minimisation, in particular Newton method. Exact centering is not necessary since the central path has no significance beyond that it leads to the solution of the original problem (COP) as t →∞. Inexact centering will still produce a convergent sequence, however the λ?(t), ν?(t) are not exactly dual feasible (can be corrected). The difference in cost between the exact and good approximate centering is marginal (few Newton steps), so centering is usually assumed exact. M.M. Betcke Numerical Optimisation Choice of µ: trade of between the number of outer (centering) and inner (Newton) iterations. Small µ ≈ 1: good initial guess for Newton i.e. small number of inner iterations closely following the central path but a large number of outer iterations to reach desire accuracy �. Larger µ: only a few outer iterations but with a large number of inner iterations straying from the central path. In practice for a large range of µ ∈ (3, 100) these effects balance each other yielding approximately same total number of Newton iterations. Values around 10-20 seem to work well. For best worse-case bound on total Newton steps set µ close to 1. M.M. Betcke Numerical Optimisation Choice of t(0): Trade of between the number of inner iterations in the first step and number of outer iterations. Choose so that m/t(0) ≈ f (x (0))− p?. For instance if a dual feasible point λ, ν is known with the duality gap η = f (x (0))− g(λ, ν), then we can set t(0) = m/η (the first centering step will compute a pair with the same duality gap as the initial primal and dual feasible points). Choose t(0) as a minimiser of inf t,ν ‖t∇f (x (0)) +∇φ(x (0)) + ATν‖, a measure of deviation of x (0) from x?(t) (least squares problem for t, ν). Infeasible Newton method for centering step. Choose x (0) ∈ D, fi (x (0)) < 0, i = 1, . . . ,m but not necessarily Ax (0) = b. Assuming the centering problem is strictly feasible, a full Newton step is taken at some point during the first centering step and thereafter the iterates are primal feasible and the algorithm coincides with the standard barrier method. M.M. Betcke Numerical Optimisation Computing a strictly feasible point The barrier method requires a strictly feasible point x (0). When such a point is not known, the barrier method is preceded by a preliminary stage called phase I to compute a strictly feasible point (or to find that the constraints are infeasible). Consider the set of inequalities and equalities fi (x) ≤ 0, i = 1, . . . ,m, Ax = b (FEAS) Assume we have a point x (0) ∈ ∏m i=1 dom fi and Ax (0) = b i.e. the inequalities are possibly not satisfied at x (0). M.M. Betcke Numerical Optimisation Phase I: max Goal: find a strictly feasible solution of equalities and inequalities: min s (PH1:MAX) subject to fi (x) ≤ s, i = 1, . . . ,m Ax = b s: bound on the the maximum infeasibility of the inequalities. The goal is to drive this maximum below 0. The problem (PH1:MAX) is always strictly feasible. Thus we can initialise with x = x (0) and for s with any number larger than maxi=1,....m fi (x (0)) and apply the barrier method. M.M. Betcke Numerical Optimisation Let p?I denote the optimal value for (PH1:MAX). p?I < 0: (FEAS) has a strictly feasible solution. If (x , s) is feasible for (PH1:MAX) with s < 0, then x satisfies fi (x) < 0. We do not need to solve (PH1:MAX) with high accuracy, we can terminate when s < 0. p?I > 0: (FEAS) are infeasible.
We do not need to solve (PH1:MAX) with high accuracy, we
can terminate when a dual feasible point is found with
positive objective, which proves that p?I > 0.

If p?I = 0 and the minimum is attained at x
? and s? = 0, then

the set of inequalities is feasible, but not strictly feasible. If
p? = 0 and the minimum is not attained, the inequalities are
infeasible.

M.M. Betcke Numerical Optimisation

Phase I: sum

min 1Ts (PH1:SUM)

subject to fi (x) ≤ s, i = 1, . . . ,m
Ax = b

s ≥ 0

For a fixed x , the optimal value of si is max{fi (x), 0}. Thus
we are minimising a sum of infeasibilities.

The optimal value is 0 and achieved iff the original set of
equalities and inequalities is feasible.

When the system of equalities and inequalities is infeasible,
often the solution violates only a small number of constraints
i.e. we identified a large feasible subset. This is more
informative than finding that m inequalities together are
mutually infeasible.

M.M. Betcke Numerical Optimisation

Termination near phase II central path

Assume x (0) ∈ D ∩
∏m

i=1 domfi with Ax
(0) = b.

Modified phase I optimisation problem

min s

subject to fi (x) ≤ s, i = 1, . . . ,m
f (x) ≤ M
Ax = b

with M > max{f (x (0)), p?}.
Central path for this modified problem (x?(t̄), s?(t̄)), t̄ > 0
m∑
i=1

1

s − fi (x)
= t̄,

1

M − f (x)
∇f (x) +

m∑
i=1

1

s − fi (x)
∇fi (x) + ATν = 0.

If (x , s) with s = 0 is on this central path, it is also on the central
path for (COP) if the latter is strictly feasible (s = 0)

t∇f (x) +
m∑
i=1

1

−fi (x)
∇fi (x) + ATν = 0

with t = 1/(M − f (x)).
M.M. Betcke Numerical Optimisation

Phase I via infeasible Newton

We expresse (COP) in equivalent form

min f (x)

subject to fi (x) ≤ s, i = 1, . . . ,m
Ax = b, s = 0.

Start the barrier method using infeasible Newton method so solve

min tf (x)−
m∑
i=1

log(s − fi (x))

subject to Ax = b, s = 0,

which can be initialised with any x ∈ D and any s > max
i

fi (x)︸ ︷︷ ︸
infeasibility

.

Provided the problem is strictly feasible, the infeasible Newton will
eventually take an undamped step an thereafter we will have s = 0
i.e. x strictly feasible.

M.M. Betcke Numerical Optimisation

Finding a point in the domain D

The same trick can be applied if a point in D ∩
∏m

i=1 domfi
(domain of the function and inequality constraints) is not known.

Apply infeasible Newton to

min tf (x + z0)−
m∑
i=1

log(s − fi (x + zi ))

subject to Ax = b, s = 0, z0 = 0, z1 = 0, . . . , zm = 0,

with initialisation zi : x + zi ∈ dom fi .

Disadvantage: no good stopping criterion for infeasible problems;
the residual simply fails to converge to 0.

M.M. Betcke Numerical Optimisation

Characteristic performance

Typically the cost of solving a set of convex inequalities and
linear equalities using the barrier method is modest, and
approximately constant, as long as the problem is not very
close to the boundary between feasibility and infeasibility.

When the problem is very close to the boundary, the number
of Newton steps required to find a strictly feasible point or
produce a certificate of infeasibility grows.

When the problem is exactly on the boundary between strictly
feasible and infeasible, for example, feasible but not strictly
feasible, the cost becomes infinite.

Typically the infeasible start Newton method works very well
provided the inequalities are feasible, and not very close to the
boundary between feasible and infeasible.

When the feasible set is just barely nonempty, a phase I
method is far better choice. Phase I method gracefully
handles the infeasible case; the infeasible start Newton
method, in contrast, simply fails to converge.

M.M. Betcke Numerical Optimisation

Primal-dual interior point method

Primal-dual interior point method is similar to barrier method with
key differences:

There is only one loop or iteration, i.e., there is no distinction
between inner and outer iterations as in the barrier method.
At each iteration, both the primal and dual variables are
updated.

The search directions in a primal-dual interior-point method
are obtained from Newton’s method, applied to modified KKT
equations (i.e., the optimality conditions for the logarithmic
barrier centering problem). The primal-dual search directions
are similar to, but not quite the same as, the search directions
that arise in the barrier method.

In a primal-dual interior-point method, the primal and dual
iterates are not necessarily feasible.

Usually more efficient than barrier methods, and do not
require strict feasibility.

M.M. Betcke Numerical Optimisation

Primal-dual search direction

As in barrier method we start from (KKT:CENT) which we rewrite
in the form

0 = rt(x , λ, ν) =


∇f (x) + J(x)Tλ+ ATν− diag(λ)F (x)− (1/t)1

Ax − b


 =:


rdualrcent
rprim


 ,

with t > 0, F : Rn → Rm and its Jacobian

F (x) =


 f1(x)…
fm(x)


 , J(x) = DF (x) =


∇f1(x)

T


∇fm(x)T


 .

If x , λ, ν satisfy rt(x , λ, ν) = 0 (and fi (x) < 0), then x = x?(t), λ = λ?(t), ν = ν?(t). In particular, x is primal feasible, and λ, ν are dual feasible, with duality gap m/t. M.M. Betcke Numerical Optimisation Newton step for solution of rt(x , λ, ν) = 0 at y = (x , λ, ν) a primal-dual strictly feasible point F (x) < 0, λ > 0.
Difference to barrier method: we do not eliminate λ before taking
the Newton step

rt(y + ∆y) ≈ rt(y) + Drt(y)∆y = 0,

where ∆y = (∆x ,∆λ,∆ν) is the primal-dual search direction.

Written in terms of x , λ, ν:
∇2f (x) +

∑m
i=1 λi∇

2fi (x) J(x)
T AT

− diag(λ)J(x) − diag(F (x)) 0
A 0 0




∆x∆λ

∆ν


 = −


rdualrcent
rprim


 .

(PD:N)

M.M. Betcke Numerical Optimisation

Comparison of primal-dual and barrier search directions

Eliminate ∆λpd from (PD:N):

From the second block

∆λpd = − diag(F (x))−1 diag(λ)J(x)∆xpd + diag(F (x))−1rcent

and substitute into the first block[
Hpd A

T

A 0

] [
∆xpd
∆νpd

]
= −

[
rdual + J(x)

T diag(f (x))−1rcent
rpri

]
= −

[
∇f (x) + 1

t

∑m
i=1

1
−fi (x)

∇fi (x) + ATν
rpri

]
,

where

Hpd = ∇2f (x) +
m∑
i=1

λi∇2fi (x) +
m∑
i=1

λi
−fi (x)

∇fi (x)∇fi (x)T.

M.M. Betcke Numerical Optimisation

Compare to the Newton step in the barrier method (in the
infeasible form)[

Hbar A
T

A 0

] [
∆xbar
νbar

]
= −

[
t∇f (x) +

∑m
i=1

1
−fi (x)

∇fi (x)
rpri

]
,

where

Hbar = t∇2f (x) +
m∑
i=1

1

−fi (x)
∇2fi (x) +

m∑
i=1

1

f 2i (x)
∇fi (x)∇fi (x)T.

Multiplying first block by 1/t and changing variables
∆νbar = (1/t)νbar − ν[
1
t
Hbar A

T

A 0

] [
∆xbar
∆νbar

]
= −

[
∇f (x) + 1

t

∑m
i=1

1
−fi (x)

∇fi (x) + ATν
rpri

]
,

M.M. Betcke Numerical Optimisation

The right hand sides are identical.

The only difference are

Hpd = ∇2f (x) +
m∑
i=1

λi∇2fi (x) +
m∑
i=1

λi
−fi (x)

∇fi (x)∇fi (x)T.

1

t
Hbar = ∇2f (x) +

m∑
i=1

1

−tfi (x)
∇2fi (x) +

m∑
i=1

1

tf 2i (x)
∇fi (x)∇fi (x)T.

When x , λ, ν satisfy −fi (x)λi = 1/t, the coefficient matrices (and
hence directions) coincide.

M.M. Betcke Numerical Optimisation

The surrogate duality gap

In the primal-dual interior point methods, the iterates
x (k), λ(k), ν(k) are not necessarily feasible, except in the limit
as the algorithm converges.

Hence, cannot easily evaluate duality gap η(k) in the kth step,
as we do in the outer loop of the barrier method.

Instead, we define the surrogate duality gap, for any x that
satisfied F (x) < 0 and λ ≥ 0 as η(x , λ) = −F (x)Tλ. The surrogate gap is the duality gap if x were primal feasible and λ, µ were dual feasible i.e. if rprim = 0, rdual = 0. Note that value of t corresponds to the surrogate duality gap η ≈ m/t → t = m/η. M.M. Betcke Numerical Optimisation Primal-dual interior point Require: x that satisfies F (x) < 0, λ > 0
Require: µ > 1
Require: Tolerances �feas > 0, � > 0
1: repeat
2: Determine t: t := µm/η
3: Compute primal-dual search direction ∆ypd
4: Line search: determine step length s > 0 and set

y := y + s∆ypd
5: until ‖rprim‖ ≤ �feas, ‖rdual‖ ≤ �feas and η ≤ �

M.M. Betcke Numerical Optimisation

Remarks

The parameter t is set to a factor µm/η, which is the value of
t associated with the current surrogate duality gap η. If
x , λ, ν were central, with parameter t (and therefore with
duality gap m/t), then we would increase t by the factor µ (as
in the barrier method).

Values of the parameter µ on the order of 10 appear to work
well.

The primal-dual interior-point algorithm terminates when x is
primal feasible and λ, ν are dual feasible (within the tolerance
�feas) and the surrogate gap is smaller than the tolerance �.
Since the primal-dual interior-point method often has faster
than linear convergence, it is common to choose �feas, � small.

M.M. Betcke Numerical Optimisation

The line search in the primal-dual interior point method is a
standard backtracking line search, based on the norm of the
residual, and modified to ensure that λ > 0 and F (x) < 0. Start with smax = sup{s ∈ [0, 1] : λ+ s∆λ ≥ 0}, multiply by ρ ∈ (0, 1) until F (x + s∆xpd) < 0. Continue multiplying until we have ‖rt(x+s∆xpd, λ+s∆λpd, x+s∆νpd)‖ ≤ (1−αs)‖rt(x , λ, ν)‖. Common choices for backtracking parameters are same as for Newton method α in the range 0.01 to 0.1 and ρ 0.3 to 0.8. One iteration of the primal-dual interior-point algorithm is the same as one step of the infeasible Newton method, applied to solving rt(x , λ, ν) = 0, but modified to ensure λ > 0 and
F (x) < 0 (or, equivalently, with domrt restricted to λ > 0 and
F (x) < 0). The same arguments used in the proof of convergence of the infeasible start Newton method show that the line search for the primal-dual method always terminates in a finite number of steps. M.M. Betcke Numerical Optimisation