程序代写代做代考 algorithm Numerical Optimisation: Constraint Optimisation

Numerical Optimisation: Constraint Optimisation

Numerical Optimisation:
Constraint Optimisation

Marta M. Betcke
m.betcke@ucl.ac.uk,

Kiko Rullan
f.rullan@cs.ucl.ac.uk

Department of Computer Science,
Centre for Medical Image Computing,

Centre for Inverse Problems
University College London

Lecture 12

M.M. Betcke Numerical Optimisation

Constraint optimisation problem

min
x∈Rn

f (x) subject to

{
ci (x) = 0, i ∈ E
ci (x) ≥ 0, i ∈ I

(COP)

f : D ⊂ Rn → R: objective function, assume smooth
ci : Rn → R: constraint function, assume smooth
i ∈ E equality constraints,
i ∈ I inequality constraints.
x ∈ Rn: optimisation variable

Feasible set Ω is a set of all points satisfying the constraints

Ω = {x ∈ D : ci (x) = 0, i ∈ E ; ci ≥ 0, i ∈ I}.

Optimal value: x? = infx∈Ω f (x)

x? =∞ if (COP) is infeasible i.e. Ω = ∅
x? = −∞ if (COP) is unbounded below

M.M. Betcke Numerical Optimisation

Examples: smooth constraints

Smooth constraints can describe regions with kinks.

Example: 1-norm:

‖x‖1 = |x1|+ |x2| ≤ 1

can be described as

x1 + x2 ≤ 1, x1 − x2 ≤ 1, −x1 + x2 ≤ 1, −x1 − x2 ≤ 1.

Example: pointwise max

min f (x) = max(x2, x)

can be reformulated as

min t, s.t. t ≥ x , t ≥ x2.

M.M. Betcke Numerical Optimisation

Types of minimisers of constraint problems

A point x? ∈ Ω is a global minimiser if

f (x?) ≤ f (x), ∀x ∈ Ω.

A point x? ∈ Ω is a local minimiser if

∃N (x?) : f (x?) ≤ f (x), ∀x ∈ N (x?) ∩ Ω.

A point x? ∈ Ω is a strict (or strong) local minimiser if

∃N (x?) : f (x?) < f (x), ∀x ∈ N (x?) ∩ Ω, x 6= x?. A point x? ∈ Ω is an isolated local minimiser if ∃N (x?) : x? is the only local minimiser in N (x?) ∩ Ω. M.M. Betcke Numerical Optimisation Active set Feasibility problem: Find x such that all constraints are satisfied at x . Active set A(x) at any feasible x ∈ Ω consists of the equality constraint indices set E and the inequality constraints i ∈ I for which ci (x) = 0 A(x) = E ∪ {i ∈ I : ci (x) = 0}. At a feasible point x ∈ Ω, the inequality constraint i ∈ I is said to be active if ci (x) = 0 and inactive if the strict inequality holds ci (x) > 0.

M.M. Betcke Numerical Optimisation

Single equality constraint

min x1 + x2 s.t. x
2
1 + x

2
2 − 2 = 0.

Feasibility: (Taylor expan-
sion of c1)

0 = c1(x+s) ≈ c1(x)︸ ︷︷ ︸
=0

+∇c1(x)Ts

Decrease direction: (Taylor
expansion of f )

0 > f (x+s)−f (x) ≈ ∇f (x)Ts

The only situation that such s does not exist is if for some scalar λ1

∇f (x) = λ1∇c1(x).

M.M. Betcke Numerical Optimisation

Single inequality constraint

min x1 + x2 s.t. 2− x21 − x
2
2 ≥ 0.

Feasibility: (Taylor expan-
sion of c1)

0 ≤ c1(x+s) ≈ c1(x)+∇c1(x)Ts

Decrease direction: (Taylor
expansion of f )

0 > f (x+s)−f (x) ≈ ∇f (x)Ts

Case: x inside the circle, i.e. c1(x) > 0

s = −α∇f (x)

M.M. Betcke Numerical Optimisation

Single inequality constraint

min x1 + x2 s.t. 2− x21 − x
2
2 ≥ 0.

Feasibility: (Taylor expan-
sion of c1)

0 ≤ c1(x+s) ≈ c1(x)︸ ︷︷ ︸
=0

+∇c1(x)Ts

Decrease direction: (Taylor
expansion of f )

0 > f (x+s)−f (x) ≈ ∇f (x)Ts

Case: x on the boundary of the circle, i.e. c1(x) = 0

∇f (x)Ts < 0, ∇c1(x)Ts ≥ 0 Empty only if ∇f (x) = λ1∇c1(x) for some λ1 ≥ 0. M.M. Betcke Numerical Optimisation Linear independent constraint qualification (LICQ) Given the point x in the active set A(x), the linear independent constraint qualification (LICQ) holds if the set of of active constraint gradients {∇ci (x), i ∈ A(x)} is linearly independent. Note that for LICQ to be satisfied, none the the active constraint gradients can be 0. Example: LICQ is not satisfied if we define the equality constraint c1(x 2 2 + x 2 2 − 2) 2 = 0 (same feasibility region, different constraint) There are other constraint qualifications e.g. Slater’s conditions for convex problems. M.M. Betcke Numerical Optimisation Theorem: 1st order necessary conditions Lagrangian function L(x , λ) = f (x)− ∑ i∈E∪I λici (x). Let x? be a local solution of (COP) and f and ci be continuously differentiable and LICQ hold at x?. Then there exists a Lagrange multiplier λ? with components λ?i , i ∈ E ∪ I such that the following Karush-Kuhn-Tucker conditions are satisfied at (x?, λ?) ∇xL(x?, λ?) = 0, (1a) ci (x ?) = 0, ∀i ∈ E , (1b) ci (x ?) ≥ 0, ∀i ∈ I, (1c) λ? ≥ 0, ∀i ∈ I, (1d) λ?i ci (x ?) = 0, ∀i ∈ E ∪ I. (1e) M.M. Betcke Numerical Optimisation Strong complementarity condition The complementarity condition (2)(e) can be made stronger. Given x? a local solution of (COP) and a vector λ? satisfying the KKT conditions (2), the strict complementarity condition holds if exactly one of λ?i and ci (x ?) is zero for each i ∈ I. In other words, λ? > 0 for each i ∈ I ∩ A(x?).

Strict complementarity makes it easier for the algorithms to
identify the active set and converge quickly to the solution.

For a given solution x? of (COP), there may be many vectors λ?

which satisfy the KKT condition (2). However, if LICQ holds, the
optimal λ? is unique.

M.M. Betcke Numerical Optimisation

Lagrangian: primal problem

For convenience we change (and refine) our notation for the
constraint optimisation problem. The following slides are based on
Boyd (Convex Optimization I).

Let p? be the optimal value of the primal problem

min
x∈D⊂Rn

f (x) (COP:P)

subject to fi (x) ≤ 0, i = 1, . . . ,m,
hi (x) = 0, i = 1, . . . , p

The Lagrangian L : D × Rm × Rp → R

L(x , λ, ν) = f (x) +
m∑
i=1

λi fi (x) +

p∑
i=1

νihi (x)

λi are Lagrange multipliers associated with fi (x) ≤ 0
νi are Lagrange multipliers associated with hi (x) = 0

M.M. Betcke Numerical Optimisation

Lagrange dual function

Lagrange dual function: g : Rm × Rp → R

g(λ, ν) = inf
x∈D
L(x , λ, ν) (LD)

= inf
x∈D

(
f (x) +

m∑
i=1

λi fi (x) +

p∑
i=1

νihi (x)

)
.

g : is concave, can be −∞ for some λ, ν.

Lower bound property: If λ ≥ 0, then g(λ, ν) ≤ p?.
Proof: For any feasible x̃ and λ ≥ 0 we have

f (x̃) ≥ L(x̃ , λ, ν) ≥ inf
x∈D
L(x , λ, ν) = g(λ, ν).

Minimising over all feasible x̃ gives p? ≥ g(λ, ν).

M.M. Betcke Numerical Optimisation

Convex problem

Convex problem in standard form

min
x∈D⊂Rn

f (x)

subject to fi (x) ≤ 0, i = 1, . . . ,m,
hi (x) = 0, i = 1, . . . , p

f is convex and D is convex
fi are convex

hi are affine i.e. a
T
i x = bi

Feasibility set Ω of a convex problem is a convex set.

M.M. Betcke Numerical Optimisation

Example: least norm solution of linear equations

min
x∈Rn

xTx

subject to Ax = b

Lagrangian: L(x , ν) = xTx + νT(Ax − b)

Dual function:
g(ν) = infx∈Rn L(x , ν) = infx∈Rn

(
xTx + νT(Ax − b)

)
L(x , ν) is strictly convex in x , to minimise over x set gradient
equal zero:

∇xL(x , ν) = 2x + ATν = 0 ⇒ xmin = −1/2ATν

Plug xmin into g

g(ν) = L(xmin, ν) = −
1

4
νATAν − bTν.

g is a concave function of ν.

Lower bound property: p? ≥ −1/4νATAν − bTν for all ν.
M.M. Betcke Numerical Optimisation

Example: standard form LP

min
x∈Rn

cTx

subject to Ax = b, x ≥ 0

Lagrangian:
L(x , ν) = cTx +νT(Ax−b)−λTx = −bTν+(c+ATν−λ)Tx

Dual function:

g(λ, ν) = inf
x∈Rn
L(x , λ, ν) = inf

x∈Rn

(
−bTν + (c + ATν − λ)Tx

)
L(x , ν) is affine in x , hence

g(λ, ν) = infx∈Rn L(x , ν) =
{
−bTν, ATν − λ+ c = 0
−∞, otherwise

g is linear on affine domain {(λ, ν) : ATν − λ+ c = 0},
hence concave.

Lower bound property: p? ≥ −bTν if ATν + c ≥ 0.
M.M. Betcke Numerical Optimisation

Example: equality constraint norm minimisation

min
x∈Rn

‖x‖

subject to Ax = b

Lagrangian:
L(x , ν) = ‖x‖+ νT(b − Ax) = ‖x‖+ bTν − νTAx

infx∈Rn(‖x‖ − yTx) = 0 if ‖y‖? ≤ 1,−∞ otherwise, where
‖v‖? = sup‖u‖≤1 uTv is dual norm of ‖ · ‖.
If ‖y‖? ≤ 1, then ‖x‖ − yTx ≥ 0, ∀x , with equality if x = 0.
If ‖y‖? > 1, choose x = tu, u : ‖u‖ ≤ 1, uTy = ‖y‖? > 1

‖x‖ − yTx = t(‖u‖ − ‖y‖?)→ −∞ as t →∞

Dual function:

g(ν) = inf
x∈Rn
L(x , ν) =

{
bTν, ‖ATν‖? ≤ 1
−∞, otherwise

Lower bound property: p? ≥ bTν if ‖ATν‖? ≤ 1.
M.M. Betcke Numerical Optimisation

Conjugate function

The conjugate of function f is

f ?(y) = sup
x∈D

(yTx − f (x))

The conjugate f ? is convex (even if f is not)

Figure: Boyd, Convex Optimization I

M.M. Betcke Numerical Optimisation

Lagrange dual and conjugate function

min
x∈Rn

f (x)

subject to Ax ≤ b, Cx = d

Lagrangian:

L(x , λ, ν) = f (x) + λT(Ax − b) + νT(Cx − d)
= f (x) + (ATλ+ CTν)Tx − bTλ− dTν.

Dual function:

g(λ, ν) = inf
x∈D

(
f (x) + (ATλ+ CTν)Tx − bTλ− dTν

)
= − sup

x∈D

(
−f (x)− (ATλ+ CTν)Tx − bTλ− dTν

)
= −f ?(−ATλ− CTν)− bTλ− dTν

M.M. Betcke Numerical Optimisation

Lagrangian: dual problem

Lagrange dual problem

max g(λ, ν)

subject to λ ≥ 0

finds the best lower bound on p? obtained from Lagrange dual
function

is a convex optimization problem, we denote its optimal value
with d?

λ, ν are dual feasible if λ ≥ 0, (λ, ν) ∈ dom g
often simplified by making implicit constraint (λ, ν) ∈ dom g ,
explicit

M.M. Betcke Numerical Optimisation

Weak and strong duality

Weak duality: d? ≤ p?

always holds (for convex and nonconvex problems)

can be used to find nontrivial lower bounds for difficult
problems

Strong duality: d? = p?

does not hold in general

holds for convex problems under constraints qualifications

M.M. Betcke Numerical Optimisation

Slater’s constraint qualification

Strong duality holds for a convex problem

min
x∈D

f (x)

subject to fi (x) ≤ 0, i = 1, . . . ,m
Ax = b,

if it is strictly feasible i.e.

∃x ∈ intD : fi (x) < 0, i = 1, . . . ,m, Ax = b also guarantees that the dual optimum is attained (if p? > −∞)
can be sharpened: e.g. can replace intD with relintD (interior
of the affine hull); linear inequalities do not need to hold with
strict inequality, …

other constraint qualifications exist e.g. LICQ

M.M. Betcke Numerical Optimisation

Example: inequality form LP

Primal problem

min
x∈Rn

cTx

subject to Ax ≤ b

Dual function

g(λ) = inf
x∈Rn

(
(c + ATλ)Tx − bTλ

)
=

{
−bTλ, ATλ+ c = 0
−∞, otherwise

Dual problem

max − bTλ
subject to ATλ+ c = 0, λ ≥ 0

From Slater’s condition: p? = d? if ∃x̃ : Ax̃ < b In fact, p? = d? except when primal and dual are infeasible M.M. Betcke Numerical Optimisation Example: Quadratic program Primal problem (assume P symmetric positive definite) min x∈Rn xTPx subject to Ax ≤ b Dual function g(λ) = inf x∈Rn ( xTPx + λT(Ax − b) ) = − 1 4 λTAP−1ATλ− bTλ Dual problem max − 1 4 λTAP−1ATλ− bTλ subject to λ ≥ 0 From Slater’s condition: p? = d? if ∃x̃ : Ax̃ < b In fact, p? = d? always M.M. Betcke Numerical Optimisation Example: nonconvex problem with strong duality Primal problem min x∈Rn xTAx + 2bTx subject to xTx ≤ 1 A 6� 0 is not positive definite. Dual function g(λ) = inf x∈Rn ( xT(A + λI )x + 2bTx − λ ) unbounded below if A + λI 6� 0 or if A + λI � 0 and b 6∈ R(A + λI ) otherwise minimised by x = −(A + λI )†b: g(λ) = −bT(A + λI )†b − λ M.M. Betcke Numerical Optimisation Dual problem max − bT(A + λI )†b − λ subject to A + λI � 0 b ∈ R(A + λI ) and equivalent semidefinite program: max − t − λ subject to [ A + λI b bT t ] � 0 Strong duality although primal problem is not convex (not easy to show). M.M. Betcke Numerical Optimisation KKT conditions revisited Karush-Kuhn-Tucker conditions are satisfied at x?, ν?, λ? i.e. ∇f (x?) + m∑ i=1 λi∇fi (x?) + p∑ i=1 νi∇hi (x?) = 0, (2a) hi (x ?) = 0, i = 1, . . . , p [primary constraints] (2b) fi (x ?) ≤ 0, i = 1, . . . ,m [primary constraints] (2c) λ? ≥ 0, i = 1, . . . ,m [dual constraints] (2d) λ?i fi (x ?) = 0, i = 1, . . . ,m [complementary slackness] (2e) Necessary condition: If strong duality holds and x , ν, λ are optimal, then they must satisfy KKT conditions. For any problem for which strong duality holds, KKT are necessary conditions. M.M. Betcke Numerical Optimisation KKT conditions for convex problem Sufficient condition: If x?, ν?, λ? satisfy KKT conditions and the problem is convex, then x?, ν?, λ? are primal and dual optimal: from complementary slackness: f (x?) = f (x?) + ∑m i=1 λi fi (x ?)︸ ︷︷ ︸ =0 + ∑p i=1 νi hi (x ?)︸ ︷︷ ︸ =0 = L(x?, λ?, ν?) g(λ?, ν?) = infx L(x , λ?, ν?) and from the 1st order necessary condition and convexity of f we have that the minimum is attained at x?, hence g(λ?, ν?) = L(x?, λ?, ν?) Thus it follows that f (x?) = g(λ?, ν?). If Slater’s condition is satisfied: x? is optimal if and only if there exists λ?, ν? that satisfy KKT conditions recall that Slater implies strong duality, and that the dual optimum is attained generalises optimality condition ∇f (x) = 0 for unconstrained problem M.M. Betcke Numerical Optimisation