程序代写代做代考 algorithm AI Numerical Optimisation: Background

Numerical Optimisation: Background

Numerical Optimisation:
Background

Marta M. Betcke
m.betcke@ucl.ac.uk,

Kiko Rullan
f.rullan@cs.ucl.ac.uk

Department of Computer Science,
Centre for Medical Image Computing,

Centre for Inverse Problems
University College London

Lecture 1

M.M. Betcke Numerical Optimisation

Mathematical opimisation problem

min
x∈Rn

f (x) subject to
ci (x) = 0, i ∈ E
ci (x) ≥ 0, i ∈ I

f : Rn → R: objective function
ci : Rn → R: constraint function,
i ∈ E equality constraints,
i ∈ I inequality constraints.
x ∈ Rn: optimisation variable

Optimal solution x? has the smallest value of f among all x
which satisfy the constraints.

M.M. Betcke Numerical Optimisation

Example: geodesics

Geodesics are the shortest surface paths between two points.

Figure: https://academo.org/demos/geodesics/

M.M. Betcke Numerical Optimisation

A very short and incomplete early history

Source http://www.mitrikitti.fi/opthist.html

Antiquity: geometrical optimisation problems
300 BC Euclid considers the minimal distance between a point
a line, and proves that a square has the greatest area among
the rectangles with given total length of edges

before Calculus of Variations: isolated optimization
problems
1615 J. Kepler: optimal dimensions of wine barrel (with
smallest variation of volume w.r.t. barrel parameters).1

Early version of the secretary problem (optimal stopping
problem) when he started to look for a new wife
1636 P. Fermat shows that at the extreme point the derivative
of a function vanishes. In 1657 Fermat shows that light
travels between two points in minimal time.

1http://www.maa.org/press/periodicals/convergence/kepler-the-volume-of-
a-wine-barrel-solving-the-problem-of-maxima-wine-barrel-design

M.M. Betcke Numerical Optimisation

A very short and incomplete early history cont.

Source http://www.mitrikitti.fi/opthist.html

Calculus of Variations
I. Newton (1660s) and G.W. von Leibniz (1670s) create
mathematical analysis that forms the basis of calculus of
variations (CoV). Some separate finite optimization problems
are also considered
1696 Johann and Jacob Bernoulli study Brachistochrone’s
problem, calculus of variations is born
1740 L. Euler’s publication begins the research on general
theory of calculus of variations

M.M. Betcke Numerical Optimisation

A very short and incomplete early history cont.

Source http://www.mitrikitti.fi/opthist.html

Least squares
1806 A.-M. Legendre presents the least square method, which
also J.C.F. Gauss claims to have invented. Legendre made
contributions in the field of CoV, too

Linear Programming
1826 J.B.J. Fourier formulates LP-problem for solving
problems arising in mechanics and probability theory
1939 L.V. Kantorovich presents LP-model and an algorithm
for solving it. In 1975 Kantorovich and T.C. Koopmans get
the Nobel memorial price in economics for their contributions
on LP-problem
1947 G. Dantzig, who works for US air-force, presents the
Simplex method for solving LP-problems, von Neumann
establishes the theory of duality for LP-problems

M.M. Betcke Numerical Optimisation

Example: transportation problem

2 factories, Fi

12 retail outlets, Rj

each factory Fi can produce up to ai tones of a certain
compound per week

each retail outlet Rj has a weekly demand of bj tones of the
compound

the cost of shipping of one tone of the compound from Fi to
Rj is cij

Goal: what is the optimal amount to ship from each factory to
each outlet which satisfies demand at minimal cost.

M.M. Betcke Numerical Optimisation

Example: transportation problem cont.

min

ij cijxij

subject to
12∑
j=1

xij ≤ ai , i = 1, 2

2∑
i=1

xij ≥ bj , j = 1 . . . 12

xij ≥ 0, i = 1, 2, j = 1 . . . 12.

Linear programming problem because the objective function and
all constraints are linear.

M.M. Betcke Numerical Optimisation

Example: nonlinear optimisation

min f (x , y)

subject to − y + 2x −
1

2
≥ 0

y −
1

4
x2 +

1

2
≥ 0.

x
-0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1

y

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

M.M. Betcke Numerical Optimisation

Convexity

A set S ⊂ Rn is convex if for any two points x , y ∈ S the line
segment connecting them lies entirely in S

αx + (1− α)y ∈ S, ∀α ∈ [0, 1].

Examples:

-1.5 -1 -0.5 0 0.5 1 1.5
-1.5

-1

-0.5

0

0.5

1

1.5

p = 1
p = 2
p = 3
p = Inf

50 100 150 200 250 300 350 400

50

100

150

200

250

300

350

400

Figure: (a) unit ball {x ∈ Rn : ‖x‖p ≤ 1}, p ≥ 1; (b) polyheadron
{x ∈ Rn : Ax = b,Cx ≤ d}

M.M. Betcke Numerical Optimisation

Convexity

A function f is convex if

its domain S is a convex set,
for any two points x , y ∈ S,

f (αx + (1− α)y) ≤ αf (x) + (1− α)f (y), ∀α ∈ [0, 1].

A function f is strictly convex if for x 6= y

f (αx + (1− α)y) < αf (x) + (1− α)f (y), ∀α ∈ (0, 1). A function f is concave if −f is convex. Examples: linear function f (x) = cTx + α, where c ∈ Rn, α ∈ R convex quadratic function f (x) = xTHx , where H ∈ Rn×n symmetric positive (semi)definite M.M. Betcke Numerical Optimisation Classification of optimisation problems convex vs non-convex smooth vs non-smooth constrained vs unconstrained linear vs quadratic vs nonlinear small vs large scale local vs global stochastic vs deterministic discrete vs continuous M.M. Betcke Numerical Optimisation Unconstraint minimisation Let f : Rn → R be a smooth function for which we can evaluate f and its derivatives at any given point x ∈ Ω ⊆ Rn. Unconstraint optimisation problem min x∈Ω⊆Rn f (x). (1) A point x? is a global minimiser if f (x?) ≤ f (x), ∀x ∈ Ω ⊆ Rn. A point x? is a local minimiser if ∃N (x?) : f (x?) ≤ f (x), ∀x ∈ N (x?), N (y) is a neighbourhood of y (an open set which contains y). M.M. Betcke Numerical Optimisation Unconstraint minimisation A point x? is a strict (or strong) local minimiser if ∃N (x?) : f (x?) < f (x), ∀x ∈ N (x?), x 6= x?. Examples: f(x) = 2: every point is a (weak) local minimiser f (x) = (x − 2)4 : x? = 2 is a strict local minimiser (also a global one) f (x) = cos(x) : x? = π + 2kπ, k ∈ Z, are all strict local minimisers (but not strict global on R) M.M. Betcke Numerical Optimisation Unconstraint minimisation A point x? is an isolated local minimiser if ∃N (x?) : x? is the only local minimiser in N (x?). Some strict local minimisers are not isolated e.g. f (x) = x4 cos(1/x) + 2x4, f (0) = 0 has a strict local minimiser at x? = 0 but there are strict local minimisers (albeit f (xj) ≥ x4j > 0 = f (x

?)) at nearby points
xj → 0, j →∞.

However, all isolated local minimiser are strict.

M.M. Betcke Numerical Optimisation

Unconstraint minimisation

Difficulties with global minimisation:

f (x) = (cos(20πx) + 2)|x |
has a unique global minimiser x? = 0, but the algorithms usually
get trapped into one of the many local minima.

For convex functions, every local minimiser is also a global
minimiser.

-1 -0.5 0 0.5 1
0

0.5

1

1.5

2

2.5

3

-1 -0.5 0 0.5 1
0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

|x|
exp(|x|)-1

x
2

Figure: (a) f (x) = (cos(20πx) + 2)|x |; (b) convex functionsM.M. Betcke Numerical Optimisation

Taylor theorem

Let f : Rn → R be continuously differentiable. Then for p ∈ Rn we
have

f (x + p) = f (x) +∇f (x + tp)Tp

for some t ∈ (0, 1).
If moreover f is twice continuously differentiable, we also have

∇f (x + p) = ∇f (x) +
∫ 1

0
∇2f (x + tp)pdt.

and

f (x + p) = f (x) +∇f (x)Tp +
1

2
pT∇2f (x + tp)p,

fore some t ∈ (0, 1).

M.M. Betcke Numerical Optimisation

Theorem [1st order necessary condition]

Let f : Rn → R be continuously differentiable in an open
neighbourhood of a local minimiser x?, then ∇f (x?) = 0.

Proof: [by contradiction]
Suppose that ∇f (x?) 6= 0 and define p = −∇f (x?). Note that
pT∇f (x?) = −‖∇f (x?)‖22 < 0. Furthermore, as ∇f is continuous near x?, there exists T > 0 such that

pT∇f (x? + tp) < 0, t ∈ [0,T ]. By Taylor’s theorem, for any t̄ ∈ (0,T ] we have f (x? + t̄p) = f (x?) + t̄pT∇f (x? + tp), t ∈ (0, t̄). Hence f (x? + t̄p) < f (x?) for all t̄ ∈ (0,T ], and we have found a direction leading away from x? along which f decreases which is in contradiction with x? being a local minimiser. M.M. Betcke Numerical Optimisation 1st oder necessary condition We call x? a stationary point if ∇f (x?) = 0. By Theorem [1st order necessary condition] any local minimiser is a stationary point. The converse is in general not true. M.M. Betcke Numerical Optimisation Theorem [2nd order necessary condition] If x? is a local minimiser of f and ∇2f exists and is continuous in an open neighbourhood of x?, then ∇f (x?) = 0 and ∇2f (x?) is positive semidefinite. Proof: [by contradiction] By Theorem [1st order necessary condition] we have ∇f (x?) = 0. Assume ∇2f (x?) is not positive semidefinite. Then there exists a vector p such that pT∇2f (x?)p < 0, and because ∇2f is continuous near x?, there exists T > 0 such that
pT∇2f (x? + tp)p < 0 for all t ∈ [0,T ]. By Taylor theorem, we have for any t̄ ∈ (0,T ] and some t ∈ (0, t̄) f (x? + t̄p) = f (x?) + t̄pT∇f (x?) + 1 2 t̄2pTf (x? + tp)p < f (x?). We have found a decrease direction for f away from x? which contradicts x? being a local minimiser. M.M. Betcke Numerical Optimisation Theorem [2nd order sufficient condition] Let f : Rn → R with ∇2f continuous in an open neighbourhood of x?. If ∇f (x?) = 0 and ∇2f (x?) is positive definite, then x? is a strict local minimiser of f . Proof: Because the Hessian ∇2f is continuous and positive definite at x?, we can choose a radius r > 0 so that ∇2f (x) remains positive
definite for all x in an open ball B2(x?, r) = {y : ‖y − x?‖2 < r}. For any nonzero vector p 6= 0, ‖p‖2 < r , x? + p ∈ B2(x?, r) and f (x? + p) = f (x?) + pT∇f (x?) + 1 2 pT∇2f (x? + tp)p (2) = f (x?) + 1 2 pT∇2f (x? + tp)p, (3) for some t ∈ (0, 1). Furthermore, x? + tp ∈ B2(x?, r) thus pT∇2f (x? + tp)p > 0 and
therefore f (x? + p) > f (x?).

M.M. Betcke Numerical Optimisation

necessary vs sufficient condition

2nd order sufficient condition guarantees a stronger statement than
the necessary conditions (strict local minimiser).

A strict local minimiser may fail to satisfy the sufficient conditions:

f (x) = x4, f ′(x) = 4×3, f ′′(x) = 12×2

x? = 0 is a strict local minimiser while f ′′(x?) = 0 thus it satisfies
the necessary but not the sufficient conditions.

M.M. Betcke Numerical Optimisation

Implications of convexity

If f : Rn → R is convex, any local minimiser x? is also a global
minimiser of f . If, f is also differentiable, then any stationary point
x? is a global minimiser.

Proof:
Suppose x? is a local but not global minimiser. Then
∃z ∈ Rn : f (z) < f (x?). For all x on a line segment joining x? and z i.e. L(x?, z) = {x : x = λz + (1− λ)x?, λ ∈ (0, 1]}, by convexity of f we have f (x) ≤ λf (z) + (1− λ)f (x?) < f (x?). For any neighbourhood N (x?) ∩ L(x?, z) 6= ∅, hence ∃x ∈ N (x?) : f (x) < f (x?) and x? is not a local mininiser. M.M. Betcke Numerical Optimisation Implications of convexity Proof: cont. For the second part, we suppose that x? is not global minimiser. For all z chosen as before by convexity of f it follows ∇f (x?)T(z − x?) = d dλ f (x? + λ(z − x?)) ∣∣∣∣ λ=0 = lim λ→0 f (x? + λ(z − x?))− f (x?) λ ≤ lim λ→0 λf (z) + (1− λ)f (x?)− f (x?) λ = f (z)− f (x?) < 0. Hence ∇f (x?) 6= 0 and x? is not a stationary point. M.M. Betcke Numerical Optimisation