Objectives
Unconstrained Optimization
• Review of necessary or sufficient conditions. • Newton’s method and its application to
solving the minimization problem.
• Search techniques for numerical solutions.
3/2/2020 @2020 New York University Tandon 173 School of Engineering
Problem Statement
Find optimality conditions, and algorithms, for the minimization problem
min f (x) , xn
3/2/2020 @2020 New York University Tandon 174 School of Engineering
x0 : 1-variable:
point
Taylor Series
For a (sufficiently smooth) function around a
n
f(x p)f(x)pf'(x)1p2f”(x)p f(n)(x)
0 0 020 n!0 n-variables:
TT2
f(x p)f(x)pf(x)1p f(x)p
00020
3/2/2020 @2020 New York University Tandon 175 School of Engineering
Optimality Conditions
Taylor Formula is used to derive:
o First and Second Order Necessary Optimality Conditions.
o Second-Order Sufficient Optimality Condition.
3/2/2020 @2020 New York University Tandon 176 School of Engineering
First‐Order Necessary Condition
A first‐order necessary condition for optimization is:
f (x0 ) 0.
x0 is called a stationary point.
3/2/2020 @2020 New York University Tandon 177 School of Engineering
Second‐Order Necessary Condition
A second‐order necessary condition for minimization is:
2 f (x ) is positive semi-definite. 0
Indeed, we have
f(x)f(x p)f(x)1pT2f(x)p H.O.T. 0020
3/2/2020 @2020 New York University Tandon 178 School of Engineering
3/2/2020
@2020 New York University Tandon 179 School of Engineering
Second‐Order Sufficient Condition
Under the following conditions
f(x)0 and 2 f(x)ispositivedefinite, **
thenx isalocal,strict,minimizer. *
An Example
Find all stationary points and discuss the nature of each stationary point:
fx,x 1×3 1×2 2xx 1×2 x 9 12312112222
3/2/2020 @2020 New York University Tandon 180 School of Engineering
Answer
Stationary points:
x 1 : neither minimizer nor maximizer a 1
x 2 : localminimizer
b 3
not global minimizer nor maximizer, as f (x)
3/2/2020 @2020 New York University Tandon 181 School of Engineering
Review of Newton’s Method
Goal: search for zeros of g(x)=0.
Thinkof g(x)f(x)!
Formula for Newton’s Method: (Scalar Case) xk1 xk p,withpg(xk)/g(xk)
or,
xk1 xk g(xk)/g(xk), ifg(xk)0.
3/2/2020 @2020 New York University Tandon 182 School of Engineering
A Simulation Example
Apply Newton’s method with Matlab simulations to the one‐dimensional problem:
g(x)7×4 3×3 2×2 9x40.
3/2/2020 @2020 New York University Tandon 183 School of Engineering
Convergence Analysis
ForaC2 functiong:, letx beazeroofg *
with g(x ) 0. If x x is sufficiently small, *0*
then the sequence defined by xk1 xk g(xk)/g(xk)
converges quadratically to x , i.e. *
xx
lim k1 * 2
g (x)/2g(x). **
kx x k*
3/2/2020
@2020 New York University Tandon 184 School of Engineering
Idea of Proof
Using Taylor Formula, and e x x , kk*
0g(x)g(x e)g(x)eg(x)1e2g() * kk kkk2k
eg(xk)1e2 g() k g(xk) 2kg(xk)
1 2 g ( ) x xxx
k1 * 2k * g(xk) x x1g(x)xx2.
k1* *k* 2 g (x )
*
3/2/2020 @2020 New York University Tandon 185 School of Engineering
Comments
•
When g(x ) 0, g(x ) also converges *k
•
When g(x ) 0 or 0, Newton’s method may fail *
quadratically to zero, because: 0g(x)g(x e)g(x)eg()
*kkkk
g(x )e g()g()x x .
kkk*
to converge, or only converges linearly.
3/2/2020 @2020 New York University Tandon 186 School of Engineering
For the example
An Example
g(x) (x 1)2 (x 2)(x 3) 0. Themethodlinearlyconvergestox 1,
It quadratically converges to x 2, *
* ifinitializedatx0 1.1;
ifinitializedatx0 2.1.
3/2/2020 @2020 New York University Tandon 187 School of Engineering
3/2/2020
@2020 New York University Tandon 188 School of Engineering
n‐Dimensional Case
Newton’s Formula:
xk1 xk p,with g(xk)pg(xk)
or, equivalently,
xk1 xk g(xk)1g(xk) where
g(x ) g nn.
k
i x
j
3/2/2020
@2020 New York University Tandon 189 School of Engineering
An Exercise
Apply Newton’s method along with Matlab simulations to approximate the zero (0, 1.5) of the 2‐dimensional problem:
g(x,x)3xx 7x2x 30, 1121212
g (x,x )5xx 9x 4x 60. 2121212
3/2/2020
@2020 New York University Tandon 190 School of Engineering
The Minimization Problem
Apply Newton’s method to the 1st‐order necessary condition for minimization:
f (x) 0.
x x2f(x)1f(x)
k1 k k k xk pN.
k
Comment
At each iteration, p minimizes the quadratic function:
Q(p) f(x )f(x )T p1 pT2 f(x )p kk2k
3/2/2020 @2020 New York University Tandon 191 School of Engineering
Modified Newton’s Method
To make the classical Newton method more reliable and more cost‐effective, some modifications are required:
xk1 xk kpk
with the stepsize k chosen s.t.
f(xk1) f(xk).
Remark: In classical Newton method, k 1
andpk :pNk.
3/2/2020 @2020 New York University Tandon School of Engineering
192
If
Superlinear Convergence
H1) On a convex set S:
2 f(x)2 f(y) L xy
forsomeL0, x,yS.
H2) The sequence x S x , and
k* x xpx,k.
k1 k k *
H3) 2 f (x ) is positive definite.
*
Then, x x superlinearly, with f (x ) 0,
k**
iff lim pk pN k 0, with pN Newton direction.
3/2/2020
@2020 New York University Tandon 193 School of Engineering
k pk
k
3/2/2020
@2020 New York University Tandon 194 School of Engineering
Search Direction
The search direction p is chosen s.t. f(xp) f(x)
for some stepsize 0. Therefore, using Taylor formula,
pTf(x) 0.
Comment
Clearly, requiring pT f (x) 0 is weaker than the Newton direction in the classical method.
Indeed,
3/2/2020
2f(x)0. @2020 New York University Tandon
195
f(x)
pTf(x) 2 f(x)f(x)
N
f(x)T2 f(x)f(x)0
School of Engineering
T
Question:
What if 2 f (x) 0, i.e.,
it is merely positive semidefinite?
3/2/2020 @2020 New York University Tandon 196 School of Engineering
Modified Newton Algorithm
Specify some initial guess x0 , and a convergence tolerance . For k 0, 1,
If f(xk),stop.
Compute a modified (Cholesky) factorization:
2T f(x)ELDL0, Ediagonal
k
and solve for pk :
(LDL )pf(x ). k
T
Perform a line search to determine xk1xk kpk.
3/2/2020 @2020 New York University Tandon 197 School of Engineering
3/2/2020
@2020 New York University Tandon 198 School of Engineering
Line‐Search Methods for Convergence
Once the descent direction p is chosen, the idea of line‐search methods is to find a suitable step size α such that
fxk1f(xk), withxk1xkkpk or, the “ideal” choice of k as the solution to
minF()f(xk pk). 0
3/2/2020
@2020 New York University Tandon 199 School of Engineering
Impact of the Stepsize
Consider the optimization problem
minf(x)5×2 7×2 3xx. 1212
Let
x (2,3)T,p (5, 7)T f(x)65.
kkk
Ifk 1,thenf(xk kpk)121f(xk).
Ifk 1, thenf(xk k pk)9 f(xk). 24
3/2/2020
@2020 New York University Tandon 200 School of Engineering
A Motivating Example
The purpose of this example shows that decreasing the value of f at each iteration is not enough.
min f (x) x2 Initialization: x0 3.
pk 1, k 2k yield: xk1 xk 2k
which converges to 1, not 0!
p kT f ( x k ) pk f(xk)
0 .
Standing Assumptions
A1) The vectors pk satisfy a sufficient-descent condition:
A2) The search directions are gradient related, and bounded: pk m f(xk) and pk M,k (forsomem0).
A3) Each scalar k is chosen as the 1st element of 11
1, , , to meet a sufficient-decrease condition: 24
f(x p ) f(x ) pTf(x ), forsome0<1.
kkkkkkk
3/2/2020 @2020 New York University Tandon 201 School of Engineering
Then,
A Convergence Result
Under the above assumptions, if the level set of f S x: f(x)f(x0)
is bounded, and if f(x)f(y) L xy , forsomeL>0
limf(xk) 0. k
3/2/2020 @2020 New York University Tandon 202 School of Engineering
Consider the quadratic function
3/2/2020
@2020 New York University Tandon 203 School of Engineering
An Exercise
f(x)1xTQxcTx,withQQT 0. 2
Let p be a descent direction for f at x. Show that the solution of the exact line
search problem min f (x p) 0
pTf (x) is: pTQp .