King’s College London Department of Mathematics
Academic year 2019–2020, Spring term Dr Blanka Horvath
7CCMFM18 Machine Learning Problem sheet 2 — Solutions
Problem 1
Let i = 1,…,n. We have on the one hand, by (⋆), i −1
i −1 un n =αi+βi n ,
whileontheotherhand,by(⋆⋆⋆),uni−1=ui−1,sothat nn
i −1 i −1
αi+βi n =u n . (1)
and,by(⋆)and(⋆⋆⋆),
whence
In the case i = n, equation (2) simply follows from (⋆) and (⋆ ⋆ ⋆). Now (1) and (2) form a system of two linear equation with two unknowns, αi and βi . Subtracting (1) from (2), we get
ii−1 ii−11 un−u n =αi+βin−αi+βi n =βin,
If i < n, by (⋆⋆), we have moreover
α +β i =α +β i
i i n i +1 i +1 n
i i i
αi+1+βi+1n=un n =u n , i i
αi+βin=u n . (2)
so that
Substituting this into (2), we finally get
i i−1 βi=nun−u n .
i i i i i − 1 i − 1 i αi=un −βin=un −iun −u n =iu n −(i−1)un.
1
Problem 2
Leti=1,...,nandletx∈i−1,i .Bysubtractingandaddinguni−1,weget nnn
i −1 i −1 un(x)−u(x)=un(x)−un n +un n −u(x),
where, by (⋆),
un(x)−un
i −1 n
i −1 =αi +βix− αi +βi n
i −1 =βi x− n
i i−1 i−1 =nun−un x−n
(if x = i , invoke also (⋆⋆)) and, by (⋆⋆⋆), n
i −1 i −1
un n −u(x)=u n −u(x).
Now, due to the triangle inequality,
i − 1 i − 1
where
and
|un(x)−u(x)|≤un(x)−un n +un n −u(x),
i−1 i i−1 i−1 un(x)−un n =|n|u n −u n x− n
≤1 n
n n n
i−1 i−1 i−1αK
un n −u(x)=u n −u(x)≤K n −x ≤nα
i i − 1 α K ≤K− =α
by Hölder continuity. Therefore,
|un(x)−u(x)|≤ K + K =2K. (3)
nα nα nα
We assumed that x ∈ i−1, i , but since upper bound in (3) does not depend on i,
nn
we can conclude that, in fact,
∥un −u∥sup,[0,1] = sup |un(x)−u(x)|≤ 2K .
x∈[0,1] nα 2
Problem 3
A neural network fn ∈ N2(1,n,1;ReLU,Id) can be expressed as n
fn(x)=W2 ReLUW1 x+b1+b2, x∈R,
1,i i,1 i 1 i=1
withweightsW1 =W1 ···W1 ′ ∈Rn×1 andW2 =W2 ···W2 ∈R1×n,andbi- 1,1 n,1 1,1 1,n
asesb1 =(b1,...,bn1)∈Rn andb12 ∈R. Withsomegraphicalintuition(discussedin class), we can write
un(x)=α1+β1ReLU(x)+(β2−β1)ReLUx− 1+(β3−β2)ReLUx− 2 nn
+···+(βn −βn−1)ReLUx − n −1, x ∈ [0,1]. n
(This is also analogous to replicating a piecewise affine payoff with a finite set of call options.) We get then un(x) = f (x) for x ∈ [0,1] if we set
W1 :=1 ··· 1′,
W 2 : = β 1 β 2 − β 1 · · · β n − β n − 1 ,
Problem 4
Notethatforε>0,
b1 :=0, 1,…,n−1, nn
b 12 : = α 1 .
1 2K<ε ⇐⇒ n>2Kα.
(4)
nα ε
Consider now un where n satisfies (4). By Problem 3 (and Proposition 2.6 in lecture
notes), we can represent un on [0,1] as f ∈ N2(1,d,1;ReLU,Id) for any d ≥ n. By Problem 3 and (4),
∥u − f ∥sup,[0,1] < ε.
Note that n → ∞ in (4) when ε → 0 or κ → ∞, where the rate of growth depends on
the Hölder exponent α, (smaller α faster growth).
3