CS计算机代考程序代写 The Australian National University Semester 2, 2021 School of Computing Theory Assignment 2 of 4 Liang Zheng

The Australian National University Semester 2, 2021 School of Computing Theory Assignment 2 of 4 Liang Zheng
COMP3670/6670: Introduction to Machine Learning
Release Date. 18th August 2021
Due Date. 23:59pm, 19th September 2021 Maximum credit. 100
Errata: In Exercise 4, the loss function included a regulariser term ∥c∥2B, which is undefined due to a dimensionality mismatch. This has been replaced with ∥c∥2A.
Exercise 1 Inner Products induce Norms 20 credits Let V be a vector space, and let ⟨·,·⟩ : V ×V → R be an inner product on V. Define ||x|| := 􏰀⟨x,x⟩.
Prove that || · || is a norm.
(Hint: To prove the triangle inequality holds, you may need the Cauchy-Schwartz inequality, ⟨x,y⟩ ≤
||x||||y||.)
Exercise 2 Vector Calculus Identities 10+10 credits
1. Letx,a,b∈Rn.Provethat∇x(xTabTx)=aTxbT +bTxaT. 2. LetB∈Rn×n,x∈Rn.Provethat∇x(xTBx)=xT(B+BT).
Exercise 3 Properties of Symmetric Positive Definiteness 10 credits Let A, B be symmetric positive definite matrices. 1 Prove that for any p, q > 0 that pA + qB is also
symmetric and positive definite.
Exercise 4 General Linear Regression with Regularisation (10+10+10+10+10 credits)
Let A ∈ RN×N,B ∈ RD×D be symmetric, positive definite matrices. From the lectures, we can use symmetric positive definite matrices to define a corresponding inner product, as shown below. From the previous question, we can also define a norm using the inner products.
⟨x,y⟩A :=xTAy ∥x∥2A := ⟨x, x⟩A
⟨x,y⟩B :=xTBy ∥x∥2B := ⟨x, x⟩B
Suppose we are performing linear regression, with a training set {(x1, y1), . . . , (xN , yN )}, where for each i,xi ∈RD andyi ∈R.Wecandefinethematrix
and the vector
X = [x1,…,xN]T ∈ RN×D y = [y1,…,yN]T ∈ RN.
Wewouldliketofindθ∈RD,c∈RN suchthaty≈Xθ+c,wheretheerrorismeasuredusing∥·∥A. We avoid overfitting by adding a weighted regularization term, measured using ||·||B. We define the loss function with regularizer:
LA,B,y,X(θ, c) = ||y − Xθ − c||2A + ||θ||2B + ∥c∥2A
For the sake of brevity we write L(θ, c) for LA,B,y,X(θ, c). For this question:
1A matrix is symmetric positive definite if it is both symmetric and positive definite.

• You may use (without proof) the property that a symmetric positive definite matrix is invertible.
• We assume that there are sufficiently many non-redundant data points for X to be full rank. In particular, you may assume that the null space of X is trivial (that is, the only solution to Xz = 0 is the trivial solution, z = 0.)
1. Find the gradient ∇θL(θ,c).
2. Let ∇θL(θ,c) = 0, and solve for θ. If you need to invert a matrix to solve for θ, you should prove
the inverse exists.
3. Find the gradient ∇cL(θ, c).
We now compute the gradient with respect to c.
4. Let ∇cL(θ) = 0, and solve for c. If you need to invert a matrix to solve for c, you should prove the
inverse exists.
5. Show that if we set A = I,c = 0,B = λI, where λ ∈ R, your answer for 4.2 agrees with the analytic solution for the standard least squares regression problem with L2 regularization, given by
θ = (XT X + λI)−1XT y.