CS代写 unsw comp9417 Machine Learning and Data Mining Final Exam Question 1

unsw comp9417 Machine Learning and Data Mining Final Exam Question 1

Question 1
Please submit Question1.pdf on Moodle using the Final Exam – Question 1 object. You must submit a

Copyright By PowCoder代写 加微信 powcoder

single PDF. You may submit multiple .py files (placed in a single zip file) if you wish. Do not put your
pdf in the zip file. The parts are worth (2 + 4+ 4 + 3) + (3+3+6) = 25.
(a) (Bias, Variance & MSE) Let X1,…, Xn be ii.d. random variables with mean u and variance o?
Define the estimator T = L”y a; X; for some constants a1,…, An.
(i) What condition must the a;’s satisfy to ensure that T is an unbiased estimator of u? Unbiased
means that ET = M.
What to submit: your working out, either typed or handwritten.
(in) Under the condition identified in the previous part, which choice of the a;’s will minimize the
MSE of T? Does this choice of a;’s surprise you? Provide some brief discussion. Hint: this is
a constrained minimization problem.
What to submit: your working out, either typed or handwritten, and some commentary.
(ili) Suppose that instead, we let a; = 1*° for i = 1,…, n, where b is some constant. Find the value
of b (in terms of u and o?) which minimizes the MSE of T. How does the answer here compare
to the estimator found in the previous part? How do the two compare as the sample size n
increases? Are there any obvious issues with using the result of this question in practice?
What to submit: your working out, either typed or handwritten, and some commentary.
(iv) Suppose now that vou are told that o? = u?. For the choice of b identified in the previous part,
find the MSE of the estimator in this setting. How does the MSE compare to the MSE of the
sample average X? (Recall that MSE(X) = var(X) = 0? /n = 4? /n). Further, explain whether
or not you can use this choice of b in practice.
What to submit: your working out, either typed or handwritten, and some commentary.
(b) (kNN Regression) Consider the usual data generating process y = f(x) + €, where f is some
unknown function, and € is a noise variable with mean zero and variance o?. Recall that in kNN
regression, we look at the k nearest neighbours (in our dataset) of an input point 2’o, we then
consider their corresponding response values and average them to get a prediction for to. Given a
dataset D = {(dis Yi)}”-1 we can write down the kNN prediction as
where Ni (co) is the set of indices of the k nearest neighbours of ro. Without loss of generality, label
the k nearest neighbours of to as #1, …, c and their corresponding response values by t1, …, th
G) Show that
Ibias(in(20))2 = (f(a0)
Throughout, you should treat to as a fixed point (not a random variable). You may use any
results from tutorials, labs or lectures without proof!.
What to submit: your working out, either typed or handwritten.
(ii) Derive an expression for the variance var(m(20)).
What to submit: your working out, either typed or handwritten.
erence any results that vou use, e.g. by stating that a particular result follows from Tutorial A, question B, part (
(in) Using the results so far, write down an expression for the MSE of m(20). Describe what hap-
pens to the bias of the kNN estimator at 2 when k is very small (1NN), and what happens
when k is very large (k -> ∞). Similarly, what happens to the variance? What does this tell
you about the relationship between bias and variance and choice of k?
What to submit: your working out, either tuped or handwritten, and some commentary.

程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com