程序代写代做 kernel graph pp

pp
SS4850G/9850B Assignment #1 (Due Feb 14 2020 before 13:30)
• Show all your works in details and provide your code for computations.
• In Questions 4 to 6, use the “Biweight” kernel.
• Write down your name, student id, and you are undergraduate/graduate stu- dents.
• Please submit your hard copy in class.
1. In Section 2.1, we discussed properties of LSE β􏰖 and residual e = (In − X(X⊤X)−1X⊤)y.
Based on the notation defined in Section 2.1, please show that
(a) the estimator β􏰖 is an unbiased estimator of β with var(β􏰖|X) = σε2(X⊤X)−1. Moreover,
β􏰖 is the Best Linear Unbiased Estimator (BLUE).
(b) E(e|X) = 0, var(e|X) = σε2(In − H) with H = X(X⊤X)−1X⊤, and cov(e, β􏰖|X) = 0.
(c) Furthermore, find the unbiased estimator of σε2.
2. Suppose that {(Yi, Xi) : i = 1, · · · , n} is a sequence of independently and identically dis- tributed (i.i.d.) random variables with Yi, Xi ∈ R. Assume that var(Xi) = σX2 . We consider the simple linear regression model
Yi = β0 + Xiβx + εi, (1) where εi ∼ N(0, σε ) and εi is independent of Xi.
(a) In some applications (e.g., when collecting data), we are not able to precisely measure Xi, but instead, we can only observe Xi∗. It is called mismeasurement. As a result, we usually have model (1) with Xi replaced by Xi∗ in this situation. Please find the estimators of βx based on Xi and Xi∗, and denote them by β􏰖x and β􏰏x, respectively.
(b) We usually build up the relationship between Xi and Xi∗ by the following model:
Xi∗ = Xi + δi, (2)
where δi ∼ N(0,σδ) and δi is independent of Xi and εi. Based on (2), show that
i.i.d. 2
i.i.d. 2
βx −→ ω1βx and βx −→ ω2βx for some non-negative values ω1 and ω2 as n → ∞, where p
􏰖􏰏
“−→” represents convergence in probability. Also, you should specify the exact values of ω1 and ω2.
(c) Find variances of βx and βx , i.e., var(βx ) and var(βx ). Moreover, compare these two variances.
􏰖􏰏􏰖􏰏
1
Instructor: L.-P. Chen

(d) (Do a simple simulation study). Consider the sample size n = 1000. Let β0 = βx = 1,
σε2 = 1. Let Xi be generated by N(4,1). Then the response Yi can be generated by
model (1). In addition, consider σδ2 = 0.15,0.55 and 0.75, and then generate Xi∗ by
(2). Suppose that we run 1000 repetitions. Based on your “artificial” data, calculate
numerical results for βx, βx, var(βx) and var(βx). Summarize your numerical results as the following table and compare with (a), (b), and (c).
Table 1: Simulation result
􏰖􏰏􏰖􏰏
βx βx σδ2 = 0.15 σδ2 = 0.55 σδ2 = 0.75
Bias var
Note: Bias is β􏰏x − βx and β􏰖x − βx.
−∞
􏰏􏰖
(e) Summarize your findings in (a) – (d).
3. Suppose f(y) is a probability density function (pdf). Let (r) 􏰕 ∞ 􏰗 (r) 􏰘2
R(f )= where f(r)(y) is the rth derivative of f(y).
f (y) dy,
(a) When f is pdf of N(μ, σ2), please find R(f(2)) so that we are able to obtain the bandwidth based on normal scale rule.
(b) Show that under some conditions, R(f(r)) = (−1)r
these conditions?
4. Consider the wool prices data set (wool.txt) that reports the wool prices at weekly markets. The response of interest is the log price difference between the price of a particular wool 19 μm (cents per kilogram clean) and the floor wool price (cents per kilogram clean) at markets:
yt = log(19 μm price/floor price), and the covariate xt is the time in weeks since January 1, 1976.
(a) Fit the data by a simple linear regression model and a polynomial model of order 10. Give scatterplot of the data and add the two fitted lines, one for simple linear model and one for polynomial model. Put clear and proper legends on it.
(b) Fit the data by local constant kernel estimator and local linear kernel estimator. Choose the bandwidths in these two estimators by the CV method. Give scatterplot of the data and add the two fitted lines. Put clear and proper legends on it.
2
􏰕∞ −∞
f(2r)(y)f(y)dy.
Which kinds of conditions do we need here? Does standard normal distribution satisfy

(c) Fit the data by local linear kernel estimator. Choose the bandwidths by the CV and direct plug-in methods. Give scatterplot of the data and add the two fitted lines. Put clear and proper legends on it.
n
(d) Let y􏰖 denote the fitted values determined by methods in (a) to (c). Compute 􏰔(y􏰖 −y )2. ttt
t=1
Finally, summarize your findings.
5. Consider the data from undergoing corrective spinal surgery. The objective was to determine important risk factors for kyphosis. The response (y) is the presence or absence of the kyphosis. The risk factor (x) is the age in weeks. The data is in kyphosis.txt.
We consider the following model:
yi ∼ Bernoulli π(xi) with log 1 − π(xi) = f(xi).
(a) Suppose that we use the local constant and local linear kernel methods to estimate π(x). Choose the bandwidths h by the CV method, and use h to calculate the estimators π􏰖(xi );
(b) Assume that f(x) is a linear function of x. We estimate π(x) under this assumption. Plot this estimator and local constant / linear kernel estimators in the same graph. Put clear and proper legends on it. Does the linear assumption for f(x) look reasonable here?
6. This problem refers to data from a study of nesting horseshoe crabs. Each female horseshoe crab in the study had a male crab attached to her in her nest. The study investigated factors that affect whether the female crab had any other males, called satellites, residing near her. The response outcome y for each female crab is her number of satellites. The covariate (x) is the female crab’s carapace width. The data is in crab.txt.
We consider the following model:
yi ∼ Poisson􏰐μ(xi)􏰑 with log{μ(xi)} = f(xi).
You are asked to answer the following questions:
(a) By the similar idea in Section 3.4, please write down the formulations of li(β0,β1) and the performance measure. Also, determine the CV score.
(b) Choose the bandwidth h by the CV score based on local constant and local linear kernel estimators, and then calculate μ􏰖(xi) by local constant and local linear kernel estimators.
(c) Assume that f(x) is a linear function of x, estimate μ(x) under this assumption and plot it and local constant and linear kernel estimators in the same graph. Put clear and proper legends on it. Does the linear assumption look reasonable here?
􏰐􏰑 􏰒π(xi)􏰓
3