MATH 4130B 3.0 – Fall 2019 Assignment 1
(Due Date: September 26, 2019)
Question 1: The aim of this question is to review some of the basic concepts you have learned in the prerequisite courses. Consider the model
yi =β+εi i=1,…,n
where ε1, . . . , εn are independent and identically distributed with mean 0 and variance σ2.
a. Using the least squares method, obtain the estimator of β.
b. What is the variance of the least squares estimator of β.
c. Obtain an estimator for σ2.
d. Assume ε1, . . . , εn are iid normally distributed, obtain a 95% confidence interval for β.
e. Assume ε1, . . . , εn are iid normally distributed, obtain a 95% prediction interval for the response Y .
f. For graduate students only. Re-do parts (a) to (c), using the likelihood method with the assumption that ε1, . . . , εn are iid normally distributed.
Question 2: As in Question 1, the aim of this question is to review some of the basic concepts you have learned in the prerequisite courses. The data set is given in the Excel file (tab “Question 2”). x is the waist (in inches) and y is the body fat (in percentage).
a. Obtain the sample mean, sample variance and sample standard deviation for y.
b. Repeat part (a) but for x.
c. Obtain the sample covariance and sample correlation between x and y. What can you tell from the sample correlation on the association between x and y.
d. Let W = 2X − 5Y + 1. Based on the data, obtain a 90% confidence interval for the mean of W. Clearly state all the necessary assumptions that you need in order to answer this question.
e. Verify the assumptions that you stated in part (d).
Question 3: For the data set given in Question 2, a simple linear regression model is used
to study the relationship between variable y and variable x.
a. Clearly state the model with all the assumptions.
b. State your predicted model.
1
c. Is the model a significant model? Give 2 reasons to support your answer.
d. Any violation of the assumptions stated in part (a)? Why?
e. Is there any evidence that the slope of the linear regression model is different from 1?
f. Based on the model obtained in part (2), predict the value of y at x = 18. Should you trust your prediction? Why?
g. Obtain a 90% confidence interval for the mean response of y at x = 30.
h. Obtain a 90% prediction interval for a new response of y at x = 30.
Question 4: Consider the following stochastic process: Zt =Acos(ωt+θ)
where A is a random variable with mean 0 and variance 1, and Θ is following the U nif orm(−π, π) distribution. Assume A and Θ are independently distributed.
a. Find the mean of Zt .
b. Obtain the covariance function of lag k.
c. Obtain the correlation function of lag k.
Question 5: The data set is given in the Excel file (tab “Question 5”).
a. Obtain the time plot and describe the trend.
b. Calculate the sample mean.
c. Calculate the sample variance.
d. Calculate the sample autocovariance of lag 1, and the sample autocovariance of lag2.
e. Calculate the sample autocorrelation of lag 1, and the sample autocorrelation of lag2.
f. Using the first 10 data points from this data set only, re-do parts (b) to (e) without using any software.
Question 6: (revisit Question 5) The data set is given in the Excel file (tab “Question 5”). Consider the following model:
where
Zt =φ1Zt−1 +φ2Zt−2 +εt,
Xt − E(Xt)
t=3,…,98.
Zt =
,
var(Xt)
E(Xt) and var(Xt) are obtained in Question 5 part (b) and part (c) respectively, and ε3,…,ε98 areiidstandardnormal.
2
a. Applying the least squares method, obtain the predicted model. Note that we are using only 96 data points (3rd data point to 98th data point only) for this question.
b. Based on the predicted model in part (a), forecast the Z99 and Z100 values and the corresponding X99 and X100.
c. Compare the forecasted values obtained in part (b) to the true observed values given in the data set, are they reasonable forecast? Justified your answer
3