Part I includes three 2019 exam paper questions. Part II has TWO questions which are based on the data set named DiatFat.csv in the folder Assignment 2 in OneDrive.
The data set DiatFat.csv gives the percentage of body fat determined by underwater weighing and various body circumference measurements for 252 men. A variety of popular health books suggest that the readers assess their health, at least in part, by estimating their percentage of body fat. Percentage of body fat for an individual can be estimated once density has been determined. The variables listed in the data set DiatFat.csv, from left to right, are:
1. Percent body fat from Brozek¡¯s compartment model (in short, brozek) 2. Percent body fat from Siri’s compartment model (in short, siri)
3. Density determined from underwater weighing (in short, density)
4. Age (years)
5. Weight (lbs)
6. Height (inches) (in short,) (in short, height)
7.Adiposity index = weight/(height*height) (in short, adipos) 8. Fat-Free adipose tissue mass (in short, free)
9. Neck circumference (cm) (in short, neck)
10. Chest circumference (cm) (in short, chest)
11. Abdomen 2 circumference (cm) (in short, abdom)
12. Hip circumference (cm) (in short, hip)
13. Thigh circumference (cm) (in short, thigh)
14. Knee circumference (cm) (in short, knee)
15. Ankle circumference (cm) (in short, ankle)
16. Biceps (extended) circumference (cm) (in short,)
17. Forearm circumference (cm) (in short, forearm)
18. Wrist circumference (cm) (in short, wrist)
We aim to study the factors that will affect body fat. For the body fat index, we choose brozek. For covariates, we choose 10 factors listed from column 9 to column 18 in the data frame.
Q1. Inspect where there is multicollinearity among these 10 regressors by correlation matrix, conditional index, and VIF separately.
Q2. Fit the body fat index brozk with the 10 factors by MLR and LSE.
Q3. Consider model the body fat index brozk with the 10 factors by ridge regression. Q3.1 Fit the ridge regression by ridgeLSE.
Q3.2 Draw the plot of ridge trace.
Q3.3 Suggest how to choose the ridge parameter.
Q3.4. Compare the estimate by Q2 and Q3.1. What do you find?
Q4.Consider model the body fat index brozk with the 10 factors by principle component
regression. One uses prcomp() function to compute the principle components.
Q4.1. Give the representation of the first principle component.
Q4.2 How much variation can be explained by the first principle component?
Q4.3 How much variation can be explained by the second principle component?
Q4.4 Choosing the first two principle component, estimate the effects of each factor on
the body fat index based on pc regression.