EXAMINER: Johanna G. Nešlehová
ASSOC. EXAMINER:
INSTRUCTIONS
GENERALIZED LINEAR MODELS MATH 523 SECTION: 001
Copyright By PowCoder代写 加微信 powcoder
START: APRIL 13, 2022, 1 PM EDT END: APRIL 14, 2022, 2 PM EDT
• Open book exam. Calculators and translation dictionaries are permitted.
• The exam should take you about 3h to complete. It is TIMED, you have 4h to complete
it and upload your solutions (unless you have a special arrangement with OSD).
• You must complete the exam entirely independently, without consulting with anyone (in
person, virtually, through an online chat, phone or otherwise).
• The exam is available during 25h beginning at 1 PM EDT on Wednesday, April 13 and
ending at 2 PM EDT on Thursday, April 14.
• You must submit your exam via Crowdmark. You will receive a notification from Crowdmark when the exam is available. After clicking the “Start the assessment now” button in the notification email from Crowdmark, the timer will start and you can view, solve, and submit the exam.
• Exams submitted after the timer of 4 hours has run out or after the final due date at 2 PM EDT on April 14 for whatever reason will result in a 0 mark.
• Failure to respond to Question 1 will result in a 0 mark; please see instructions for Question 1 below.
• The instructor is available on Wednesday, April 13 from 1 PM-3:30 PM, 8:30 PM-9:30 PM and on Thursday, April 14 from 9 AM-12 PM to answer questions; clarifications for everyone (if needed) will be posted as announcements on myCourses. Therefore, it is advisable to check the announcements on myCourses prior to starting your exam.
MATH 523 Generalized Linear Models (Winter 2022) Page 1 of 15
Winter 2022 Final Exam
Version Number: 1
1. Carefully review and sign the document
CoverPage-Final.pdf
Upload the SIGNED document on Crowdmark as your response to Question 1. The document is also available on myCourses under Content/Exams.
Attention: Failing to upload the signed document on Crowdmark will result in a 0 mark on the exam. If you have trouble viewing, uploading or signing the document please contact Professor Nešlehová immediately.
MATH 523 Generalized Linear Models (Winter 2022) Page 2 of 15
2. The inverse Gaussian distribution has density of the form
λ 1/2 λ(y−μ)2 f(y;μ,λ)= 2πy3 exp − 2μ2y
for y > 0, with parameters μ > 0 and λ > 0.
(a) Show that the family of inverse Gaussian distributions is an exponential dispersion family. Identify the functions b(·), c(·) as well as the canonical and the dispersion parameters.
(b) Fill in the gaps in the following sentence (no calculation will be graded): The mean- variance relationship for the inverse Gaussian family is .
(c) Identify the canonical link for an inverse Gaussian GLM. Comment on the suitability of the canonical link. What other link functions might be appropriate and why?
(d) Fill in the gaps in the following sentence (no calculation will be graded): When the inverse Gaussian GLM with the identity link and design matrix X is used, the standard errors of the estimators βˆ are the entries of the matrix , where is a diagonal matrix with entries wi = .
Using past data on home sales, it is of interest to explain how the price of a home (in 1000 USD) depends on the size of the home (in square feet), the number beds of bedrooms and whether the home is new (1=yes, 0=no).
(e) The summary of the fitted inverse Gaussian GLM m1 is on Page 4, lines 1–25. Fill in the gaps in the following sentence (no calculation will be graded): The predicted price of a new 3000 square feet home with 4 bedrooms is .
(f) A simpler model m2 with only size and the intercept as predictors has been fitted to the data. Using a suitable statistical test at the 5% level and the output on Page 4, decide whether it is a reasonable simplification of m1.
MATH 523 Generalized Linear Models (Winter 2022) Page 3 of 15
2 glm(formula = price ~ size + new
3 family = inverse.gaussian(link
5 Deviance Residuals:
7 -0.153808 -0.025300
9 Coefficients:
+ size:new + beds, = “identity”))
3Q Max 0.009637 0.084056
10 Estimate
11 (Intercept) -9.78572
12 size 0.12754
13 new -33.20859
14 beds -14.06834
15 size:new 0.03135
Std. Error 14.89902 0.01446 89.40044 8.44154 0.05795
t value -0.657 8.821 -0.371 -1.667 0.541
Pr(>|t|) 0.5129
5.45e-14 *** 0.7111
0.0989 . 0.5898
Median -0.006148
Output for Question 2
17 Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1 18
19 (Dispersion parameter for inverse.gaussian 20
21 Null deviance: 0.24773 on 99 degrees
22 Residual deviance: 0.10848 on 95 degrees
23 AIC: 1078.9
25 Number of Fisher Scoring iterations: 25 26
27 > sum(residuals(m1,”pearson”)^2)
28 [1] 0.09940069
30 > deviance(m2)
31 [1] 0.1143216
family taken to be 0.001045622)
of freedom of freedom
MATH 523 Generalized Linear Models (Winter 2022)
Page 4 of 15
3. Consider the Binomial GLM with the canonical link.
(a) Decide which of the following statements are TRUE (no calculation will be graded):
(i) The parameter estimates do not depend on the data entry format (grouped or ungrouped).
(ii) The standard errors depend on the data entry format (grouped or ungrouped).
(iii) The deviance depends on the data entry format (grouped or ungrouped).
(iv) The Pearson residuals do not depend on the data entry format (grouped or
ungrouped).
(b) Fill in the gaps in the following sentence (no calculation will be graded): The deviance residuals for a Binomial GLM with the canonical link when the data are entered in the grouped form are given by . For the same model, when the data are entered in the ungrouped form, the deviance residuals are (write either “the same as in the grouped case” or provide a formula). 4 MARKS
(c) Will the residuals calculated in part (b) change when the probit link is used? Explain.
Consider the a study of car accident records in Florida, reporting I (whether the injury was fatal or not); S whether seatbelt was in use or not (values Y for yes and N for not); E whether ejected from the car during the accident (values Y for yes and N for not).
(d) A number of models have been fitted to the data (for all models, the data were entered in the same format); their description appears on Page 6, lines 1–5. Using the entire output on Page 6, find the most suitable model for these data. Use suitable statistical tests at the 5% level. 5 MARKS
For the remainder of this question, consider the model 1+E+S whose summary is provided on Page 6, lines 10–31.
(e) The calculation in the output on Page 6, lines 33–34 reports a p-value. What can you conclude from it? Why is the p-value meaningful? 1 MARK
(f) What dependence pattern does the model describe? 1 MARK
(g) Fill in the gaps in the following sentence (no calculation will be graded): The odds of a fatal injury when not wearing seatbelt is times (choose between “higher” or “lower”) compared to when wearing a seatbelt, with a 95% confidence interval . The effect of wearing a seatbelt (choose between “does” or “does not”) depend on whether ejected. 4 MARKS
(h) Fill in the gaps in the following sentence (no calculation will be graded): The odds of a fatal injury when ejected is times (choose between “higher” or “lower”) compared to when not ejected, with a 95% confidence interval . The effect of being ejected (choose between “does” or “does not”) depend on whether seatbelt is used. 4 MARKS
MATH 523 Generalized Linear Models (Winter 2022) Page 5 of 15
Output for Question 3
1 glm(formula = cbind(fatal, nonfatal) ~ 1, family = binomial)
2 glm(formula = cbind(fatal, nonfatal) ~ 1 + E, family = binomial)
3 glm(formula = cbind(fatal, nonfatal) ~ 1 + S, family = binomial)
4 glm(formula = cbind(fatal, nonfatal) ~ 1 + S + E, family = binomial)
5 glm(formula = cbind(fatal, nonfatal) ~ S * E, family = binomial)
7 > round(c(deviance(m0),deviance(m1),deviance(m2),deviance(m3),deviance(m4)),3)
8 [1] 3567.723 1144.636 1680.412 2.854 0.000
11 glm(formula = cbind(fatal, nonfatal) ~ 1 + S + E, family = binomial)
13 Deviance Residuals:
14 1 2 3 4
15 -1.6132 0.3142 0.3256 -0.2165
17 Coefficients:
18 Estimate Std. Error z value
19 (Intercept) -5.04362 0.03120 -161.65
20 SY -1.71732 0.05402 -31.79
21 EY 2.79779 0.05526 50.63
23 Signif. codes: 0 ’***’ 0.001 ’**’ 0.01
25 (Dispersion parameter for binomial family taken to be 1) 26
27 Null deviance: 3567.723 on 3 degrees of freedom
28 Residual deviance: 2.854 on 1 degrees of freedom
29 AIC: 38.039
31 Number of Fisher Scoring iterations: 3 32
33 > pchisq(2.854,df=1,lower.tail=FALSE)
34 [1] 0.0911469
MATH 523 Generalized Linear Models (Winter 2022)
Page 6 of 15
Pr(>|z|) <2e-16 <2e-16 <2e-16
*** *** ***
’.’ 0.1 ’ ’ 1
(a) Derive the likelihood equations for a Poisson GLM with the identity link and explain how these equations simplify when the canonical link is used. 4 MARKS
Consider the number of damage incidents and aggregate months of service for different types of cargo ships broken down by year of construction (with values “60” for 1960–64, “65” for 1965–69, “70” for 1970–74, and “75" for 1975–79) and period of operation (with values “60" for 1960–74, and “75" for 1975–79).
(b) The output of the first model m1 fitted to these data is shown on Page 8, lines 1–25. Identify the model, the response and the predictors; state which predictors are treated as continuous variables and which are factors. 4 MARKS
(c) Why is service treated as offset? 1 MARK
(d) Fillinthegapsinthefollowingsentenceusingmodelm1(nocalculationwillbegraded): The of incidents in cargo ships (choose between “increases” or “decreases”) by for ships operating between 1975 and 1979, with 95% confidence interval .
(e) Using m1, ships constructed in which year appear to be the safest? Select from the answers below (no calculation will be graded)
A) 1960-64 B) 1965-69 C) 1970-74 D) 1975-79
(f) Would period still be significant at the 5% level if the quasi-Poisson model were used
with the same predictors as m1? Test using the Wald test.
(g) Is there significant overdispersion present? Assess this using the entire output on
Page 8 and a suitable statistical test at the 5% level.
MATH 523 Generalized Linear Models (Winter 2022) Page 7 of 15
1 2 3 4 5 6 7 8 9
glm(formula = incidents ~ offset(log(service)) + as.factor(period) +
as.factor(year), family = poisson, data = ships) Coefficients:
Estimate (Intercept) -6.9477 as.factor(period)75 0.3875
Std. Error 0.1269 0.1181 0.1488 0.1576 0.2203
z value Pr(>|z|) -54.733 < 2e -16 3.281 0.00104 5.070 3.99e-07 6.669 2.57e-11 3.196 0.00139
*** ** *** *** **
as.factor(year)65 as.factor(year)70 as.factor(year)75 ---
0.7542 1.0509 0.7041
Signif. codes: 0 ’***’ 0.001 ’**’ 0.01
(Dispersion parameter for poisson family taken to be 1)
Null deviance: 146.328 on 33 degrees of freedom Residual deviance: 62.365 on 29 degrees of freedom AIC: 170.23
> logLik(m2)
’log Lik.’ -80.11592 (df=5)
> sum(residuals(m1,”pearson”)^2) [1] 82.73714
glm.nb(formula = incidents ~
as.factor(year), data = Coefficients:
Estimate (Intercept) -6.9335 as.factor(period)75 0.3536
offset(log(service)) + as.factor(period) +
as.factor(year)65 as.factor(year)70 as.factor(year)75 —
1.0125 1.2551 0.7595
z value -24.655 1.557 3.114 4.061 1.951
Pr(>|z|) < 2e-16 0.11937 0.00184
4.88e-05 0.05106
Output for Question 4
ships , link
Std. Error 0.2812 0.2271 0.3251 0.3090 0.3893
= log, init.theta = 7.67193894)
Signif. codes: 0 ’***’ 0.001 ’**’ 0.01
(Dispersion parameter for Negative Binomial(7.6719) family taken to be 1)
Null deviance: 55.576 on 33 degrees of freedom Residual deviance: 36.847 on 29 degrees of freedom AIC: 167.75
Theta: 7.67 Std. Err.: 4.78
2 x log-likelihood: -155.748
MATH 523 Generalized Linear Models (Winter 2022)
Page 8 of 15
’*’ 0.05 ’.’ 0.1
’*’ 0.05 ’.’ 0.1
5. Consider the Baseline Category Logit model for a response with categories {1, . . . , K}, with category K as the baseline category. Denote the parameters in this model βj, j ̸= K. Now suppose that the baseline has been changed to be category K∗ ∈ {1, . . . , K}, instead of category K, K ̸= K∗.
(a) Show how the parameters change when the baseline is K∗ instead of K. 4 MARKS
(b) Show that the probabilities πj(Xi), j = 1,...,K are the same when the baseline is
K∗ as when the baseline is K. 6 MARKS
In a knee injury study, pain after 10 days following the injury is measured on a four-point scale representing the severity of PAIN (from 1 = no pain to 4 = severe pain). The explanatory variables collected are TH (0 placebo, 1 treatment), GEN (gender with 0 = male, 1 = female), and AGE (age in years).
(c) A model has been fitted to these data whose summary appears on Page 10, lines 1–17. Specify the model that has been used, the response, and the predictors. 2 MARKS
(d) Using the output on Page 10 and a suitable statistical test at the 5% level, decide whether GEN and AGE are significant predictors. 4 MARKS
(e) Consider the simpler model with the intercept and TH. Using the output Page 10, lines 20–36, calculate the fitted probabilities for each level of pain on the four-point scale (1–4) for this model. 4 MARKS
(f) Sketch the fitted probabilities calculated in part (e) and provide a qualitative summary of the treatment effect. 3 MARKS
(g) Which other type of model could have been used to analyze these data? Explain.
MATH 523 Generalized Linear Models (Winter 2022) Page 9 of 15
2 multinom(formula = PAIN ~ GEN + AGE + TH)
4 Coefficients:
5 (Intercept) GEN AGE TH
6 2 0.0003548329 -1.18840437 -0.01822593 1.3407833
7 3 0.3961681871 -0.65176745 -0.01131277 -0.2459782
8 4 1.5756946211 0.06707541 -0.03802770 -1.2622678
10 Std. Errors:
16 Residual Deviance: 311.1885
17 AIC: 335.1885
21 multinom(formula = PAIN ~ TH)
23 Coefficients:
24 (Intercept)
25 2 -0.8266765
26 3 -0.1335288
27 4 0.4054703
29 Std. Errors:
(Intercept) 0.9734467 0.9651109 0.9105105
GEN AGE TH
35 Residual Deviance: 319.1105
36 AIC: 331.1105
(Intercept) 0.4531638 0.3659628 0.3227487
TH 0.5501292 0.5325988 0.5366464
0.6007834 0.6014755 0.5459158
0.02742365 0.5708325 0.02798168 0.5481900 0.02732269 0.5547571
TH 1.2515469 -0.3018058 -1.1592485
MATH 523 Generalized Linear Models (Winter 2022)
Page 10 of 15
Output for Question 5
Table of the Normal distribution
Entries in the table are the values of the cumulative distribution function Φ of the Normal(0, 1) distribution, evaluated at z.
0.00 0.01 0.02 0.03 0.04
0.0 0.5000 0.5040 0.5080 0.5120 0.5160 0.5199 0.5239 0.5279 0.5319 0.5359 0.1 0.5398 0.5438 0.5478 0.5517 0.5557 0.5596 0.5636 0.5675 0.5714 0.5753 0.2 0.5793 0.5832 0.5871 0.5910 0.5948 0.5987 0.6026 0.6064 0.6103 0.6141 0.3 0.6179 0.6217 0.6255 0.6293 0.6331 0.6368 0.6406 0.6443 0.6480 0.6517 0.4 0.6554 0.6591 0.6628 0.6664 0.6700 0.6736 0.6772 0.6808 0.6844 0.6879 0.5 0.6915 0.6950 0.6985 0.7019 0.7054 0.7088 0.7123 0.7157 0.7190 0.7224 0.6 0.7257 0.7291 0.7324 0.7357 0.7389 0.7422 0.7454 0.7486 0.7517 0.7549 0.7 0.7580 0.7611 0.7642 0.7673 0.7704 0.7734 0.7764 0.7794 0.7823 0.7852 0.8 0.7881 0.7910 0.7939 0.7967 0.7995 0.8023 0.8051 0.8078 0.8106 0.8133 0.9 0.8159 0.8186 0.8212 0.8238 0.8264 0.8289 0.8315 0.8340 0.8365 0.8389
1.0 0.8413 0.8438 0.8461 0.8485 0.8508 0.8531 0.8554 0.8577 0.8599 0.8621 1.1 0.8643 0.8665 0.8686 0.8708 0.8729 0.8749 0.8770 0.8790 0.8810 0.8830 1.2 0.8849 0.8869 0.8888 0.8907 0.8925 0.8944 0.8962 0.8980 0.8997 0.9015 1.3 0.9032 0.9049 0.9066 0.9082 0.9099 0.9115 0.9131 0.9147 0.9162 0.9177 1.4 0.9192 0.9207 0.9222 0.9236 0.9251 0.9265 0.9279 0.9292 0.9306 0.9319 1.5 0.9332 0.9345 0.9357 0.9370 0.9382 0.9394 0.9406 0.9418 0.9429 0.9441 1.6 0.9452 0.9463 0.9474 0.9484 0.9495 0.9505 0.9515 0.9525 0.9535 0.9545 1.7 0.9554 0.9564 0.9573 0.9582 0.9591 0.9599 0.9608 0.9616 0.9625 0.9633 1.8 0.9641 0.9649 0.9656 0.9664 0.9671 0.9678 0.9686 0.9693 0.9699 0.9706 1.9 0.9713 0.9719 0.9726 0.9732 0.9738 0.9744 0.9750 0.9756 0.9761 0.9767
2.0 0.9772 0.9778 0.9783 0.9788 0.9793 0.9798 0.9803 0.9808 0.9812 0.9817 2.1 0.9821 0.9826 0.9830 0.9834 0.9838 0.9842 0.9846 0.9850 0.9854 0.9857 2.2 0.9861 0.9864 0.9868 0.9871 0.9875 0.9878 0.9881 0.9884 0.9887 0.9890 2.3 0.9893 0.9896 0.9898 0.9901 0.9904 0.9906 0.9909 0.9911 0.9913 0.9916 2.4 0.9918 0.9920 0.9922 0.9925 0.9927 0.9929 0.9931 0.9932 0.9934 0.9936 2.5 0.9938 0.9940 0.9941 0.9943 0.9945 0.9946 0.9948 0.9949 0.9951 0.9952 2.6 0.9953 0.9955 0.9956 0.9957 0.9959 0.9960 0.9961 0.9962 0.9963 0.9964 2.7 0.9965 0.9966 0.9967 0.9968 0.9969 0.9970 0.9971 0.9972 0.9973 0.9974 2.8 0.9974 0.9975 0.9976 0.9977 0.9977 0.9978 0.9979 0.9979 0.9980 0.9981 2.9 0.9981 0.9982 0.9982 0.9983 0.9984 0.9984 0.9985 0.9985 0.9986 0.9986
3.0 0.9987 0.9987 0.9987 0.9988 0.9988 0.9989 0.9989 0.9989 0.9990 0.9990 3.1 0.9990 0.9991 0.9991 0.9991 0.9992 0.9992 0.9992 0.9992 0.9993 0.9993 3.2 0.9993 0.9993 0.9994 0.9994 0.9994 0.9994 0.9994 0.9995 0.9995 0.9995 3.3 0.9995 0.9995 0.9995 0.9996 0.9996 0.9996 0.9996 0.9996 0.9996 0.9997 3.4 0.9997 0.9997 0.9997 0.9997 0.9997 0.9997 0.9997 0.9997 0.9997 0.9998
0.05 0.06 0.07 0.08
MATH 523 Generalized Linear Models (Winter 2022) Page 11 of 15
Table of the Chi-squared distribution
Entries in the table are χ2α(ν): the α tail quantile of the Chi-squared(ν) distribution
α given in columns, ν given in
0.05000 3.84146 5.99146 7.81473 9.48773
11.07050 12.59159 14.06714 15.50731 16.91898 18.30704 19.67514 21.02607 22.36203 23.68479 24.99579 26.29623 27.58711 28.86930 30.14353 31.41043 32.67057 33.92444 35.17246 36.41503 37.65248 38.88514 40.11327 41.33714 42.55697 43.77297 55.75848 67.50481 79.08194 90.53123
101.87947 113.14527 124.34211
10 2.15586
11 2.60322
12 3.07382
13 3.56503
14 4.07467
15 4.60092
16 5.14221
17 5.69722
18 6.26480
19 6.84397
20 7.43384
21 8.03365
22 8.64272
23 9.26042
24 9.88623
25 10.51965
26 11.16024
27 11.80759
28 12.46134
29 13.12115
30 13.78672
40 20.70654 50 27.99075 60 35.53449 70 43.27518 80 51.17193 90 59.19630 100 67.32756
0.99000 0.97500 0.00016 0.00098 0.02010 0.05064 0.11483 0.21580 0.29711 0.48442 0.55430 0.83121 0.87209 1.23734 1.23904 1.68987 1.64650 2.17973 2.08790 2.70039 2.55821 3.24697 3.05348 3.81575 3.57057 4.40379 4.10692 5.00875 4.66043 5.62873 5.22935 6.26214 5.81221 6.90766 6.40776 7.56419 7.01491 8.23075 7.63273 8.90652 8.26040 9.59078 8.89720 10.28290 9.54249 10.98232
10.19572 11.68855 10.85636 12.40115 11.52398 13.11972 12.19815 13.84390 12.87850 14.57338 13.56471 15.30786 14.25645 16.04707 14.95346 16.79077 22.16426 24.43304 29.70668 32.35736 37.48485 40.48175 45.44172 48.75756 53.54008 57.15317 61.75408 65.64662 70.06489 74.22193
0.95000 0.00393 0.10259 0.35185 0.71072 1.14548 1.63538 2.16735 2.73264 3.32511 3.94030 4.57481 5.22603 5.89186 6.57063 7.26094 7.96165 8.67176 9.39046
10.11701 10.85081 11.59131 12.33801 13.09051 13.84843 14.61141 15.37916 16.15140 16.92788 17.70837 18.49266 26.50930 34.76425 43.18796 51.73928 60.39148 69.12603 77.92947
0.90000 0.01579 0.21072 0.58437 1.06362 1.61031 2.20413 2.83311 3.48954 4.16816 4.86518 5.57778 6.30380 7.04150 7.78953 8.54676 9.31224
10.08519 10.86494 11.65091 12.44261 13.23960 14.04149 14.84796 15.65868 16.47341 17.29188 18.11390 18.93924 19.76774 20.59923 29.05052 37.68865 46.45889 55.32894 64.27784 73.29109 82.35814
0.10000 2.70554 4.60517 6.25139 7.77944 9.23636
10.64464 12.01704 13.36157 14.68366 15.98718 17.27501 18.54935 19.81193 21.06414 22.30713 23.54183 24.76904 25.98942 27.20357 28.41198 29.61509 30.81328 32.00690 33.19624 34.38159 35.56317 36.74122 37.91592 39.08747 40.25602 51.80506 63.16712 74.39701 85.52704 96.57820
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com