May 2022 Project
ST227 Survival Analysis
2020/21 and 2021/22 syllabus only
Instructions to candidates
Copyright By PowCoder代写 加微信 powcoder
This paper contains four questions. Answer ALL FOUR. Question 1: 25 marks
Question 2: 30 marks
Question 3: 25 marks
Question 4: 20 marks
The marks in brackets reflect marks for each part of a question.
Time allowed Reading Time: Writing Time:
©LSE LT 2022/ST227
Page 1 of 2
1. Consider the following mortality intensity function:
αγtγ−1 −3 μ(t)=1+αtγ, α=3.757×10 ,γ=1.4243.
(a) Define in R the survival probability function (t, x) → tpx and calculate the probability of surviving the next 3 years for a 20-year-old individual.
[ 5 marks]
(b) Define in R the density function for T20. This definition may involve a numerical
(c) Calculate the expected remaining lifetime for a 20-year-old individual.
[ 5 marks] [ 5 marks]
(d) Define in R the cumulative distribution function of T20. This definition may involve a
numerical integral.
(e) Discuss how one can numerically find the median of this distribution. Outline the
approach only, you are not required to solve for the median.
2. Thisquestionsisdividedintotwoparts.Bothpartsusethesamedatasetoffullyobserved lifetimes given below:
80 75 38 45 62 65 77 92 65 60 55 67 72 46 64 35 68 52 45 94
(a) Let us suppose that this data set comes from a Log-Normal distribution, i.e:
1 (ln(x)−μ)2 f(x|μ,σ)=xσ√2πexp − 2σ2 (1)
i. Using the results:
σ2 2 derive the method of moment estimators for μ and σ.
1 2 2 E(X)=exp(μ+ 2), Var(X)=(exp(σ)−1) exp(μ+2σ) , (2)
[5 marks] ii. Using your MMEs above as the initial values for optim or otherwise, derive the
MLE for μ and σ.
[10 marks] (b) We propose a lifetime model with the following mortality intensity function:
μ(t) = λγ(λt)γ−1, t ≥ 0.
©LSE LT 2022/ST227
Page 2 of 2
[ 5 marks]
[ 5 marks]
[Total 25 marks]
i. Derive algebraically the probability density function for lifetime and write down the joint-likelihood of the given sample.
ii. Using optim and the initial values λ0 = 67 and γ0 = 0.2233, numerically obtain
the maximum likelihood estimators of the model parameters.
iii. For the purpose of lifetime modelling, what range of values for γ would yield a
sensible model?
[2 marks] [Total 30 marks]
3. Cancer patients who are in remission are observed and the number of days until the symptoms reappear is recorded. Some records have been right-censored. The data set is provided in a spreadsheet named cancer.xlsx and the columns therein are:
• time: the time until reappearance of symptoms in number of days.
• event: an indicator variable taking value 0 if the record has been right-censored and
1 if fully observed.
• fullyObserved: logical variable indicating whether the record has been fully ob-
• sex: categorical variable with value 0 for male (the reference group) and 1 for female.
(a) Calculate the Kaplan-Meier estimate for survival probabilities.
(b) Denote by T the time until reappearance of cancer. Using the following formula:
propose and calculate a suitable estimation for Var(T ). (Hint: you can use Midpoint Rule, Trapezoidal/Trapezium Rule or any geometric method of approximating the area under the curve.)
[10 marks]
[Total 25 marks]
4. In this question, we will fit a Cox Proportional Hazard model on the same data set in Question 3, with time as the response variable and sex as the categorical covariate.
(a) By using the survival package, calculate the MLE for the Cox Proportional Hazard Model.
[10 marks]
(b) Based on the output you have generated, perform the z-test, Score test, and Likeli-
hood Ratio test on the following hypotheses:
H0 :β=0, vs H1 :β̸=0.
©LSE LT 2022/ST227
Page 3 of 2
ntn−1P (T > t)dt,
[15 marks]
[7 marks] (c) It is hypothesised that cases with remission time greater than 125 belong to a differ-
ent class of cancer. Create a data frame containing this subset of cases.
[3 marks] [Total 20 marks]
©LSE LT 2022/ST227 Page 4 of 2
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com