程序代写 ST227: Applications of R in Life Insurance

ST227: Applications of R in Life Insurance
Functions in R
27/02/2021

Copyright By PowCoder代写 加微信 powcoder

▶ These ST227 workshops explore R’s application in the context of Life Insurance. The students are assumed to be familiar with the following topics:
▶ Data Types (Numeric, Character and Logical)
▶ Data Structures (Vector, List and Data Frames)
▶ Basic Iterations (for loops and apply family of functions).
▶ If you are not familiar with the above listed topics, you are recommended to consult either:
▶ ST226 course material,
▶ LSE’s pre-sessional R course, ▶ Office hours!

Motivation
▶ Why programming?
▶ Data revolution: modern needs to tackle big, diverse and complex data sets. ▶ Reusable codes and miminising human errors.
▶ Very in demand and required in the job market.
▶ High-level language and simple syntax. Minimal communication with the underlying
▶ Functional Programming style focusing on complex operations and simple data structures.
▶ The de-facto official language of statistical analysis.
▶ Interface with Python, C,C++ and more.
▶ The actual official language of the IFoA.

be a successful R programmer?
Most R programmers (including myself) are not very good at programming. Reason: R “programmers” are mostly trained statisticians and amateur programmers.
If that sounds like you, it’s important to expect the challenges:
▶ It is application-driven.
▶ Difficult to maintain readability of codes.
▶ The perfect codes can eliminate human errors. However, writing code can be very
error-prone.
▶ You spend more time debugging than writing codes.
How to overcome these challenges?
▶ Study R codes in contexts.
▶ Modularising codes into functions (more later).
▶ Frequent testing.
▶ Again, frequent testing.
Ultimately, programming is mostly mental muscle memory. Practice often.
All LSE academics in the department of statistics are experienced R users. Chase after your instructors. Utilise your resources!

▶ R is a functional programming language:
▶ We focus on the many operations performed on data. (as opposed to object-oriented
languages).
▶ Almost all tasks in R are achieved by defining, composing and reusing functions.
▶ What exactly are functions?
▶ Mathematics definition: a rule that maps each input value in the domain to a
corresponding output in the co-domain. For example: f:R→R, x→x2.
▶ Programming definition: a set of instructions that can be executed whenever called for. Let’s see an example:
f <- function(x){ xˆ2 ▶ Anatomy of a function: ▶ A name so we can refer to it. (There are anonymous functions, but let’s not worry about it for now) ▶ An input argument or several arguments. ▶ The body of the function which contains instructions. ▶ When we call for this function, its body is executed: ▶ An example of a function with multiple input arguments: f(x = 3) f(2) g <- function(x,y){ xˆ2 + yˆ2 } g(x=1,y=2) -Major differences to mathematical functions: 1. A programming function can have no input at all. 2. It might modify the context around it. 3. The input might not uniquely determine the output. ▶ Example: g <- function(){ print("Hello World!") print("Are you enjoying ST227?") ▶ An example of non-unique output: ▶ Technically - if you know the seed and the random number algorithm - then the output is unique. But conceptually we can think about this as a random-output function. ▶ Modifying external context and non-unique outputs can be undesirable behaviours. Document instances of them and proceed with cautions. h <- function(){ rnorm(1,mean=0,sd=1) Modelling Lifetime ▶ Let us fix a few notations: ▶ Tx : remaining lifetime of an individual aged x (an R+ valued random variable). ▶ t px = P(Tx ≥ t ), i.e. the t-year survival probability of the life aged x . ▶ tqx = P(Tx < t) = 1 − tpx, i.e. the death within t years probability of a life aged x. ▶ fx(t) = − d tpx, i.e. the density of Tx. dt ▶ Assume the distribution of T0 is known, the force of mortality is defined by: μx = lim P(Tx ≤ε). - The reverse is also true. If we are given a function μ : R+ → R+, then we can recover tpx by: 􏰀􏰂x+t 􏰁 􏰀􏰂t 􏰁 tpx =exp − μsds =exp − μx+sds . Modelling Lifetime ▶ The remaining lifetime distribution for all ages are completely determined by the force of mortality. ▶ From a computational point of view, we have reduced from a two-argument function, i.e. t px : R+ × R+ → [0, 1] to a one-argument function, i.e.: μ:R+ →R+. ▶ Constructing a mortality function from scratch is beyond the scope of this course. we will examine a few commonly used forms. Modelling Lifetime ▶ Let us consider the simplest case, constant mortality. Suppose μt ≡ 0.05. Calculate the probability that individuals aged 20, 40 and 80 will survive the next 20 years. ▶ Direct calculations are possible. ▶ We will use the “complicated” method to achieve a general template for this type of problems. mu <- function(t){ rep(0.05,times=length(t)) tpx <- function(t,x){ exp (-integrate(mu,lower =x,upper =t+x)$value) tpx(t=20 ,x =20) tpx(t=20 ,x =40) tpx(t=20 ,x =80) ▶ We duplicate the output of μ for vectorisation purpose. We’ll revisit this shortly. ▶ We see that it doesn’t matter what the current age is, this person has the same chance of surviving the next 20 years. This is called the memoryless property. Is this a good way to model human lifetime? Modelling Lifetime | Example: Makeham’s mortality ▶ A class of commonly considered mortality is called Gompertz-Makeham’s (or simply Makeham’s) mortality, which has the form: ▶ Exercise: using Makeham’s mortality with A = 5 × 10−4 , B = 7.5858 × 10−5 and c = 1.09144, calculate the probability that individuals aged 20,40 and 80 will survive the next 20 years. A = 5e-04; B=7.5858e-05; c = 1.09144 mu <- function(t){ tpx <- function(t,x){ exp(-integrate(mu,lower=x,upper=t+x)$value) tpx(t=20,x=20) tpx(t=20,x=40) tpx(t=20,x=80) A quick note on vectorisation ▶ In programming - one often needs to apply a function f : R → R to a vector x = (x1,..,xn) on an element-by-element basis, i.e. f (x) = (f (x1), f (x2), ..., f (xn)). ▶ This is called vectorisation. Many base functions in R have been vectorised by design. For instance: ▶ The integrate routine requires a vectorised function (see ?integrate). ▶ The above defined function tpx has not been vectorised. If you try tpx(t=20:25,x=20), it will throw an error. ▶ We will need to vectorise it in t for later applications of integrate. Let us redefine it: tpx <- function(t,x){ sapply(t, function(t){ -integrate(mu,lower=x,upper=t+x)$value Curtate Lifetime ▶ The curtate lifetime of an individual is the integer part of their total lifetime. For instance, if a person dies at age 85 years and 6 months, than the curtate life time is 85. ▶ Denote by Kx the remaining curtate lifetime of an individual aged x. Then: Kx =⌊Tx⌋, where the mapping t → ⌊t⌋ is the floor function. ▶ Exercise: calculate the curtate lifetime of a 20-year old individual with Makeham’s mortality function. ▶ The expected curtate lifetime is: ⌊t⌋ × μ20+t × t p20dt. ▶ Let us define the integrand. The floor function is built into R, so we can use it E(K20) = E(⌊T20⌋) = integrand <- function(t){ floor(t)*mu(20+t)*tpx(t,20) integrate(integrand,lower=0,upper=100) ▶ Handling an integral over (0, ∞) is a bit tricky. We replace ∞ with 100 as an approximation. Curtate Lifetime ▶ It can be shown that: E(Kx) = 􏰃npx. n=1 ▶ Utilise this formula to calculate the expected curtate lifetime: probs <- sapply(1:100,function(n){tpx(n,20)}) sum(probs) Life tables ▶ Given an initial age x0 and maximum age ω, let lx0 be an arbitrary positive number (usually 100,000). For any 0 ≤ t ≤ ω − x0, define: lx0+t =lx0 ×tpx0. ▶ lx+0+t can be interpreted as the number of survival at time t out of an lx0 number of individuals aged x0 at time 0. ▶ Moreover, define: dx =lx ×qx =lx ×(1−px). ▶ A life table is typically expressed in the following format (for concreteness, assume x0 =10andω=100): ... ... 100 d100 Life Tables ▶ Given a mortality function μ and boudnary ages x0 and ω, we can construct a life table: x0 <- 10; omega <- 100 x <- x0:omega lx <- numeric(length(x)) lx[1] <- 1e05 #1e05 = 10ˆ5 for(i in 2: length(lx)){ print(x[i-1]) lx[i] = lx[i -1]* tpx(1,x[i -1]) ▶ We will also need the sequence of probabilities px and qx for x = 10, ..., 100. px <- sapply(x,function(n){tpx(t=1,x=n)}) qx <- 1 - px dx = lx*qx ▶ The last step is to assumble the data frame: lifeTable <- data.frame(x,lx,dx) 程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com