ST227: Applications of R in Life Insurance
Functions in R
27/02/2021
Copyright By PowCoder代写 加微信 powcoder
▶ These ST227 workshops explore R’s application in the context of Life Insurance. The students are assumed to be familiar with the following topics:
▶ Data Types (Numeric, Character and Logical)
▶ Data Structures (Vector, List and Data Frames)
▶ Basic Iterations (for loops and apply family of functions).
▶ If you are not familiar with the above listed topics, you are recommended to consult either:
▶ ST226 course material,
▶ LSE’s pre-sessional R course, ▶ Office hours!
Motivation
▶ Why programming?
▶ Data revolution: modern needs to tackle big, diverse and complex data sets. ▶ Reusable codes and miminising human errors.
▶ Very in demand and required in the job market.
▶ High-level language and simple syntax. Minimal communication with the underlying
▶ Functional Programming style focusing on complex operations and simple data structures.
▶ The de-facto official language of statistical analysis.
▶ Interface with Python, C,C++ and more.
▶ The actual official language of the IFoA.
be a successful R programmer?
Most R programmers (including myself) are not very good at programming. Reason: R “programmers” are mostly trained statisticians and amateur programmers.
If that sounds like you, it’s important to expect the challenges:
▶ It is application-driven.
▶ Difficult to maintain readability of codes.
▶ The perfect codes can eliminate human errors. However, writing code can be very
error-prone.
▶ You spend more time debugging than writing codes.
How to overcome these challenges?
▶ Study R codes in contexts.
▶ Modularising codes into functions (more later).
▶ Frequent testing.
▶ Again, frequent testing.
Ultimately, programming is mostly mental muscle memory. Practice often.
All LSE academics in the department of statistics are experienced R users. Chase after your instructors. Utilise your resources!
▶ R is a functional programming language:
▶ We focus on the many operations performed on data. (as opposed to object-oriented
languages).
▶ Almost all tasks in R are achieved by defining, composing and reusing functions.
▶ What exactly are functions?
▶ Mathematics definition: a rule that maps each input value in the domain to a
corresponding output in the co-domain. For example: f:R→R, x→x2.
▶ Programming definition: a set of instructions that can be executed whenever called for. Let’s see an example:
f <- function(x){ xˆ2
▶ Anatomy of a function:
▶ A name so we can refer to it. (There are anonymous functions, but let’s not worry about
it for now)
▶ An input argument or several arguments.
▶ The body of the function which contains instructions.
▶ When we call for this function, its body is executed:
▶ An example of a function with multiple input arguments:
f(x = 3) f(2)
g <- function(x,y){ xˆ2 + yˆ2
} g(x=1,y=2)
-Major differences to mathematical functions: 1. A programming function can have no input at all. 2. It might modify the context around it. 3. The input might not uniquely determine the output.
▶ Example:
g <- function(){
print("Hello World!")
print("Are you enjoying ST227?")
▶ An example of non-unique output:
▶ Technically - if you know the seed and the random number algorithm - then the output is unique. But conceptually we can think about this as a random-output function.
▶ Modifying external context and non-unique outputs can be undesirable behaviours. Document instances of them and proceed with cautions.
h <- function(){ rnorm(1,mean=0,sd=1)
Modelling Lifetime
▶ Let us fix a few notations:
▶ Tx : remaining lifetime of an individual aged x (an R+ valued random variable).
▶ t px = P(Tx ≥ t ), i.e. the t-year survival probability of the life aged x .
▶ tqx = P(Tx < t) = 1 − tpx, i.e. the death within t years probability of a life aged x.
▶ fx(t) = − d tpx, i.e. the density of Tx. dt
▶ Assume the distribution of T0 is known, the force of mortality is defined by: μx = lim P(Tx ≤ε).
- The reverse is also true. If we are given a function μ : R+ → R+, then we can recover tpx by:
x+t t tpx =exp − μsds =exp − μx+sds .
Modelling Lifetime
▶ The remaining lifetime distribution for all ages are completely determined by the force of mortality.
▶ From a computational point of view, we have reduced from a two-argument function, i.e. t px : R+ × R+ → [0, 1] to a one-argument function, i.e.: μ:R+ →R+.
▶ Constructing a mortality function from scratch is beyond the scope of this course. we will examine a few commonly used forms.
Modelling Lifetime
▶ Let us consider the simplest case, constant mortality. Suppose μt ≡ 0.05. Calculate the probability that individuals aged 20, 40 and 80 will survive the next 20 years.
▶ Direct calculations are possible.
▶ We will use the “complicated” method to achieve a general template for this type
of problems.
mu <- function(t){ rep(0.05,times=length(t))
tpx <- function(t,x){
exp (-integrate(mu,lower =x,upper =t+x)$value)
tpx(t=20 ,x =20)
tpx(t=20 ,x =40)
tpx(t=20 ,x =80)
▶ We duplicate the output of μ for vectorisation purpose. We’ll revisit this shortly.
▶ We see that it doesn’t matter what the current age is, this person has the same chance of surviving the next 20 years. This is called the memoryless property. Is
this a good way to model human lifetime?
Modelling Lifetime | Example: Makeham’s mortality
▶ A class of commonly considered mortality is called Gompertz-Makeham’s (or simply Makeham’s) mortality, which has the form:
▶ Exercise: using Makeham’s mortality with A = 5 × 10−4 , B = 7.5858 × 10−5 and c = 1.09144, calculate the probability that individuals aged 20,40 and 80 will survive the next 20 years.
A = 5e-04; B=7.5858e-05; c = 1.09144 mu <- function(t){
tpx <- function(t,x){ exp(-integrate(mu,lower=x,upper=t+x)$value)
tpx(t=20,x=20)
tpx(t=20,x=40)
tpx(t=20,x=80)
A quick note on vectorisation
▶ In programming - one often needs to apply a function f : R → R to a vector x = (x1,..,xn) on an element-by-element basis, i.e.
f (x) = (f (x1), f (x2), ..., f (xn)).
▶ This is called vectorisation. Many base functions in R have been vectorised by
design. For instance:
▶ The integrate routine requires a vectorised function (see ?integrate).
▶ The above defined function tpx has not been vectorised. If you try
tpx(t=20:25,x=20), it will throw an error.
▶ We will need to vectorise it in t for later applications of integrate. Let us
redefine it:
tpx <- function(t,x){ sapply(t, function(t){
-integrate(mu,lower=x,upper=t+x)$value
Curtate Lifetime
▶ The curtate lifetime of an individual is the integer part of their total lifetime. For instance, if a person dies at age 85 years and 6 months, than the curtate life time is 85.
▶ Denote by Kx the remaining curtate lifetime of an individual aged x. Then: Kx =⌊Tx⌋,
where the mapping t → ⌊t⌋ is the floor function.
▶ Exercise: calculate the curtate lifetime of a 20-year old individual with Makeham’s
mortality function.
▶ The expected curtate lifetime is:
⌊t⌋ × μ20+t × t p20dt.
▶ Let us define the integrand. The floor function is built into R, so we can use it
E(K20) = E(⌊T20⌋) =
integrand <- function(t){ floor(t)*mu(20+t)*tpx(t,20)
integrate(integrand,lower=0,upper=100)
▶ Handling an integral over (0, ∞) is a bit tricky. We replace ∞ with 100 as an approximation.
Curtate Lifetime
▶ It can be shown that:
E(Kx) = npx. n=1
▶ Utilise this formula to calculate the expected curtate lifetime:
probs <- sapply(1:100,function(n){tpx(n,20)}) sum(probs)
Life tables
▶ Given an initial age x0 and maximum age ω, let lx0 be an arbitrary positive number (usually 100,000). For any 0 ≤ t ≤ ω − x0, define:
lx0+t =lx0 ×tpx0.
▶ lx+0+t can be interpreted as the number of survival at time t out of an lx0 number of individuals aged x0 at time 0.
▶ Moreover, define:
dx =lx ×qx =lx ×(1−px).
▶ A life table is typically expressed in the following format (for concreteness, assume x0 =10andω=100):
... ... 100 d100
Life Tables
▶ Given a mortality function μ and boudnary ages x0 and ω, we can construct a life table:
x0 <- 10; omega <- 100
x <- x0:omega
lx <- numeric(length(x)) lx[1] <- 1e05 #1e05 = 10ˆ5 for(i in 2: length(lx)){
print(x[i-1])
lx[i] = lx[i -1]* tpx(1,x[i -1])
▶ We will also need the sequence of probabilities px and qx for x = 10, ..., 100.
px <- sapply(x,function(n){tpx(t=1,x=n)}) qx <- 1 - px
dx = lx*qx
▶ The last step is to assumble the data frame: lifeTable <- data.frame(x,lx,dx)
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com