Rizzo 6.1, 6.2
STAT 513/413: Lecture 13 Monte Carlo integration
(the real stuff)
Include code, please
50 100
2 1.00000 1.00000
3 0.99998 1.00000
4 0.98149 0.99966
5 0.82057 0.97149
6 0.54471 0.80939
7 0.30812 0.54405
50 100
2 24.50000 49.50000
3 12.00000 24.50000
4 5.87500 12.12500
5 2.87500 6.00000
6 1.40625 2.96875
7 0.68750 1.46875
1
But not like this
2
Even this does not help
3
Finally, good to be in R spirit
What is this formatting? Python? Check any book on R how the R code looks like in print!
Also, if you want to be serious with R: this is not the programming style of R! Loops? In R, you can do all the “Main Code” in three lines…
4
Are your computers that slow?
N = 100 is almost never a big deal; sometimes neither N = 10000
(You understand that probability is about large numbers? If not yet, the hope is that after this course you will. It is the most important knowledge you get here, much more important than coding skills or Monte Carlo technicalities.)
5
Last but not least
So, for the full credit:
– you answer all questions asked
– if it is mathematics, include some justification
– if it is testing in R, include the transcript how it tested – if it is algorithm, implement it, and include the R code
– and then include the transcript showing how that code ran, the result that code produced and, if applicable, include the pictures that code produced
EVERYTHING COMPUTER WORK MUST BE REPRODUCIBLE – otherwise no points
6
The programming part can be all monospaced
Probability of occurring and expected number of runs
of length k=2,3,4,5,6,7 in sequences of length 50 and 100
Expected values easier: closed-form solution:
> rr=cbind(2^(-(1:6))*(49:44),2^(-(1:6))*(99:94))
> rownames(rr)=as.character(2:7)
> colnames(rr)=as.character(c(50,100))
> rr
50 100
2 24.50000 49.50000
3 12.00000 24.50000
4 5.87500 12.12500
5 2.87500 6.00000
6 1.40625 2.96875
7 0.68750 1.46875
My R function generating n coin tosses counting runs (16 lines):
runs = function(r, n=50, N=100000)
## the numbers of runs of k same elements in n coin tosses
## repeated N times
{
runs = numeric(N)
for (k in 1:N) {
y = diff(floor(2*runif(n)))
… and so on…
7
And now to the topic: expected value = integral
Suppose that cl(x) is a probability density of X and we would like to compute the integral
11
θ = g(x)l(x) dx = c g(x)cl(x) dx = c E(g(X))
AA Notation: let μ = E(g(X))
Simple Monte Carlo algorithm:
If all Xi follow the distribution with density cf(x) then μ = E(g(X1)) = E(g(X2)) = . . . is estimated by
1 N
g(Xi)
(Note: this is written as g(X); do not confuse it with g(X ̄))
1 1N
The desired integral θ is then estimated by TN = cSN = cN
SN = N
i=1
i=1
g(Xi) 8
The typical instance
A “typical case” is when l(x) = 1 and A = [a, b], b b
and the desired integral is g(x)l(x) dx = aa
g(x) dx Then the appropriate density has c = 1 so that
b−a
cl(x) = 1 is the density of the uniform distribution on [a, b]
b−a The integral
b b1
θ = g(x) dx = (b − a) g(x)b − a dx
aa
=(b−a) g(x)cl(x)dx=(b−a)μ
1 N
is estimated by TN =(b−a)SN =(b−a)N
g(Xi)
i=1
9
E(SN) = E N
(as we have already seen before)
Unbiased estimation
The target of interest, the integral, is defined outright as an expected value
So every unbiased estimator is estimating the integral
In particular, the simple Monte Carlo method yields an unbiased estimator:
1N 1N
g(Xi) = N E(g(Xi)) = μ i=1
i=1
For the “typical case” we get
1N E(TN)=E((b−a)SN)=(b−a)E N g(Xi) =(b−a)μ=θ
i=1
10
The measure of precision: variance
Once E(g(X)) [exists and] is finite, then all theory applies: the important thing to watch then is
1 N
Var(SN) = N2
Note once again: Var(g(X)) is not g(Var(X)); it depends on g
1 N
i=1
Var(g(X)) Var(g(Xi)) = N
(g(Xi) − SN)2
Var(TN) = Var((b − a)SN) = (b − a)2 Var(SN)
= (b − a)2 Var(g(X)) N
But we may estimate it by N − 1 And for the “typical case”, the variance is
i=1
11
An alternative: hit-or-miss way of integration
Often the function f is bounded on [a,b]: say by 0 from below and by c from above (the example looks similar to acceptance/rejection method, but note the subtle difference)
x is between a=0 and b=2, y between 0 and c=0.75
a
b
c
inside outside
0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0
U
Generate points (U,V), with U uniform on [a,b] and V uniform on [0,c); the desired result is
number of points inside
number of points alltogether, outside or inside
where area = c(b − a)
× area
12
V
0.0 0.2 0.4 0.6 0.8
Which one is better?
Let us do the simple Monte Carlo first, and then the hit-and-miss
> ngen = 10000
> x=runif(ngen,0,2); mean((3/4)*x*(2-x))*2
[1] 0.9965935
> x=runif(ngen,0,2); mean((3/4)*x*(2-x))*2
[1] 1.000751
> x=runif(ngen,0,2); mean((3/4)*x*(2-x))*2
[1] 1.002049
> ngen = 10000
> u = runif(ngen,0,2); v=runif(ngen,0,0.75); mean(v <= (3/4)*u*(2-u))*2*0.75
[1] 1.01865
> u = runif(ngen,0,2); v=runif(ngen,0,0.75); mean(v <= (3/4)*u*(2-u))*2*0.75
[1] 1.0035
> u = runif(ngen,0,2); v=runif(ngen,0,0.75); mean(v <= (3/4)*u*(2-u))*2*0.75
[1] 0.9894
Every difference here is in the eye of beholder (and also by chance), so let us try to estimate the variances; note, however, that we used twice as many random numbers for the hit-and-miss method
13
Variances: simple Monte Carlo
Note that we need to consider the variance of the integral: that is, if the mean (or probability) is multiplied by c to obtain the integral, the variance has then to be multiplied by c2.
Everything is eventually divided by n - but to see the numbers better, we look at them before the division. In other words, our variances here are the variances of estimators, multiplied by n
First, the simple Monte Carlo; the estimate of variance is
> var((3/4)*x*(2-x))*2^2
[1] 0.1969241
It is also possible to obtain this theoretically: knowing that 3 23 1
2E 4X(2−X) =2
2 3 23 21
4x(2−x)2dx=1, we obtain 2 Var 4X(2−X) =4 4x(2−x)−1 2dx=0.2
0
0
(But that is because we have rather a toy setup here)
14
Variances: simple Monte Carlo
And then the hit-and-miss method, where we use the pˆ(1 − pˆ) estimate for σ2 = p(1 − p)
> p=mean(v <= (3/4)*u*(2-u))
> p*(1-p)*(2*0.75)^2
[1] 0.5069691
Also here we know the exact solution, as we know that p = 1/(2(0.75)) = 1/1.5 = 2/3, and therefore
2 2 2 2 2132 1 p(1−p)(2(0.75)) =3 1−3 (1.5) = 3 3 2 =2=0.5
So, “hit-and-miss” has larger variance – 0.5 to 0.2 – and also requires twice more random numbers…
Note: we are using the pˆ(1 − pˆ) estimate, because we would like to evaluate “hit-and-miss” against “simple”. If we wanted performance guarantee, we could use the upper bound 1/4 – but then hit-and-miss would come out even worse
> (1/4)*(2*0.75)^2
[1] 0.5625
15
Remark
Note that in both averages it is the same number of summands, N, so both variances will be divided by N. “Twice more” is in terms of computational complexity: for one random uniformly distributed random number in simple Monte Carlo, we have to generate two uniformly distributed random numbers, in both coordinates, for the “hit-nad-miss” method. So in terms of n, how many times the random number generating procedure is ran, it is n = N to n = 2N.
16
How about Hoeffding inequality here?
− 2Nε2 Recall: the bound is 2e (b−a)2
given same N and ε, it is all about (b−a)2 (the lower, the better)
In our particular case, simple Monte Carlo still fares better, as g(Xi) ∈ [0, 0.75) (b − a)2 = 0.752 = 0.5625
while the hit-and-miss method, working with 0-1 variables, has (b−a)2 =1
Note also that for both methods, we have pretty exact performance guarantees by the Hoeffding inequality in both cases – without a need to estimate variance
(Inspiration: Monte Carlo Concepts, Algorithms, and Applications, by G. S. Fishman)
17