COMP9334
Capacity Planning for Computer Systems and Networks
Week 5B_2: Discrete event simulation (4): Generating random numbers
COMP9334
1
This lecture
• Discrete event simulation
• Week4B:Howtostructurethesimulation • Weeks5A,5B_1:Statisticalanalysis
• This lecture
• Backgroundonrandomnumbersandhowtheyaregenerated
• Howtogeneraterandomnumbersofanyprobabilitydistribution • Reproducibility
• Motivation
• ThePythonrandomlibrarycangeneraterandomnumbersfrom many probability distributions but sometimes you may need a distribution that the library does not have
T1,2021 COMP9334 2
Random number generator in C
• In C, the function rand() generates random integers between 0 and RAND_MAX
• E.g. The following program generates 10 random integers:
Let us generate 10,000 random integers using rand() and see how they are distributed
#include
int main () {
int i;
for (i = 0; i < 10; i++) printf("%d\n",rand());
return; }
This C file “genrand1.c” is available from the course web site.
T1,2021 COMP9334
3
Distribution of 10000 entries from rand()
Sort into 50 bins
If the numbers are really uniformly distributed, we expect 200 numbers
in each bin.
The numbers are almost uniformly distributed
T1,2021 COMP9334 4
LCG
• TherandomnumbergeneratorinCisaLinearCongruentialGenerator (LCG)
• LCG generates a sequence of integers {Z1, Z2, Z3, ...} according to the recursion
Zk =aZk-1 +c(modm) where a, c and m are integers
• Bychoosinga,c,m,Z1appropriately,wecanobtainasequenceof seemingly random integers
• Ifa=3,c=0,m=5,Z1 =1,LCGgeneratesthesequence1,3,4,2,1, 3, 4, 2, ...
• Fact:ThesequencegeneratedbyLCGhasacycleofm-1
• Wemustchoosemtobealargeinteger
• ForC,m=231
• Thepropernameforthenumbersgeneratedispseudo-random
numbers
T1,2021 COMP9334 5
Seed
• LCG generates a sequence of integers {Z1, Z2, Z3, ...} according to the recursion
Zk =aZk-1 +c(modm) where a, c and m are integers
• The term Z1 is call a seed
• Bydefault,Calsouses1astheseedanditwillgeneratethesame
random sequence
• However,sometimesyouneedtogeneratedifferentrandom sequences and you can change the seed by calling the function srand() before using rand()
• Demo genrand1.c, genrand2.c and genrand3.m
• genrand1.c – uses the default seed
• genrand2.c – sets the seed using command line argument
• genrand3.c – sets the seed using current time
T1,2021 COMP9334 6
Uniformly distributed random numbers between (0,1)
• With rand() in C, you can generate uniformly distributed random numbers in between 1 and 231-1(= RAND_MAX)
• BydividingthenumbersbyRAND_MAX,yougetrandomlydistributed numbers in (0,1)
• In Python, uniformly distributed random numbers in (0,1) can be generated by random.random() or numpy.random.random()
• BothlibrariesusestheMersenneTwisterrandomnumbergenerator with a period of 219937 - 1
• If you use 109 random number in a second, the sequence will only repeat after 105985 years
• Why are uniformly distributed random numbers important? • Ifyoucangenerateuniformlydistributedrandomnumbersbetween
(0,1), you can generate random numbers for any probability distribution
T1,2021 COMP9334 7
Random numbers generated by numpy
• 10,000 numbers generated by numpy.random.random() • Codeinrand_uni.py
T1,2021 COMP9334 8
Fair coin distribution
• You can generate random numbers between 0 and 1
• You want to use these random numbers to imitate fair coin tossing, i.e.
• ProbabilityofHEAD=0.5 • ProbabilityofTAIL=0.5
• You can do this using the following algorithm • Generatearandomnumberu
• Ifu<0.5,outputHEAD
• Ifu≥0.5,outputTAIL
T1,2021 COMP9334 9
A loaded die
• You want to create a loaded die with probability mass function 0.3
0.2 0.1
• The algorithm is:
• Generatearandomnumberu
• If u<0.1,output1 • If0.1≤u<0.3,output2 • If0.3≤u<0.4,output3
6
• If0.4≤u<0.7,output4 • If0.7≤u<0.8,output5 • If0.8≤u ,output6
12345
T1,2021 COMP9334
10
Cumulative probability distribution
0.3 0.2
0.1
6 Probability that the dice gives a value ≤ x
12345
Ex: Can you work out what these levels should be
0.1
x 6
1 0.8 0.7
0.4 0.3
T1,2021 COMP9334
11
12345
Comparing algorithm with cumulative distribution
• The algorithm is:
• Generatearandomnumberu
• If u<0.1,output1 • If0.1≤u<0.3,output2 • If0.3≤u<0.4,output3
Ex: What do 1
• If0.4≤u<0.7,output4 • If0.7≤u<0.8,output5 • If0.8≤u ,output6
Probability that the dice gives a value ≤ x
you notice
about the
intervals in 0.4 the algorithm 0.3 and the 0.1 cumulative distribution?
1
2 3 4
5
x 6
0.8 0.7
T1,2021
COMP9334
12
Graphical interpretation of the algorithm
• The algorithm is:
• Generatearandomnumberu
• If u<0.1,output1 • If0.1≤u<0.3,output2 • If0.3≤u<0.4,output3
Ex: Let us 1
0.8 0.7
0.4 0.3
0.1
• If0.4≤u<0.7,output4 • If0.7≤u<0.8,output5 • If0.8≤u ,output6
Probability that the dice gives a value ≤ x
assume
u = 0.5126, what should the algorithm output?
12345
Output 4
x 6
T1,2021
COMP9334
13
Graphical representation of inverse transform method
• Consider the cumulative density function (CDF) y = F(x), showed in the figure below
For this particular F(x), if u = 0.7 is generated
then F-1(0.7) is 6.8
T1,2021 COMP9334 14
Inverse transform method
• Amethodtogeneraterandomnumberfromaparticulardistributionis the inverse transform method
• Ingeneral,ifyouwanttogeneraterandomnumberswithcumulative density function (CDF) F(x) = Prob[X £ x], you can use the following procedure:
• Generateanumberuwhichisuniformlydistributedin(0,1) • ComputethenumberF-1(u)
• Example: Let us apply the inverse transform method to the exponential distribution
• CDFis1-exp(-lx)
T1,2021 COMP9334 15
Generating exponential distribution
• Givenasequence{U1,U2,U3,...}whichisuniformlydistributedin(0,1) • Thesequence-log(1-Uk)/lisexponentiallydistributedwithratel
• (Python file hist_expon.py)
1. Generate 10,000 uniformly
distributed numbers in (0,1)
2. Compute -log(1-uk)/2 where uk are the numbers generated in Step 1
3. The plot shows
1. The histogram of the numbers generated in Step 2 in 50 bins
2. The red line show the expected number of exponential distributed numbers in each bin
T1,2021 COMP9334 16
Reproducible simulation – motivation
• You may recall that when we run the simulation sim_mm1.py, each simulation run gives a different result because different set of random numbers is used
• Doing simulation is like performing a scientific experiment
• Good science demands reproducibility
• E.g.,Ifyouclaimthatthemeanresponsetimeofasimulationrunis say 1.3579, other people should be able to reproduce your result
T1,2021 COMP9334 17
Reproducible simulation
• In order to realise reproducibility of results, you need to save the state of the random number generator before simulation If you reuse the setting later, you can reproduce the result
• ThestateoftheMersenneTwisterplaysasimilarroletoaseedinthe generator used by C
• Demo: sim_mm1.py
# obtain setting and save it in a file
rand_state = random.getstate()
pickle.dump( rand_state, open( "rand_state_mm1.p", "wb" ) )
# load the saved setting and apply it
rand_state = pickle.load( open( "rand_state_mm1.p", "rb" ) ) random.setstate(rand_state)
T1,2021 COMP9334 18
Random number generators in Python
• Although both the random and numpy.random libraries use the Mersenne Twister generator, the generator for the libraries are separate
• The numpy.random library
• Youcangenerateanarrayofrandomnumbers
• Thefunctionstogetandsetthestateare numpy.random.get_state() and numpy.random.set_state()
• Fewerdistributionscomparedtotherandomlibrary
T1,2021 COMP9334 19
Summary
• Basic concepts on pseudo-random number generators • Using the inverse transform method to produce random
numbers of different probability distributions • Reproducibility – why and how
T1,2021 COMP9334 20
References
• Generation of random numbers
• RajJain,“TheArtofComputerSystemsPerformanceAnalysis”
• •
Sections 26.1 and 26.2 on LCG
Section 28.1 on the inverse transform methods
T1,2021
COMP9334 21