CS计算机代考程序代写 matlab python chain capacity planning COMP9334

COMP9334
Capacity Planning for Computer Systems and Networks
Week 5A: Discrete event simulation (2). Independent replications. Confidence interval.
COMP9334
1

Week 4B
• Two topics
• Howtostructurediscreteeventsimulationofqueues
• HowtousethePythonrandomlibrarytogeneraterandom numbers for inter-arrival and service times
• You should be able to simulate
• Manytypesofqueues
• Single-server or multi-server
• Different queueing disciplines
• Manyinter-arrivaltimeandservicetimedistributions • However, there are a number of problems …
T1, 2021 COMP9334 2

Problem: data interpretation, simulation length
• Week4B’srevision problem #1. The problem asks you to:
• use 4 different simulation end times: 1000, 5000, 10000, 50000
• For each end time, do 20 runs
• Figureontherightshows the solution
• Each dot is the average response time for each run
• The red horizontal line shows the theoretical
• Whatobservationscanyou make?
T1, 2021 COMP9334 3

Problem: How do we compare 2 alternative choices?
• Week4B’sRevision Problem #2. The question asks you to simulate each of the following 2 queues 10 times:
• M/M/1 queue with l = 0.9 and μ=1
• M/M/2 queue with l = 0.9 and μ = 0.5
• FromQueueingtheory,we expect the M/M/1 system to have a lower mean response time but do the simulation results suggest that?
T1, 2021 COMP9334 4

Analysis of simulation results
A very important topic but it is very often ignored
Simulation is not simply about
Writing a computer program
Verifying the correctness of the program
Running the simulation once and present the results
Verifying the correctness of the simulation program is important
It is equally important to do sound statistical analysis on the simulation results obtained
T1, 2021
COMP9334
5

This lecture
• Analysis of simulation results
• How to choose simulation parameters?
• HowlongshouldIsimulatefor?
• HowmanytimesshouldIrepeatthesimulation?
• Confidence interval
T1, 2021 COMP9334 6

Analysis of simulation data
• There are many statistical methods to analyse data depending on the situation
• We will focus on analysing steady state mean value only • For example, we are interested to find the steady state
mean response time of a queue
• Recall that we talked about
• TransientandsteadystatebehaviourofqueueinWeek2B • SteadystateofMarkovchaininWeek3B
T1, 2021 COMP9334 7

What is steady state? (1)
• Let us simulate an M/M/1 queue with • Arrivalratel=0.7
• Servicerateμ=1
• Simulationendswhenmasterclockis50000s
• In this simulation, we record the response time for each job • Let X(k) = Response time of kth job
• ThenextpageshowsX(k)changescontinuously
• Let N denote the number of jobs in the simulation • N=35000foroursimulation
• In Week 5A, we computed the mean response time using
T1, 2021 COMP9334 8

Response time continuously changes over time
• Thisgraph shows response time of X(k) of the k-th job where k = 1 to 35000
• Noteresponse time continuously varies
• Responsetime does not settle to a constant value
• Butmean response time does settle
T1, 2021
COMP9334 9

What is steady state? (2)
• Let us instead compute the running mean M(k) where
• For example, if k = 5, then
• Thus M(5) is the mean response time of the first 5 jobs
• In general, M(k) is the mean response time of the first k jobs • Let us plot M(k) – see the next slide
T1, 2021 COMP9334 10

Transient behaviour versus steady state behaviour
T1, 2021 COMP9334 11
jobs

Transient removal: Introduction (1)
• The early part of the simulation displays transient (= non- steady state behaviour)
• The later part of the simulation converges or fluctuates around the steady state value
• Since we are interested in the steady state value, we should not use the transient part of the data to compute the steady state value
• We should remove the transient part and only use the steady state part to compute the mean
• One method to identify the transient part is to use visual inspection
• Note:Inthepreviousslide,wehavethetheoreticalvaluetoguide us but in practice you don’t, you will learn a transient removal method based on batch means in Revision Problem 5A
T1, 2021 COMP9334 12

Transient removal: Introduction (2)
• Let us assume that the first m jobs constitute the transient part and there are N jobs altogether, we should revise the formula to compute the mean to
• Note: We used too simple a method to compute the mean in Week 4B but I didn’t want to complicate things at that time!
• Important: You must run the simulation long enough so that you have a good number of data points (or jobs) in the steady state part.
T1, 2021 COMP9334 13

Independent replications
• Assume that we carry out simulations to find out what the steady state mean response time of a queueing system is
• Importantnote:Wecannotgetexactanswerfromsimulation
• Weexpressoursimulationresultsase.g.thereisaprobabilityof
95% that the mean response time is in the interval [3.1,3.3]. • Wecalltheinterval[3.1,3.3]the95%confidenceinterval.
• Independent replications: Repeat the simulation a number of times using different sets of random numbers
• Why independent replications?
• Independentreplicationsallowustousestatisticalmethodto
estimate a confidence interval of steady state mean response time
T1, 2021 COMP9334 14

Example: Independent replications
• We want to use simulation to estimate the mean response time of an M/M/1 queue with
• Arrivalratel=0.7
• Servicerateμ=1
• Simulationendswhenmasterclockis16000s
• We repeat the experiment 30 times using different sets of random numbers
• For each independent experiments
• Werecordtheresponsetimeofallthejobs
• Removethetransientpart
• Computethemeanresponsetimeusingthesteadystatesection
• We obtain 30 different estimates of the mean response time, one from each independent experiment
• These independent estimates allow us to find a confidence interval
T1, 2021 COMP9334 15

Example (Cont’d)
• The blue circles show the estimated mean response time from the 30 independent experiments
• The red line is the 95% confidence interval
• Thereisa95%probabilitythat the true mean response time that we want to estimate is in the interval [3.30,3.62]
• The green line is the theoretical mean response time (which you should not normally know).
T1, 2021 COMP9334 16

Computing the confidence interval (1)
• Assume that you do n independent replications
• In each replication, you remove the transient part and compute an estimate of the mean steady state response time
• Let us call your estimate from the kth replication, T(k)
• Compute the sample mean
• And the sample standard deviation
Note: for sample standard deviation, (n-1) is in the denominator, not n.
See also the note on p.21.
T1, 2021 COMP9334 17

Computing confidence interval (2)
• There is a probability (1-a) that the mean response time that you want to estimate lies in the interval
• If a is 0.05, then it means there is a 0.95 probability that the mean response time is in the calculated confidence interval
T1, 2021 COMP9334 18

Computing confidence interval (3)
• The value can be obtained from looking up the Student t distribution table
• Note:AStudentttablehasbeenprovidedonthewebsite • There are also programs that compute it
• InPython
• InMatlab,youcanusetinv(1-alpha/2,n-1)
T1, 2021 COMP9334 19

Example: Independent replications (cont’d)
• Five replications with mean response times: • 0.31,0.37,0.34,0.36,0.39
• The sample mean of (n = ) 5 replications = 0.354
• The sample standard deviation of 5 replications is 0.0305
• If we want to compute the 95% confidence interval, a = 0.05 • Sincewedid5independentexperimentsandwant95%confidence
interval, we use t4,0.975
• From the t-distribution table, the value of t4,0.975 is 2.776, the 95% confidence interval is
0.354 − 2.776 !.!#!$ , 0.354 − 2.776 !.!#!$ = [0.316, 0.392] $$
T1, 2021 COMP9334
20

Sample program and caution
• The sample program comp_confidence_interval.py shows you how you can do the calculations on the previous slide using Python libraries numpy and scipy.stats
• Caution: We need to compute sample standard deviation
which means you need to divide by (n-1), not divide by n. Some Python functions divide by (n-1) by default but some others need to use an option. See the sample program. If in doubt, use a simple example to check.
T1, 2021 COMP9334 21

More on confidence interval
• Confidence interval
a % confidence interval mid-point
What happens if you make n bigger, i.e. do more independent replications ?
T1, 2021
22
COMP9334

What can we get from simulation?
• If your queueing problem has a mathematical solution, you will get one value for the steady state mean response time
• If you simulate a queue to try to estimate the mean response time, you will not know the exact value of the steady state mean response time
• Simulation can only give you a confidence interval of what you want to estimate
• You can reduce the confidence interval by doing many independent replications!
T1, 2021 COMP9334 23

Choice of simulation parameters (1)
• Simulation parameters • Lengthofsimulation
• Numberofreplications • Accuracy
• Unfortunately, there are no hard rules to choose them. You will need to do some trial and error
• Ifthelengthofsimulationisnotlongenough,youwillneedto increase it
• Ifthenumberofreplicationsisnotenoughtogiveyouthedesired accuracy, you will need to increase it
T1, 2021 COMP9334 24

Choice of simulation parameters (2)
• Length of simulation
• Mustbelongerthanthetransient
• Shouldhaveagoodnumberofdatapointinthesteadystatepart • Hard to say what “good” is. Get a few hundred if you can. The more
the better but of course your simulation will run longer
• Number of replications
• Youmaywanttohave5replicationstostartwith
• Afterremovingthetransient,computetheconfidenceintervalfor your estimate.
• Comparethewidthofyourconfidenceintervalwithyourdesired accuracy. If the confidence interval that you have obtained is too wide, you will need to increase the number of replications.
• Progressively(basicallybytrial-and-error),increasethenumberof replications until you get the desired level of accuracy
T1, 2021 COMP9334 25

Summary
• Simulation is not just a computer programming exercise • You need to make sure that your program is correct
• It is also important to analyse your results statistically
• Methods discussed include
• Transientremovaltechnique
• Confidenceinterval
• Determiningnumberofreplications
T1, 2021 COMP9334 26

References
• TheprimaryreferenceisLawandKelton,“SimulationModellingand Analysis”
• Transient removal, Sections 9.1, 9.2 and 9.5
• Replication method, Section 9.5.2
• RajJain,“TheArtofComputerSystemsPerformanceAnalysis”has
materials on
• Transient removal methods, Section 25.3
• Calculating confidence interval, Section 13.2
• If you are interested to know the mathematical background on confidence interval, student t-distribution etc., a possible reference is Wackerly et al, “Mathematical Statistics with Applications”.
T1, 2021 COMP9334 27