COMP9334
Capacity Planning for Computer Systems and Networks
Week 7A_2: Using queueing theory to improve system performance
COMP9334
1
Applications of queueing
• There are plenty of examples on using queueing to improve system performance
• We will look at a few examples
• Thetechnicalpaperscanbedownloadedfromthecoursewebsite
• Good resource:
• Sigmetrics
• http://www.sigmetrics.org
• A leading conference on performance evaluation of computer systems and networks
• Performanceevaluation
• A journal devoted to the topic of performance evaluation
T1,2021 COMP9334 2
Priority queueing
Arrivals
Departures
Arrivals
Departures
High priority jobs/Class 1
Arrivals High priority jobs/Class 1 Departures
Arrivals
Departure
Low priority jobs/Class 2
Low priority jobs/Class 2
s
T1,2021 COMP9334
3
Using priority queueing to reduce response time (1)
• Revision Problem 4A, Question 1
• Customersarriveatagrocerystore’scheckoutcounteraccording
to Poisson process with rate 1 per minute.
• Eachcustomercarriesanumberofitemsthatisuniformly distributed between 1 and 40.
• Thestorehas2checkoutcounters,eachcapableofprocessing items at a rate of 15 per minute.
• Toreducecustomerwaitingtimeinqueue,thestoremanager considers dedicating one of the two counters to customers with x items or less and dedicating the other counter to customers with more than x items.
• Writeasmallcomputerprogramtofindthevalueofxthat minimises the average customer waiting time.
T1,2021 COMP9334 4
Using priority queueing to reduce response time (2)
• At a website, use a dispatcher to classify the HTTP
requests depending on their service time requirement
• Requestswhoseservicetimeisbelowathresholdgotoone server
• Therestsoftherequestsgototheotherserver • Seediagramonthenextslide
• Need prior knowledge on the service time of the requests
• Reference: B. Schroeder and M. Harchol-Balter, “Web servers under overload: How scheduling can help”, ACM Transactions on Internet Technology (TOIT), Volume 6 , Issue 1,
pages 20 – 52, 2006. http://doi.acm.org/10.1145/1125274.1125276.
T1,2021 COMP9334 5
Dispatcher
System 1
Arriving requests
Requests requiring less service time
Dispatcher
Requests requiring more service time
T1,2021
COMP9334 6
System 2
Determining Multi-programming level
How to determine a good multi-programming level for external scheduling
Bianca Schroeder§ Mor Harchol-Balter§∗ §Carnegie Mellon University
Department of Computer Science Pittsburgh, PA USA
Abstract
Scheduling/prioritization of DBMS transactions is im- portant for many applications that rely on database back- ends. A convenient way to achieve scheduling is to limit the number of transactions within the database, maintain- ing most of the transactions in an external queue, which can be ordered as desired by the application. While external scheduling has many advantages in that it doesn’t require changes to internal resources, it is also difficult to get right in that its performance depends critically on the particular multiprogramming limit used (the MPL), i.e. the number of transactions allowed into the database. If the MPL is too
Arun Iyengar† Erich Nahum† Adam Wierman§ †IBM T.J. Watson Research Center
Yorktown Heights, NY USA
processing time is spent. For such applications it is often desirable to control the order in which transactions are exe- cuted at the DBMS. An e-commerce applications for exam- ple might want to give faster service to those transactions carrying a lot of revenue.
Recently, systems researchers have started to investigate the idea of external scheduling as a method of controlling the order in which transactions are executed. The basic mechanism in external scheduling is demonstrated in Fig- ure 1, and simply involves limiting the number of transac- tions concurrently executing within the DBMS. This limit is referred to as the MPL (multi-programming limit). If the MPL is already met, all remaining transactions are queued
up in an external queue. The application can then control
low, throughput will suffer, since not all DBMS resources
T1,2021 COMP9334
7
will be utilized. On the other hand, if the MPL is too high,
the order in which transactions are executed by scheduling
owed into the database. If the MPL is too
e
n h
m i
a o
y
a t
o
a a
N g
suffer, since not all DBMS resources he other hand, if the MPL is too high,
up in an external queue. The application can then control the order in which transactions are executed by scheduling
DB server – Multi-programming level
ontrol on scheduling. The question of L to achieve both goals simultaneously
incoming transactions
• Somedatabaseserver
ot just for databases but in system de-
the external queue.
management systems (DBMS)
in we study this problem in the context
set an upper limit on the number
loads, both via extensive experimenta- oretic analysis.
MPL=4 DBMS
of active transactions within the
external queue
o most critical factors in adjusting the
system
of resources that the workload utilizes
• Thisupperlimitiscalledmulti- the transactions’ service demands. We
programming level (MPL)
ased controller, augmented by queue- for automatically adjusting the MPL. methods to the specific problem of ex- of transactions. We find that external nearly as effective as internal prioriti- egative consequences, when the MPL
Figure 1. Simplified view of the mechanism used in external scheduling. A fixed limited number of trans- actions (MPL=4) are allowed into the DBMS simul- taneously. The remaining transactions are held back in an external queue. Response time is the time from when a transaction arrives until it completes, includ- ing time spent queueing externally to the DBMS.
• AhelppagefromSAPexplainingMPL
Examples of recent work on external scheduling come
• http://dcx.sap.com/1200/en/dbadmin_en12/running-s-3713576.html
• PicturefromSchroderetal.“Howtodetermineagoodmulti-
eb applications are largely dependent
se, whperoe gthreamamjoirnitgy loefvtehel froeqruexstternal scheduling”
from many areas including storage servers, web servers, and
database servers. For example, Jin et al. [9] develop an ex-
rants CCR-0133077, CCR-0311383, 0313148,
ternal scheduling front-end to provide proportional sharing
among the requests at a storage service utility. Blanquer et
T1,2021 COMP9334 8
will On t ent c MP em, n Here work
g the e tw ber
ty of ck b
dels our
tion n be
ny n ely.
n
ys w taba
SF g
h Digital Greenhouse Grant.
MPL is already met, all remaining transactions are queued
al. [4] study external scheduling for quality of service pro-
in that its performance depends critically on the particular
tions concurrently executing within the DBMS. This limit
multiprogramming limit used (the MPL), i.e. the number of
is referred to as the MPL (multi-programming limit). If the
The problem
transactions allowed into the database. If the MPL is too
low, throughput will suffer, since not all DBMS resources
will be utilized. On the other hand, if the MPL is too high,
there is insufficient control on scheduling. The question of how to adjust the MPL to achieve both goals simultaneously is an open problem, not just for databases but in system de- sign in general. Herein we study this problem in the context of transactional workloads, both via extensive experimenta- tion and queueing theoretic analysis.
We find that the two most critical factors in adjusting the MPL are the number of resources that the workload utilizes and the variability of the transactions’ service demands. We develop a feedback based controller, augmented by queue- ing theoretic models for automatically adjusting the MPL.
MPL is already met, all remaining transactions are queued up in an external queue. The application can then control the order in which transactions are executed by scheduling the external queue.
• To choose a good MPL means you wan
Finally, we apply our methods to the specific problem of ex-
act
e
ternal prioritization of transactions. We find that external prioritization can be nearly as effective as internal prioriti-
S
simul-
mean response time for different choices of MPL
zation, without any negative consequences, when the MPL is set appropriately.
when a transaction arrives until it completes, includ- ing time spent queueing externally to the DBMS.
Examples of recent work on external scheduling come from many areas including storage servers, web servers, and database servers. For example, Jin et al. [9] develop an ex- ternal scheduling front-end to provide proportional sharing among the requests at a storage service utility. Blanquer et al. [4] study external scheduling for quality of service pro-
• IfMPL=1,whatistheresponsetime? 1. Introduction
• IfMPL=2,whatistheresponsetime?
•…
Many of todays web applications are largely dependent on a backend database, where the majority of the request
∗Supported by NSF grants CCR-0133077, CCR-0311383, 0313148, and a 2005 Pittsburgh Digital Greenhouse Grant.
• Question: Let us assume that the arrival is Poisson and the Proceedings of the 22nd International Conference on Data Engineering (ICDE’06)
service time is exponential, can you suggest how we can determine the mean response time?
8
EE
-7695
-2
570-
9/06
$2
0.00
©
200
6 IE
incoming transactions
external queue
MPL=4 DBMS
Figure 1. Simplified view of the mechanism used in external scheduling. A fixed limited number of trans-
t
t
ion
s
(MPL
o
d
=4
)a
re
all
e
t
e
r
ow
m
taneously. The remaining transactions are held back in an external queue. Response time is the time from
ed i
i
n
n
to t
e
t
he D
BM
h
• Markovchain
T1,2021 COMP9334 9
Optimal power allocation
BSTRACT
Optimal Power Allocation in Server Farms
Anshul Gandhi Carnegie Mellon University Pittsburgh, PA, USA
anshulg@cs.cmu.edu
Rajarshi Das
IBM Research Hawthorne, NY, USA rajarshi@us.ibm.com
Mor Harchol-Balter∗ Carnegie Mellon University
Pittsburgh, PA, USA harchol@cs.cmu.edu
Charles Lefurgy IBM Research Austin, TX, USA lefurgy@us.ibm.com
improve server farm performance, by a factor of ty 1.4 and as much as a factor of 5 in some cases.
Categories and Subject Descriptors
C.4 [Performance of Systems]: Modeling techniqu General Terms
Theory, Experimentation, Measurement, Performance
1. INTRODUCTION
erver farms today consume more than 1.5% of the total ectricity in the U.S. at a cost of nearly $4.5 billion. Given e rising cost of energy, many industries are now seeking lutions for how to best make use of their available power. n important question which arises in this context is how distribute available power among servers in a server farm
as to get maximum performance.
By giving more power to a server, one can get higher server
equency (speed). Hence it is commonly believed that, for
given power budget, performance can be maximized by
erating servers at their highest power levels. However,
T1,2021 COMP9334
10
A
S el th so A to so
fr a op
pically
es
Servers today consume ten times more power than they
it is also conceivable that one might prefer to run servers
Server farm power allocation: Introduction
• Aserverfarmconsistsofmultipleservers • See next slide
• Theserverscanrunat
• Higher clock speed with higher power
• Lower clock speed with lower power
• Illustration: The slide after next
• Ex:Given
• Higher power = 250W, lower power = 150W
• Power budget = 3000W
• You can have
• 12 servers at highest clock speed
• 20 servers at lowest clock speed
• Other combinations
• Which combination is best?
T1,2021 COMP9334 11
Server farm
Servers at the back-end
PS
PS
High-speed Router
PS
PS
T1,2021
COMP9334
12
…..
CPU Power-speed tradeoff
(a) DFS and DVFS (b) DVFS+DFS
Power-to-frequency curves for DFS, DVFS, and DVFS+DFS for the CPU bound LINPACK workload. Fig.(a) illustrates our measurements for DFS and DVFS. In both these mechanisms, we see that the server frequency is linearly related to the power allocated to the server. Fig.(b) illustrates our measurements for DVFS+DFS, where the power-to-frequency curve is better approximated by a cubic relationship.
i, receives a fraction qi of the total workload coming in to the server farm. Corresponding to any vector of power allo- cation (P1 , . . . , Pk ), there exists an optimal workload allo- cation vector (q1∗ , . . . , qk∗ ). We derive the optimal workload
allocation for each power allocatCionOManPd9u3s3e4that vector, T1,2021
timal power allocation is non-trivial, computing E[T] for a given allocation is easy. Hence we omit showing the mean response time in each case and refer the reader to [12]. Due to lack of space, we defer all proofs to [12]. However, we present the intuition behind the theorems in each case. Re-
13
(q1∗, . . . , qk∗), both in theory and in the actual experiments. call from Section 2 that each fully utilized server has a min-
M/G/1/PS
• Poisson arrivals with mean arrival rate λ
• General service time distribution with mean rate μ • Processing sharing (PS)
• Mean response time
Processor sharing
T1,2021 COMP9334 14
Power allocation for 2 servers
• To be worked out during the lecture
T1,2021 COMP9334 15
Server farms with setup costs
Performance Evaluation 67 (2010) 1123–1138
Server farms with setup costs
Anshul Gandhi a,⇤, Mor Harchol-Balter a, Ivo Adan b
a Computer Science Department, Carnegie Mellon University, Pittsburgh, PA 15213, USA
b Department of Mathematics and Computer Science, Eindhoven University of Technology, 5600 MB Eindhoven, The Netherlands
Contents lists available at ScienceDirect
Performance Evaluation
journal homepage: www.elsevier.com/locate/peva
T1,2021
COMP9334 16
article info
Article history:
Available online 6 August 2010
Keywords:
Setup times
Power management Data centers
M/M/k
Staggered boot up
abstract
In this paper we consider server farms with a setup cost. This model is common in
manufacturing systems and data centers, where there is a cost to turn servers on. Setup
costs always take the form of a time delay, and sometimes there is additionally a power
penalty, as in the case of data centers. Any server can be either on, off, or in setup mode.
While prior work has analyzed single servers with setup costs, no analytical results are
known for multi-server systems. In this paper, we derive the first closed-form solutions
and approximations for the mean response time and mean power consumption in server
farms with setup costs. We also analyze variants of server farms with setup, such as server
farm models with staggered boot up of servers, where at most one server can be in setup
mode at a time, or server farms with an infinite number of servers. For some variants, we
find that the distribution of response time can be decomposed into the sum of response
time for a server farm without setup and the setup time. Finally, we apply our analysis to
data centers, where both response time and power consumption are key metrics. Here we
analyze policy design questions such as whether it pays to turn servers off when they are
idle, whether staggered boot up helps, how to optimally mix policies, and other questions
related to the optimal data center size.
Server farm with setup cost: The problem
• A data centre consists of many servers
• An ON server consumes peak power
• An idling server consumes 60% of peak power • Not all servers are needed all the time
T1,2021 COMP9334 17
Alternative 1: Idling servers remain ON all the time
• Servers 1-3 are ON
• Servers 4-5 are idling
• When a new job arrives, it is allocated to an idling server immediately
• Littletimedelay
• Idlingserversconsume power
1
2
3 4
5
T1,2021 COMP9334
18
Alternative 2: Turn idling server OFF
• Servers 1-3 are ON
• Servers 4-5 are OFF
• Once a server becomes idle, turn it off immediately to save power.
• When a new job arrives, need to turn an OFF server ON
• Longertimedelay
• Consumeslesspower
1
2
3 4
5
T1,2021 COMP9334
19
Trading off between power consumption and delay
• Alternative 1: Keep idling server ON • Shortdelaybuthighpowerconsumption
• Alternative 2: Turn idling server OFF immediately • Longdelaybutlowerpowerconsumption
• Can you suggest some alternatives?
• Given what you have learnt so far, will you carry out a
study on this problem?
T1,2021 COMP9334 20
Conclusions
• Queueing theory has many applications
• You have learnt the basics of analysis and simulation
• The are a lot of advanced theory and methods that we cannot cover but the basics will enable you to learn more
T1,2021 COMP9334 21