4. GREEDY ALGORITHMS I
‣ coin changing
‣ interval scheduling
‣ scheduling to minimize lateness ‣ optimal caching
Lecture slides by Kevin Wayne Copyright © 2005 Pearson-Addison Wesley Copyright © 2013 Kevin Wayne
http://www.cs.princeton.edu/~wayne/kleinberg-tardos
Last updated on Sep 8, 2013 6:30 AM
4. GREEDY ALGORITHMS I
‣ coin changing
‣ interval scheduling
‣ scheduling to minimize lateness ‣ optimal caching
SECTION 4.1
Interval scheduling
・Job j starts at sj and finishes at fj.
・Two jobs compatible if they don’t overlap.
・Goal: find maximum subset of mutually compatible jobs.
a
b
c
d
e
f
g
h
jobs d and g are incompatible
time
0 1 2 3 4 5 6 7 8 9 10 11
9
Interval scheduling: greedy algorithms
Greedy template. Consider jobs in some natural order.
Take each job provided it’s compatible with the ones already taken.
・[Earliest start time] Consider jobs in ascending order of sj. ・[Earliest finish time] Consider jobs in ascending order of fj. ・[Shortest interval] Consider jobs in ascending order of fj – sj.
・[Fewest conflicts] For each job j, count the number of conflicting jobs cj. Schedule in ascending order of cj.
10
Interval scheduling: greedy algorithms
Greedy template. Consider jobs in some natural order.
Take each job provided it’s compatible with the ones already taken.
counterexample for earliest start time
counterexample for shortest interval
counterexample for fewest conflicts
11
Interval scheduling: earliest-finish-time-first algorithm
EARLIEST-FINISH-TIME-FIRST (n, s1, s2, …, sn , f1, f2, …, fn) _________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
SORT jobs by finish time so that f1 ≤ f2 ≤ … ≤ fn A ← φ set of jobs selected
FOR j = 1 TO n
IF job j is compatible with A A ←A∪{j}
RETURN A _________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
P・roposition. Can implement earliest-finish-time first in O(n log n) time. ・Keep track of job j* that was added last to A.
・Job j is compatible with A iff sj ≥ fj* .
Sorting by finish time takes O(n log n) time.
12
–
Earliest Finish Time First Theorem – –
. optimal
.
i s returns
( the
schedule
A that alg that we can
maximizes the # of jobs
a l l
.
ru n schedules )
on a
single
computer ,
le t let
O
among
schedule
ii.
O= (Ji,Ja .
)
. .. .,jm)
be a n A=(
optimal
.
iz
.
….
ik
O be
ii. Lemmy
O= (Ji,Ja .
)
f- (jr )
. .. .,jm)
Let let
a n A=(
optimal
schedule
.
all ( by
Base
(2) t.tl
.
r etc induction)
flip )
flirt = Hi, ) .
For
Prout 41
,
e
iz
.
….
ik
case
c. 31 I.s.
Proof of I. s claim
A considers the rith job when it will be able
,
:
Assume
true flit is the min finish time among all jobs
flirt s f-(jr) .
f-liar )
flir ) e
stir
)
to add runjobjr
flirt )
f
.,
( bk
ascending
order)
.. g.,
.
I .tl
.
⇒
flirt
a-
flirt
s
Cir )
-proof
Suppose From
thin by kcm
o f
contradiction
(
greedy
ffi, ) E f-(je)
suppose
i s
suboptimal )
Lemmy ,
.
But then
, job j++,
is compatible with
– (
ii.iz, – – n , added
ik )
A-
I
could have also
job
.
jen
Interval scheduling: analysis of earliest-finish-time-first algorithm
Theorem. The earliest-finish-time-first algorithm is optimal.
P・f. [by contradiction]
・Assume greedy is not optimal, and let’s see what happens. ・Let i1, i2, … ik denote set of jobs selected by greedy.
i1
i2
ir
ir+1
ik
j1
j2
jr
jr+1
jm
Greedy:
OPT:
job ir+1 exists and finishes before jr+1
. . .
.. .
why not replace job jr+1 with job ir+1?
Let j1, j2, … jm denote set of jobs in an optimal solution with i1 = j1, i2 = j2, …, ir = jr for the largest possible value of r.
13
Interval scheduling: analysis of earliest-finish-time-first algorithm
Theorem. The earliest-finish-time-first algorithm is optimal.
P・f. [by contradiction]
・Assume greedy is not optimal, and let’s see what happens. ・Let i1, i2, … ik denote set of jobs selected by greedy.
i1
i2
ir
ir+1
ik
j1
j2
jr
ir+1
jm
Greedy:
OPT:
job ir+1 exists and finishes before jr+1
. . . .. .
solution still feasible and optimal (but contradicts maximality of r)
Let j1, j2, … jm denote set of jobs in an optimal solution with i1 = j1, i2 = j2, …, ir = jr for the largest possible value of r.
14
Interval partitioning
In・terval partitioning.
・Lecture j starts at s and finishes at f .
Goal: find minimum number of classrooms to schedule all lectures so that no two lectures occur at the same time in the same room.
Ex. This schedule uses 4 classrooms to schedule 10 lectures.
4 3 2 1
jj
e
j
c
d
g
b
h
a
f
i
9 9:30 10 10:30 11 11:30 12 12:30 1 1:30 2 2:30 3 3:30 4 4:30 time
15
Interval partitioning
In・terval partitioning.
・Lecture j starts at s and finishes at f .
Goal: find minimum number of classrooms to schedule all lectures so that no two lectures occur at the same time in the same room.
Ex. This schedule uses 3 classrooms to schedule 10 lectures.
jj
3 2 1
c
d
f
j
b
g
i
a
e
h
9 9:30 10 10:30 11 11:30 12 12:30 1 1:30 2 2:30 3 3:30 4 4:30 time
16
Interval partitioning: greedy algorithms
Greedy template. Consider lectures in some natural order. Assign each lecture to an available classroom (which one?); allocate a new classroom if none are available.
・[Earliest start time] Consider lectures in ascending order of sj. ・[Earliest finish time] Consider lectures in ascending order of fj. ・[Shortest interval] Consider lectures in ascending order of fj – sj.
・[Fewest conflicts] For each lecture j, count the number of conflicting lectures cj. Schedule in ascending order of cj.
17
Interval partitioning: greedy algorithms
Greedy template. Consider lectures in some natural order. Assign each lecture to an available classroom (which one?); allocate a new classroom if none are available.
counterexample for earliest finish time
3 2 1
counterexample for shortest interval
3 2 1
counterexample for fewest conflicts
3 2 1
18
Interval partitioning: earliest-start-time-first algorithm
EARLIEST-START-TIME-FIRST (n, s1, s2, …, sn , f1, f2, …, fn) _________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
SORT lectures by start time so that s1 ≤ s2 ≤ … ≤ sn.
d←0
FOR j = 1 TO n
IF lecture j is compatible with some classroom Schedule lecture j in any such classroom k.
ELSE
Allocate a new classroom d + 1. Schedule lecture j in classroom d + 1. d←d +1
RETURN schedule. _________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
number of allocated classrooms
19
Interval partitioning: earliest-start-time-first algorithm
Proposition. The earliest-start-time-first algorithm can be implemented in O(n log n) time.
Pf. Store classrooms in a priority queue (key = finish time of its last lecture). ・To determine whether lecture j is compatible with some classroom,
compare sj to key of min classroom k in priority queue.
・To add lecture j to classroom k, increase key of classroom k to fj. ・Total number of priority queue operations is O(n).
・Sorting by start time takes O(n log n) time. ▪
Remark. This implementation chooses the classroom k whose finish time of its last lecture is the earliest.
20
Interval partitioning: lower bound on optimal solution
nd”
Def. The depth of a set of open intervals is the maximum number that contain any given time.
Key observation. Number of classrooms needed ≥ depth.
Q. Does number of classrooms needed always equal depth? –
A. Yes! Moreover, earliest-start-time-first algorithm finds one.
be
depth = 3
3 2
1
At
end of
algorithm, d=D”
to
our
proved.
c
d
f
j
b
g
i
a
e
h
9 9:30 10 10:30 11 11:30 12 12:30 1 1:30 2 2:30 3 3:30 4 4:30 time
21
Proof
that d
suppose We
w e
death
scheduling the jth
lecture .
open
n e w running
classroom if all
currently
open
classrooms are j
Edt end of algorithm
=d* ←
considering
At
So , one
will
lectures
must open
intersect available
def. of classrooms.
most
lectures that
I kT
intervals
d* – I such
interval .
classroom
be
than
intersect
lecture
j (by
depth) .
for
.
j d ‘t
So , we
never
more
Interval partitioning: analysis of earliest-start-time-first algorithm
Observation. The earliest-start-time first algorithm never schedules two incompatible lectures in the same classroom.
Theorem. Earliest-start-time-first algorithm is optimal. Pf.
・Let d = number of classrooms that the algorithm allocates. ・Classroom d is opened because we needed to schedule a lecture, say j,
that is incompatible with all d – 1 other classrooms.
・These d lectures each end after sj.
・Since we sorted by start time, all these incompatibilities are caused by ・lectures that start no later than sj.
・Thus, we have d lectures overlapping at time sj + ε.
Key observation ⇒ all schedules use ≥ d classrooms. ▪
22
4. GREEDY ALGORITHMS I
‣ coin changing
‣ interval scheduling
‣ scheduling to minimize lateness ‣ optimal caching
SECTION 4.2
Scheduling to minimizing lateness
, d. . . . – i
・Job j requires t units of processing time and is due at time d . jj
・If j starts at time s , it finishes at time f = s + t . j jjj
・Lateness: l = max { 0, f – d }. jjj
Goal: schedule all jobs to minimize maximum lateness L = max l. jj
123456
321432 6 8 9 9 14 15
lateness = 2
dn
M・inimizing lateness problem.
・Single resource processes one job at a time.
input: n it, . . . . .tn
tj
dj
lateness = 0
max lateness = 6
d3 = 9
d2 = 8
d6 = 15
d1 = 6
d5 = 14
d4 = 9
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
24
Minimizing lateness: greedy algorithms
Greedy template. Schedule jobs according to some natural order. ・[Shortest processing time first] Schedule jobs in ascending order of
processing time tj.
・[Earliest deadline first] Schedule jobs in ascending order of deadline dj. ・[Smallest slack] Schedule jobs in ascending order of slack dj – tj.
25
Minimizing lateness: greedy algorithms
Greedy template. Schedule jobs according to some natural order. ・[Shortest processing time first] Schedule jobs in ascending order of
processing time tj.
tj
dj
・[Smallest slack] Schedule jobs in ascending order of slack dj – tj. counterexample
tj
dj
1
1
100
1
1
2
2
10
10
2
10
10
job job
T E late
counterexample
not late
l, =md×f0, I-
=O
q=msx -10,11 =L
–
lot
1007
26
Minimizing lateness: earliest deadline first
EARLIEST-DEADLINE-FIRST (n, t1, t2, …, tn , d1, d2, …, dn) __________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
SORT n jobs so that d1 ≤ d2 ≤ … ≤ dn. t←0
FOR j = 1 TO n
Assign job j to interval [t, t +tj]. sj ←t; fj ←t+tj
t ← t + tj
RETURN intervals [s1, f1], [s2, f2], …, [sn, fn]. __________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
max lateness = 1
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
d1 = 6
d2 = 8
d3 = 9
d4 = 9
d5 = 14
d6 = 15
27
Minimizing lateness: no idle time
Observation 1. There exists an optimal schedule with no idle time.
0 1 2 3 4 5 6 7 8 9 10 11
0 1 2 3 4 5 6 7 8 9 10 11
Observation 2. The earliest-deadline-first schedule has no idle time.
d= 4
d =6
d = 12
d= 4
d =6
d = 12
28
died ;
inversion
[ as before, we assume jobs are numbered so that d1
fi
≤ … ≤
Minimizing lateness: inversions
Def. Given a schedule S, an inversion is a pair of jobs i and j such that: i < j but j scheduled before i.
j
i
Observation 3. The earliest-deadline-first schedule has no inversions.
Observation 4. If a schedule (with no idle time) has an inversion, it has one with a pair of inverted jobs scheduled consecutively.
≤ d2
dn ]
29
-Proof o f
of
greedy
with no inversions
lateness.
deadline
common deadline
First
All
and no
idle time
m:
There
is a n
optimal
with n o
inversions
and no idle
time.
optimality
o u r schedules
algorithm .
:
have same
maximum jobs with same
two
Is.isI: !
JT ,←
Mdx
- lateness
"
finish time
schedule
of
2nd job
w ill among
be these 2 jobs
same
There is a n optimal schedule -Proof ( by " exchange argument" )
with n o
inversions
and no idle
time.
ycldim:
Suppose Then,
O is
a n
suppose
optimal schedule , and
I i. j S.t.{job i is immediately followed by job
! AND di>dj ( )
Tj ”
f-(j ) is
finish time of job ;
O has an inversion .
j
inversion
I;T”i”¥i is
before
exchange
” ”
Ii
–
= max – fo, flit- dit = maxfo, fl) dit
s max -10 flit dit ,
”
“” finish
!.in: time
lj
;
–
– –
Minimizing lateness: inversions
Def. Given a schedule S, an inversion is a pair of jobs i and j such that: i < j but j scheduled before i.
inversion
fi
j
i
before swap after swap
i
j
Claim. Swapping two adjacent, inverted jobs reduces the number of inversions by one and does not increase the max lateness.
P・f. Letlbe the lateness before the swap, and let l' be it afterwards.
・l' = l for all k ≠ i, j. kk
・l' ≤ l. ii
If job j is late, l' = f 'j – dj" j
= fi –dj" ≤ fi –di" ≤li.! "
(definition)
( j now finishes at time fi ) (since i and j inverted) (definition)
30
f 'j
Minimizing lateness: analysis of earliest-deadline-first algorithm
Theorem. The earliest-deadline-first schedule S is optimal.
Pf. [by contradiction]
Define S* to be an optimal schedule that has the fewest number of inversions, and let's see what happens.
・Can assume S* has no idle time.
・If S* has no inversions, then S = S*.
・If S* has an inversion, let i–j be an adjacent inversion. ・Swapping i and j
- does not increase the max lateness
- strictly decreases the number of inversions ・This contradicts definition of S* ▪
31
Greedy analysis strategies
Greedy algorithm stays ahead. Show that after each step of the greedy algorithm, its solution is at least as good as any other algorithm's.
Structural. Discover a simple "structural" bound asserting that every possible solution must have a certain value. Then show that your algorithm always achieves this bound.
Exchange argument. Gradually transform any solution to the one found by the greedy algorithm without hurting its quality.
Other greedy algorithms. Gale-Shapley, Kruskal, Prim, Dijkstra, Huffman, ...
32
4. GREEDY ALGORITHMS I
‣ coin changing
‣ interval scheduling
‣ scheduling to minimize lateness ‣ optimal caching
SECTION 4.3
Optimal offline caching
Caching.
・Cache with capacity to store k items.
・Sequence of m item requests d1, d2, ..., dm.
・Cache hit: item already in cache when requested.
・Cache miss: item not already in cache when requested: must bring
requested item into cache, and evict some existing item, if full.
Goal. Eviction schedule that minimizes number of evictions.
Ex. k = 2, initial cache = ab, requests: a, b, c, b, c, a, a. Optimal eviction schedule. 2 evictions.
cache miss (eviction)
a
a
b
b
a
b
c
c
b
b
c
b
c
c
b
a
a
b
b
a
b
requests cache 34
Optimal offline caching: greedy algorithms
LIFO / FIFO. Evict element brought in most (east) recently. LRU. Evict element whose most recent access was earliest. LFU. Evict element that was least frequently requested.
previous queries
⋮
a
a
w
x
y
z
d
a
w
x
d
z
a
a
w
x
d
z
b
a
b
x
d
z
c
a
b
c
d
z
e
a
b
c
d
e
g
b
e
d
⋮
current cache
cache miss (which item to eject?)
FIFO: eject a LRU: eject d
LIFO: eject e
future queries
35
Optimal offline caching: farthest-in-future (clairvoyant algorithm)
Farthest-in-future. Evict item in the cache that is not requested until farthest in the future.
a
a
b
c
d
e
f
a
b
c
e
g
b
e
d
⋮
current cache
cache miss (which item to eject?)
future queries
Theorem. [Bélády 1966] FF is optimal eviction schedule. Pf. Algorithm and theorem are intuitive; proof is subtle.
36
FF: eject d
Reduced eviction schedules
Def. A reduced schedule is a schedule that only inserts an item into the cache in a step in which that item is requested.
item inserted when not requested
a
a
b
c
a
a
x
c
c
a
d
c
d
a
d
b
a
a
c
b
b
a
x
b
c
a
c
b
a
a
b
c
a
a
b
c
a
a
b
c
a
a
b
c
c
a
b
c
d
a
d
c
a
a
d
c
b
a
d
b
c
a
c
b
a
a
b
c
a
a
b
c
an unreduced schedule
a reduced schedule
37
Reduced eviction schedules
Claim. Given any unreduced schedule S, can transform it into a reduced schedule S' with no more evictions.
P・f. [by induction on number of unreduced items]
・Suppose S brings d into the cache at time t, without a request. ・Let c be the item S evicts when it brings d into the cache.
Case 1: d evicted at time t', before next request for d.
unreduced schedule S
Case 1 S'
d enters cache without a request
d evicted before next request
¬d
e
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
c
c
c
d
d
e
e
.
.
c
.
.
c
.
.
c
¬d
.
.
c
.
.
c
.
.
c
e
.
.
e
.
.
e
time t d
time t'
might as well leave c in cache
38
Reduced eviction schedules
Claim. Given any unreduced schedule S, can transform it into a reduced schedule S' with no more evictions.
P・f. [by induction on number of unreduced items]
・Suppose S brings d into the cache at time t, without a request. ・Let c be the item S evicts when it brings d into the cache. ・Case 1: d evicted at time t', before next request for d.
Case 2: d requested at time t' before d is evicted. ▪
unreduced schedule S
Case 2
d enters cache without a request
d requested before d evicted
S'
¬d
d
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
c
c
c
d
d
d
d
¬d
d
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
c
c
c
c
c
c
d
time t d
time t'
might as well leave c in cache until d is requested
d
39
Farthest-in-future: analysis
Theorem. FF is optimal eviction algorithm. Pf. Follows directly from invariant.
Invariant. There exists an optimal reduced schedule S that makes the same eviction schedule as SFF through the first j requests.
Pf. [by induction on j]
Let S be reduced schedule that satisfies invariant through j requests.
W・e produce S' that satisfies invariant after j + 1 requests. ・Consider (j + 1)st request d = dj+1.
Since S and SFF have agreed up until now, they have the same cache ・contents before request j + 1.
・Case 1: (d is already in the cache). S' = S satisfies invariant.
Case 2: (d is not in the cache and S and SFF evict the same element). S' = S satisfies invariant.
40
Farthest-in-future: analysis
P・f. [continued]
Case 3: (d is not in the cache; SFF evicts e; S evicts f ≠ e).
- begin construction of S' from S by evicting e instead of f j
S S' dj+1 samedf
- now S' agrees with SFF on first j + 1 requests; we show that having element f in cache is no worse than having element e
- let S' behave the same as S until S' is forced to take a different action (because either S evicts e; or because either e or f is requested)
same
e
f
same
e
f
same
e
41
Farthest-in-future: analysis
Let j' be the first time after j + 1 that S' must take a different action from S,
and let g be item requested at time j'.
j'
involves e or f (or both)
same
e
same
f
S S'
・Case 3a: g = e.
Can't happen with FF since there must be a request for f before e.
・Case 3b: g = f.
Element f can't be in cache of S, so let e' be the element that S evicts. - if e' = e, S' accesses f from cache; now S and S' have same cache
- if e' ≠ e, we make S' evict e' and brings e into the cache;
now S and S' have the same cache
We let S' behave exactly like S for remaining requests.
S' is no longer reduced, but can be transformed into a reduced schedule that agrees with SFF through step j+1
42
Farthest-in-future: analysis
Let j' be the first time after j + 1 that S' must take a different action from S,
and let g be item requested at time j'. j'
involves e or f (or both)
same
e
same
f
S S'
otherwise S' could have take the same action
・Case 3c: g ≠ e, f. S evicts e. Make S' evict f .
j'
S S'
Now S and S' have the same cache.
(and we let S' behave exactly like S for the remaining requests) ▪
same
g
same
g
43
Caching perspective
Online vs. offline algorithms.
・Offline: full sequence of requests is known a priori. ・Online (reality): requests are not known in advance. ・Caching is among most fundamental online problems in CS.
LIFO. Evict page brought in most recently.
LRU. Evict page whose most recent access was earliest.
FIF with direction of time reversed!
Theorem. FF is optimal offline eviction algorithm.
・Provides basis for understanding and analyzing online algorithms. ・LRU is k-competitive. [Section 13.8]
・LIFO is arbitrarily bad.
44