Greedy Algorithms
Minimum Lateness Scheduling
2021-01-20 CSC373 Winter 2021 – Sam Toueg
1
Minimizing Lateness Schedule
Problem
Input: ! intervals (“jobs’’): 1, 2, … , !
ØInterval ! requires “! units of time and has deadline #!
Ø If interval ! is scheduled to start at time $! , it will finish at time %! = $! + “! Ø Lateness: l! = max 0, %! − #!
Output: schedule that minimizes the maximum lateness, & = max l/ /
Input: Schedule Example:
!! = 1
!” = 2
!# = 4
!$ = 3
!% = 2 !& = 3
= 4
l”=0 l#=0
l$=0
l%=4 l&=0 l’=1
2 = max l! !
324165
013 7101215
2021-01-20 CSC373 Winter 2021 – Sam Toueg
2
Minimizing Lateness Schedule
• Fact 1: there is an optimal schedule with no gaps
2021-01-20 CSC373 Winter 2021 – Sam Toueg 3
Minimizing Lateness Schedule
• Fact 1: there is an optimal schedule with no gaps
• Brute force algorithm: try all possible permutations of ! intervals 3(5!)
(with no gaps)
• Greedy algorithm (general scheme):
– sort intervals in some order
– schedule all intervals in this order with no gaps
What order minimizes lateness?
Ø Shortest processing time first: increasing order of processing time !! Ø Earliest deadline first: increasing order of due time “!
Ø Smallest slack first: increasing order of “! − !!
2021-01-20 CSC373 Winter 2021 – Sam Toueg 4
Shortest Processing Time First? No!
Input:
Schedule by Shortest Processing Time First:
l% = 0 l# = 1
2 = max l! !
!$ = 1 !” = 10
= 1
Optimal schedule:
l#=0 l%=0 2=maxl! !” = 10 !$ = 1 = 0!
2021-01-20
CSC373 Winter 2021 – Sam Toueg
5
12
01 11
0
21 10 11
Smallest Slack First? No!
Input:
Schedule by Smallest Slack First:
l#=0 l%=9 2=maxl!
Optimal schedule:
0
21 10 11
!$ = 1
!” = 10
l% = 0
l# = 1
2 = max l! = 1!
!” = 10 !$ = 1
= 9!
12
01 11
2021-01-20 CSC373 Winter 2021 – Sam Toueg
6
Minimizing Lateness Schedule
• Greedy by Earliest Deadline First (EDF): increasing order of deadline “! ✅ – sort intervals in order of increasing deadline
– schedule all intervals in this order with no gaps
How do we prove that this outputs an optimal schedule?
• Observation 1: The greedy EDF schedule has no gaps
• Observation 2: The greedy EDF schedule has no inversions
Øinversion: two intervals “, # such that $” > $! but ” is scheduled before #
• Observation 3: If a schedule with no gaps has an inversion,
then it has a pair of inverted intervals that are adjacent
:9! !” !! !”
2021-01-20 CSC373 Winter 2021 – Sam Toueg
7
Contradiction!
Minimizing Lateness Schedule
• Claim 1: Swapping adjacent inverted interval doesn’t increase lateness and reduces the number of inversions by one
• Proof:
Ø Let 9 and ! denote two adjacent inverted intervals in schedule :
o # is scheduled before $ in % but deadline &! > &”
Ø By swapping 9 and !, we get a new schedule :′ with one fewer inversion Ø Let l and l′ denote lateness before/after swap
ØClearly,l( =l)( forall<≠9,!
ØClaim: l! ≥ l!) and l! ≥ l)* l! ≥ max l)*,l!)
o l" ≥ l"# because $ finishes earlier in %′ (i.e., after the swap)
ol"=,"−&"≥,!#−&!=l#! ) ) ) ) Ø2=maxl*,l!,maxl( ≥max l*,l!,maxl( =2
(+*,! :9!
:′
') !9
2021-01-20
!"
""′ !! "!′ CSC373 Winter 2021 - Sam Toueg
8
(+*,!
l'
!" l( !! l( ""
l * = l (*
Minimizing Lateness Schedule
• Claim 2: All schedules with no gaps and no inversions have the same lateness • Proof Sketch:
• Let ! and !$ be two distinct schedules with no gaps and no inversions
• Note: ! and !$ differ only by the schedule of intervals with the same deadline
• Consider the intervals with the same deadline: Ø They are adjacent in both ) and )(
Ø As a group, they have the same max lateness
• The other intervals have the same lateness • So ! and !$ have the same lateness
!! =!" =!# max{l),l',l*} :9!<
:′ < 9 !
max{l*, l), l' }
2021-01-20 CSC373 Winter 2021 - Sam Toueg 12
Minimizing Lateness Schedule
• Claim 1: Swapping adjacent inverted intervals doesn’t increase lateness and reduces the number of inversions by one
• Claim 2: All schedules with no gaps and no inversions have the same lateness
• Proof of optimality:
Ø Suppose, for contradiction, the EDF schedule : is not optimal
Ø Let :∗ be an optimal schedule that has no gaps and has the fewest inversions among all optimal schedules
Ø Case 1: :∗ has no inversions
• Both : and :∗ have no gaps and no inversions
• By Claim 2, : has the same lateness as :∗ and so : is also optimal
Contradiction!
Ø Case 2: :∗ has at least one inversion
• So :∗ has an adjacent inversion (9, !)
• Swap9and!
• By Claim 1, this swap doesn’t increase lateness and reduces the number of
inversions by one
• So the new schedule remains optimal and has 1 fewer inversion than :∗
2021-01-20 CSC373 Winter 2021 - Sam Toueg
13
Contradiction!
Greedy Algorithms Huffman Code
2021-01-20 CSC373 Winter 2021 - Sam Toueg 1
Overview and Motivation • Alphabet Γ: a set of " symbols
e.g.,Γ= $,&,...,(, "= Γ =26 • Text: a string over Γ, e.g. ``nonna’’
• Coding: maps each symbol + ∈ Γ to a unique binary string,
called “codeword” (or simply ``code’’) of +
• Two types of symbol coding: Ø Fixed length
Ø Variable length
2021-01-20 CSC373 Winter 2021 - Sam Toueg 2
Fixed-Length Code •For"distinctsymbols,eachcodewordhas -./!" bits.
For example:
Ø Γ = #,%,&,' , code: #=00, %=01, &=10 and '=11
Ø ASCII code for 256 distinct characters uses 8 bits/character
• Easy to decode
• But not optimal in terms of text coding length:
Ø “(” : 12.7% of letter occurrences in English texts
Ø “)” : 0.07% of letter occurrences in English texts
Ø “(” : appears 180 times more frequently than “)” in English
Ø So giving the same code length to both “(” and “)” is wasteful: o better to give a shorter code to ", even at the cost of a longer one for #
• Goal: minimize the length of text coding
• Idea: give shorter codes to frequent text symbols 2021-01-20 CSC373 Winter 2021 - Sam Toueg 3
Problem:
0$
not a “prefix code”
code of ! is a prefix of code of "!
0101
Variable-Length Code
• Different codewords may have different lengths
Ø Pro: potential to save space
Ø Con: harder to decode,
can be ambiguous if not done correctly
Forexample: Γ= #,%,&,... and #=1,%=01,&=010
&&
• ``Prefix code’’: no codeword is a prefix of any other Ø Scan from left to right till find codeword of some symbol
Ø Unambiguous decoding!
2021-01-20 CSC373 Winter 2021 - Sam Toueg 4
Prefix Code as a Binary tree
Every prefix code can be represented as a binary tree
• Example: Left edges Γ = #,%,&,',(,C labeled0
0 1
Right edges labeled to 1
Coding
0 01
0
1
1
# = 10 % = 001 & = 0000 ' = 01 ( = 11 C = 0001
0
'#( %
&C
• Symbols of Γ = leaves of the binary tree
• Codeword for symbol ? = concatenation of edge labels on the
1
path from the root to leaf ?
2021-01-20 CSC373 Winter 2021 - Sam Toueg 5
H: a preZix code tree for
0 91
Γ = #,%,&,',(,C
01 1'#(
Is T optimal?
No! Swap % and (
Prefix Code as a Binary tree
0 0
1
01 % .2 .38 .12
&C
.1 .05
.15
frequency of “#” (think of it as probability of “#”)
.1 + .05 + .15 + .15 + .2 + .38 + .12 = 1
• Average Depth of H=2·(.2+.38+.12)+3·(.15)+4·(.1+.05)=2.45 Ø Average length of a codeword is 2.45 bits/symbol
• A fixed-length code needs 3 bits/symbol (to code 6 symbols)
• So the variable-length coding given by this H is ≈ 20% better 2021-01-20 CSC373 Winter 2021 - Sam Toueg 6
H: a preZix code tree for
0 91
Γ = #,%,&,',(,C
01
01
Is T optimal?
No! Swap % and (
Prefix Code as a Binary tree
0
(
0
1
'#%
&C
.1 .05
.2 .38 .15 1
.12
• Average Depth of H=2·(.2+.38+.15)+3·(.12)+4·(.1+.05)=2.42 Ø Average length of a codeword is 2.42 bits/symbol
2021-01-20 CSC373 Winter 2021 - Sam Toueg 7
Prefix Code Problem
• Input: alphabet Γ with " symbols ∀+ ∈ Γ, O(+) =frequency of +
(where ∑$ ∈ & O(+) = 1)
• Output: Binary tree 9 representing an optimal
prefix code for Γ
• Weighted average depth of 9: AD(9)=∑$∈&O+ ·QRSTh' +
whereQRSTh' +=depthof+in9
= length of code of + in 9
• Optimal solution: a tree 9 with minimum AD(9) 2021-01-20 CSC373 Winter 2021 - Sam Toueg 8
Useless branch: not used to distinguish the codes in the leaves
2021-01-20
CSC373 Winter 2021 - Sam Toueg 10
Prefix Code as a Binary tree
Fact 1: An Ooptiimalltreeiisafulllbiinarytree,, i.e. each internali.neo. edaechaisntweronaclhniloddrenhas two children
• Proof
internal node with no right child
0
1
%
&C
0
0
internal node
0
%
&C
Shorter codes for !, ", .
CSC373 Winter 2021 - Sam Toueg 11
2021-01-20
1
with no right child
1
Prefix Code as a Binary tree
Fact 1: An optimal tree is a full binary tree,
i.e. each internal node has two children
01 01 0101 0101
'#(01'#(
Prefix Code as a Binary tree
Fact 1: An optimal tree is a full binary tree,
i.e. each internal node has two children
Fact 2: In an optimal tree 9, for any two symbols + , Y O(+) > O(Y) ⟹ QRSTh'(+) ≤ QRSTh'(Y)
Facts 1&2 imply:
Fact 3: If +, Y are symbols with minimum frequency . then there is optimal tree 9 such that
+, Y are siblings and are at maximum depth
have the same parent
2021-01-20 CSC373 Winter 2021 – Sam Toueg 12
Huffman’s Algorithm
• Input: ” symbols Γ with their frequencies O
Sorted by
.8 =.2 +.(4)
)
• FrominstanceΓwith)symbols,we Since 2, 4 are symbols with minimum
01
created an instance Γ!with ) − 1 symbols frequency, by Fact 3 there is an
\? \?
• Do this recursively until we get 2 symbols optimal tree 9 such that 2, 4 are
.(4) = . 2 2021-01-20
(then we have one root with two leaves) siblings at maximum depth in 9
O′ ?\]_)^
“−1 Γ′
increasing
replace 2 and 4 with new symbol 8
frequency
1 2 3 4 …. …. 0 .(2) .(4) .(5) .(6) . 2 + .(4) .(7)
• Note: This (optimal) tree is not unique CSC373 Winter 2021 – Sam Toueg 13
2021-01-20
CSC373 Winter 2021 – Sam Toueg
Γ#!”;<.
freq. .38 .15 .1 .2 .12 .05
Γ′ # ! " ; < >
freq.
.38 .15 .1 .2 .12
.15
.15 > 0
1
.” freq. .38 .15 .1 .2 .12 .15
Γ′ # ! ” ; < h
freq.
.38 .15 .1 .2 .12
.27
.27 h 01
.15 > < .12 01
."
@ .35 2 , assume that:
• (Induction Hypothesis, IH) for any Γ with b symbols and any C, the algorithm produces an optimal tree.
• Let Γ be any alphabet with b + 1 symbols
• Let C(?) be a frequency for each ? ∈ Γ
• Let i = tree output by the algorithm on input (Γ, C)
• We now show that i is optimal
2021-01-20 CSC373 Winter 2021 – Sam Toueg 20
• Let i = tree output by the algorithm on input (Γ, C)
• Let ?,\ ∈ Γ be the two symbols of min frequency at max depth in i
• Let)∉Γbetheparentof?,\ini
• The algorithm constructed i by recursively computing a tree iC for the alphabet and frequency (Γ′, C′) where:
ØΓ′=Γ −{/,1}⋃{#} Ø4′(#)=4 / +4(1) Ø 4′(8) = 4(8), ∀ α ∈ Γ
output of the algorithm on input (Γ′, .′)
i′ = output of alg ?\
C ? +C(\)
)
)
Η
Η′
Optimal by IH
orithm on input (Γ′, C′
• Since i′ has b symbols, by IH, i′ is optimal for (Γ′, C′)
2021-01-20 CSC373 Winter 2021 – Sam Toueg 21
Recap:
Compare DE F and DE F′ Η
‘
) ?\
)
‘−1
Η′
Optimal by IH
•i=outputofthealgorithmoninput Γ,C •i′=outputofthealgorithmoninput Γ′,C′ •i′isoptimalfor(Γ′,C′),thatis:AD(i′)isminimumfor Γ′,C′
• Note that:
ØGH(I)= J+ K ·4(/)+ K·4(1) ØGHI! =J+(K−1)·(4/ +41)
• So:
2021-01-20 CSC373 Winter 2021 – Sam Toueg 22
C ? +C(\)
pqi =pqiC +C? +C\
Recap:
) ?\
)
Η
Η′
Optimal by IH
•i=outputofthealgorithmoninput Γ,C •i′=outputofthealgorithmoninput Γ′,C′ •AD(i′)isminimumfor Γ′,C′
4! # = 4 / + 4(1)
pqi =pqiC +C? +C\
2021-01-20 CSC373 Winter 2021 – Sam Toueg 23
• Recall that ?, \ are symbols of Γ of min frequency in C • By Fact 3: there is an optimal tree H for Γ, C
such that ?, \ are siblings at maximum depth T
T′
Maximum depth
)
?\
4! # = 4 / + 4(1)
• Let H′ be the tree constructed from H by:
Ø removing /, 1 and replacing their parent with symbol #
• Clearly:
Ø X′ is a prefix code tree for Γ′, 4′ , where Γ′, 4′ are defined as before ØGHX =GHX′ +4/ +41
2021-01-20 CSC373 Winter 2021 – Sam Toueg
24
Recap:
) ?\
)
Η′ an optimal tree for (Γ′, .′) DEF =DEF! +.2 +.4
?\
4! # = 4 / + 4(1)
Η algo output for (Γ, .)
4! # = 4 / + 4(1)
T optimal for (Γ, .)
T′ a tree for (Γ′, .′)
DE9 =DE9! +.2 +.4
• !” # =!” #! +& ‘ +& ( SinceF! isoptimalfor Γ!,.! : ≤!”*! +&’+&( DEF′≤DE9!
= !” *
• So!”#≤!”*
• Since*isoptimalfor Γ,& , #isalsooptimalfor Γ,& !
2021-01-20 CSC373 Winter 2021 – Sam Toueg 26
)
1. 2. 3. 4. 5.
Huffman(b, C) [\] ^ = 1 _\ ) `\
6.
H(0 log0)
7. 8.
1, 4 1 = y/dbczd{^) I
9.
É)Ñ”bd(I,(^,4 / + 4 1 ))
Huffman’s Algorithm
Input: symbols 1, 2, … , 0 with frequencies .(1), .(2), … , .(0)
Jb”cd” c e”c4 )fK” ecg”e”K ^
H(0)
I(^) = (^, 4(^)) [Min Heap Array with key 4(^)] nopq`rpstuvw(I)
[\] ^ = ) + 1 _\ 2) − 1 `\ [) − 1 iterations]
/, 4 / = y/dbczd{^) I
H(0)
Jb”cd” c )fK” ecg”e”K ^ ~^dh dh” 4feef~^)Ä d~f zh^eKb”): dh” )fK” ecg”e”K / c)K dh” )fK” ecg”e”K 1
H(0)
H(0 log0 )
Running time = r(b log b)
2021-01-20 CSC373 Winter 2021 – Sam Toueg
27