CSC373
Week 2: Greedy Algorithms
Nisarg Shah
373F20 – Nisarg Shah 1
Recap
• Divide & Conquer
➢ Master theorem
➢ Counting inversions in 𝑂(𝑛 log 𝑛)
➢ Finding closest pair of points in R2 in 𝑂 𝑛 log 𝑛
➢ Fast integer multiplication in 𝑂 𝑛log2 3
➢ Fast matrix multiplication in 𝑂 𝑛log2 7
➢ Finding 𝑘𝑡h smallest element (in particular, median) in 𝑂(𝑛)
373F20 – Nisarg Shah 2
Greedy Algorithms
• Greedy (also known as myopic) algorithm outline ➢ We want to find a solution 𝑥 that maximizes some
objective function 𝑓
➢ But the space of possible solutions 𝑥 is too large
➢ The solution 𝑥 is typically composed of several parts (e.g. 𝑥 may be a set, composed of its elements)
➢ Instead of directly computing 𝑥…
o Compute it one part at a time
o Select the next part “greedily” to get maximum immediate benefit (this needs to be defined carefully for each problem)
o May not be optimal because there is no foresight o But sometimes this can be optimal too!
373F20 – Nisarg Shah 3
Interval Scheduling
• Problem
➢ Job 𝑗 starts at time 𝑠 and finishes at time 𝑓
𝑗𝑗
➢ Two jobs are compatible if they don’t overlap
➢ Goal: find maximum-size subset of mutually compatible jobs
373F20 – Nisarg Shah 4
Interval Scheduling
• Greedy template
➢ Consider jobs in some “natural” order
➢ Take each job if it’s compatible with the ones already chosen
• What order?
➢ Earliest start time: ascending order of 𝑠
➢ Earliest finish time: ascending order of 𝑓 𝑗
➢ Shortest interval: ascending order of 𝑓 − 𝑠 𝑗𝑗
➢ Fewest conflicts: ascending order of 𝑐 , where 𝑐 is the 𝑗𝑗
number of remaining jobs that conflict with 𝑗
𝑗
373F20 – Nisarg Shah 5
Example
• Earliest start time: ascending order of 𝑠 𝑗
• Earliest finish time: ascending order of 𝑓 𝑗
• Shortest interval: ascending order of 𝑓 − 𝑠 𝑗𝑗
• Fewest conflicts: ascending order of 𝑐 , where 𝑐 is the number of
remaining jobs that conflict with 𝑗
𝑗𝑗
373F20 – Nisarg Shah 6
Interval Scheduling
• Does it work? Counterexamples for earliest start time
shortest interval fewest conflicts
373F20 – Nisarg Shah 7
Interval Scheduling
• Implementing greedy with earliest finish time (EFT) ➢ Sort jobs by finish time. Say 𝑓 ≤ 𝑓 ≤ ⋯ ≤ 𝑓
12𝑛
➢ When deciding on job 𝑗, we need to check whether it’s
compatible with all previously added jobs
o We only need to check if 𝑠 ≥ 𝑓∗, where 𝑖∗ is the last added job
𝑗𝑖
o This is because for any jobs 𝑖 added before 𝑖∗, 𝑓 ≤ 𝑓 ∗ 𝑖𝑖
o So we can simply keep track of the finish time of the last added job
➢ Running time: 𝑂 𝑛 log 𝑛
373F20 – Nisarg Shah 8
Interval Scheduling
• Optimality of greedy with EFT
➢ Suppose for contradiction that greedy is not optimal
➢ Say greedy selects jobs 𝑖1, 𝑖2, … , 𝑖𝑘 sorted by finish time ➢ Consider the optimal solution 𝑗 , 𝑗 , … , 𝑗 (also sorted by
finish time) which matches greedy for as long as possible
o That is, we want 𝑗 = 𝑖 , … , 𝑗 = 𝑖 for greatest possible 𝑟 11𝑟𝑟
12𝑚
373F20 – Nisarg Shah 9
Interval Scheduling
Another standard method is induction
• Optimality of greedy with EFT
➢ Both 𝑖𝑟+1 and 𝑗𝑟+1 were compatible with the previous
selection(𝑖 =𝑗,…,𝑖 =𝑗) 11𝑟𝑟
➢Consider the solution 𝑖 ,𝑖 ,…,𝑖 ,𝑖 ,𝑗 ,…,𝑗 1 2 𝑟 𝑟+1 𝑟+2 𝑚
o It should still be feasible (since 𝑓 ≤ 𝑓 ) 𝑖𝑟+1 𝑗𝑟+1
o It is still optimal
o And it matches with greedy for one more step (contradiction!)
373F20 – Nisarg Shah 10
Interval Partitioning
• Problem
➢ Job 𝑗 starts at time 𝑠 and finishes at time 𝑓
𝑗𝑗
➢ Two jobs are compatible if they don’t overlap
➢ Goal: group jobs into fewest partitions such that jobs in
the same partition are compatible
• One idea
➢ Find the maximum compatible set using the previous greedy EFT algorithm, call it one partition, recurse on the remaining jobs.
➢ Doesn’t work (check by yourselves)
373F20 – Nisarg Shah 11
Interval Partitioning
• Think of scheduling lectures for various courses into as few classrooms as possible
• This schedule uses 4 classrooms for scheduling 10 lectures
373F20 – Nisarg Shah 12
Interval Partitioning
• Think of scheduling lectures for various courses into as few classrooms as possible
• This schedule uses 3 classrooms for scheduling 10 lectures
373F20 – Nisarg Shah 13
Interval Partitioning
• Let’s go back to the greedy template!
➢ Go through lectures in some “natural” order
➢ Assign each lecture to an (arbitrary?) compatible classroom, and create a new classroom if the lecture conflicts with every existing classroom
• Order of lectures?
➢ Earliest start time: ascending order of 𝑠
➢ Earliest finish time: ascending order of 𝑓 𝑗
➢ Shortest interval: ascending order of 𝑓 − 𝑠 𝑗𝑗
➢ Fewest conflicts: ascending order of 𝑐 , where 𝑐 is the 𝑗𝑗
number of remaining jobs that conflict with 𝑗
𝑗
373F20 – Nisarg Shah 14
Interval Partitioning
• At least when you assign each lecture to an arbitrary compatible classroom, three of these heuristics do not work.
• The fourth one works! (next slide)
373F20 – Nisarg Shah 15
Interval Partitioning
373F20 – Nisarg Shah 16
Interval Partitioning
• Running time
➢ Key step: check if the next lecture can be scheduled at
some classroom
➢ Store classrooms in a priority queue
o key = latest finish time of any lecture in the classroom ➢ Is lecture 𝑗 compatible with some classroom?
o Same as “Is 𝑠 at least as large as the minimum key?” 𝑗
o If yes: add lecture 𝑗 to classroom 𝑘 with minimum key, and increase its key to 𝑓
o Otherwise: create a new classroom, add lecture 𝑗, set key to 𝑓 𝑗
➢ 𝑂(𝑛) priority queue operations, 𝑂(𝑛 log 𝑛) time
𝑗
373F20 – Nisarg Shah 17
Interval Partitioning
• Proof of optimality (lower bound)
➢ # classrooms needed ≥ maximum “depth” at any point
o depth = number of lectures running at that time
➢ We now show that our greedy algorithm uses only these many classrooms!
373F20 – Nisarg Shah 18
Interval Partitioning
• Proof of optimality (upper bound) ➢ Let 𝑑 = # classrooms used by greedy
➢ Classroom 𝑑 was opened because there was a schedule 𝑗 which was incompatible with some lectures already scheduled in each of 𝑑 − 1 other classrooms
➢ All these 𝑑 lectures end after 𝑠 𝑗
➢ Since we sorted by start time, they all start at/before 𝑠 𝑗
➢ So at time 𝑠 , we have 𝑑 mutually overlapping lectures 𝑗
➢ Hence, depth ≥ 𝑑
➢ So all schedules use ≥ 𝑑 classrooms. ∎
373F20 – Nisarg Shah 19
Interval Graphs
• Interval scheduling and interval partitioning can be seen as graph problems
• Input
➢ Graph 𝐺 = (𝑉, 𝐸)
➢ Vertices 𝑉 = jobs/lectures
➢ Edge 𝑖, 𝑗 ∈ 𝐸 if jobs 𝑖 and 𝑗 are incompatible
• Interval scheduling = maximum independent set (MIS)
• Interval partitioning = graph colouring
373F20 – Nisarg Shah 20
Interval Graphs
• MIS and graph colouring are NP-hard for general graphs
• But they’re efficiently solvable for “interval graphs” ➢ Graphs which can be obtained from incompatibility of
intervals
➢ In fact, this holds even when we are not given an interval representation of the graph
• Can we extend this result further?
➢ Yes! Chordal graphs
o Every cycle with 4 or more vertices has a chord
373F20 – Nisarg Shah 21
Minimizing Lateness
• Problem
➢ We have a single machine
➢ Each job 𝑗 requires 𝑡 units of time and is due by time 𝑑 𝑗𝑗
➢ If it’s scheduled to start at 𝑠 , it will finish at 𝑓 = 𝑠 + 𝑡 𝑗 𝑗𝑗𝑗
➢Lateness:l =max 0,𝑓 −𝑑 𝑗𝑗𝑗
➢ Goal: minimize the maximum lateness, 𝐿 = max l
• Contrast with interval scheduling ➢ We can decide the start time
➢ There are soft deadlines
𝑗
𝑗
373F20 – Nisarg Shah 22
Minimizing Lateness
• Example Input
An example schedule
373F20 – Nisarg Shah 23
Minimizing Lateness
• Let’s go back to greedy template
➢ Consider jobs one-by-one in some “natural” order
➢ Schedule jobs in this order (nothing special to do here, since we have to schedule all jobs and there is only one machine available)
• Natural orders?
➢ Shortest processing time first: ascending order of
processing time 𝑡 𝑗
➢ Earliest deadline first: ascending order of due time 𝑑 𝑗
➢ Smallest slack first: ascending order of 𝑑 − 𝑡 𝑗𝑗
373F20 – Nisarg Shah 24
Minimizing Lateness
• Counterexamples
➢ Shortest processing time first
o Ascending order of processing time 𝑡 𝑗
➢ Smallest slack first
o Ascending order of 𝑑 − 𝑡 𝑗𝑗
373F20 – Nisarg Shah 25
Minimizing Lateness
• By now, you should know what’s coming…
• We’ll prove that earliest deadline first works!
373F20 – Nisarg Shah 26
Minimizing Lateness
• Observation 1
➢ There is an optimal schedule with no idle time
373F20 – Nisarg Shah 27
Minimizing Lateness
• Observation 2
➢ Earliest deadline first has no idle time
• Let us define an “inversion”
➢ 𝑖,𝑗 such that 𝑑 < 𝑑 but 𝑗 is scheduled before 𝑖
𝑖𝑗
• Observation 3
➢ By definition, earliest deadline first has no inversions
• Observation 4
➢ If a schedule with no idle time has an inversion, it has a pair of inverted jobs scheduled consecutively
373F20 - Nisarg Shah 28
Minimizing Lateness
• Observation 5
➢ Swapping adjacently scheduled inverted jobs doesn’t
increase lateness but reduces #inversions by one
• Proof
➢ Let 𝐿 and 𝐿′ denote lateness before/after swap
➢ Clearly, l = l′ for all 𝑘 ≠ 𝑖, 𝑗 𝑘𝑘
➢ Also, clearly, l′ ≤ l 𝑖𝑖
373F20 - Nisarg Shah 29
Minimizing Lateness
• Observation 5
➢ Swapping adjacently scheduled inverted jobs doesn’t
increase lateness but reduces #inversions by one
• Proof
➢l′=𝑓′−𝑑 =𝑓−𝑑 ≤𝑓−𝑑 =l
𝑗𝑗𝑗𝑖𝑗𝑖𝑖𝑖 ′′′′
➢𝐿 =max l,l,maxl ≤max l,l,maxl ≤𝐿 𝑖 𝑗𝑘≠𝑖,𝑗 𝑘 𝑖 𝑖𝑘≠𝑖,𝑗 𝑘
373F20 - Nisarg Shah 30
Minimizing Lateness
• Observation 5
➢ Swapping adjacently scheduled inverted jobs doesn’t
increase lateness but reduces #inversions by one
• Proof
➢ Check that swapping an adjacent inverted pair reduces the total #inversions by one
373F20 - Nisarg Shah 31
Minimizing Lateness
• Proof of optimality of earliest deadline first
➢ Suppose for contradiction that it’s not optimal
➢ Consider an optimal schedule 𝑆∗ with fewest inversions among all optimal schedules
o WLOG, suppose it has no idle time
➢ Because EDF is not optimal, 𝑆∗ has inversions
➢ By Observation 4, it has an adjacent inversion (𝑖, 𝑗)
➢ By Observation 5, swapping the adjacent pair keeps the schedule optimal but reduces the #inversions by 1
➢ Contradiction! ∎
373F20 - Nisarg Shah 32
Lossless Compression
• Problem
➢ We have a document that is written using 𝑛 distinct labels
➢ Naïve encoding: represent each label using log 𝑛 bits ➢ If the document has length 𝑚, this uses 𝑚 log 𝑛 bits
➢ English document with no punctuations etc.
➢ 𝑛 = 26, so we can use 5 bits o 𝑎 = 00000
o 𝑏 = 00001
o 𝑐 = 00010
o 𝑑 = 00011 o...
373F20 - Nisarg Shah 33
Lossless Compression
• Is this optimal?
➢ What if 𝑎, 𝑒, 𝑟, 𝑠 are much more frequent in the
document than 𝑥, 𝑞, 𝑧?
➢ Can we assign shorter codes to more frequent letters?
• Say we assign...
➢ 𝑎 = 0, 𝑏 = 1, 𝑐 = 01, ...
➢ See a problem?
o What if we observe the encoding ‘01’? o Is it ‘ab’? Or is it ‘c’?
373F20 - Nisarg Shah 34
Lossless Compression
• To avoid conflicts, we need a prefix-free encoding ➢ Map each label 𝑥 to a bit-string 𝑐(𝑥) such that for all
distinct labels 𝑥 and 𝑦, 𝑐(𝑥) is not a prefix of 𝑐 𝑦 ➢ Then it’s impossible to have a scenario like this
.............................
𝑐(𝑥) 𝑐(𝑦)
➢ So we can read left to right
o Whenever the part to the left becomes a valid encoding, greedily decode it, and continue with the rest
373F20 - Nisarg Shah 35
Lossless Compression
• Formal problem
➢ Given 𝑛 symbols and their frequencies (𝑤 , ... , 𝑤 ), find a
1𝑛 prefix-free encoding with lengths (l1, ... , l𝑛) assigned to
the symbols which minimizes σ𝑛 𝑤 ⋅ l 𝑖=1 𝑖 𝑖
o Note that σ𝑛 𝑤 ⋅ l is the length of the compressed document 𝑖=1 𝑖 𝑖
• Example
➢(𝑤 ,𝑤 ,𝑤 ,𝑤 ,𝑤 ,𝑤 ) = (42,20,5,10,11,12)
𝑎𝑏𝑐𝑑𝑒𝑓
➢ No need to remember the numbers ☺
373F20 - Nisarg Shah 36
Lossless Compression
• Observation: prefix-free encoding = tree
𝑎 → 0, 𝑒 → 100,
𝑓 → 101, 𝑐 → 1100, 𝑑 → 1101, 𝑏 → 111
373F20 - Nisarg Shah 37
Lossless Compression
• Huffman Coding
➢ Build a priority queue by adding 𝑥, 𝑤 for each symbol 𝑥
➢ While |queue|≥ 2
o Take the two symbols with the lowest weight (𝑥, 𝑤 ) and (𝑦, 𝑤 )
o Merge them into one symbol with weight 𝑤 + 𝑤 • Let’s see this on the previous example
𝑥
𝑥𝑦 𝑥𝑦
373F20 - Nisarg Shah 38
Lossless Compression
373F20 - Nisarg Shah 39
Lossless Compression
373F20 - Nisarg Shah 40
Lossless Compression
373F20 - Nisarg Shah 41
Lossless Compression
373F20 - Nisarg Shah 42
Lossless Compression
373F20 - Nisarg Shah 43
Lossless Compression
• Final Outcome
𝑎 → 0, 𝑒 → 100,
𝑓 → 101, 𝑐 → 1100, 𝑑 → 1101, 𝑏 → 111
373F20 - Nisarg Shah 44
Lossless Compression
• Running time
➢𝑂(𝑛log𝑛)
➢ Can be made 𝑂(𝑛) if the labels are given to you sorted by their frequencies
o Exercise! Think of using two queues...
• Proof of optimality
➢ Induction on the number of symbols 𝑛
➢ Base case: For 𝑛 = 2, both encodings which assign 1 bit to each symbol are optimal
➢ Hypothesis: Assume it returns an optimal encoding with 𝑛 − 1 symbols
373F20 - Nisarg Shah 45
Lossless Compression
• Proof of optimality
➢ Consider the case of 𝑛 symbols
➢ Lemma 1: If 𝑤 < 𝑤 , then l ≥ l 𝑥𝑦𝑥𝑦
➢ Proof:
o Suppose for contradiction that 𝑤 < 𝑤
in any optimal tree. and l < l .
𝑥𝑦𝑥𝑦 o Swapping 𝑥 and 𝑦 strictly reduces the overall length as
𝑤 ⋅l +𝑤 ⋅l <𝑤 ⋅l +𝑤 ⋅l (check!) 𝑥𝑦𝑦𝑥𝑥𝑥𝑦𝑦
o QED!
373F20 - Nisarg Shah 46
Lossless Compression
• Proof of optimality
➢ Consider the two symbols 𝑥 and 𝑦 with lowest frequency
which Huffman combines in the first step
➢ Lemma 2: ∃ optimal tree 𝑇 in which 𝑥 and 𝑦 are siblings
(i.e. for some 𝑝, they are assigned encodings 𝑝0 and 𝑝1).
➢ Proof:
1. Take any optimal tree
2. Let 𝑥 be the label with the lowest frequency.
3. If 𝑥 doesn’t have the longest encoding, swap it with one that has
4. Due to optimality, 𝑥 must have a sibling (check!)
5. If it’s not 𝑦, swap it with 𝑦
6. Check that Steps 3 and 5 do not change the overall length. ∎
373F20 - Nisarg Shah 47
Lossless Compression
• Proof of optimality
➢ Let 𝑥 and 𝑦 be the two least frequency symbols that
Huffman combines in the first step into “𝑥𝑦”
➢ Let 𝐻 be the Huffman tree produced
➢ Let 𝑇 be an optimal tree in which 𝑥 and 𝑦 are siblings
➢ Let 𝐻′ and 𝑇′ be obtained from 𝐻 and 𝑇 by treating 𝑥𝑦 as one symbol with frequency 𝑤 + 𝑤
𝑥𝑦
➢ Induction hypothesis: 𝐿𝑒𝑛𝑔𝑡h 𝐻′ ≤ 𝐿𝑒𝑛𝑔𝑡h(𝑇′)
➢𝐿𝑒𝑛𝑔𝑡h 𝐻 =𝐿𝑒𝑛𝑔𝑡h 𝐻′ + 𝑤 +𝑤 ⋅1 𝑥𝑦
➢𝐿𝑒𝑛𝑔𝑡h𝑇=𝐿𝑒𝑛𝑔𝑡h𝑇′ +𝑤+𝑤 ⋅1 𝑥𝑦
➢ So 𝐿𝑒𝑛𝑔𝑡h 𝐻 ≤ 𝐿𝑒𝑛𝑔𝑡h(𝑇) ∎
373F20 - Nisarg Shah 48
Other Greedy Algorithms
• If you aren’t familiar with the following algorithms, spend some time checking them out!
➢ Dijkstra’s shortest path algorithm
➢ Kruskal and Prim’s minimum spanning tree algorithms
373F20 - Nisarg Shah 49