CSC373H Lecture 2
Dan Zingaro
September 19, 2016
Greedy Proof Technique
Let S0 be our state before the first iteration of greedy
Let S1 be our partial solution after the first iteration of greedy
Let S2 be our partial solution after the second iteration of greedy
. . . Let Sn be the solution after all n iterations of greedy
We are going to use induction to show that each Si is part of some optimal solution
Greedy Proof Technique…
Promising: some optimal solution OPT agrees with the greedy algorithm in some way
The particular notion of “promising” depends on the algorithm!
We use the loop invariant “Si is promising” to prove correctness of a greedy algorithm.
For activity scheduling: we say that Si is promising if there is some optimal solution OPT such that Si and OPT agree on all choices made between choice 1 and choice i inclusive
Activity Scheduling: Base Case
Before the loop runs, S0 = ∅.
Take any optimal solution OPT
S0 and OPT trivially agree on the first 0 activities
Activity Scheduling: Inductive Case
Inductive hypothesis: Suppose i ≥ 0 and that Si and some optimal solution OPT agree on all choices made up to and including ai . Inductive step: we must prove that Si+1 is promising; i.e. we must show that some optimal and Si+1 agree on all choices up to and including ai+1.
Activity Scheduling: Inductive Case…
Two choices for what the greedy algorithm does.
(1) If ai+1 overlaps an activity that is already in S, then we exclude ai+1.
Because Si and OPT agree on all choices made up to and including ai, ai+1 must similarly conflict with some activity in OPT, so ai+1 is excluded from OPT too.
Therefore, Si+1 and OPT agree up to and including ai+1.
Activity Scheduling: Inductive Case…
(2) Otherwise, ai+1 doesn’t overlap any previously-added activity, so the greedy algorithm adds ai+1 to Si+1.
Two sub-cases.
2a. OPT includes ai+1. Then Si+1 and OPT already agree up to and including ai+1.
Activity Scheduling: Inductive Case…
2b. OPT does not include ai+1. We’re in trouble if we can’t fix this!
OPT must include some aj that overlaps ai+1. Why?
Well, if we could successfully add ai+1 to OPT, then OPT
wouldn’t be optimal after all!
Wehavej>i+1,becauseOPTandSi agreeuptoi,and
nothing in Si overlaps ai+1 Therefore, aj is not in Si+1
Activity Scheduling: Inductive Case…
2b. …
In OPT, exactly one aj overlaps ai+1. Why?
We finally use the fact that the activities are sorted by finish
time
If OPT contained more than one activity that overlapped ai+1,
then those activities would all start before ai+1 ends and end
after ai+1 ends
But we cannot have this without OPT itself having an overlap!
Exchange lemma: it’s safe to remove aj from OPT and replace it with ai+1!
After the exchange, the new OPT and Si+1 agree up to and including ai+1
Activity Scheduling: Postcondition
Now we know that each Si is promising
In particular, Sn is promising, which means that Sn and OPT
agree up to and including an (i.e. they agree on all activities) Therefore, Sn is optimal
Minimum Spanning Tree (MST)
Input:connected,undirected,weightedgraphG=(V,E), with n vertices and m edges
Each edge e has non-negative weight w(e)
Output: a spanning tree T ⊆ E such that the sum of the costs of the edges in T is minimal
Definition of Spanning Tree
A spanning tree is a connected, acyclic set of edges that connects all vertices.
Connected: contains a path between any two vertices Acyclic: no cycles
Properties of Spanning Trees
For all spanning trees T of graph G: T contains exactly n − 1 edges
The removal of any edge from T disconnects T into two disjoint subtrees
The addition of any edge to T creates a cycle
Kruskal’s Algorithm
Greedy algorithm: add edge of smallest weight, as long as it doesn’t create a cycle.
Sort edges by nondecreasing weight: w(e_1) <= ... <= w(e_m)
T = {}
for i in [1,2,...,m]:
(u,v) = e_i
if T does _not_ contain a path between u and v:
T = T UNION {e_i}
Kruskal’s Algorithm: Example
Use Kruskal to find an MST of this graph.
Kruskal’s Algorithm: Proof
Kruskal generates sets of edges T0, T1, . . ., Tm
We are going to use the same strategy for proving correctness
of greedy algorithms that we used for activity scheduling
Remember that this involves showing that Ti is promising at each step
Promising: Ti and optimal solution OPT agree on whether to include or exclude each of the first i edges
Kruskal’s Algorithm: Base Case
Before the loop runs, T0 = ∅.
Take any optimal solution OPT
T0 and OPT trivially agree on the first 0 edges
Kruskal’s Algorithm: Inductive Case
Inductive hypothesis: Suppose i ≥ 0 and that Ti and some optimal solution OPT agree on all choices made up to and including ei . Inductive step: we must prove that Ti+1 is promising; i.e. we must show that OPT and Ti+1 agree on all choices up to and including ei+1.
Kruskal’s Algorithm: Inductive Case...
Two choices for what Kruskal’s algorithm does.
(1) If ei+1 forms a cycle with the edges in Ti, then we exclude ei+1. But then, because Ti and OPT agree on all choices made up to and including ei, ei+1 must similarly cause a cycle in OPT. As OPT is an MST, ei+1 is excluded from OPT too.
Therefore, Ti+1 and OPT agree up to and including ei+1.
Kruskal’s Algorithm: Inductive Case...
(2) Otherwise, ei+1 doesn’t form a cycle and so is added to Ti+1. Two sub-cases.
2a. OPT includes ei+1. Then Ti+1 and OPT already agree up to and including ei+1.
Kruskal’s Algorithm: Inductive Case...
2b. OPT does not include ei+1. We’re in trouble if we can’t fix this!
Let u and v be the endpoints of ei+1
OPT must contain a path from u to v
Some edge ej on this path is not in Ti. Why?
Otherwise, Ti would already contain a path from u to v, and the algorithm wouldn’t add ei+1
Kruskal’s Algorithm: Inductive Case...
2b. ...
Wehavej>i+1,becauseOPTandTi agreeuptoand
including edge i
Then, because edges are sorted by nondecreasing weight,
w(ej) ≥ w(ei+1)
Construct OPT2 by starting with OPT, adding edge ei+1, and
removing edge ej
OPT2 is a new MST (cost is no worse than OPT), and Ti+1 and OPT2 agree up to and including ei+1
Kruskal’s Algorithm: Postcondition
Now we know that each Ti is promising
In particular, Tm is promising, which means that Tm and OPT
agree up to and including em (i.e. they agree on all edges) Therefore, Tm is optimal