=============================================================================
CSC 263 Lecture Summary for Week 11 Fall 2019
=============================================================================
READING: Chapter 23.
SELF-TEST: Exercise 23.1-1, 23.2-2.
———————-
Minimum Spanning Trees
———————-
– Input: Connected, undirected, weighted graph G=(V,E) with weights w(e)
for each edge e in E.
Output: Find a spanning tree of G with smallest total weight.
– Spanning tree? Connected, acyclic subgraph of G — subset of edges that
is connected (at least one path between any two vertices) and acyclic (no
simple cycle). If G not connected, generalize to minimum spanning forest.
– Applications: circuit wiring, road paving, etc.
– Algorithm idea 1: Start with T = E; remove edges until MST remains.
Q: Which edges to remove?
A: Edges with largest weight whose endpoints are already connected —
in other words, the edge belongs to some cycle.
Possible but, in practice, inefficient.
. Fact: every tree on n vertices contains exactly n-1 edges.
. Fact: the maximum number of edges for an undirected graph on n
vertices is (n choose 2) = n(n-1)/2.
. So the algorithm may have to remove Omega(n^2) many edges to reach a
spanning tree.
– Algorithm idea 2: Start with T = {}; add edges until MST is reached —
generic “greedy” algorithm:
GREEDY-MST(G=(V,E),w:E->R)
T <- {} # invariant: T subset of some MST of G
while T is not a spanning tree:
find e "safe for T"
T <- T u {e}
Loop condition? Test |T| = n-1 (where n = |V|) -- since T is always a
subset of some MST, by the loop invariant.
"Safe" edges? Edge e is "safe for T" iff T u {e} is a subset of some MST
of G. This preserves loop invariant and makes algorithm trivially
correct, but poses one major difficulty: how to find safe edges
efficiently? With no other information, current algorithm is somewhat
circular: to find a MST, you need to know which edges belong to some
MST...
- Theorem: If G is a connected, undirected, weighted graph, T is a subset
of some MST of G, and e is any edge _of_minimum_weight_ whose endpoints
are in different connected components of T, then e is safe for T
(T u {e} is a subset of some MST of G).
Proof (in textbook) uses "exchange argument" -- you will learn more
about these types of proofs in CSC373.
Prim's algorithm:
- Idea: pick a root (any vertex r in V will do) and "grow" MST by
connecting one isolated vertex at a time. [Picture: partial tree T
rooted at r, isolated vertices outside T.] At each step, select edge
with minimum weight from T to any isolated vertex.
How to make this efficient? Priority queue holds isolated vertices v,
priority[v] = minimum weight of any edge to T (oo if no such edge),
pi[v] = parent of v for edge of weight priority[v] (NIL if no such %%%%%%%
edge).
PRIM-MST(G=(V,E),w:E->R):
T <- {}
Q <- empty priority queue
for all v in V:
priority[v] <- oo
pi[v] <- NIL
Enqueue(Q,v)
select vertex r in V
priority[r] <- 0
Update-Priority(Q,r)
while Q is not empty:
u <- ExtractMin(Q)
T <- T u {(pi[u],u)} # problem when u = r: see below
for each v in adj[u]:
if v in Q and w(u,v) < priority[v]:
priority[v] <- w(u,v)
Update-Priority(Q,v)
pi[v] <- u
T <- T - {(NIL,r)} # "clean up"
- Note: Update-Priority(Q,v) called "Decrease-Key" in textbook. Assumes
constant-time access to v's position in Q -- can be achieved by
maintaining additional field pos[v] for each v (set to -1 when v not in
Q anymore, which also makes test "v in Q" efficient).
- Trace (assuming root = C):
Graph priority,parent of v in Q T
E B E D C B
|\2 | oo,NIL oo,NIL 0,NIL oo,NIL {}
1| \ |3 2,C 2,C 3,C {}
| \| 1,E 3,C {{C,E}}
D---C 3,C {{C,E},{E,D}}
2 {{C,E},{E,D},{C,B}}
Resulting MST: {{C,E},{E,D},{C,B}}.
- Running Time? All priority queue operations take time Theta(log n). At
most n ExtractMins performed. For each vertex, adjacency list examined
once; for each edge, perform at most one Update-Priority. Total is
O((n + m) log n) = O(m log n).
Possible to do better by using more complex data structure for priority
queue: "Fibonacci heaps" yield overall time O(m + n log n).
Kruskal's algorithm:
- Idea: look at edges in order of non-decreasing weight, select each edge
that does not create a cycle in T.
KRUSKAL-MST(G=(V,E),w:E->R):
T <- {}
sort edges so w(e_1) <= w(e_2) <= ... <= w(e_m)
for i <- 1 to m:
# let (u_i,v_i) = e_i
if u_i,v_i in different connected components of T:
T <- T u {e_i}
- Difficulty: testing for components could be "expensive".
If we use BFS/DFS for this purpose, each test for connectivity takes
time O(n). The loop iterates O(m) times, for a total of O(mn). This is
in addition to the time to sort: O(m log m). Grand total = O(mn).
- Use disjoint set ADT to keep track of connected components.
KRUSKAL-MST(G=(V,E),w:E->R)
T <- {}
sort edges so w(e_1) <= w(e_2) <= ... <= w(e_m)
for each v in V:
Make-Set(v)
for i <- 1 to m:
# let (u_i,v_i) = e_i
if Find-Set(u_i) != Find-Set(v_i):
Union(u_i,v_i)
T <- T u {e_i}
Trace (assume edge {E,C} gets sorted before edge {D,C}):
Graph Disjoint Sets T
E B {E} {D} {C} {B} {}
|\2 | {E,D} {C} {B} {{E,D}}
1| \ |3 {E,D,C} {B} {{E,D},{E,C}}
| \| {E,D,C} {B} {{E,D},{E,C}}
D---C {E,D,C,B} {{E,D},{E,C},{B,C}}
2
Resulting MST: {{E,D},{E,C},{B,C}}.
Running time? O(m log m) for sorting weights, and O(m * T(n,m)) for main
loop, where T(n,m) is worst-case time for disjoint set operations.