Minimum spanning tree
Based on slides by David Kauchak
Minimum spanning trees
What is the lowest weight set of edges that connects all vertices of an undirected graph with positive weights
Input: An undirected, connected, positive weight graph, G=(V,E) Output: A tree T=(V,E’) where E’ E that minimizes
weight(T) w
e eE’
Spanning trees
• G=(V,E)isaconnectedgraph.|V|=n,|E|=m
• Is there a subset of edges E’ ⊆ E that forms a tree and
still connects all the nodes? • Yes.
Proof: 𝑚 ≥ 𝑛 − 1 (otherwise, G would not be connected). If 𝑚 = 𝑛 − 1, then G is already a tree.
If 𝑚 > 𝑛 − 1 then G has a cycle, and if we delete an edge from a cycle, the graph is still connected.
Keep on deleting edges from cycles till we are left with 𝑛 − 1 edges that make a tree.
But there are many spanning trees we can obtain this way.
We want one with minimum cost.
MST example
A1CE
34245 4
BDF 46
A1CE
245 BDF
4
MSTs
Can an optimal set of edges that span all the nodes have a cycle?
A1CE
245 B4DF
4
MSTs
Can an optimal set of edges that span all the nodes have a cycle?
A1CE
245 BDF
4
Applications?
Connectivity
Networks (e.g. communications) Circuit design/wiring
hub/spoke models (e.g. flights, transportation) Traveling salesman problem?
Algorithm ideas?
A1CE
34245 4
BDF 46
A1CE
245 BDF
4
Cuts
A cut is a partitioning of the vertices into two sets S and V-S
An edge “crosses” the cut if it connects a vertex uV and vV-S
A1CE
34245 4
BDF 46
Minimum cut property
Given a (S, V-S), let edge e be the minimum cost edge that crosses the cut (if there is a single such e). Every minimum spanning tree contains edge e.
Prove this!
A1CE
34245 4
BDF 46
Minimum cut property
Given a cut (S,V-S), let edge e be the edge of minimum cost edge that crosses the partition (if there is a single such e). Every minimum spanning tree contains edge e.
S V-S e
If we add e, there is a cycle that contains both e’ and e
e’
Consider an MST with edge e’ that is not the minimum edge
Minimum cut property
Given a cut (S,V-S), let edge e be the edge of minimum cost edge that crosses the partition (if there is a single such e). Every minimum spanning tree contains edge e.
S
V-S
e
If we add e, there is a cycle that contains both e’ and e
e’
Using e instead of e’, still connects the graph, but produces a tree with smaller weights
Minimum cut property- 2
Given a cut (S,V-S), let edge e of minimum cost edge that crosses the partition. There is a minimum spanning tree that contains edge e.
S
V-S
e
If we add e, there is a cycle that contains both e’ and e
e’
Consider an MST with a crossing edge e’ that is not e
Minimum cut property-2
Given a cut (S,V-S), let edge e of minimum cost edge that crosses the partition. There is a minimum spanning tree that contains edge e.
S
V-S
e
If we add e, there is a cycle that contains both e’ and e
e’
Using e instead of e’, still connects the graph, but produces a tree with smaller or equal
weights
Basic principle in the construction of a MST
• A set E’ of edges is promising if it can be extended to an MST.
• IfSisasetofnodesthatcontainsallendpointsofedgesinpromisingE’,andeisaminweight edge crossing from S to V-S, then E’ ∪ 𝑒 is also promising.
S
V-S
e
Kruskal algorithm
• MainIdea:Selectedgesinorderofincreasingcostandacceptanedgeonlyifitdoesnotcausea cycle
Kruskal’s MST algorithm:
• Put all the vertices into single node trees (also called sets) by themselves
• Sort edges with respect to the weight
• Repeat until |V|-1 edges have been accepted
Look at the next edge in the sorted list
If it forms a cycle with the edges already selected, ignore it; else select the edge
(it will join two existing trees and yield a larger tree) • Returntheacceptededges(theyformthespanningtree)
Disjoint-set data structure (Chapter 21 in Cormen et al.)
• Data structure maintains a collection of sets 𝑆, 𝑆, … , 𝑆 with elements from a universe U
• The sets 𝑆, 𝑆, … , 𝑆 are pairwise disjoint (no element of U can belong to 2 sets)
• Each set is identified by a representative which is an element of the set. • Operations:
MAKE-SET (x) : creates a set with single element x
FIND-SET(x): returns the representative of the set containing x
UNION (x,y): unites the set containing x with the set containing y (and destroys these two sets).
Implementation with trees: each set 𝑆 is implemented as a tree; the root is the representative
Disjoint set data structure – implementation with path compression and union by rank
MAKE-SET(x) // creates set containing the single element x; the parent of x is x; rank is the heigh x.p = x; x.rank =0
FIND-SET(x) // goes up and returns the root of the set containing x; all the nodes on the path are linked to the if x ≠ 𝑥. 𝑝
x.p = FIND-SET(x.p) return x.p
// root
LINK (x,y) // unites the trees with roots x and y; the tree with lower rank becomes a child of the root of the larger one if (x.rank > y.rank) y.p = x
else x.p = y
if (x.rank == y.rank) y.rank = y.rank+1
UNION (x,y) // unites the trees containing x and y LINK(FIND-SET(x), FIND-SET(y))
Theorem 21.14 (in Cormen) A sequence of m MAKE-SET, UNION, and FINd-SET, n of which are MAKE-setareperformedin lessthanO(mloglog𝑛)
Kruskal’s algorithm
Kruskal’s algorithm
Add smallest edge that connects
two sets not already connected
A1CE
3424
BDF 46
G
5
4
MST ACE
BDF
Kruskal’s algorithm
Add smallest edge that connects
two sets not already connected
A1CE
3424
BDF 46
G
5
4
MST A1CE
BDF
Kruskal’s algorithm
Add smallest edge that connects
two sets not already connected
A1CE
3424
BDF 46
G
5
4
MST A1CE
2 BDF
Kruskal’s algorithm
Add smallest edge that connects
two sets not already connected
A1CE
3424
BDF 46
G
5
4
MST A1CE
2 BDF
4
Kruskal’s algorithm
Add smallest edge that connects
two sets not already connected
A1CE
34245 4
BDF 46
G
MST A1CE
4
4
2 BDF
Kruskal’s algorithm
Add smallest edge that connects
two sets not already connected
A1CE
34245 4
BDF 46
G
MST A1CE
245 BDF
4
Correctness of Kruskal’s
Never adds an edge that connects already connected vertices
At the iteration where we add the i-th edge (u,v): S = nodes in the edges chosen so far + u or v (if none of them already in S)
(u,v) is lowest cost edge that crosses from S to V-S. By basic principle, that edge must be part of the MST
Running time of Kruskal’s
Running time of Kruskal’s
n calls to MakeSet
O(m log m) = O(m log n) 2 m calls to FindSet
n calls to Union
So (2m+2n) ≤ 4𝑚 calls MakeSet, FindSet or Union
Running time of Kruskal’s
Using the Disjoint Set Data structure: 3m calls to FindSet or Union can be
done in 3m log log n.
Total time: O(m log n) (for sorting) + 4m log log n (for the rest) =
O(m log n).
Prim’s algorithm
Idea: Grow a tree by adding an edge from the “known” vertices to the “unknown” vertices. Pick the edge with the smallest weight.
Prim’s algorithm
Builds MST by greedily adding nodes
INITIALIZATION:
Select a node to be the “root” ; mark it as known; Update cost of all its neighbors
MAIN LOOP:
While there are unknown nodes left in the graph
a. Select an unknown node b with the smallest cost from some known node a
b. Mark b as known
c. Add ( a, b) to MST
d. Update cost of all nodes adjacent to b
Prim’s algorithm
Prim’s algorithm
Prim’s algorithm
Prim’s algorithm
Start at some root node and build out the MST by adding the lowest weighted edge at the frontier
Prim’s
A1CE
3424 B4D6FACE
BDF
4
5
MST
Prim’s
A1CE
3424
5
MST
4
BDF 46ACE
BDF
0
Prim’s
A1CE
3424
5
MST
4
BDF45 46ACE
BDF
60
Prim’s
A1CE
3424
5
MST
4
BDF45 46ACE
BDF
60
Prim’s
A1CE
3424
5
MST
4
BDF145 46ACE
BDF
420
Prim’s
A1CE
3424
5
MST
4
BDF145 46ACE
BDF
420
Prim’s
A1CE
3424
5
MST
4
BDF145 46ACE
BDF
420
Prim’s
A1CE
3424
5
MST
4
BDF145 46ACE
BDF
420
Prim’s
A1CE
3424
5
MST
4
BDF145 46ACE
BDF
420
Prim’s
A1CE
3424
5
MST
4
BDF145 46ACE
BDF
420
Prim’s
A1CE
3424
5
MST
4
BDF145 46ACE
BDF
420
Prim’s
A1CE
3424
5
MST
4
BDF145 46ACE
BDF
420
Correctness of Prim’s?
Can we use the basic principle ?
By induction, at each step the set of edges added so far is promising. Proof:
Step 0: The empty tree is promising.
At the iteration where we add (u,v) to the tree:
Let S be the set of vertices added to the tree so far (u,v) is a min weight edge crossing from S to V-S So the edges continue to be a promising set.
Running time of Prim’s
Running time of Prim’s
Θ(|V|)
Θ(|V|)
|V| calls to Extract-Min
|E| calls to Decrease-Key
Running time of Prim’s
Same as Dijkstra’s algorithm (depends on how we implement the heap: array or
binary MIN HEAP)
1 MakeHeap Array O(|V|)
Bin heap O(|V|)
|V| ExtractMin O(|V|2)
O(|V| log |V|)
|E| DecreaseKey O(|E|)
O(|E| log |V|)
Total O(|V|2)
O((|V|+|E|) log |V|) O(|E| log |V|)