Lecture 20: Minimum Spanning Trees
• A connected undirected graph G(V,E) has between V-1 and V(V-1)/2 edges.
• O(logE)=O(logV2)=O(2logV)=O(logV)
• An undirected graph G(V,E) with fewer than V-1 edges is not connected.
Copyright By PowCoder代写 加微信 powcoder
• A connected undirected graph G(V,E) with V-1 edges, is a tree.
• An undirected graph G(V,E) with more than V-1 edges contains at least one cycle.
Minimum Spanning Trees
• Definition
• Prim’s Algorithm
• The Cut Lemma
Correctness of Prim’s Algorithm
Uniqueness of MSTs (under distinct weight assumption)
• Kruskal’s Algorithm
Basic Idea
Union-Find Data Structure
Kruskal’s algorithm
Correctness of Kruskal’s Algorithm
• Removing distinct weight assumption
Minimum Spanning Tree Example
Minimum spanning tree. Given a connected undirected graph
with real-valued edge weights , an MST is a subset of the
edges such that is a tree that connects all nodes whose sum of edge weights is minimized.
969 511 511
8101478 7 21
Applications: telephone networks, electrical and hydraulic systems , TV cables, computer networks, road systems
Prim’s Algorithm: Idea
Prim’s algorithm
Initialize = {any one node}.
Add min cost edge Addto.
Repeat until
with and to
Prim’s Algorithm: Idea (cont)
Prim’s algorithm
Initialize = {any one node}.
Add min cost edge Addto.
Repeat until
with and to
Prim’s Algorithm: Idea (cont)
Prim’s algorithm
Initialize = {any one node}.
Add min cost edge Addto.
Repeat until
with and to
Prim’s Algorithm: Idea (cont)
Prim’s algorithm
Initialize = {any one node}.
Add min cost edge Addto.
Repeat until
with and to
Prim’s Algorithm: Idea (cont)
Prim’s algorithm
Initialize = {any one node}.
Add min cost edge Addto.
Repeat until
with and to
form a tree
Prim’s Algorithm: Idea (cont)
Prim’s algorithm
Initialize = {any one node}.
Add min cost edge Addto.
Repeat until
with and to
Prim’s Algorithm: Example
Prim’s Algorithm: Example (continued)
Prim’s Algorithm: Implementation
Implementation.
Maintain set of explored nodes .
For each unexplored node , maintain cheapest edge from to node in .
Maintain all nodes in a priority queue with this cheapest edge as key
Note: In the end, the parent pointers form the MST.
Running time:
Q: Decrease-key needs the location of the key in the heap. How to get that?
Prim(𝐺, 𝑟):
for each 𝑣∈𝑉 do
𝑣. 𝑘𝑒𝑦 ← ∞,𝑣. 𝑝 ← 𝑛𝑖𝑙,𝑣. 𝑐𝑜𝑙𝑜𝑟 ← 𝑤h𝑖𝑡𝑒 𝑟. 𝑘𝑒𝑦 ← 0
insert all keys in a min priority queue 𝑄 on 𝑉 while 𝑄≠∅
𝑢 ← Extract-Min(𝑄)
𝑢. 𝑐𝑜𝑙𝑜𝑟 ← 𝑏𝑙𝑎𝑐𝑘
for each 𝑣 ∈ 𝐴𝑑𝑗[𝑢] do
if 𝑣. 𝑐𝑜𝑙𝑜𝑟 = 𝑤h𝑖𝑡𝑒 and 𝑣. 𝑝 ← 𝑢
𝑤(𝑢, 𝑣) < 𝑣. 𝑘𝑒𝑦 then Decrease-Key(𝑄, 𝑣, 𝑤(𝑢, 𝑣))
𝑣.𝑘𝑒𝑦←𝑤 𝑢,𝑣
Simplifying assumption. All edge weights are distinct.
Cut lemma. Let be any subset of nodes, and let be the min cost edge with exactly one endpoint in . Then every MST must contain .
If we replace in ∗ with , then ∗
is still a spanning tree, but the total cost will be lowered,
contradicting fact that ∗ is an MST.
Pf of Cut Lemma. (exchange argument)
There is a path in
which must cross cut separating from
∗ be any MST.
and suppose
that connects to
using some other edge
e is in the MST
=> is in every MST!
Kruskal’s Algorithm: Idea
Kruskal’s algorithm.
Starts with an empty tree
Consider edges in increasing order of weight.
Case 1: If adding to creates a cycle, discard .
Case 2: Otherwise, insert
into according to cut lemma
𝑒 creates cycle. Don’t add 𝑒 to 𝑇
𝑒′ doesn’t create cycle. Add 𝑒′ to 𝑇
Kruskal’s Algorithm: Example
Kruskal’s Algorithm: Example (continued)
Kruskal’s Algorithm: Example (continued)
Kruskal’s Algorithm: Implementation
Key question: How to check whether adding to Use DFS?
– Would result in total time.
Can we do the checking in time?
Observations:
will create a cycle?
does not matter. After an edge is added, two sets “union” together.
Need such a “union-find” data structure:
Maintain a collection of sets to support the following two operations:
The actual structure of each component of
– Each component can be considered as a set of nodes.
Find-Set( ): For a given node belongs to.
Union( ): For two given nodes containing and together.
, find which set this node and , merge the two sets
The union-find data structure
Representing each set as a tree:
The trees in the union-find data structure are NOT the same as the partial MST trees!
The root of the tree is the representative node of all nodes in that tree (i.e., use the root’s ID as the unique ID of the set).
Every node (except the root), has a pointer pointing to its parent. – The root has a parent pointer to itself.
– No child pointers, so a node can have many children.
Make-Set(x) and Find-Set(x)
Create-Set(x):
Find-Set(x):
Running time proportional to the height of the tree.
Make-Set(𝑥): 𝑥. 𝑝𝑎𝑟𝑒𝑛𝑡 ← 𝑥
𝑥. h𝑒𝑖𝑔h𝑡 ← 0
Find-Set(𝑥):
while 𝑥! = 𝑥. 𝑝𝑎𝑟𝑒𝑛𝑡 do
𝑥 ← 𝑥. 𝑝𝑎𝑟𝑒𝑛𝑡 return 𝑥
Union(x, y)
Assumption: and are the roots of their trees. If not, do Find-Set first
But, what if…
Solution (union by height):
When we union two trees together, we always make the root of the taller tree the parent of shorter tree.
Need to maintain the height of each tree
Union(𝑥, 𝑦):
𝑎 ← Find-Set(𝑥)
𝑏 ← Find-Set(𝑦)
if 𝑎. h𝑒𝑖𝑔h𝑡 ≤ 𝑏. h𝑒𝑖𝑔h𝑡 then
if 𝑎. h𝑒𝑖𝑔h𝑡 = 𝑏. h𝑒𝑖𝑔h𝑡 then 𝑏. h𝑒𝑖𝑔h𝑡 ← 𝑏. h𝑒𝑖𝑔h𝑡 + 1
𝑎. 𝑝𝑎𝑟𝑒𝑛𝑡 ← 𝑏
𝑏. 𝑝𝑎𝑟𝑒𝑛𝑡 ← 𝑎
The union-find data structure: Analysis
Theorem: The running time of Find-Set and Union is
Pf: We will show (by induction) that for any tree with height is at least (i.e., it contains at least nodes)
At beginning, , and . We have
Suppose the assumption is true for any and before Union( ).
Let the size and height of the resulting tree be , and .
– Case 1: – Case 2:
, we have , we have
, similar to case 1.
()
()
Consider any tree during the running of the algorithm. It must have elements. Since a tree with height has at least n elements
the tree has height . Thus all operations take time.
, its size
Path Compression
We have visited a number of nodes after Find-Set( ), and have reached the root .
We already know that these nodes belong to the set represented by .
Why not just set the parent pointers of these nodes to directly?
– Future operations will be faster!
This results in a running time that is practically a constant (but theoretically not).
See textbook for details (not required).
Running time:
Kruskal’s Algorithm
MST-Kruskal(𝐺):
for each vertex 𝑣 ∈ 𝑉
Make-Set(𝒗)
sort the edges of 𝐺 into increasing order by weight for each edge (𝑢, 𝑣) ∈ 𝐸 taken in the above order
if Find-Set(𝑢)≠Find-Set(𝑣) then output edge (𝑢,𝑣)
Union(𝑢, 𝑣)
Note: If edges are already sorted and we use path compression, then the running time is close to .
Current best MST algorithm:
An algorithm by Seth and Ramachandran (2002) has been shown to be optimal, but its running time is still unknown…
Correctness of Kruskal’s Algorithm
Kruskal’s algorithm.
Starts with an empty tree Consider edges in ascending order of weight.
Case 1: If adding to creates a cycle, discard .
Case 2: Otherwise, insert
into according to cut lemma
Case 1 𝑒 creates cycle. Don’t add 𝑒 to 𝑇
Case 2 𝑒′ doesn’t create cycle. Add 𝑒′ to 𝑇
Simplifying assumption. All edge weights are distinct.
Given: Cut lemma. Let be any subset of nodes, and let be the min cost edge with exactly one endpoint in . Then every MST must contain .
Observations on Prim’s and Kruskal’s algorithms
Both algorithms are Greedy since they make the choice that looks best at the moment
• Prim adds to MST lightest edge from S
• Kruskal adds to MST lightest edge that does not create a cycle.
For both Prim’s and Kruskal’s algorithm, we assumed that all the edges have different weights.
If we remove this assumption (and allow some or even all edges to have the same weight) the algorithms still work. The only thing that needs to be changed is that, instead of choosing the smallest cost edge, we choose a smallest cost edge (breaking ties arbitrarily).
Questions on Minimum Spanning Trees
Q: Suppose that (u, v) is an edge of a graph G and it has the least cost among all edges that are incident to vertex v. Does every minimum spanning tree contain (u, v) assuming that all edges have distinct costs?
Yes. Suppose that T is a minimum spanning tree of G and T does not contain (u, v). If (u, v) is added to T, then a cycle is formed. Assume that (w, v) is the other edge in the cycle that is incident to v. Replacing (w, v) with (u, v) in T will induce a spanning tree whose cost is smaller than T, a contradiction.
Q: Let G be an undirected connected graph with distinct edge weights. Let emax be the edge with maximum weight and emin the edge with minimum weight. Which of the following statements is false?
Every minimum spanning tree of G must contain emin. No minimum spanning tree contains emax
If emax is in a minimum spanning tree, then its removal must disconnect G
4. G has a unique minimum spanning tree 1,3,4 are true
Exercise on Maximum Spanning Trees
Design an algorithm for finding the maximum spanning tree; i.e., the spanning tree that maximizes the sum of edge weights.
Solution. I sort the edges in decreasing order of weights and I keep adding edges, provided that I do not create cycles.
Exercise on Bottleneck Weight
Let G = (V,E) be a connected undirected graph with weights on the edges. The bottleneck weight of any spanning tree T of G is the maximum weight of an edge in T. Prove that the minimum spanning tree (MST) minimizes the bottleneck weight over all spanning trees.
Solution. We prove by contradiction.
Let T be the MST, whose heaviest edge is (a, b).
Removing (a, b) from T divides T into two connected components A (containing a) and B (containing b).
Assume another spanning tree T’ whose bottleneck weight is smaller than that of T.
In T’, there must be a unique path from a to b; therefore there must be an edge (a’, b’) such that a’ A and b’ B. According to the assumption, the heaviest edge in T is heavier than any edge in T’, so (a’, b’) is lighter than (a, b).
Adding (a’, b’) will connect A and B to form a new spanning tree whose total weight is smaller than T’s, which contradicts the assumption that T is MST.
Exercise on Euclidean Traveling Salesman
• Input: Undirected Graph G = (V,E) and a cost C(e) for each edge e.
• Output: A cycle that visits each vertex exactly once and has minimum total cost
• Euclidean Traveling Salesman – Vertices are points on the plane and the cost is the Euclidian distance between them – Implies triangle inequality:
• C(u,v) <= C(u,x) + C(x,v)
• NP-hard problem: No polynomial time algorithms known for optimal solution
• 2-Approximate Solution Using MST
Euclidean Traveling Salesman: Steps
Nodes: Can be thought of as fully connected graph. Edge weight is the Euclidean distance
DFS on T starting from a
1a1a4 2bd4 2bd
5fg 35fg c6h7c6h7
Start from a
Visit nodes in DFS order
Euclidean Traveling Salesman: Proof of 2-approximation
W: DFS Walk (visit nodes using MST edges) H:ComputedTour
H*:OptimalTour
W: DFS Walk 1a 4
C(W)=2C(T)
C(H) ≤ C(W) =2C(T) (triangular inequality) C(T) ≤ C(H*)
C(H) ≤ 2C(H*)
H: Computed Tour 1a4
Euclidean Traveling Salesman: Improving Solution with Local Search (Optional)
Find an initial tour H
1. For every pair of edges (x,y), (u,v) in H
if C(x,u) + C(y,v) < C(x,y) + C(u,v) then
H := H – {(x,y),(u,v)}union {(x,u),(y,v)}
exit for loop and go to 1 Return H
After 1 swap
After 2 swaps
After 3 swaps
bdbd fgfgfg
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com