COMPSCI 330: Design and Analysis of Algorithms November 7, 2017
Lecture 18: Amortized Analysis of Disjoint Sets
Lecturer: Rong Ge Scribe: Yumin Zhang
1 Potential Argument
Problem. We are still talking about the amortized cost of add(x) method in the dynamic array problem (for more details about the problem, please refer to the previous lecture’s note). In addition to the aggregate method, accounting (charging) method, we have this third way to analyze the amortized cost, which is the potential argument.
Potential Argument. Define a ”potential function” Φ, Φ ≥ 0. Suppose an operation took (real) time Ti, changed the status from xi to x(i + 1). The amortized cost A of an operation is the following.
The we claim the following.
Proof for the claim.
Ai = Ti − Φ(xi) + Φ(xi+1)
nn
Ti = Ai + Φ(x1) − Φ(xn+1)
i=1 i=1
Ai = Ti − Φ(xi) + Φ(xi+1) = Ti − (Φ(xi) − Φ(xi+1))
potential difference
A1 = T1 − Φ(x1) + Φ(x2)
A2 = T2 − Φ(x2) + Φ(x3)
…
An = Tn − Φ(xn) + Φ(xn+1)
nn
Ai =Ai +Φ(x1)−Φ(xn+1)
i=1
total amortized
i=1
1.1 Design the Potential Function
Idea. First make sure Φ(xi) − Φ(xi+1) is large.
If i = 2k +1 is a heavy operation. (Ti = 2k+1). Φ is a function of number of elements and capacity, we denote them as l and c
Lecture 18: Amortized Analysis of Disjoint Sets-1
lc xi: 2k 2k
↑
before 2k + 1 add operation, data structure has 2k elements xi+1: 2k +1 2k+1
difference ∆: 1 2k
I want to pay for the cost of Ti = 2k+1 using the difference in number of elements and number of capacity. Since the difference in capacity is much larger than the difference in number of elements, I want to use the difference in capacity. This imply the following idea: if capacity increase by 2k, potential Φ decreases by 2k+1. Then the potential function will look like
Φ= ? −2·c
step2. We need to make sure Φ ≥ 0. We first make the following observation:
l · 2 ≥ c (exceptf ortheinitialstate) 1 + 2 · l ≥ c (includetheinitialstate)
So we can use Φ = 2 + 4l − 2c and we know Φ ≥ 0. compute amortized running time.
1 heavyoperationi=2k+1,Ti=2k+1
Φ(xi) = 2 + 4(2k) − 2(2k)
Φ(xx+1) = 2+4(2k +1)−2(2k+1) Φ(xi)−Φ(xx+1) = 2k+1 −4
Ai =Ti −(Φ(xi)−Φ(xi+1))=2k+1 −(2k+1 −4)=4
2 lightoperationi̸=2k+1,Ti=1
xi: i−1 c 2+4(i−1)−2c
xi+1: i c 2+4i−2c Φ(xi) − Φ(xi+1) = −4
Ai =Ti −(Φ(xi)−Φ(xi+1))=1−(−4)=5 amortized cost = 5 = O(1)
2 Union-Find
2.1 Data Structure for Disjoint Sets
Problem.There are n elements, each in a separate set. Want to build a data structure that supports two operations:
Find.Given an element, find the set it is in. (Each set is identified by a single head element in the set)
Merge.Given two elements, merge the sets that they are in.
Lecture 18: Amortized Analysis of Disjoint Sets-2
lcΦ
2.2 Kruskal’s Algorithm
For each edge we want to know whether adding the edge creates a cycle. set ⇐⇒ connected component in graph
Initially all vertices are separated, the tree has no edges. Each vertex in separated connected component.
add an edge ⇐⇒ F ind(u) = F ind(v) (u,v in the same set) edge(u,v) creates a cycle ⇐⇒ union on two connected components
edge 1 edge 2 edge 3 edge 4 edge 5
: F ind(1) = 1, F ind(2) = 2, all sets = {{1}, {2}, {3}, {4}, {5} U nion(1, 2), all sets = {1, 2}, {3}, {4}, {5}
: Find(3) = 3,Find(4) = 4
U nion(3, 4), all sets = {1, 2}, {3, 4}, {5}
: Find(1) = 1,Find(3) = 3
U nion(1, 3), all sets = {1, 2, 3, 4}, {5}
: Find(2) = 3,Find(4) = 1 do not add the edge
: Find(2) = 1,Find(5) = 5 Union(1,5),all sets = {1,2,3,4,5}
2.3
2.3.1
Implementation of Union Find
Na ̈ıve Implementation
Using Linked Lists.
• Merge: connects two linked lists, O(1)
• Find: needs to find the head/tail of the list, O(length)
Using Array.
• a[i]: the set that item i is in
• Find: O(1)
• Merge(i,j): needs to update many entries in the array, O(length)
Lecture 18: Amortized Analysis of Disjoint Sets-3
2.3.2 Representing Sets using Trees
• For each element, think of it as a node.
• Each subset is a tree, and the head element is the root. • Find: find the root of the tree
• Merge: merge two trees into a single tree.
Example.
• Sets1,3,2,5,6,8,4,7
• Note: not necessarily binary trees.
Find Operation. Follow pointer to the parent until reach the root.
Merge Operation. Make the root of one set as a child for another set Running Time.
Figure 1: {4} merge with {2,8,6,2}
• Find: Depth of the tree.
• Merge: First need to do two find operation, then spend O(1) to link the two trees. • In the worst-case, the tree is just a linked list
• Depth = n.
Lecture 18: Amortized Analysis of Disjoint Sets-4
2.3.3 Union by Rank
• For each root, also store a rank
• When merging two sets, always use the set with higher rank as the new root.
• If two sets have the same rank, increase the rank of the new root after merging.
If we use union-by-rank, then the depth of the tree can be bounded by the following lemma:
Lemma 1. If the root of a tree has rank k, then the tree contains at least 2k vertices.
Proof. This can be proved by induction. Obviously this is true when k = 0.
Suppose this is true for all k ≤ t. Now consider a tree whose root has rank t + 1. The only way a node of rank t + 1 can be created is if two trees of rank t are merged. By induction hypothesis,
both trees have at least 2t nodes, so the merged tree has at least 2t+1 nodes.
The rank is also an upper-bound on the depth of the tree (when we are using only union-by- rank, rank is exactly equal to the depth). As an immediate corollary, the maximum rank/depth of a tree is bounded by log2 n.
Path Compression.
• After a find operation, connect everything along the way directly to the root.
Running Time.
• Union by rank only
– Depth is always bounded by log n
– O(log n) worst-case, O(log n) amortized
Lecture 18: Amortized Analysis of Disjoint Sets-5
• Union by rank + path compression – Worst case is still O(log n)
– Amortized: O(α(n)) = o(log∗n).
Here α is the inverse Ackermann function, which is a function that grows really slowly. The function log∗(n) is defined to be the number of log2 operations that it takes to make n a constant (say 1), which also grows very slowly. For all practical purposes, α(n) ≤ 5 and the data structure has essentially a constant amortized running time.
Lecture 18: Amortized Analysis of Disjoint Sets-6