Algorithms & Data Structures (Winter 2022) Disjoint Sets
• Introduction. • Operations.
Copyright By PowCoder代写 加微信 powcoder
Introduction – Motivation
• You have a set of nodes (numbered 1-9) on a network. You are given a sequence of pairwise connections between them:
3-7 8-2 1-6 5-7 4-8 3-5
Q: Are nodes 2 and 4 (indirectly) connected?
Q: Are nodes 3 and 8 connected?
Q: Are any of the paired connections redundant? Q: How many sub-networks do we have?
Introduction – Motivation
• You have a set of nodes (numbered 1-9) on a network. You are given a sequence of pairwise connections between them:
3-7 8-2 1-6 5-7 4-8 3-5
Data Structure
{3-5-7} {2-4-8} {9} {1-6}
Disjoint sets
Q: Are nodes 2 and 4 (indirectly) connected?
Q: Are nodes 3 and 8 connected?
Q: Are any of the paired connections redundant? Q: How many sub-networks do we have?
Introduction – Motivation
Q: Are nodes 2 and 4 (indirectly) connected?
Q: Are nodes 3 and 8 connected?
Q: Are any of the paired connections redundant? Q: How many sub-networks do we have?
• These kind of questions arises in a huge number of areas. • Networks
• Transistor interconnects
• Compilers
• Image segmentation
• Graph problems (upcoming topic) • Etc
Introduction – Partition
Generalization: Set of object partitioned into disjoint subsets.
10 11 6 13
& S ≠Ø∀i∈{1,…,n} S=S∪S∪…∪S(‘ i
1 2 n () S i ∩ S j = Ø i f f i ≠ j
Introduction – Graph example
Connected component: Set of nodes connected by a path.
Question: Given 2 nodes A & B, are they in the same component?
Introduction – Equivalence relation
A relation that is:
∀a ∈ S, (a, a) ∈ R
• Symmetric∀a,b∈S,(a,b)∈R⇒(b,a)∈R
• Transitive ∀a,b,c∈S,(a,b)∈Rand(b,c)∈R⇒(a,c)∈R Example:
For any undirected graph, the connections define an equivalence relation on vertices.
• ForalluÎV,thereisapathoflength0fromutou.
• Forallu,vÎV,Thereisapathfromutov,iffthereisa
path from v to u.
• Forallu,v,wÎV,ifthereisapathfromutovandapath
from v to w, then there is a path from u to w.
• Reflexive
Introduction – Equivalence relation
Taken from Langer2014
Introduction – Equivalence relation
An equivalence relation is the same as a partition: Each equivalence relation provides a partition of the underlying set into disjoint equivalent classes (Wikipedia).
2 3 4 5 6 7
i is equivalent to j if they belong to the same set.
(more constrained that general relation)
1000011 0110000 0110000 0001000 0000100 1000011 1000011
Introduction – Disjoint set ADT
i is equivalent to j if they belong to the same set.
Each set in the partition has a unique name (a representative member for convenience).
• find(i) returns the representative of the set that contains i.
• sameset(i,j) returns the boolean value find(i)==find(j)
• union(i,j) merges the sets containing i and j.
Introduction – Operations
• Maintain disjoint sets:
• {3,5,7},{4,2,8},{9},{1,6}
• Each set has a representative: • {3,5,7},{4,2,8},{9},{1,6}
• Find(x) returns the representative of the set containing x • Find(1) = 5
• Find(4) = 8
• Union(x,y) takes the union of the two sets that contains x and y • Union(5,1) = {3, 5, 7, 1, 6}, {4, 2, 8}, {9}
• Union(9,6) = {3, 5, 7}, {4, 2, 8}, {9, 1, 6}
Introduction – Operations
• Maintain disjoint sets:
• {3,5,7},{4,2,8},{9},{1,6}
• Each set has a representative: • {3,5,7},{4,2,8},{9},{1,6}
• Q: Are nodes 2 and 4 (indirectly) connected? • Find(2) == Find(4)
• Q: Are nodes 3 and 8 connected? • Find(3) == Find(8)
• Q: How many sub-networks do we have? • Number of representatives.
• Q: Are any of the paired connections redundant?
• Connections – (|S1-1| + |S2-1| + |S3-1| + |S4-1)| = 6 – (2+2+0+1) = 1 13
• Introduction. • Operations.
Operations – find
Let Rep[i] Î { 1, 2, … , n } be the representative of the set containing i. find(i) { return rep[i]; }
Operations – union
• union(i,j) merges the sets containing i and j. Example: union(2,6)
Operations – union
union(i,j) {
if rep[i] != rep[j] {
prevrepi = rep[i];
for (k=1; k<=n; k++) {
if rep[k] == prevrepi {
rep[k] = rep[j];
store value of rep[i] because it may change during the execution of the algorithm.
O(n) running time... slow.
Union – forest & tree representation
• Represent the disjoint sets by a forest of rooted trees.
• Roots are the representative (i.e. find(i) == findroot(i)).
• Each node points to its parent.
The tree structure does not necessarily represent the relationship between the stored objects.
Union – table representation
5 8 2 9 6 10
• Non-root nodes hold index of their parent. • Root nodes store their own value. 19
Operations – Union
union(9,11) Root of the tree of 11 becomes the
parent of the root of the tree of 9.
Operations – Find & Union
if p[i] == i {
return find(p[i]);
union(i,j) {
if find(i) != find(j) {
Remark: Arbitrarily merge the set on i into the set of j.
p[find(i)] = find(j);
Operations – Union - > Worst case
union(1,2)
union(1,3)
union(1,4)
… union(1,n)
Then, find(1) is O(n)…
Union – Heuristic – Definitions
• The depth of a node is the number of edges from the node to the tree’s root node. A root node will have a depth of 0.
• The height of a node is the number of edges on the longest path from the node to a leaf. A leaf node will have a height of 0.
Image and definitions taken from: https://stackoverflow.com/
Union – Heuristic – by size
Heuristic to control the height to the trees after merging.
Idea: Merge tree with smaller number of nodes into the tree with the largest number of nodes (In practice, we can also use rank which is an upper bound on the height of nodes).
Union – Heuristic – by size
Claim: The depth of any node is at most log n.
• If union causes the depth of a node to increase, then this node must belong to the smallest tree.
• Thus, when the depth increases, the size of the (merged) tree containing this node will at least double.
• But we can double the size of a tree at most log n times.
Union – Heuristic – by height
Idea: Merge tree with smaller height into tree with larger height.
Claim: The height of trees obtained by union-by-height is at most log n.
Corollary: An union-by-height tree of height h has at least nh 3 2h nodes.
Proof (Corollary):
• Base case: a tree of height 0 has one node.
• Induction: (hypothesis) nh 3 2h. Show nh+1 3 2h+1.
Heuristic – supplemental material
The base case k=0 is easy, since a tree of height 0 always has just 1=2^0 node (the root). Suppose the claim is true for h=k. Now consider a union- by height tree of height k+1. There must have been a union that brought two trees together and increased the height of one of them from k to k+1. Let those two trees (at the time of that union) be T1 and T2. We know that both T1 and T2 were of height k before the union. [Why? If one of them were of height less than k, then union-by-height would have changed the root of that shorter one to make it point to the root of the taller one, and the height of the unioned tree would still be k. But its not; the unioned tree is of height k+1.] Now we can apply the induction hypothesis: the trees T1 and T2 each have at least 2^k nodes. Thus, the unioned tree has at least 2^k + 2^k = 2^(k+1) nodes.
Taken from Langer 2014
Union – running time
union(i,j)
Quick Union
find Union by size Union by height
Note: These are worst case complexities.
Quick union makes 2 calls to find.
union-find – heuristics – path compression
• Find path = nodes visited during the execution of find() on the trip to the root.
• Make all nodes on the find path direct children of root.
union-find – heuristics – path compression
if p[i] == i {
return find(p[i]);
p[i] = find(p[i]);
return p[i];
union-find – running time
• Use union by size and path compression.
• m union or find operations take O( m α(n) ). What is α(n) ?
n α(n) 0-2 0 31 4-7 2 8 – 2047 3
2048 – A4(1) 4 Where A4(1) >> 1080 !!
union-find – running time
• Use union by size and path compression.
• Huge Practical problem.
• 1010 edges connecting 109 nodes.
• The heuristics reduces time from 3.000 years to 1 minute
• Supercomputer wont help much
• Good algorithm makes solution possible
• Good algorithms makes it possible to solve problems that could not otherwise be addressed.
• Introduction. • Operations.
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com