CS 112: Data Structures
Sesh Venugopal
Topological Sorting of Graphs Using DFS or BFS
Precedence Graph
A precedence graph is a Directed Acyclic Graph (DAG), i.e. a directed graph that does not have any cycles
BBB AAA
CCC
acyclic has cycle has cycle
An edge xày means that x precedes y.
Precedence graphs are used in practice for task scheduling in various
domains such as project management, computational graphs, etc.
Each vertex is a task, and a precedence xày means that task y can only start after task x is completed.
Sesh Venugopal CS112: Topological Sorting 2
Sesh Venugopal
Precedence Graph
When there are a bunch of tasks to be done, and there are precedence constraints among them, we need to know in what sequence to do the tasks without violating any of the precedence .
BBBE AAA
CCCD
CS112: Topological Sorting
Possible sequences:
Possible sequences:
– A, B, C – A, C, B
Possible sequences:
– A,B,C,D,E
– A,B,C,E,D
– A,B,E,C,D
– A,C,B,D,E
– A,C,B,E,D
– A,C,D,B,E
–
A, B, C
B A
C
Because this graph has a cycle, there is no way to sequence the vertices without violating a precedence constraint.
has cycle
3
Precedence Graph
To sequence the vertices, we need to assign numbers to vertices – the number is the position of the vertex in the sequence
1113
BBBE 0A0A0A
CCCD
The process of assigning sequence numbers to vertices is called topological sorting, and the sequence numbers are called topological numbers.
2
B
A
C
1
4 14
222
0
0
BE A
CD
23
Sesh Venugopal
CS112: Topological Sorting
4
Plus other possibilities
Topological Sorting using DFS
Sesh Venugopal CS112: Topological Sorting 5
DFS Topsort
In order to assign topological numbers, we need to navigate the topology of the graph to detect the precedence constraints, so that if xày is an edge (a precedence that constrains x to happen before y), the topological number for x is less than that for y.
Could we use DFS to do the topological numbering, assigning numbers from 0..n-1 as we visit the vertices?
0A
Start at A Visit A (0) Visit B (1) Visit C(2)
✓
1
B C
2
✗
Assigning numbers in ascending order does not work. We wouldn’t know to pick B instead of C to visit next, because the next vertex is chosen arbitrarily, depending on storage in adjacency linked lists. And if C is the first neighbor, then we’re stuck. (Also, what if we start at C first?)
Sesh Venugopal
CS112: Topological Sorting 6
Start at A Visit A (0) Visit C (1) Back up to A Visit B (2)
DFS Topsort
The good news is that we can still use DFS, with a clever trick:
– –
Number the vertices NOT when you visit, but when it is a dead end (options exhausted, about to back up), and
Assign numbers in descending order: n-1 for the first number,
n-2 for the second number, etc.
Start at A 1 Visit A
Start at A Visit A
Visit C
Dead end
C: 2
Back up to A Visit B
Dead end
B: 1
Back up to A Dead end
A: 0
Start at C Visit C Dead end C: 2 Restart at B Visit B Dead end B: 1
Restart at A Visit A Dead end A: 0
B 0A
C
2
Visit B
Visit C
Dead end
C: 2 BackuptoB Dead end B:1
Back up to A Dead end
A: 0
Sesh Venugopal
CS112: Topological Sorting
7
Sesh Venugopal
DFS Topsort
The good news is that we can still use DFS, with a clever trick:
CS112: Topological Sorting
– –
Number the vertices NOT when you visit, but when it is a dead end (options exhausted, about to back up), and
Assign numbers in descending order: n-1 for the first number,
n-2 for the second number, etc.
B A
C
Start at A Visit A Visit B Dead end B: 2
Back up to A Visit C
Dead end
C: 1
Back up to A Dead end
A: 0
2
Start at A Visit A
Visit C
Dead end
C: 2
Back up to A Visit B
Dead end
B: 1
Back up to A Dead end
A: 0
Start at C Visit C Dead end C: 2 Restart at B Visit B Dead end B: 1
Restart at A Visit A Dead end A: 0
Start at B Visit B Dead end B:2 Restart at A Visit A
Visit C
Dead end
C: 1
Back up to A Dead end
A: 0
112
B 0A0A0A0A
BBB
C
1
CCC
221
8
DFS Topsort
Start at A Visit A
Visit I
Dead end
I: 11
Back up to A Dead end
A: 10
Restart at C Visit C
Visit E
Visit F
Visit J
Dead end
J: 9
Back up to F Dead end
F: 8
Vertices are numbered 1..11, but actually BackuptoE
they should be 0..10
Visit G
Visit H
Dead end
H: 7
Back up to G Dead end
G: 6
Back up to E Dead end
E: 5
Back up to C Dead end
C: 4
Restart at D
CS112: Topological Sorting …9
Sesh Venugopal
DFS Topsort Implementation
// recursive dfs
private void dfs(int v, boolean[] visited) {
visited[v] = true;
for (Neighbor e=adjLists[v].adjList; e != null; e=e.next) {
if (!visited[e.vertexNum]) {
dfs(e.vertexNum, visited);
} }
}
Modification 1: Need to accept as parameters the current highest topological number, and an array in which topological sequence can be written in
// recursive dfs
private void dfs(int v, boolean[] visited, int num, int[] topSequence) {
visited[v] = true;
for (Neighbor e=adjLists[v].adjList; e != null; e=e.next) {
if (!visited[e.vertexNum]) {
dfs(e.vertexNum, visited, num, topSequence);
} }
}
Sesh Venugopal
CS112: Topological Sorting 10
Sesh Venugopal CS112: Topological Sorting
DFS Topsort Implementation
Modification 2: Need to assign topological number when at a dead end (just about to back up), and decrement the topological number in preparation for assigning to the next vertex
// recursive dfs
private void dfs(int v, boolean[] visited, int num, int[] topSequence) {
visited[v] = true;
for (Neighbor e=adjLists[v].adjList; e != null; e=e.next) {
if (!visited[e.vertexNum]) {
dfs(e.vertexNum, visited, num, topSequence);
} }
topSequence[num] = v; // slot vertex in its sequence position
}
num–;
34
BE
topSequence
A
C
D
B
E
0
A 01234
CD
12
11
Sesh Venugopal CS112: Topological Sorting
DFS Topsort Implementation
Modification 3: But num is a local variable, so when it is decremented, the new value is not going to be carried through to subsequent calls – so we need to return it from the method:
// recursive dfs
private int dfs(int v, boolean[] visited, int num, int[] topSequence) {
visited[v] = true;
for (Neighbor e=adjLists[v].adjList; e != null; e=e.next) {
if (!visited[e.vertexNum]) {
num = dfs(e.vertexNum, visited, num, topSequence);
} }
topSequence[num] = v; // slot vertex in its sequence position
return num-1;
}
Start at A Visit A Visit B Visit E Dead end E: 4
Back up to B Dead end
B: 3
Back up to A Visit C
Visit D
Dead end
D: 2
Back up to C Dead end
C: 1
Back up to A Dead end
A: 0
34
num=4 à num=4 à num=4 à
num=3 ß
num=2 ß num=2 à num=2 à
num=1 ß num=0 ß
0
BE A
CD
12
12
Sesh Venugopal CS112: Topological Sorting
DFS Topsort Implementation
Modification 4: Driver needs to start with the highest number, and also restart with current highest number. It also needs to set up an array for the topological sequence
// driver (this is the method called from any application
public void dfs() {
boolean[] visited = new boolean[adjLists.length];
int[] topSequence = new int[adjLists.length];
int num = adjLists.length-1;
for (int v=0; v < visited.length; v++) {
if (!visited[v]) { // start/restart at v
num = dfs(v, visited, num, topSequence);
} }
}
num=4 à num=4 à
num=3 ß num=2 à
num=2 à num=2 à num=2 à
num=1 ß num=0 ß
Start at B Visit B Visit E Dead end E: 4
Back up to B Dead end
B: 3
Return to driver Restart at A Visit C
Visit D
Dead end
D: 2
Back up to C Dead end
C: 1
Back up to A Dead end
A: 0
34
0
BE A
CD
12
13
DFS Topsort Big O Running Time
Basic DFS running time is O(n+e)
What additional work is done for the topological numbering?
O(n) Recursive dfs
- Assign each vertex to a slot in topSequence array: O(1)
O(1) Driver
- Initialize num
Total: O(n+e)
Sesh Venugopal CS112: Topological Sorting 14
Topological Sorting using BFS
Sesh Venugopal CS112: Topological Sorting 15
BFS Topsort
Since BFS is not recursive, and once a vertex is visited, there is no backing up to it, topological numbers need to be assigned in increasing order 0..n-1
This means we MUST start with vertices that have no incoming edges, i.e. indegree=0 (indegree is number of edges coming in/pointing at a vertex, outdegree is number of edges going out of a vertex)
So before we even start BFS, we need to go through the graph and compute indegree for each vertex
112
B D CF
3 0E
G indegrees ABCDEFG
0
1
1
1
0
3
2
0A
1
Sesh Venugopal
CS112: Topological Sorting 16
BFS Topsort
After the indegrees are computed, we scan the indegrees array.
For each vertex that has indegree=0, we assign it the next higher topological number, and enqueue it
0
1
1
1
0
3
2
112
indegrees
ABCDEFG
0
B A
DG CF
1 0E
3
queue
AE
topSequence
0123456
A
E
-
-
-
-
-
Sesh Venugopal
CS112: Topological Sorting 17
BFS Topsort
Then we run a loop as long as queue is not empty.
In each iteration of the loop, we dequeue a vertex, v.
For each neighbor of v, we decrement the indegree count.
If the count goes to 0, we assign it the next higher topological number, and enqueue it.
Iteration 1
queue
ßAE indegrees
ABCDEFG
topSequence
0123456
queue
EBC
CS112: Topological Sorting 18
112
BDG CF
3 0E
0
0
0
1
0
3
2
0A
1
A
E
B
C
-
-
-
Sesh Venugopal
BFS Topsort
Iteration 2
queue
ßEBC indegrees
ABCDEFG
topSequence
0123456
queue
BC
112
BDG
F
3
0
0
0
1
0
2
2
0A
C
1 0E
A
E
B
C
-
-
-
Sesh Venugopal
CS112: Topological Sorting 19
BFS Topsort
Iteration 3
queue
ßBC indegrees
ABCDEFG
topSequence
0123456
queue
CD
112
BDG
F
3
0
0
0
0
0
2
2
0A
C
1 0E
A
E
B
C
D
-
-
Sesh Venugopal
CS112: Topological Sorting 20
BFS Topsort
Iteration 4
queue
ßCD indegrees
ABCDEFG
topSequence
0123456
queue
D
112
BDG
F
3
0
0
0
0
0
1
2
0A
C
1 0E
A
E
B
C
D
-
-
Sesh Venugopal
CS112: Topological Sorting 21
BFS Topsort
Iteration 5
queue
ßD indegrees
ABCDEFG
topSequence
0123456
queue
F
112
BDG
F
3
0
0
0
0
0
0
1
0A
C
1 0E
A
E
B
C
D
F
-
Sesh Venugopal
CS112: Topological Sorting 22
BFS Topsort
Iteration 6
queue
ßF indegrees
ABCDEFG
topSequence
0123456
queue
G
112
BDG
F
3
0
0
0
0
0
0
0
0A
C
1 0E
A
E
B
C
D
F
G
Sesh Venugopal
CS112: Topological Sorting 23
BFS Topsort
Iteration 7
queue
ßG indegrees
ABCDEFG
topSequence
0123456
queue
112
BDG
F
3
0
0
0
0
0
0
0
0A
C
1 0E
A
E
B
C
D
F
G
Sesh Venugopal
CS112: Topological Sorting 24
BFS Topsort
Sesh Venugopal
CS112: Topological Sorting
25
bfsTopsort()
compute indegrees of all vertices topnum ß 0
for each vertex, v, do
if indegrees[v] == 0 then topSequence[topnum] = v topnum++
enqueue(v)
endif endfor
while queue is not empty do
v ß dequeue()
for each neighbor, w, of v do
indegrees[v]--
if indegrees[v] == 0 then
topSequence[topnum] = v topnum++
enqueue(v)
endif endfor
endwhile
Since we start with all vertices that don’t have any incoming edges, all other vertices should be reachable from these initial vertices. Which means a driver is not needed.
BFS Topsort Big O Running Time
Basic BFS running time is O(n+e)
What additional work is done for the topological numbering?
O(n+e) Indegrees computation
- Go through entire graph and count incoming edges for each vertex
O(n) BFS
- Assign each vertex to a slot in topSequence array: O(1)
- Decrement indegree Total: O(n+e)
Sesh Venugopal CS112: Topological Sorting 26