Graph Traversals
Based on slides by David Kauchak
Tree BFS
QUEUE USE
A ←
Running time of Tree BFS
Adjacency list
• How many times does it visit each vertex? • How many times is each edge traversed? • O(|V|+|E|)
Adjacency matrix
• For each vertex visited, how much work is done? • O(|V|2)
BFS Recursively
Hard to do!
BFS for graphs
What needs to change for graphs?
Need to make sure we don’t visit a node multiple times
A DE
FG
C B
distance variable keeps track of how far from the starting node and whether we’ve seen the node yet
C B
A DE
FG
prey -43=141L . I
prev-43← u C
B DE
A
FG
A DE
FG
set all nodes as unseen
C B
A DE
FG
C B
check if the node has been seen
C• B
set the node as seen and record distance
↳
Oo A
DE
-x
FG
-e
B
DE
C
FG
A
B
DE
C
0
A
FG
Q:A
B
DE
C
0 A
FG
Q:
1
B
11
DE
C
0 A
FG
Q: D, E, B
1 B
11 DE
C
0 A
FG
Q: E, B
1 B
11 DE
C
0 A
FG
Q: B
1 B
11 DE
C
0 A
FG
Q: B
1 B
11 DE
C
0 A
FG
Q:
1 B
11 DE
C
Q:
0 A
FG
Q: F, C
0 A
1A ,
2 l BB C
21B FG
1IA
B DE
1, A
Q
:
C.
G
1it B
2iB C
0 A
1 ( A DE
1 GA
2iB3cF FG
Q
:
G
1 B
11 DE
2 C
23 FG
0 A
Q :
empty
Genie are
dove) .
1 B
11 DE
2 C
23 FG
0 A
Is BFS correct?
Does it visit all nodes reachable from the starting node? Can you prove it?
Assume we “miss” some node ‘u’, i.e. a path exists, but we don’t visit ‘u’
S…U
Is BFS correct?
Does it visit all nodes reachable from the starting node? Can you prove it?
Find the last node along the path to ‘u’ that was visited
S…ZW…U
why do we know that such a node exists?
Is BFS correct?
Does it visit all nodes reachable from the starting node? Can you prove it?
We visited ‘z’ but not ‘w’, which is a contradiction, given the pseudocode
S… …U
contradiction
ZW
Is BFS correct?
Does it correctly label each node with the shortest distance from the starting node?
Is BFS correct?
Does it correctly label each node with the shortest distance from the starting node?
Assume the algorithm labels a node with a longer distance. Call that node ‘u’
S…U
Is BFS correct?
Does it correctly label each node with the shortest distance from the starting node?
Find the last node in the path with the correct distance
S…ZW…U
correct incorrect
Is BFS correct?
Does it correctly label each node with the shortest distance from the starting node?
Find the last node in the path with the correct distance
S… …U
contradiction
ZW
Runtime of BFS
Nothing changed over our analysis of TreeBFS
Runtime of BFS
Adjacency list: O(|V| + |E|) Adjacency matrix: O(|V|2)
¥• #
⇒
nodeisengueuedaud
E a c h
011111 ) The for loop has
TOTAL
degueued
once
:
TOTAL for all u
&) deglu ) iter
RUNTIME : G)+ (*as)
: Ideg(u) u can
–
.
=L) IE
1(Nlt (El)
Depth First Search (DFS)
–
S
:
stack
Depth First Search (DFS)
Tree DFS
A BDE
CFG
Tree DFS
A BDE
CFG
Tree DFS
A BDE
CFG
Tree DFS
A BDE
CFG
Tree DFS
A BDE
CFG
Tree DFS
A BDE
CFG
Tree DFS
A BDE
CFG
DFS on graphs
1VO o¥dor oho
DFS from vertex U →
←
w
w ill
visit allthe nodes reachable from u
DFS on graphs
mark all nodes as not visited
DFS on graphs
until all nodes have been visited repeatedly call DFS-Visit
RUNTIME : DFS on graphs
Of
h ilt
IEI
)
What happened to the stack?
←
deg
‘
r.
What does DFS do?
Finds connected components
Each call to DFS-Visit from DFS starts exploring a new set of connected components Helps us understand the structure/connectedness of a graph
Is DFS correct?
Does DFS visit all of the nodes in a graph?
Running time?
Like BFS
• Visits each node exactly once
• Processes each edge exactly twice (for an undirected graph) • O(|V|+|E|)
DFS with timing (useful for directed graphs)
Each vertex u receives tim e ) u.d = the time it is discovered C discovery
u.f = the time its exploration is finished
(finish time)
*4 44
Textbook,page60#
DFS with timing
Following a DFS with timing we can classify the edges of G into:
Tree edges: (u,v) is a tree edge if v was first discovered by exploring edge (u,v)
Forward edges: (u,v) is a forward edge if v is a descendant of u in the DFS tree A tree edge or a forward edge (u,v) satisfies: u.d < v.d < v.f < u.f
Back edges: (u,v) is a back edge if v is an ancestor of u in the DFS tree A back edge (u,v) satisfies: v.d < u.d < u.f < v.f
Cross edges: all other edges
[u.d, u.f] and [v.d, v.f] do not overlap
5F- ¥O←¥
G
DFS. VISIT (GA) 4 tree
• , 16 ¥-2,1
¥0. "
g
go
graph ←
•* ¥eE¥
¥05,6
12,15
5 ¥I±X5
€8←¥ •no* 8: /
- →
-0A
5,6
PR-OPERT.tn:
•
tree
edge forward edge
back edge
→
Howtoclassifyedge (un):
Eu
o r
edge
edge
cross
edge
]
I, I
tree
forward
edge .
?
?
→
→ back edge
E?I}
→ G has a cycle⇒ G hasa back
cross
.
?
1 916 •
4→ 4
"is ,"
¥s
5c-5→
o←¥
4
4
*. ¥÷
¥+0 5
.
*.
↳4 ¥
DAGs
Can represent dependency graphs
underwear
socks
shoes shirt
pants
belt tie
watch
jacket
Topological sort
A linear ordering of all the vertices such that for all edges (u,v) E, u appears before v in the ordering
An ordering of the nodes that “obeys” the dependencies, i.e. an activity can’t happen until it’s dependent activities have happened
1→ 4 watch
underwear pants
shirt belt tie
socks shoes
jacket
underwear
socks
shoes shirt
pants
belt tie
watch
jacket
Topological sort
Topological sort
underwear
socks
shoes shirt
pants
belt tie
watch
jacket
Topological sort
underwear
socks
shoes shirt
pants
belt tie
watch
jacket
Topological sort
underwear
socks
shoes shirt
pants
belt tie
jacket
watch
Topological sort
underwear
pants
socks
shoes shirt
belt tie jacket
watch
Topological sort
underwear
pants
socks
shoes shirt
belt tie jacket
watch
Topological sort
underwear pants
shirt
belt tie jacket
socks shoes
watch
Topological sort
underwear pants
shirt
belt tie jacket
socks shoes
watch
Topological sort
socks shoes
watch
underwear pants
shirt
...
belt tie jacket
Running time?
( Assuming adg .
lis t
representations)
Running time?
O(|V|+|E|)
Running time?
O(E) overall
Running time?
How many calls?
|V|
Running time?
Overall running time?
O(|V|2+|V| |E|)
Can we do better?
Topological sort 2
Topological sort 2
Topological sort 2
Topological sort 2
Running time?
How many times do we process each node? How many times do we process each edge?
O(|V| + |E|)
Connectedness
Given an undirected graph, for every node u V, can we reach all other nodes in the graph?
Run BFS or DFS-Visit (one pass) and mark nodes as we visit them. If we visit all nodes, return true, otherwise false.
Running time: O(|V| + |E|)
Strongly connected
Given a directed graph, can we reach any node v from any other node u?
\visit Iu ) DFS -
Ideas?
&
!{ I
Nodes
Do
→
all the
4
reachable
trance .
Transpose of a graph
Given a graph G, we can calculate the transpose of a graph GR by reversing the direction of all the edges
G A
GR A
B
D
B
D
E C
E C
Running time to calculate GR?
O(|V| + |E|)
Strongly connected
A4 ¥
DFS- VISIT in G
"⇒ 'i''II 4→⇒--- 1→.- - →4
Is it correct?
What do we know after the first pass?
• Starting at u, we can reach every node
What do we know after the second pass?
• All nodes can reach u. Why?
• WecangetfromutoeverynodeinGR,therefore,ifwereversethe edges (i.e. G), then we have a path from every node to u
Which means that any node can reach any other node. Given any two nodes s and t we can create a path through u
s...u...t
Runtime?
O(|V| + |E|)
O(|V| + |E|) O(|V|)
O(|V| + |E|) O(|V| + |E|) O(|V|)
Detecting cycles
Undirected graph
• BFS or DFS. If we reach a node we’ve seen already, then we’ve found a cycle
Directed graph
A
B
D
have to be careful
Detecting cycles
Undirected graph
• BFS or DFS. If we reach a node we’ve seen already, then we’ve found a cycle
Directed graph
• Call TopologicalSort
• If the length of the list returned ≠ |V| then a cycle exists
• Or do DFS with timing: the graph has cycles if and only if it has back edges
ALGORITHM
T
.- - - n. it
" -
-
r
-
-b4, 3I
f.6→ 4
.
.
,
I 5← <
e-4 c
-
-
s -
-
-
-
meta -
FOR
FINDING
STRONGLY
C ota PONENTS
The 7-•B-•
some
SINK
graph
;/↳g¥
L
-
SINK
CONNECTED
-Observation 1
:
If u isin a sink component, if
Observation.
in a
observation In the
A-LGORITHM
component
-
-
; node v me sink of G .
highest
(version 1) DFS x.timing for
GR
The
we DFS (G , u ) , it terminates after
visiting Component
nodes in the strong
source
a ll the
o f
node u with highest finishtime must be
u
reversegraphGR:sourceofGR- ssinkofG
finish time DFS (G, u) : finds thestrong. componentofu;-Ne eliminate
Find next ALGORITHM (final
modes . '
STEP
1
.
x . timing o n
Run DISone G. Processtheverticesin
STEP 2.
is in a
a ll these
.
finish
modev me version ) Run DFS
highest
, DFSCG,u),etc GR .
decreasing
STEP
order of their finish time from
1.
.
.
←
DFSxx. timing.
.