COMP3027: Algorithm Design
Lecture 2: Graphs
William Umboh
School of Computer Science
The University of Sydney
1
Lecture 2: Graphs
– Definitions
– Representations
– Breadth First Search
– Depth First Search
– Applications
The University of Sydney
2
Basic Definitions and Applications
The University of Sydney 3
Undirected Graphs G=(V,E)
– V = {nodes (or vertices)}
– E = {edges between pairs of nodes}
– Captures pairwise (symmetric) relationship between objects – Graph size parameters: n = |V|, m = |E|
The University of Sydney
4
V = { 1, 2, 3, 4, 5, 6, 7, 8 }
E = { (1,2), (1,3), (2,3), (2,4), (2,5), (3,5), (3,7), (3,8),
(4,5), (5,6), (7,8) }
n=8
m = 11
Directed Graphs G=(V,E)
– Edge (u, v) goes from node u to v (sometimes called an arc) – Captures asymmetric relationship between objects
The University of Sydney 5
World Wide Web
– Node: web page.
– Edge: hyperlink from one page to another.
cnn.com
netscape.com
timewarner.com
hbo.com
sorpranos.com
novell.com
cnnsi.com
The University of Sydney
6
Some Graph Applications
Graph
Nodes
Edges
transportation
street intersections
highways
communication
computers
fiber optic cables
social
people
relationships
World Wide Web
web pages
hyperlinks
food web
species
predator-prey
software systems
functions
function calls
scheduling
tasks
precedence constraints
circuits
gates
wires
Undirected
Directed
The University of Sydney 7
The University of Sydney
8
Graph Representations
Graph Representation: Adjacency Matrix
Adjacency matrix. n-by-n matrix with Auv = 1 if (u, v) is an edge.
– Two representations of each edge (undirected graph). – Space proportional to n2.
– –
Checking if (u, v) is an edge takes Θ(1) time. Identifying all edges takes Θ(n2) time even if m << n.
12345678
1 2 3 4 5 6 7 8
01100000 10111000 11001011 01001000 01110100 00001000 00100001 00100010
The University of Sydney 9
Graph Representation: Adjacency List
Adjacency list. Node indexed array of lists.
– Two representations of each edge (undirected graph).
– Space proportional to m + n. degree = number of neighbors of u – Checking if (u, v) is an edge takes O(deg(u)) time.
–
Identifying all edges takes Θ(m + n) time.
1
2 3 4 5 6 7 8
2
1
1
2
2
5
2
3
4
6
5
3
8
3
7
The University of Sydney
10
3
3
4
5
5
7
8
Paths and Connectivity
Definition: A path in an undirected graph G = (V, E) is a sequence
P of nodes v1, v2, ..., vk-1, vk with the property that each consecutive pair vi, vi+1 is joined by an edge in E.
Definition: A path is simple if all nodes are distinct.
Definition: An undirected graph is connected if for every pair of
nodes u and v, there is a path between u and v.
The University of Sydney 11
Paths and Connectivity
Definition: A path in an undirected graph G = (V, E) is a sequence
P of nodes v1, v2, ..., vk-1, vk with the property that each consecutive pair vi, vi+1 is joined by an edge in E.
Example: 1,2,4,5,6 1,2,4,5,2,3
The University of Sydney
12
Paths and Connectivity
Definition: A path in an undirected graph G = (V, E) is a sequence
P of nodes v1, v2, ..., vk-1, vk with the property that each consecutive pair vi, vi+1 is joined by an edge in E.
Definition: A path is simple if all nodes are distinct. 1,2,4,5,6 1,2,4,5,2,3
The University of Sydney
simple not simple 13
Paths and Connectivity
Definition: A path in an undirected graph G = (V, E) is a sequence
P of nodes v1, v2, ..., vk-1, vk with the property that each consecutive pair vi, vi+1 is joined by an edge in E.
Definition: A path is simple if all nodes are distinct.
Definition: An undirected graph is connected if for every pair of
nodes u and v, there is a path between u and v.
The University of Sydney
14
Connected
Paths and Connectivity
Definition: A path in an undirected graph G = (V, E) is a sequence
P of nodes v1, v2, ..., vk-1, vk with the property that each consecutive pair vi, vi+1 is joined by an edge in E.
Definition: A path is simple if all nodes are distinct.
Definition: An undirected graph is connected if for every pair of
nodes u and v, there is a path between u and v.
The University of Sydney 15
Not connected (or disconnected)
Cycles
Definition: A cycle is a path v1, v2, ..., vk-1, vk in which v1 = vk,
k > 2, and the first k-1 nodes are all distinct.
The University of Sydney
16
cycle C = 1-2-4-5-3-1
Can a simple path contain a cycle? NO
Definition: A path is simple if all nodes are distinct. Definition: A cycle is a path v1, v2, …, vk-1, vk in which v1 = vk,
k > 2, and the first k-1 nodes are all distinct.
The University of Sydney 17
Trees
Definition: An undirected graph is a tree if it is connected and
does not contain a cycle.
Number of edges in a tree?
Theorem: Let G be an undirected graph on n nodes. Any two of the following statements imply the third.
– G is connected.
– G does not contain a cycle. – G has n-1 edges.
Proof. Exercise.
The University of Sydney 18
Rooted Trees
Rooted tree. Given a tree T, choose a root node r and orient each edge away from r.
Importance. Models hierarchical structure.
root r
Notation:
u is parent of v
v is a child of u
r is an ancestor of v v is a descendant of r
u
v
The University of Sydney
19
a tree
the same tree, rooted at 1
The University of Sydney
20
Graph Traversal
Connectivity
s-t connectivity problem. Given two nodes s and t, is there a path between s and t?
Length of path = number of links along path
s-t shortest path problem. Given two nodes s and t, what is the
length of the shortest path between s and t?
Applications.
Many. For example: Fewest number of hops in a communication network.
The University of Sydney 21
Breadth First Search
BFS intuition. Explore outward from s in all possible directions, adding nodes one “layer” at a time.
s
L n-1
BFS algorithm.
L0 ={s}.
L1 = all neighbors of L0.
– – –
–
Alt. intuition. Flooding process starting at s
– Initially, s is flooded.
– In each iteration, every unflooded vertex
that neighbors a previously flooded verteLx
becomes flooded
L2 =allnodesthatdonotbelongtoL0 orL1,andthathaveanedgetoanodein L1.
Li+1 = all nodes that do not belong to an earlier layer, and that have an edge to a node in Li.
The University of Sydney
L3 22
L1
L2
L0 L1
2
Breadth First Search
BFS intuition. Explore outward from s in all possible directions, adding nodes one “layer” at a time.
BFS algorithm.
L0 ={s}.
L1 = all neighbors of L0.
– – –
–
Alt. intuition. Zombie process starting at s
– Initially, s is zombie.
– In each iteration, every human that neighbors a zombie becomes a zombie
The University of Sydney
s
L n-1
L1
L2 =allnodesthatdonotbelongtoL0 orL1,andthathaveanedgetoanodein L1.
Li+1 = all nodes that do not belong to an earlier layer, and that have an edge to a node in Li.
L2
L0 L1
L2
L3 23
Breadth First Search
BFS intuition. Explore outward from s in all possible directions, adding nodes one “layer” at a time.
s
L n-1
L1
BFS algorithm.
– – –
L0 ={s}.
L1 = all neighbors of L0.
L2 =allnodesthatdonotbelongtoL0 orL1,andthathaveanedgetoanodein L1.
Li+1 = all nodes that do not belong to an earlier layer, and that have an edge to a node in Li.
–
Theorem: For each i, Li consists of all nodes at distance exactly i
from s. There is a path from s to t if and only if t appears in some layer.
The University of Sydney 24
L2
def BFS(G,s)
s
L1 L2 Ln-1
layers = []
next_layer = [s]
“mark every vertex except s as not seen”
while “current layer not empty” do
layers.append(current_layer)
for every u in current_layer do
for every v in neighbourhood of u do
if “haven’t seen v yet” then
next_layer.append(v)
The University of Sydney
25
“mark v as seen”
current_layer = next_layer
next_layer = []
return layers
What if G is not connected?
Breadth First Search
BFS produces a tree T rooted at the start vertex on the set of nodes in G reachable from s.
The University of Sydney
Property. LetTbeaBFStreeofG=(V,E), and let (x, y) be an edge of G. Then the level of x and y differ by at most 1.
Proof. If x is level i and y not yet seen, then y is level i + 1.
L0 L1
L2 L3 26
Breadth First Search: Analysis
Theorem: The above implementation of BFS runs in O(m + n) time
if the graph is as an adjacency list.
Proof: Easy to prove O(n2) running time:
• at most n lists L[i]
• each node occurs on at most one list; for loop runs ≤ n times
• when we consider node u, there are ≤ n incident edges (u, v),
and we spend O(1) processing each edge – Actually runs in O(m + n) time:
• •
when we consider node u, there are deg(u) incident edges (u, v) total time processing edges is Σu∈V deg(u) = 2m
each edge (u, v) is counted exactly twice
in sum: once in deg(u) and once in deg(v)
The University of Sydney
27
def BFS(G,s)
layers = []
next_layer = [s]
“mark every vertex except s as not seen”
while “current layer not empty” do
layers.append(current_layer)
for every u in current_layer do
for every v in neighbourhood of u do
if “haven’t seen v yet” then
next_layer.append(v)
“mark v as seen”
current_layer = next_layer
next_layer = []
return layers
This takes O(|V|) time
This loop takes O(|N(u)|) time
The University of Sydney
28
Adding up over all u, we get O(Σu|N(u)|)=O(|E|)
BFS implementation
Complexity? Depends on graph representation
Traverse all neighbours of a node u:
– Adjacencylist:Θ(numberofneighbours)=Θ(|N(u)|) – Adjacencymatrix:Θ(n)
Check if u and v are connected by an edge:
– Adjacency list: Θ(number of neighbours) = Θ(|N(u)|) or Θ(|N(v)|)
– Adjacency matrix: Θ(1)
Space:
– Adjacency list: Θ(|V|+|E|)
– Adjacency matrix: Θ(|V|2)
The University of Sydney 29
Applications of BFS
– Connected component problem:
– Given G and s, output the set of vertices connected to s
– Shortest path problem:
– Given G and s, output the length of the shortest path between s
and all other nodes in V
– Run BFS from s
– dist(s,v) = i, where i is level v appears in
The University of Sydney 30
Connected Component
Find all nodes reachable from s.
Connected component containing node 1
= { 1, 2, 3, 4, 5, 6, 7, 8}
The University of Sydney
31
Transitive closure of a graph
The transitive closure graph of G is a graph G’:
– with the same vertices as G, and
– with an edge between all pairs of nodes that are connected by a path in G
input: abcd
output:
abcd
The University of Sydney
32
How to compute transitive closure?
Closure graph by BFS
Question: What is transitive closure of connected graph? Answer: A complete graph (contains every possible edge)
Question: What is transitive closure of disconnected graph? Answer: A union of complete graphs, one per conn. comp.
The University of Sydney 33
Closure graph by BFS
Algorithm:
1. Compute connected components
2. For each component, make a complete graph
The University of Sydney 34
DFS – Depth first search
Algorithm: Pick a starting vertex, follow outgoing edges that lead
to new vertices, and backtrack whenever “stuck”.
The University of Sydney 35
Algorithm DFS(G,u)
Input: graph G(V,E) and a vertex u in V
begin
mark u as visited
for each edge (u,v) in E do
if v has not been visited then
DFS(G,v)
end
The University of Sydney
36
What if G is not connected?
Algorithm DFS(G,s)
Input: graph G(V,E) and a vertex s in V
begin
initialise a stack S with node s
while S is not empty do
u=pop(S)
if u not visited then
set u as visited
for each edge (u,v) do
add v to S
The University of Sydney
37
end
Properties of DFS
Running time: O(n+m)
Subset of edges in DFS that “discover a new node” form a forest
(a collection of trees).
A graph is connected if and only if DFS results in a single tree. Each tree in the DFS result corresponds to a connected component
The University of Sydney 38
The University of Sydney
39
Bipartite Graphs and Testing Bipartiteness
Bipartite Graphs
Definition: An undirected graph G = (V, E) is bipartite if the nodes can be colored red or blue such that every edge has one red and one blue end.
Applications
– Stable marriage: men = red, women = blue. – Scheduling: machines = red, jobs = blue.
The University of Sydney
40
a bipartite graph
Testing Bipartiteness
Testing bipartiteness. Given a graph G, is it bipartite? – Many graph problems become:
• easier if the underlying graph is bipartite (matching)
• tractable if the underlying graph is bipartite (independent set) – Before attempting to design an algorithm, we need to understand
structure of bipartite graphs.
v2 v3
v2
v4
v1
v6v5v4 v3 v5
v7 v1
v6
v7
The University of Sydney a bipartite graph G another drawing of G 41
An Obstruction to Bipartiteness
Lemma: If a graph G is bipartite, it cannot contain an odd
length cycle.
Proof: Not possible to 2-color the odd cycle, let alone G.
The University of Sydney
42
bipartite
(2-colorable)
not bipartite
(not 2-colorable)
Bipartite Graphs
Lemma: Let G be a connected graph, and let L0, …, Lk be the
layers produced by BFS starting at node s. Exactly one of the following holds.
(i) No edge of G joins two nodes of the same layer, and G is bipartite.
(ii) An edge of G joins two nodes of the same layer, and G contains an odd-length cycle (and hence is not bipartite).
L1 L2 L3 L1 L2 L3
The University of Sydney Case (i) Case (ii) 43
Bipartite Graphs
Lemma: Let G be a connected graph, and let L0, …, Lk be the
layers produced by BFS starting at node s. Exactly one of the following holds.
(i) No edge of G joins two nodes of the same layer, and G is bipartite.
(ii) An edge of G joins two nodes of the same layer, and G contains an odd-length cycle (and hence is not bipartite).
Proof: Case (i)
– Suppose no edge joins two nodes in the same layer.
– By previous lemma, this implies all edges join nodes on adjacent level. – Bipartition: red = nodes on odd levels, blue = nodes on even levels.
L1 L2 L3
The University of Sydney Case (i) 44
Bipartite Graphs
Lemma: Let G be a connected graph, and let L0, …, Lk be the layers
produced by BFS starting at node s. Exactly one of the following holds.
(i) No edge of G joins two nodes of the same layer, and G is bipartite.
(ii) An edge of G joins two nodes of the same layer, and G contains an odd- length cycle (and hence is not bipartite).
Proof: Case (ii)
– Suppose (x, y) is an edge with x, y in same level Lj.
– Let z = lca(x, y) = lowest common ancestor.
–
then path from y to z, then path from z to x.
– Its length is 1 + (j-i) + (j-i), which is odd. ▪
(x,y) pathfrom
pathfrom
y to z z to x
z = lca(x, y)
Let Li be level containing z.
– Consider cycle that takes edge from x to y,
The University of Sydney
45
Obstruction to Bipartiteness
Corollary: A graph G is bipartite if and only if it contain no odd
length cycle.
5-cycle C
The University of Sydney
46
bipartite
(2-colorable)
not bipartite
(not 2-colorable)
Testing bipartiteness
Theorem: Given a graph G=(V,E) one can test if G is bipartitie in
O(n+m) time.
5-cycle C
The University of Sydney
47
bipartite
(2-colorable)
not bipartite
(not 2-colorable)
Cut edges
Definition: In a connected graph, an edge e is called a “cut edge”
if its removal would disconnect the graph • G=(V,E) is connected
• G’ = (V, E\{e}) is not connected
How do we find the cut edges of a graph?
The University of Sydney 48
Finding cut edges
Algorithm 1: (the straightforward one)
For every edge e in G
remove e from G
check if G is connected (running DFS for
example)
Running time? O(m2+mn) More efficient algorithm?
The University of Sydney 49
Structural property of cut edges
Definition: In a connected graph, an edge e is called a “cut edge”
if its removal would disconnect the graph • G=(V,E) is connected
• G’ = (V, E\{e}) is not connected
Lemma. e is a cut edge if and only if e is not contained in a cycle
Proof.
The University of Sydney 50
Finding cut edges
Assumes the input graph G is connected.
Algorithm 2:
Run DFS on the graph G
For each edge in the DFS tree
remove that edge from the graph G
check if G is now disconnected (using DFS)
Running time? O(nm)
Correctness? Need to show that non-tree edges cannot be cut
edges
The University of Sydney 51
Algorithm DFS(G,u)
Input: graph G(V,E) and a vertex u in V
begin
mark u as visited
for each edge (u,v) in E do
if v has not been visited then
DFS(G,v) end
Obs. Non-tree edges = back edges and back edges are part of cycles The University of Sydney 52
Summary: Graphs
Graph representation:
– adjacency matrix or adjacency list
Basic notations and definitions:
– cycle, simple, connected, path, tree, directed,…
Traversing a graph (BFS or DFS): O(n+m)
– Applications of BFS/DFS: shortest path, transitive closure, testing
bipartitness, cut edges…
The University of Sydney 53
For HW1: Minimum Spanning Tree
Def.: Given a graph G = (V,E) with real-valued edge costs ce f a minimum spanning tree (MST) is a spanning tree T whose sum of edge cost is minimised.
4 24 4 62396 9
16
18
511 511
8 14 7 8 7 10
21
G=(V,E) T, Σe∈T ce=50
The University of Sydney
54
This Week’s Todo’s
Tutorial 2:
– to be posted this evening
– make sure you work on it before the tutorial
Quiz 1:
– to be posted this evening
– due Wednesday 13 March 23:59:00 (no late submissions accepted)
Assignment 1:
– posted tonight
– due Wednesday 20 March 23:59:00
– no submissions later than Wednesday 27 March 23:59:00 accepted – START EARLY!
The University of Sydney 55