Analysis of Algorithms, I
CSOR W4231
Computer Science Department
Copyright By PowCoder代写 加微信 powcoder
Columbia University
Strongly connected components, single-origin shortest paths in weighted graphs
Applications of DFS
Strongly connected components
Shortest paths in graphs with non-negative edge weights (Dijkstra’s algorithm)
Correctness Implementations
Finding your way in a maze
Depth-first search (DFS): starting from a vertex s, explore the graph as deeply as possible, then backtrack
1. Try the first edge out of s, towards some node v.
2. Continue from v until you reach a dead end, that is a node
whose neighbors have all been explored.
3. Backtrack to the first node with an unexplored neighbor and repeat 2.
Remark: DFS answers s-t connectivity
Directed graphs: classification of edges
DFS constructs a forest of trees.
Graph edges that do not belong to the DFS tree(s) may be
1. forward: from a vertex to a descendant (other than a child)
2. back: from a vertex to an ancestor
3. cross: from right to left (no ancestral relation), that is
from tree to tree
between nodes in the same tree but on different branches
On the time intervals of vertices u, v
If we use an explicit stack, then
start(u) is the time when u is pushed in the stack
finish(u) is the time when u is popped from the stack (that is, all of its neighbors have been explored).
Intervals [start(u),finish(u)] and [start(v),finish(v)] either contain each other (u is an ancestor of v or vice versa); or they are disjoint.
Classifying edges using time
1. Edge(u,v)∈EisabackedgeinaDFStreeifandonlyif start(v) < start(u) < finish(u) < finish(v).
2. Edge (u,v) ∈ E is a forward edge if
start(u) < start(v) < finish(v) < finish(u).
3. Edge (u, v) ∈ E is a cross edge if
start(v) < finish(v) < start(u) < finish(u).
Applications of DFS
Strongly connected components
Shortest paths in graphs with non-negative edge weights (Dijkstra’s algorithm)
Correctness Implementations
Exploring the connectivity of a graph
Undirected graphs: find all connected components
Directed graphs: find all strongly connected components
SCC(u) = set of nodes that are reachable from u and have a path back to u
SCCs provide a hierarchical view of the connectivity of the graph:
on a top level, the meta-graph of SCCs has a useful and simple structure (coming up);
each meta-vertex of this graph is a fully connected subgraph that we can further explore.
How can we find SCC(u) using BFS?
1. Run BFS(u); the resulting tree T consists of the set of nodes to which there is a path from u.
2. Define Gr as the reverse graph, where edge (i,j) becomes edge (j, i).
3. Run BFS(u) in Gr; the resulting BFS tree T′ consists of the set of nodes that have a path to u.
4. The common vertices in T, T′ compose the strongly connected component of u.
What if we want all the SCCs of the graph?
The meta-graph of SCCs of a directed graph
Consider the meta-graph of all SCCs of G.
Make a (super)vertex for every SCC.
Add a (super)edge from SCC Ci to SCC Cj if there is an edge from some vertex u of Ci to some vertex v of Cj.
What kind of graph is the meta-graph of SCC’s?
The meta-graph of SCCs of a directed graph
Consider the meta-graph of all SCCs of G.
Make a (super)vertex for every SCC.
Add a (super)edge from SCC Ci to SCC Cj if there is an edge from some vertex u of Ci to some vertex v of Cj.
This graph is a DAG.
Is there an SCC we could process first?
Suppose we had a sink SCC of G, that is, an SCC with no outgoing edges.
1. What will DFS discover starting at a node of a sink SCC? 2. How do we find a node that for sure lies in a sink SCC? 3. How do we continue to find all other SCCs?
Easier to find a node in a source SCC!
The node assigned the largest finish time when we run DFS(G) belongs to a source SCC in G.
Example: v5 belongs to source SCC C2.
We will use Lemma 2 below. Let G be a directed graph. The meta-graph of its SCCs is a DAG. For an SCC C, let
finish(C) = max finish(v) v∈C
Example: finish(C1) = finish(v1) = 8.
Let Ci, Cj be SCCs in G. Suppose there is an edge (u,v) ∈ E such that u ∈ Ci and v ∈ Cj. Then finish(Ci) > finish(Cj).
Gr is useful again
Fact 1 provides a direct way to find a node in a source SCC of G: pick the node with largest finish.
ButwewantanodeinasinkSCCofG!
Consider Gr, the graph where the edges of G are reversed.
How do the SCCs of G and Gr compare?
Run DFS on Gr: the node with the largest finish comes from a source SCC of Gr (Fact 1). This is a sink SCC of G!
Using this observation to find all SCCs
We now know how to find a sink SCC in G.
1. Run DFS(Gr); compute finish times.
2. Run DFS(G) starting from the node with the largest finish: the nodes in the resulting tree T form a sink SCC in G.
How do we find all remaining SCCs?
Remove T from G; let G′ be the resulting graph.
The meta-graph of SCCs of G′ is a DAG, hence it has at
least one sink SCC.
Apply the procedure above recursively on G′.
Algorithm for finding SCCs in directed graphs
SCC(G = (V, E))
1. Compute Gr.
2. Run DFS(Gr); compute finish(u) for all u.
3. Run DFS(G) in decreasing order of finish(u).
4. Output the vertices of each tree in the DFS forest of line 3 as an SCC.
1. Running time: O(n + m) —why?
2. Equivalently, we can (i) run DFS(G), compute finish times; (ii) run DFS(Gr) by decreasing order of finish. Why?
A directed graph and its DFS forest with time intervals
1 (1,8) 5 (9,14) 2 (2,5) 4 (6,7) 6 (10,13) 3 (3,4) 7 (11,12)
DFS forest of Gr; nodes are considered by decreasing finish times
(8) (14) (13)
v3 v2 (5) v4 (7) v6 (12)
Still need to prove Lemma 2
Let G be a directed graph. The meta-graph of its SCCs is a DAG.
For an SCC C, let
finish(C) = max finish(v)
Let Ci, Cj be SCCs in G. Suppose there is an edge (u,v) ∈ E such that u ∈ Ci and v ∈ Cj. Then finish(Ci) > finish(Cj).
Proof of Lemma 2
There are two cases to consider:
1. start(u) < start(v) (DFS starts at Ci)
Before leaving u, DFS will explore edge (u, v).
Since v ∈ Cj, all of Cj will now be explored.
All vertices in Cj will be assigned finish times before DFS
backtracks to u and assigns a finish time to u. Thus finish(Cj) < finish(u) ≤ finish(Ci)
Proof of Lemma 2 (cont’d)
2. start(u) > start(v)
Since there is no edge from Cj to Ci (DAG!), DFS will
finish exploring Cj before it discovers u. Thus
finish(Cj) < start(u) < finish(u) ⇒ finish(Cj) < finish(Ci)
Applications of DFS
Strongly connected components
Shortest paths in graphs with non-negative edge weights (Dijkstra’s algorithm)
Correctness Implementations
Weighted graphs
Edge weights represent distances (or time, cost, etc.)
Consider a path P = (v0,...,vk). The length of P is the
sum of the weights of its edges:
w(P ) = w(vi, vi+1).
In weighted graphs, a shortest path from u to v is a path of minimum length among all paths from u to v.
s-tpath: apathfromstot.
dist(s, t): the length of the shortest s-t path;
min w(P) , if exists s-t path P
dist(s, t) =
dist(t): the length of the shortest s-t path, when s is fixed.
We will refer to w(P) as the weight or cost or length of P.
∞ , otherwise
Single-origin (source) shortest-paths problem
aweighted,directedgraphG=(V,E,w),wherefunction w : E → R maps edges to real-valued weights;
anoriginvertexs∈V.
Output: for every vertex v ∈ V
1. the length of a shortest s-v path; 2. a shortest s-v path.
Given an algorithm A for single-origin shortest-paths
We can also solve
single-pair shortest-path problem
single-destination shortest-paths problem: find a shortest path from every vertex to a destination t
all-pairs shortest-paths: find a shortest path between every pair of vertices
Graphs with non-negative weights
aweighted,directedgraphG=(V,E,w);function
w : E → R+ assigns non-negative real-valued weights to edges;
anoriginvertexs∈V.
Output: for every vertex v ∈ V
1. the length of a shortest s-v path; 2. a shortest s-v path.
Dijkstra’s algorithm (Input: G = (V, E, w), s ∈ V )
Output: arrays dist, prev with n entries such that 1. dist[v] = length of the shortest s-v path
2. prev[v] = node before v on the shortest s-v path
At all times, maintain a set S of nodes for which the distance from s has been determined.
Initially,dist[s]=0,S={s}.
Eachtime,addtoSthenodev∈V−Sthat
1. has an edge from some node in S;
2. minimizes the following quantity among all nodes v ∈ V − S
Setprev[v]=u.
d(v) = min {dist[u] + wuv } u∈S:(u,v)∈E
An example weighted directed graph
Dijkstra’s output for example graph
(1) (2) (5)
(4) (3) (4)
The distances (in parentheses) and reverse shortest paths.
Another way of showing optimality of greedy algorithms
Greedy principle: a local decision rule is applied at every step.
Dijkstra’s algorithm is greedy: always form the shortest new s-v path by first following a path to some node u in S, and then a single edge (u, v).
Proof of optimality: it always stays ahead of any other solution; when a path to a node v is selected, that path is shorter than every other possible s-v path.
Correctness of Dijkstra’s algorithm
At all times, the algorithm maintains a set S of nodes for which it has determined a shortest-path distance from s.
Consider the set S at any point in the algorithm’s execution. For each u in S, the path Pu is a shortest s-u path.
Optimality of the algorithm follows from the claim (why?).
Proof of Claim 1
By induction on the size of S.
Base case: |S| = 1, dist(s) = 0.
Hypothesis: suppose the claim is true for |S| = k, that is, for every u ∈ S, Pu is a shortest s-u path.
Step: let v be the k+1-st node added to S. We want to show that Pv, which is Pu for some u ∈ S, followed by the edge (u, v), is a shortest s-v path.
Consider any other s-v path, call it P . P must leave S somewheresincev̸∈S: lety̸=vbethefirstnodeofP in V − S and x ∈ S the node before y in P . Since the algorithm added v in this iteration and not y, it must be thatd(v)≤d(y). Sojustthesubpaths→x→yinP isat least as long as Pv! Hence so is P (why?).
Implementation
Dijkstra-v1(G = (V, E, w), s ∈ V ) Initialize(G, s)
while S ̸= V do
For every x ∈ V − S with at least one edge from S compute d(x) = min {dist[u] + wux}
u∈S,(u,x)∈E
Select v such that d(v) = min d(x) x∈V −S
S = S ∪ {v} dist[v] = d(v) prev[v] = u
Initialize(G, s) forv∈V do
dist[v] = ∞
prev[v] = NIL end for
dist[s] = 0
Improved implementation (I)
Idea: Keep a conservative overestimate of the true length of the shortest s-v path in dist[v] as follows: when u is added to S, update dist[v] for all v with (u, v) ∈ E.
Dijkstra-v2(G = (V, E, w), s ∈ V ) Initialize(G, s)
while S ̸= V do
Pick u so that dist[u] is minimum among all nodes in V − S S = S ∪ {u}
for (u, v) ∈ E do
Update(u, v) end for
Update(u, v)
if dist[v] > dist[u] + wuv then
dist[v] = dist[u] + wuv
prev[v] = u end if
Priority queues and binary heaps
Priority queue: a priority queue is a data structure for maintaining a set S of n elements, each with an associated value called a key.
Operations supported by a min-priority queue Q:
1. BuildQueue({S; keys}): builds a min-priority queue
2. Insert(Q, x): insert element x into Q
3. Extract-min(Q): extract the minimum element from Q 4. Decrease-key(Q, x, k): decrease the key for x to a new
(smaller) value k
We can implement a min-priority queue as a binary min-heap. Then each of the four operations above requires time
O(n), O(log n), O(log n), O(log n) respectively.
See Chapter 6 in your textbook for more details on binary heaps.
Improved implementation (II): binary min-heap
Idea: Use a priority queue implemented as a binary min-heap: store vertex u with key dist[u]. Required operations: Insert, ExtractMin; DecreaseKey for Update; each takes O(log n) time.
Dijkstra-v3(G = (V, E, w), s ∈ V ) Initialize(G, s)
Q = BuildQueue({V ; dist}) S=∅
while Q ̸= ∅ do
u = ExtractMin(Q) S = S ∪ {u}
for (u, v) ∈ E do
Update(u, v) end for
Running time: O(nlogn+mlogn)=O(mlogn) When is Dijkstra-v3() better than Dijkstra-v2()?
Further implementations of Dijsktra’s algorithm
Notation: |V | = n, |E| = m
Implementation
DecreaseKey
Binary heap
O((n+m)logn)
d-ary heap
O((nd+m)logn) log d
Fibonacci heap
O(1) amortized
O(nlogn+m)
Optimal choice is d ≈ m/n (the average degree of the graph) d-ary heap works well for both sparse and dense graphs
If m = n1+x, what is the running time of Dijsktra’s algorithm using a d-ary heap?
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com