CSE 101 Exam 2 Review
CSE 101 Exam 2 Review
Shortest Paths (Ch 4)
• BFS
• Dijkstra
– Priority Queues
• Bellman-Ford
• Shortest Paths in DAGs
Fundamental Shortest Paths Formula
s w
v dist(v) ℓ(v,w)
Observation
If there is a length ≤d s-v path, then there is
some w adjacent to v with a length ≤(d-1) s-w
path.
Proof: w is the next to last vertex on the path.
This means that if we know all of the vertices at
distance ≤(d-1), we can find all of the vertices
at distance ≤d.
BFS(G,s)
For v ∈ V, dist(v) ← ∞
Initialize Queue Q
Q.enqueue(s)
dist(s) ← 0
While(Q nonempty)
u ← front(Q)
For (u,v) ∈ E
If dist(v) = ∞
dist(v) ← dist(u)+1
Q.enqueue(v)
v.prev ← u
Total runtime:
O(|V|+|E|)
Edge Lengths
The number of edges in a path is not always the
right measure of distance. Sometimes, taking
several shorter steps is preferable to taking a
few longer ones.
We assign each edge (u,v) a non-negative length
ℓ(u,v). The length of a path is the sum of the
lengths of its edges.
Problem: Shortest Paths
Problem: Given a Graph G with vertices s and t
and a length function ℓ, find the shortest path
from s to t.
Algorithm
Distances(G,s,ℓ)
dist(s) ← 0
While(not all distances found)
Find minimum over (v,w) ∈ E
with v discovered w not
of dist(v)+ℓ(v,w)
dist(w) ← dist(v)+ℓ(v,w)
prev(w) ← v
Why does this work?
Claim: Whenever the algorithm assigns a
distance to a vertex v that is the length of the
shortest path from s to v.
Proof by Induction:
• dist(s) = 0 [the empty path has length 0]
• When assigning distance to w, assume that all
previously assigned distances are correct.
Inductive Step
s
Correctly Assigned Distances
v
w ℓ(v,w)
dist(v)
This is the
shortest path
from s to any
vertex outside
the bubble.
Priority Queue
A Priority Queue is a datastructure that stores
elements sorted by a key value.
Operations:
• Insert – adds a new element to the PQ.
• DecreaseKey – Changes the key of an element
of the PQ to a specified smaller value.
• DeleteMin – Finds the element with the
smallest key and removes it from the PQ.
Dijkstra(G,s,ℓ)
Initialize Priority Queue Q
For v ∈ V
dist(v) ← ∞
Q.Insert(v)
dist(s) ← 0
While(Q not empty)
v ← Q.DeleteMin()
For (v,w) ∈ E
If dist(v)+ℓ(v,w) < dist(w) dist(w) ← dist(v)+ℓ(v,w) Q.DecreaseKey(w) Runtime: O(|V|) Inserts + O(|V|) DeleteMins + O(|E|) DecreaseKeys Binary Heap Store elements in a balanced binary tree with each element having smaller key than its children. 0 1 3 7 5 8 6 Smallest Key log(n) levels Operations • Insert – Add new element at bottom – Bubble up • DecreaseKey – Change key – Bubble up • DeleteMin – Remove root – Move bottom to root – Bubble down Runtime O(log(n)) per operation d-ary Heap • Like binary heap, but each node has d children • Only log(n)/log(d) levels. • Bubble up faster! • Bubble down slower – need to compare to more children. Summary of Priority Queues Negative Edge Weights • Unusual case, but some applications • Dijkstra no longer works – Sometimes need to be able to look ahead to find savings Negative Weight Cycles Definition: A negative weight cycle is a cycle where the total weight of edges is negative. • If G has a negative weight cycle, then there are probably no shortest paths. – Go around the cycle over and over. • Note: For undirected G, a single negative weight edge gives a negative weight cycle by going back and forth on it. Fundamental Shortest Paths Formula s w v dist(v) ℓ(v,w) • System of equations to solve for distances. • When ℓ ≥ 0, Dijsktra gives an order to solve in. • With ℓ < 0, might be no solution. Algorithm Idea Instead of finding shortest paths (which may not exist), find shortest paths of length at most k. s w v distk-1(v) ℓ(v,w) k-1 edges Algorithm Bellman-Ford(G,s,ℓ) dist 0 (v) ← ∞ for all v //cant reach dist 0 (s) ← 0 For k = 1 to n For w ∈ V dist k (w)←min(dist k-1 (v)+ℓ(v,w)) dist k (s) ← min(dist k (s),0) // s has the trivial path O(|E|) What value of k do we use? Analysis Proposition: If n ≥ |V|-1 and if G has no negative weight cycles, then for all v, dist(v) = distn(v). • If there is a negative weight cycle, there probably is no shortest path. • If not, we only need to run our algorithm for |V| rounds, for a final runtime O(|V||E|). Proof • If we have a path with |V| edges: – Must repeat some vertex – Contains a loop – Removing loop makes path shorter • Thus, there is a shortest path with at most |V|-1 edges. Cycle Detection Proposition: For any n ≥ |V| - 1, there are no negative weight cycles reachable from s if and only if for every v ∈ V distn(v) = distn+1(v) • Detect by running one more round of Bellman-Ford. • Need to see if any v’s distance changes. Proof of “Only If” • Suppose no negative weight cycles. • For any n ≥ |V| - 1, distn(v) = dist(v). • So distn(v) = dist(v) = distn+1(v) Proof of “If” Suppose distn(v) = distn+1(v) for all v. So But if there were a negative weight cycle, distances would decrease eventually. Algorithm ShortestPathsInDAGs(G,s,ℓ) TopologicalSort(G) For w ∈ V in topological order If w = s, dist(w) ← 0 Else dist(w)←min(dist(v)+ℓ(v,w)) \\ dist(v) for all upstream v already computed Runtime O(|V|+|E|) Divide & Conquer (Ch 2) • General Technique • Master Theorem • Karatsuba Multiplication • Merge Sort • Order Statistics • Binary Search • Closest Pair of Points Divide and Conquer This is the first of our three major algorithmic techniques. 1. Break problem into pieces 2. Solve pieces recursively 3. Recombine pieces to get answer Master Theorem Theorem: Let T(n) be given by the recurrence: Then we have that Karatsuba Multiplication Want to multiply N and M: 1. Let X ≈ √(N+M) be a power of 2. 2. Write N = AX+B, M = CX+D – This can be done by just taking the high and low bits. 3. N·M = AC·X2+(AD+BC)X+BD = AC·X2+[(A+B)(C+D)-AC-BD]X + BD – The multiplications by X are just bit shifts. Algorithm KaratsubaMult(N,M) If N+M<99, Return Product(N,M) Let X be a power of 2⌊log(N+M)/2⌋ Write N = AX + B, M = CX + D P 1 ← KaratsubaMult(A,C) P 2 ← KaratsubaMult(B,D) P 3 ← KaratsubaMult(A+B,C+D) Return P 1 X2 + [P 3 -P 1 -P 2 ]X + P 3 Runtime Recurrence Karatsuba multiplication on inputs of size n spends O(n) time, and then makes three recursive calls to problems of (approximately) half the size. If T(n) is the runtime for n-bit inputs, we have the recursion: Master Theorem: Note In divide and conquer, it is important that the recursive subcalls are a constant fraction of the size of the original. Note on Proving Correctness There’s a general procedure for proving correctness of a D&C algorithm: Use Induction: Prove correctness by induction on problem size. Base Case: Your base case will be the non-recursive case of your algorithm (which your algorithm does need to have). Inductive Step: Assuming that the (smaller) recursive calls are correct, show that algorithm works. Sorting Problem: Given a list of n numbers, return those numbers in ascending order. Divide and Conquer How do we make a divide and conquer algorithm? 1. Split into Subproblems 2. Recursively Solve 3. Combine Answers - Split list in two - Sort each sublist - ??? Merge Problem: Given two sorted lists, combine them into a single sorted list. Merge Merge(A,B) C ← List of length Len(A)+Len(B) a ← 1, b ← 1 For c = 1 to Len(C) If (b > Len(B))
C[c] ← A[a], a++
Else if (a > Len(A))
C[c] ← B[b], b++
Else if A[a] < B[b] C[c] ← A[a], a++ Else C[c] ← B[b], b++ Return C Runtime: O(|A|+|B|) MergeSort MergeSort(L) If Len(L) = 1 \\ Base Case Return L Split L into equal L 1 and L 2 A ← MergeSort(L 1 ) B ← MergeSort(L 2 ) Return Merge(A,B) Runtime T(n) = 2T(n/2)+O(n) T(n) = O(n log(n)) Order Statistics Problem: Given a list L of numbers and an integer k, find the kth largest element of L. Divide Step Select a pivot x ∈ L. Compare it to the other elements of L. = x < x > x
Which list is our answer in?
• Answer is > x if there are ≥ k elements bigger
than x.
• Answer is x if there are < k elements bigger and
≥ k elements bigger than or equal to x.
• Otherwise answer is less than x.
Order Statistics
Select(L,k)
Pick x ∈ L
Sort L into L
>x
, L
) ≥ k
Return Select(L
>x
,k)
Else if Len(L
>x
)+Len(L
=x
) ≥ k
Return x
Return
Select(L
)-Len(L
=x
))
Randomization
Fix: Pick a random pivot.
n/4 n/4 n/2
x
• There’s a 50% chance that x is selected in the
middle half.
• If so, no matter where the answer is, recursive
call of size at most 3n/4.
• On average need two tries to reduce call.
Runtime
T(n) = T(3n/4) + O(n)
So…
T(n) = O(n)
Note
The algorithm discussed does give the correct
answer in expected O(n) time.
There are deterministic O(n) algorithms using
similar ideas, but they are substantially more
complicated.
Search
Problem: Given a sorted list L and a number x,
find the location of x in L.
Binary Search
BinarySearch(L,i,j,x)
\\Search between L[i] and L[j]
If j < i, Return ‘error’ k ← [(i+j)/2] If L[k] = x, Return k If L[k] > x
Return BinarySearch(L,i,k-1,x)
If L[k] < x Return BinarySearch(L,k+1,j,x) Runtime T(n) = T(n/2) + O(1) So… T(n) = O(log(n)) Binary Search Puzzles You have 27 coins one of which is heavier than the others, and a balance. Determine the heavy coin in 3 weightings. Lots of puzzles have binary search-like answers. As long as you can spend constant time to divide your search space in half (or thirds). You can use binary search in O(log(n)) time. Closest Pair of Points (Ex 2.32) Problem: Given n points in the plane (x1,y1)…(xn,yn) find the pair (xi,yi) and (xj,yj) whose Euclidean distance is as small as possible. Divide and Conquer Outline • Divide points into two sets by drawing a line. • Compute closest pair on each side. • What about pairs that cross the divide? Observation • Suppose closest pair on either side at distance δ. • Only need to care about points within δ of dividing line. • Need to know if some pair closer than δ. Main Idea Proposition: Take the points within δ of the dividing line and sort them by y-coordinate. Any one of these points can only be within δ of the 8 closest points on either side of it. Proof • Nearby points must have y- coordinate within δ. • Subdivide region into δ/2- sided squares. • At most one point in each square. δ δ δ δ < δ Algorithm CPP(S) If |S| ≤ 3 Return closest distance Find line L evenly dividing points Sort S into S left , S right δ ← min(CPP(S left ),CPP(S right )) Let T be points within δ of L Sort T by y-coordinate Compare each element of T to 8 closest on either side. Let min dist be δ’. Return min(δ,δ’) Runtime We have a recurrence T(n) = O(n log(n)) + 2 T(n/2). This is not quite covered by the Master Theorem, but can be shown to give T(n) = O(n log2(n)). Alternatively, if you are more careful and have CPP take points already sorted by y- coordinate, you can reduce to O(n log(n)).