Sample undirected graph
Is {a} a vertex cover? No; it misses edge d–e
Copyright By PowCoder代写 加微信 powcoder
Is {a, d} a vertex cover? Yes
“Does every edge have one of its endpoints as vertex a or d?”
Is {a, b, c, d, e} a vertex cover? Yes; it’s all of the vertices, so all edges must be accounted for here
———-
Example of clique of 5 vertices:
What’s the minimum vertex cover in this graph?
e.g. {a, b, c, d}.
Minimum is 4
More generally: clique of n vertices; what’s the minimum vertex cover? n-1
Why can’t it be n-2?
Suppose that there _was_ a vertex cover with n-2 vertices.
It’s missing 2 vertices; call those missing ones x and y.
Because it’s a clique, there is an edge
and that isn’t covered because we don’t have x or y in the cover.
That’s one half of the argument. The other half is that we don’t need a vertex cover of all n vertices. The reason is that in such a vertex cover only one vertex is missing, so both vertex endpoints of an edge can’t be missing.
n-1 is correct for every n-clique.
———-
OK. So vertex cover is NP-complete. Now what?
1. Run an exponential-time, brute-force algorithm.
Try every subset of vertices and pick the smallest one that covers all edges.
Problem: would take years/decades to run on graphs that are bigger than 100 or 1000 vertices.
2. Hope that the graphs that you need vertex cover on are small, or have a nice/easy structure (like clique).
3. Come up with an approximation algorithm.
Goals: super fast, and not too far from optimal.
If we’re OK with approximation, then algorithm design techniques that we’ve already learned are back! e.g. greedy, dynamic programming…
———-
Approximation Greedy Algorithm for Vertex Cover
C = {} # vertex cover
while F != {}:
let (u, v) be some edge in F
C = C + {u, v}
remove from F any edge incident to u or v
Sample undirected graph:
Maybe I first pick edge u–v. That adds u and v to the vertex cover.
Maybe next I pick x–y. That adds both x and y to the vertex cover.
Not optimal; but close.
It’s off by 4/3 = 1.33 in this case
How bad could it be? What’s the worst-case? What makes the greedy algorithm perform poorly?
1. Feasibility. Does it always return a vertex cover?
C is a set of vertices.
Does it cover all edges? Yes.
We definitely cover (u, v), because both of its endpoints are chosen.
We then remove only other edges that have u or v as an endpoint.
And we repeat until F is empty.
2. Optimality. How bad could it be?
Let A be the set of all of the edges chosen by the greedy algorithm.
Let C* be the optimal (i.e. minimum) vertex cover.
C* is guaranteed to have at least one vertex from each edge in A
Proof by contradiction: let’s say that there is an edge (u, v) in A, but C* doesn’t include u and doesn’t include v. But then C* isn’t a vertex cover at all.
Next claim: no edges in A share an endpoint.
Why can’t greedy do this??
Picks (u, v)
Picks (v, w)
Reason: it removes edges with an endpoint incident to a chosen edge.
|C*| >= |A|
C* is a vertex cover; set of vertices.
A is a set of edges chosen by the greedy algorithm.
|C| = 2|A|
Size of greedy vertex cover equals 2* the number of edges used by the greedy algorithm.
Here’s what we have:
|C*| >= |A| ; equivalently: |A| <= |C*|
|C| = 2|A|
|C| = 2|A| <= 2|C*|
Think of it like this:
size of vertex cover C is at most twice C*... greedy is at most twice as big as optimal.
Approximation ratio is 2.
Let's say greedy returns a vertex cover of size 10.
The optimal is then guaranteed to be at least 5.
Theoretical result: approximation ratio 2.
Maybe it's actually 1.5? Maybe there's a better proof.
Can you construct an actual example where greedy does twice as bad as optimal?
Here's such an example:
Greedy: {u, v}
Optimal: {u}
We're twice as bad as optimal.
----------
0-1 Knapsack Problem
We have n items; each item i has weight w_i and value v_i.
Knapsack has weight capacity W.
The goal is to get as much value as possible in the knapsack subject to the weight limit W of the knapsack.
I gave you a dynamic-programming algorithm for this problem back in week 4.
What was its runtime?
O(nW); n is number of items, W is knapsack weight capacity
If n = 5, W = 20; algorithm takes c*100 steps
The algorithm is polynomial in the value of W, but exponential in the size (log2 W) of W.
Knapsack is NP-complete, just like vertex cover.
This means that there's likely no polynomial-time algorithm for Knapsack.
O(nW) is what's called pseudopolynomial.
----------
Let's work on an approximation algorithm for knapsack that IS polynomial-time.
Obvious greedy algorithm: greedy choice is value/weight
sort the items from highest to lowest v_i/w_i
for each item in that order:
if item fits in knapsack, add it to knapsack
It always yields a feasible solution: you never take more weight than what can fit in the knapsack
But there are ways to make this algorithm perform TERRIBLY:
Two items:
1. value 1000000, weight 1000000
2. value 2, weight 1
Knapsack weight capacity W: 1000000
Greedy gets value 2; optimal gets 1000000.
Why is there no constant approximation ratio?
I can just change item 1 to:
value 1000000000, weight 1000000000
No guarantees of what can happen here.
What can we do?
1. Make the algorithm better... without making it much slower.
2. We can think about special cases.
e.g. what if all item weights were small.
Greedy algorithm is great in this case.
----------
Greedy Algorithm Attempt #2
sort the items from highest to lowest v_i/w_i
try two ways to fill the knapsack:
solution 1)
for each item:
if the item fits in the knapsack, add it
otherwise, stop the algorithm
solution 2)
take only the most-valuable item that fits in the knapsack
return solution 1) or 2), whichever is more valuable
Item 1: w1 = 1, v1 = 2; ratio v1/w1 = 2
Item 2: w2 = 1000, v2 = 1000; ratio is 1
Solution 1) value is 2
Solution 2) value is 1000
It will return max(2, 1000) = 1000... in this case it is optimal
Let's try to break it...
Item 1: w1 = 1, v1 = 2; ratio v1/w1 = 2
Item 2: w2 = 1000, v2 = 1000; ratio is 1
Item 3: w3 = 1000, v3 = 1000; ratio is 1
Optimal is 2000
Solution 1) here is value 1002
Solution 2) here is value 1000
max(1002, 1000) = 1002
This approximation ratio is 1002/2000 = ~0.5
You could also say that this algorithm had an approximation ratio of 2 here.
----------
Approximation scheme for knapsack
Our existing dynamic programming algorithm for knapsack is fast when the weights of items are small.
You can come up with another dynamic programming algorithm for knapsack that's fast when the values of items are small.
If you do this, then...
v1 = 1746375, w1 = ...
v2 = 2917386, w2 = ...
These are huge values; and we have an algorithm that is fast when values are small.
Let's divide the values by like 100000
v1 = 17, w1 = ...
v2 = 29, w2 = ...
Now the values are small; use dynamic programming on this new instance.
I could have also divided by 2, or 1000000...
The more you divide by, the smaller your values become => faster runtime.
However, the more you divide by, the more wrong are your new values
If you want lots of accuracy, then you’re willing to wait a while for your answer => divide the values by a small number
If you want less accuracy but quickly => divide by big numbers