Analysis of Algorithms, I
CSOR W4231
Computer Science Department
Copyright By PowCoder代写 加微信 powcoder
Columbia University
Representative NP-complete problems: TSP, Set Cover
1 Review of last lecture
2 Representative N P -complete problems
3 Integer Programming
4 Minimum-weight Set Cover
An integer programming formulation of Set Cover The linear program relaxation
5 An approximation algorithm for Set Cover Rounding the LP solution
An f-approximation algorithm for Set Cover
1 Review of last lecture
2 Representative N P -complete problems
3 Integer Programming
4 Minimum-weight Set Cover
An integer programming formulation of Set Cover The linear program relaxation
5 An approximation algorithm for Set Cover Rounding the LP solution
An f-approximation algorithm for Set Cover
Complexity classes P, NP and NP-complete
Definition 1.
We define P to be the set of problems that can be solved by polynomial-time algorithms.
Definition 2.
We define NP to be the set of decision problems that have an efficient certifier.
Definition 4.
A problem X(D) is NP-complete if 1. X(D)∈NP and
2. forallY ∈NP,Y ≤P X.
How do we show that a problem is NP-complete?
Suppose we had an NP-complete problem X.
To show that another problem Y is NP-complete, we use
transitivity of reductions. So we “only” need show that 1. Y ∈NP
The first NP-complete problem Theorem 5 (Cook-Levin). Circuit SAT is NP-complete.
Satisfiability of boolean functions
SAT: Given a formula φ in CNF with n variables and m clauses, is φ satisfiable?
3SAT: Given a formula φ in CNF with n variables and m clauses such that each clause has exactly 3 literals, is φ satisfiable?
Circuit-SAT: Given a boolean combinatorial circuit C, is there an assignment of truth values to its inputs that causes the output to evaluate to 1?
Circuit-SAT ≤P SAT, SAT ≤P 3SAT and 3SAT ≤P IS(D)
Common pitfalls when showing NP-completeness
1. Carry out the reduction in the wrong direction
2. Reduce from a problem not known to be NP-complete
3. Exponential-time transformations
Subsets, permutations
4. Neglect to carefully prove both directions of equivalence of the original and the derived instances; that is, x is a yes instance of X if and only if y = R(x) is a yes instance of Y
5. Neglect to show that the problem is in N P
Suggestions
You should think carefully which problem is most suitable to reduce from
In absence of other ideas, reduce from 3SAT
1 Review of last lecture
2 Representative N P -complete problems
3 Integer Programming
4 Minimum-weight Set Cover
An integer programming formulation of Set Cover The linear program relaxation
5 An approximation algorithm for Set Cover Rounding the LP solution
An f-approximation algorithm for Set Cover
The Traveling Salesman Problem (TSP)
Tour: a simple cycle that visits every vertex exactly once. Definition 7 (TSP(D)).
Given n cities {1, . . . , n}, a set of non-negative distances dij between every pair of cities and a budget B, is there a tour of length ≤ B?
Equivalently, is there a permutation π such that
1. π(1)=π(n+1)=1; thatis,westartandendatcity1 2. the total distance travelled satisfies
dπ(i)π(i+1) ≤ B
Application: Google street view car
Example instance of TSP
Depending on the distances, TSP instances may be
Asymmetric: dij ̸= dji
Symmetric: dij = dji
Metric: satisfy the triangle inequality dij ≤ dik + dkj
Euclidean: e.g., cities are in R2 hence city i corresponds to point (xi,yi);thendij =(xi−xj)2+(yi−yj)2
A related problem and hardness of TSP(D)
Hamiltonian Cycle: Given a graph G = (V, E), is there a simple cycle that visits every vertex exactly once?
Hamiltonian Cycle is NP-complete.
Proof: Reduction from 3SAT (e.g., see your textbook).
Symmetric TSP(D) is NP-complete.
Proof: reduction from undirected Hamiltonian Cycle.
Proof of Claim 2 (Hamiltonian Cycle ≤P TSP(D))
1. Start from an arbitrary instance of undirected Hamiltonian Cycle, that is, an undirected graph G = (V, E).
2. Construct the following instance (G′ = (V ′, E′, w), B) of TSP(D): G′ is a complete weighted graph with V ′ = V such that for every edge e ∈ E′,
1, ife∈E we = 2, otherwise
3. Set the budget B = n.
This completes the reduction transformation.
Equivalence of the instances is straightforward:
If G has a hamiltonian cycle, that cycle is a tour of length n in
If G′ has a tour of length n, it must consist of edges of weight 1 (why?); thus all these edges appear in G.
Concluding remarks on TSP
Claim 1 also holds for directed Hamiltonian cycle. An exact analog of the proof of Claim 2 then shows that asymmetric TSP is NP-complete.
It is possible to reduce Hamiltonian cycle to Euclidean TSP, thus showing that even Euclidean TSP is
N P -complete.
However, these problems are not similar in terms of how well they can be approximated: it is possible to provide very good approximate solutions to Euclidean TSP, which is not the case for Symmetric TSP.
Packing and partitioning problems
Set Packing: given a set U of a elements, a collection S1,S2,…,Sb of subsets of U, and a number k, is there a collection of at least k subsets such that no two of them intersect?
3D-Matching:GivendisjointsetsB,G,H,eachofsizen, andasetoftriplesT ⊆B×G×H,isthereasetofn triples in T , no two of which have an element in common?
Reduction from 3SAT.
Numerical problems
Subset sum: Given natural numbers w1, . . . , wn and a (large) target weight W , is there a subset of w1, . . . , wn that adds up exactly to W?
Applications: cryptography, scheduling
Minimum-weight solution to linear equations: Given a system of linear equations in n variables with integer constants, and an integer B ≤ n, does it have a rational solution with at most B non-zero entries?
Applications: coding theory, signal processing
Similar problems with very different complexities
N P -complete
longest path
shortest path
3D matching
Hamiltonian cycle
Euler cycle
3-colorability
2-colorability
LCS of n sequences
LCS of 2 sequences
More on NP-completeness:
Computers and Intractability: A guide to the theory of
NP-completeness, by Garey and Johnson
Computational Complexity, by C. Papadimitriou
1 Review of last lecture
2 Representative N P -complete problems
3 Integer Programming
4 Minimum-weight Set Cover
An integer programming formulation of Set Cover The linear program relaxation
5 An approximation algorithm for Set Cover Rounding the LP solution
An f-approximation algorithm for Set Cover
Integer Programming
Integer programming (IP(D)): Given a system of linear inequalities in n variables and m constraints with integer coefficients and a integer target value k, does it have an integer solution of value k?
Applications: production planning, scheduling trains, etc.
max cTx subject to Ax ≤ b
Here A is an m × n matrix, b ∈ Rm, c ∈ Rn, x is an integer
vector with n components.
What does the set of feasible solutions look like?
Rounding the LP is often insufficient
max 1.00×1 + 0.64×2 x1 ≥0,×2 ≥0
subject to 50×1 + 31×2 ≤ 250 3×1 − 2×2 ≥ −4
x1, x2 integer
5o o oo o o o 4oooooo 3oooooo 2oooooo 1oooooo
From Integer Programming by L. Wolsey
The optimal linear programming solution is far from the optimal integer solution $(5,0)$.
Optimum LP solution (376/193, 950/193)
Optimum IP solution (5,0)
Is IP(D) hard?
IP(D) is in NP.
We can quickly solve LPs with several thousands of variables and constraints but there exist integer programs with 10 variables and 10 constraints that are very hard to solve.
Is IP(D) hard?
IP(D) is in NP.
We can quickly solve LPs with several thousands of variables and constraints but there exist integer programs with 10 variables and 10 constraints that are very hard to solve.
This is not too surprising: integer programs restricted to solutions x ∈ {0, 1}n model yes/no decisions, which are generally hard.
To formalize this intuition, we will reduce an NP-complete problem to IP(D).
Integer Programs for Vertex Cover and IS
First we formulate integer programs for two NP-hard problems. IP for Independent Set:
subjectto xi+xj ≤1, forevery(i,j)∈E
xi ∈ {0,1}, for every i ∈ V
IP for Vertex Cover:
subjectto xi+xj ≥1, forevery(i,j)∈E
xi ∈ {0,1}, for every i ∈ V
IP(D) is NP-complete
VC(D) ≤P IP(D) Proof.
Reduction from arbitrary instance (G = (V, E), k) of VC(D) to the following integer program with target value k:
subjectto xi+xj ≥1, forevery(i,j)∈E n
xi ≤ k i=1
xi ∈ {0,1}, for every i ∈ V Equivalence of the instances is straightforward.
Similar problems with very different complexities (new)
N P -complete
longest path
shortest path
3D matching
Hamiltonian cycle
Euler cycle
3-colorability
2-colorability
LCS of n sequences
LCS of 2 sequences
integer programming
linear programming
The theory of integer and linear programming and duality can guide the design of approximation algorithms,
and exact solutions, for hard problems.
1 Review of last lecture
2 Representative N P -complete problems
3 Integer Programming
4 Minimum-weight Set Cover
An integer programming formulation of Set Cover The linear program relaxation
5 An approximation algorithm for Set Cover Rounding the LP solution
An f-approximation algorithm for Set Cover
Minimum-weight Set Cover
a set E = {e1,e2,…,en} of n elements
acollectionofsubsetsoftheseelementsS1,S2,…,Sm,
where each Sj ⊆ E
a non-negative weight wj for every subset Sj
A minimum-weight collection of subsets that cover all of E. In symbols: find an I ⊆ {1,…,m} such that ∪i∈ISi = E and
wi is minimum. i∈I
(Unweighted Set Cover: wj = 1 for all j)
Example instance of Set Cover
n = 8 ground elements, m = 6 subsets with weights w1 =w2 =w3 =w4 =1,w5 =w6 =1+ε.
Motivation: detect computer viruses
Motivation (IBM AntiVirus): detect features of boot sector viruses that do not occur in typical applications; then use them to discover more viruses
Ground elements: known boot sector viruses (n ≈ 150)
Sets: labelled by some three-byte sequence occurring in these viruses but not occurring in typical computer applications (m ≈ 21000); each set consisted of all the viruses that contained the three-byte sequence
Output: a small number of such sequences—much smaller than 150—that cover all known viruses
=⇒ use the small set cover as features in a neural classifier to determine presence of a boot sector virus
=⇒ detect new viruses (many boot sector viruses are written by modifying existing ones)
Reduction via generalization
Set-Cover(D) is NP-complete.
Reduction from VC(D). Input instance: (G = (V, E), k).
Set E = {e1,…,em} to be the set of ground elements we
want to cover.
For every vertex j, set Sj to be the set of edges (ground
elements) that are incident to–hence covered by–vertex j.
Setwj =1forall1≤j≤n.
Equivalence of instances: input graph has a vertex cover of size k if and only if E has a set cover of weight k.
Forming the integer program for Set Cover
Variables: we introduce one variable per set Sj; intuitively, 1, if Sj is included in the solution
xj = 0, otherwise Constraints: ensure that every element is covered:
for every element ei, at least one of the sets Sj containing ei appears in the final solution
Objective function: minimize the sum of the weights of the sets included in the solution
An integer programming formulation of Set Cover
Integer program for Set Cover:
min wj xj
subjectto xj ≥1, forevery1≤i≤n
xj ∈{0,1}, forevery1≤j≤m
An integer programming formulation of Set Cover
Integer program for Set Cover:
min wj xj
subjectto xj ≥1, forevery1≤i≤n
xj ∈{0,1}, forevery1≤j≤m
be the optimum value of this integer program; OPT be the value of the optimum solution to Set Cover.
Z∗ =OPT. IP
△ We cannot solve this integer program efficiently (why?).
LP relaxation: a bound for the value of the IP
LP relaxation for Set Cover: m
min wj xj
subjectto xj ≥1, forevery1≤i≤n j :ei ∈Sj
LP relaxation: a bound for the value of the IP
LP relaxation for Set Cover: m
min wj xj
subjectto xj ≥1, forevery1≤i≤n j :ei ∈Sj
Every feasible solution to the original IP is a feasible solution to the LP relaxation.
The value of any feasible solution to the original IP is the same in the LP (the objectives are the same).
Let Z∗ be the optimum value of the LP relaxation. LP
Z∗ ≤Z∗ =OPT LP IP
1 Review of last lecture
2 Representative N P -complete problems
3 Integer Programming
4 Minimum-weight Set Cover
An integer programming formulation of Set Cover The linear program relaxation
5 An approximation algorithm for Set Cover Rounding the LP solution
An f-approximation algorithm for Set Cover
Rounding the solution to the LP
LP relaxation for Set Cover: n
min wj xj
subjectto xj ≥1, forevery1≤i≤n j :ei ∈Sj
Let x∗ be an optimal solution to the LP relaxation.
Let fi = # subsets Sj where element ei appears.
Let f = max fi. 1≤i≤n
1, ifx∗j≥1/f xˆj= 0,ifx∗j<1/f
Rounding yields a feasible solution to the original IP
The collection of sets Sj with xˆj = 1 cover all the elements.
Consider the optimal solution x∗ for the LP relaxation.
Fix any element ei; recall that ei appears in fi subsets.
Forsimplicity,relabelthesesubsetsasS1,S2,...,Sfi.Then the optimal solution satisfies the constraint
x∗1 +x∗2 +...+x∗fi ≥1
Let x∗m be the maximum of x∗1,x∗2,...,x∗fi. Then
x∗m ≥ f1 ≥ f1 i
⇒ Our rounding procedure guarantees that, for every element ei, at least one set Sj that covers ei is chosen.
An f-approximation algorithm for Set Cover
How far is the solution obtained by the rounding procedure above from to the optimal solution to Set Cover?
We do not know OPT!
But we have a bound for it: the value Z∗ of the LP
relaxation!
Recall that we set xˆj = 1 if and only if x∗j ≥ 1/f. Then
wjxˆj ≤ wj(fx∗j) = f wjx∗j
= f·Z∗ ≤f·OPT LP
Approximation algorithms
Definition 8.
An α-approximation algorithm for an optimization problem is a polynomial-time algorithm that, for all instances of the problem, produces a solution whose value is within a factor of α of the value of the optimal solution.
α is the approximation ratio or approximation factor For minimization problems, α > 1.
For maximization problems, α < 1.
Example 1: the rounding procedure described on slide 53 gives an f-approximation algorithm for Set Cover:
it can be completed in polynomial-time
it always returns a solution whose value is at most f times
the value of the optimal solution.
Remark: if an element appears in too many sets (e.g., f = Ω(n)), this algorithm does not provide a good approximation guarantee.
Example 2: a 2-approximation algorithm for VC is a polynomial-time algorithm that always returns a solution whose value is at most twice the value of the optimal solution.
A 2-approximation algorithm for V C
Let E = {e1,...,em} be the set of edges in the graph. Let Sj be the set of edges (ground elements) that are
covered by vertex j.
For every edge ei, fi = 2: ei appears in exactly two subsets
Hence f = max fi = 2. 1≤i≤m
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com