CS代写 Analysis of Algorithms, I

Analysis of Algorithms, I
CSOR W4231

Computer Science Department

Copyright By PowCoder代写 加微信 powcoder

Columbia University
Representative NP-complete problems: TSP, Set Cover

1 Review of last lecture
2 Representative N P -complete problems
3 Integer Programming
4 Minimum-weight Set Cover
An integer programming formulation of Set Cover The linear program relaxation
5 An approximation algorithm for Set Cover Rounding the LP solution
An f-approximation algorithm for Set Cover

1 Review of last lecture
2 Representative N P -complete problems
3 Integer Programming
4 Minimum-weight Set Cover
An integer programming formulation of Set Cover The linear program relaxation
5 An approximation algorithm for Set Cover Rounding the LP solution
An f-approximation algorithm for Set Cover

Complexity classes P, NP and NP-complete
Definition 1.
We define P to be the set of problems that can be solved by polynomial-time algorithms.
Definition 2.
We define NP to be the set of decision problems that have an efficient certifier.
Definition 4.
A problem X(D) is NP-complete if 1. X(D)∈NP and
2. forallY ∈NP,Y ≤P X.

How do we show that a problem is NP-complete?
Suppose we had an NP-complete problem X.
To show that another problem Y is NP-complete, we use
transitivity of reductions. So we “only” need show that 1. Y ∈NP
The first NP-complete problem Theorem 5 (Cook-Levin). Circuit SAT is NP-complete.

Satisfiability of boolean functions
SAT: Given a formula φ in CNF with n variables and m clauses, is φ satisfiable?
3SAT: Given a formula φ in CNF with n variables and m clauses such that each clause has exactly 3 literals, is φ satisfiable?
Circuit-SAT: Given a boolean combinatorial circuit C, is there an assignment of truth values to its inputs that causes the output to evaluate to 1?
Circuit-SAT ≤P SAT, SAT ≤P 3SAT and 3SAT ≤P IS(D)

Common pitfalls when showing NP-completeness
1. Carry out the reduction in the wrong direction
2. Reduce from a problem not known to be NP-complete
3. Exponential-time transformations
􏰈 Subsets, permutations
4. Neglect to carefully prove both directions of equivalence of the original and the derived instances; that is, x is a yes instance of X if and only if y = R(x) is a yes instance of Y
5. Neglect to show that the problem is in N P
Suggestions
􏰈 You should think carefully which problem is most suitable to reduce from
􏰈 In absence of other ideas, reduce from 3SAT

1 Review of last lecture
2 Representative N P -complete problems
3 Integer Programming
4 Minimum-weight Set Cover
An integer programming formulation of Set Cover The linear program relaxation
5 An approximation algorithm for Set Cover Rounding the LP solution
An f-approximation algorithm for Set Cover

The Traveling Salesman Problem (TSP)
Tour: a simple cycle that visits every vertex exactly once. Definition 7 (TSP(D)).
Given n cities {1, . . . , n}, a set of non-negative distances dij between every pair of cities and a budget B, is there a tour of length ≤ B?
Equivalently, is there a permutation π such that
1. π(1)=π(n+1)=1; thatis,westartandendatcity1 2. the total distance travelled satisfies
􏰌 dπ(i)π(i+1) ≤ B
Application: Google street view car

Example instance of TSP
Depending on the distances, TSP instances may be
􏰈 Asymmetric: dij ̸= dji
􏰈 Symmetric: dij = dji
􏰈 Metric: satisfy the triangle inequality dij ≤ dik + dkj
􏰈 Euclidean: e.g., cities are in R2 hence city i corresponds to point (xi,yi);thendij =􏰗(xi−xj)2+(yi−yj)2

A related problem and hardness of TSP(D)
Hamiltonian Cycle: Given a graph G = (V, E), is there a simple cycle that visits every vertex exactly once?
Hamiltonian Cycle is NP-complete.
Proof: Reduction from 3SAT (e.g., see your textbook).
Symmetric TSP(D) is NP-complete.
Proof: reduction from undirected Hamiltonian Cycle.

Proof of Claim 2 (Hamiltonian Cycle ≤P TSP(D))
1. Start from an arbitrary instance of undirected Hamiltonian Cycle, that is, an undirected graph G = (V, E).
2. Construct the following instance (G′ = (V ′, E′, w), B) of TSP(D): G′ is a complete weighted graph with V ′ = V such that for every edge e ∈ E′,
􏰕1, ife∈E we = 2, otherwise
3. Set the budget B = n.
This completes the reduction transformation.
Equivalence of the instances is straightforward:
􏰈 If G has a hamiltonian cycle, that cycle is a tour of length n in
􏰈 If G′ has a tour of length n, it must consist of edges of weight 1 (why?); thus all these edges appear in G.

Concluding remarks on TSP
􏰈 Claim 1 also holds for directed Hamiltonian cycle. An exact analog of the proof of Claim 2 then shows that asymmetric TSP is NP-complete.
􏰈 It is possible to reduce Hamiltonian cycle to Euclidean TSP, thus showing that even Euclidean TSP is
N P -complete.
􏰈 However, these problems are not similar in terms of how well they can be approximated: it is possible to provide very good approximate solutions to Euclidean TSP, which is not the case for Symmetric TSP.

Packing and partitioning problems
􏰈 Set Packing: given a set U of a elements, a collection S1,S2,…,Sb of subsets of U, and a number k, is there a collection of at least k subsets such that no two of them intersect?
􏰈 3D-Matching:GivendisjointsetsB,G,H,eachofsizen, andasetoftriplesT ⊆B×G×H,isthereasetofn triples in T , no two of which have an element in common?
Reduction from 3SAT.

Numerical problems
􏰈 Subset sum: Given natural numbers w1, . . . , wn and a (large) target weight W , is there a subset of w1, . . . , wn that adds up exactly to W?
Applications: cryptography, scheduling
􏰈 Minimum-weight solution to linear equations: Given a system of linear equations in n variables with integer constants, and an integer B ≤ n, does it have a rational solution with at most B non-zero entries?
Applications: coding theory, signal processing

Similar problems with very different complexities
N P -complete
longest path
shortest path
3D matching
Hamiltonian cycle
Euler cycle
3-colorability
2-colorability
LCS of n sequences
LCS of 2 sequences
More on NP-completeness:
􏰈 Computers and Intractability: A guide to the theory of
NP-completeness, by Garey and Johnson
􏰈 Computational Complexity, by C. Papadimitriou

1 Review of last lecture
2 Representative N P -complete problems
3 Integer Programming
4 Minimum-weight Set Cover
An integer programming formulation of Set Cover The linear program relaxation
5 An approximation algorithm for Set Cover Rounding the LP solution
An f-approximation algorithm for Set Cover

Integer Programming
Integer programming (IP(D)): Given a system of linear inequalities in n variables and m constraints with integer coefficients and a integer target value k, does it have an integer solution of value k?
􏰈 Applications: production planning, scheduling trains, etc.
max cTx subject to Ax ≤ b
Here A is an m × n matrix, b ∈ Rm, c ∈ Rn, x is an integer
vector with n components.
What does the set of feasible solutions look like?

Rounding the LP is often insufficient
max 1.00×1 + 0.64×2 x1 ≥0,×2 ≥0
subject to 50×1 + 31×2 ≤ 250 3×1 − 2×2 ≥ −4
x1, x2 integer
5o o oo o o o 4oooooo 3oooooo 2oooooo 1oooooo
From Integer Programming by L. Wolsey
The optimal linear programming solution is far from the optimal integer solution $(5,0)$.
Optimum LP solution (376/193, 950/193)
Optimum IP solution (5,0)

Is IP(D) hard?
􏰈 IP(D) is in NP.
􏰈 We can quickly solve LPs with several thousands of variables and constraints but there exist integer programs with 10 variables and 10 constraints that are very hard to solve.

Is IP(D) hard?
􏰈 IP(D) is in NP.
􏰈 We can quickly solve LPs with several thousands of variables and constraints but there exist integer programs with 10 variables and 10 constraints that are very hard to solve.
􏰈 This is not too surprising: integer programs restricted to solutions x ∈ {0, 1}n model yes/no decisions, which are generally hard.
􏰈 To formalize this intuition, we will reduce an NP-complete problem to IP(D).

Integer Programs for Vertex Cover and IS
First we formulate integer programs for two NP-hard problems. IP for Independent Set:
subjectto xi+xj ≤1, forevery(i,j)∈E
xi ∈ {0,1}, for every i ∈ V
IP for Vertex Cover:
subjectto xi+xj ≥1, forevery(i,j)∈E
xi ∈ {0,1}, for every i ∈ V

IP(D) is NP-complete
VC(D) ≤P IP(D) Proof.
Reduction from arbitrary instance (G = (V, E), k) of VC(D) to the following integer program with target value k:
subjectto xi+xj ≥1, forevery(i,j)∈E n
􏰌 xi ≤ k i=1
xi ∈ {0,1}, for every i ∈ V Equivalence of the instances is straightforward.

Similar problems with very different complexities (new)
N P -complete
longest path
shortest path
3D matching
Hamiltonian cycle
Euler cycle
3-colorability
2-colorability
LCS of n sequences
LCS of 2 sequences
integer programming
linear programming
The theory of integer and linear programming and duality can guide the design of approximation algorithms,
and exact solutions, for hard problems.

1 Review of last lecture
2 Representative N P -complete problems
3 Integer Programming
4 Minimum-weight Set Cover
An integer programming formulation of Set Cover The linear program relaxation
5 An approximation algorithm for Set Cover Rounding the LP solution
An f-approximation algorithm for Set Cover

Minimum-weight Set Cover
􏰈 a set E = {e1,e2,…,en} of n elements
􏰈 acollectionofsubsetsoftheseelementsS1,S2,…,Sm,
where each Sj ⊆ E
􏰈 a non-negative weight wj for every subset Sj
A minimum-weight collection of subsets that cover all of E. In symbols: find an I ⊆ {1,…,m} such that ∪i∈ISi = E and
􏰋 wi is minimum. i∈I
(Unweighted Set Cover: wj = 1 for all j)

Example instance of Set Cover
n = 8 ground elements, m = 6 subsets with weights w1 =w2 =w3 =w4 =1,w5 =w6 =1+ε.

Motivation: detect computer viruses
Motivation (IBM AntiVirus): detect features of boot sector viruses that do not occur in typical applications; then use them to discover more viruses
􏰈 Ground elements: known boot sector viruses (n ≈ 150)
􏰈 Sets: labelled by some three-byte sequence occurring in these viruses but not occurring in typical computer applications (m ≈ 21000); each set consisted of all the viruses that contained the three-byte sequence
􏰈 Output: a small number of such sequences—much smaller than 150—that cover all known viruses
=⇒ use the small set cover as features in a neural classifier to determine presence of a boot sector virus
=⇒ detect new viruses (many boot sector viruses are written by modifying existing ones)

Reduction via generalization
Set-Cover(D) is NP-complete.
Reduction from VC(D). Input instance: (G = (V, E), k).
􏰈 Set E = {e1,…,em} to be the set of ground elements we
want to cover.
􏰈 For every vertex j, set Sj to be the set of edges (ground
elements) that are incident to–hence covered by–vertex j.
􏰈 Setwj =1forall1≤j≤n.
Equivalence of instances: input graph has a vertex cover of size k if and only if E has a set cover of weight k.

Forming the integer program for Set Cover
Variables: we introduce one variable per set Sj; intuitively, 􏰕 1, if Sj is included in the solution
xj = 0, otherwise Constraints: ensure that every element is covered:
for every element ei, at least one of the sets Sj containing ei appears in the final solution
Objective function: minimize the sum of the weights of the sets included in the solution

An integer programming formulation of Set Cover
Integer program for Set Cover:
min 􏰌 wj xj
subjectto 􏰌 xj ≥1, forevery1≤i≤n
xj ∈{0,1}, forevery1≤j≤m

An integer programming formulation of Set Cover
Integer program for Set Cover:
min 􏰌 wj xj
subjectto 􏰌 xj ≥1, forevery1≤i≤n
xj ∈{0,1}, forevery1≤j≤m
be the optimum value of this integer program; OPT be the value of the optimum solution to Set Cover.
Z∗ =OPT. IP
△ We cannot solve this integer program efficiently (why?).

LP relaxation: a bound for the value of the IP
LP relaxation for Set Cover: m
min 􏰌 wj xj
subjectto 􏰌 xj ≥1, forevery1≤i≤n j :ei ∈Sj

LP relaxation: a bound for the value of the IP
LP relaxation for Set Cover: m
min 􏰌 wj xj
subjectto 􏰌 xj ≥1, forevery1≤i≤n j :ei ∈Sj
􏰈 Every feasible solution to the original IP is a feasible solution to the LP relaxation.
􏰈 The value of any feasible solution to the original IP is the same in the LP (the objectives are the same).
􏰈 Let Z∗ be the optimum value of the LP relaxation. LP
Z∗ ≤Z∗ =OPT LP IP

1 Review of last lecture
2 Representative N P -complete problems
3 Integer Programming
4 Minimum-weight Set Cover
An integer programming formulation of Set Cover The linear program relaxation
5 An approximation algorithm for Set Cover Rounding the LP solution
An f-approximation algorithm for Set Cover

Rounding the solution to the LP
LP relaxation for Set Cover: n
min 􏰌 wj xj
subjectto 􏰌 xj ≥1, forevery1≤i≤n j :ei ∈Sj
􏰈 Let x∗ be an optimal solution to the LP relaxation.
􏰈 Let fi = # subsets Sj where element ei appears.
􏰈 Let f = max fi. 1≤i≤n
􏰕1, ifx∗j≥1/f xˆj= 0,ifx∗j<1/f Rounding yields a feasible solution to the original IP The collection of sets Sj with xˆj = 1 cover all the elements. 􏰈 Consider the optimal solution x∗ for the LP relaxation. 􏰈 Fix any element ei; recall that ei appears in fi subsets. 􏰈 Forsimplicity,relabelthesesubsetsasS1,S2,...,Sfi.Then the optimal solution satisfies the constraint x∗1 +x∗2 +...+x∗fi ≥1 Let x∗m be the maximum of x∗1,x∗2,...,x∗fi. Then x∗m ≥ f1 ≥ f1 i ⇒ Our rounding procedure guarantees that, for every element ei, at least one set Sj that covers ei is chosen. An f-approximation algorithm for Set Cover How far is the solution obtained by the rounding procedure above from to the optimal solution to Set Cover? 􏰈 We do not know OPT! 􏰈 But we have a bound for it: the value Z∗ of the LP relaxation! Recall that we set xˆj = 1 if and only if x∗j ≥ 1/f. Then 􏰌wjxˆj ≤ 􏰌wj(fx∗j) = f 􏰌wjx∗j = f·Z∗ ≤f·OPT LP Approximation algorithms Definition 8. An α-approximation algorithm for an optimization problem is a polynomial-time algorithm that, for all instances of the problem, produces a solution whose value is within a factor of α of the value of the optimal solution. 􏰈 α is the approximation ratio or approximation factor 􏰈 For minimization problems, α > 1.
􏰈 For maximization problems, α < 1. Example 1: the rounding procedure described on slide 53 gives an f-approximation algorithm for Set Cover: 􏰈 it can be completed in polynomial-time 􏰈 it always returns a solution whose value is at most f times the value of the optimal solution. Remark: if an element appears in too many sets (e.g., f = Ω(n)), this algorithm does not provide a good approximation guarantee. Example 2: a 2-approximation algorithm for VC is a polynomial-time algorithm that always returns a solution whose value is at most twice the value of the optimal solution. A 2-approximation algorithm for V C 􏰈 Let E = {e1,...,em} be the set of edges in the graph. 􏰈 Let Sj be the set of edges (ground elements) that are covered by vertex j. 􏰈 For every edge ei, fi = 2: ei appears in exactly two subsets 􏰈 Hence f = max fi = 2. 1≤i≤m 程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com