CS代考程序代写 algorithm computational biology distributed system Lecture 12:

Lecture 12:
Coping with hardness + Randomized Algorithms
The University of Sydney
Page 1

Algorithms and hardness
Algorithmic techniques:
– Greedy
– Divide & Conquer
– Dynamic programming
– Network flow
Hardness:
– NP-hardness: Polynomial-time algorithm is unlikely Today
– How can we cope with hard problems?
The University of Sydney
Page 2

Coping With NP-Completeness
Question: What should I do if I need to solve an NP-complete problem?
Answer: Theory says you’re unlikely to find poly-time algorithm.
Must sacrifice one of three desired features. – Solve problem to optimality.
• Approximation algorithms
• Randomized algorithms
– Solve problem in polynomial time.
• Exact exponential time algorithms
– Solve arbitrary instances of the problem.
• Solve restricted classes of instances • Parametrized algorithms
The University of Sydney
Page 3

10.2 Solving NP-Hard Problems on restricted input instances
For example special cases of graphs: – trees,
– bipartite graphs,
– planar graphs,
–…
The University of Sydney
Page 4

Independent Set on Trees
INDEPENDENT-SET: Given a graph G = (V, E) and an integer k, is there a subset of vertices S Í V such that |S| 3 k, and for each edge at most one of its endpoints is in S?
Problem: Given a tree, find a maximum IS. Key observation: If v is a leaf, there exists
a maximum size independent set containing v. Proof: [exchange argument]
u v
– Consider a max cardinality independent set S.
– IfvÎS,we’redone.
– IfuÏSandvÏS,thenSÈ {v}isindependentÞSnotmaximum. – IF u Î S and v Ï S, then S È {v}/{u} is independent. (exchange) ▪
The University of Sydney
Page 5

Independent Set on Trees: Greedy Algorithm
Theorem: The following greedy algorithm finds a maximum cardinality independent set in forests (and hence trees).
u v
Independent-Set-In-A-Forest(F) { S¬Æ
while (F has at least one edge) {
Let e = (u,v) be an edge such that v is a leaf
Add v to S
Delete from F nodes u and v, and all edges
incident to them.
}
Add remaining vertices to S.
return S }
ß
u v
ß
Proof: Correctness follows from the previous key observation. ▪
Remark. Can implement in O(n) time by considering nodes in postorder. The University of Sydney
Page 6

Weighted Independent Set on Trees
Weighted independent set on trees. Given a tree and node weights wv > 0, find an independent set S that maximizes SvÎS wv.
The University of Sydney
Page 8
r
u vwx
children(u) = { v, w, x }

Weighted Independent Set on Trees
Weighted independent set on trees. Given a tree and node weights wv > 0, find an independent set S that maximizes SvÎS wv.
Observation: If (u, v) is an edge such that v is a leaf node, then either OPT includes u, or it includes all leaf nodes incident to u.
Dynamic programming solution: Root tree at some node, say r. – OPTin(u) = max weight independent set
rooted at u, containing u.
r
– OPTout(u) = max weight independent set u rooted at u, not containing u.
OPTin(u) = OPTout (u) =
wu + å OPTout(v) v Î children(u)
å max {OPTin (v), OPTout (v)} v Î children(u)
vwx
children(u) = { v, w, x }
The University of Sydney
Page 10

Weighted Independent Set on Trees: DP Algorithm
Theorem: The dynamic programming algorithm find a maximum weighted independent set in trees in O(n) time.
Weighted-Independent-Set-In-A-Tree(T) {
Root the tree at a node r
foreach (node u of T in postorder) {
if (u is a leaf) { Min [u] = wu
Mout[u] = 0} else {
Min [u] = SvÎchildren(u) Mout[v]+wv
u
Mout[u] = SvÎchildren(u) max(Mout[v],Min[v])} u }
return max(Min[r], Mout[r]) }
Proof: Takes O(n) time since we visit nodes in postorder and examine each edge exactly once. ▪
The University of Sydney
Page 11

10.1 Solving restricted instances
For example in the case when the vertex cover is small.
The University of Sydney Page 12

10.1 Finding Small Vertex Covers
VERTEX COVER: Given a graph G = (V, E) and an integer k, is there a subset of vertices S Í V such that |S| £ k, and for each edge (u, v) either u Î S, or v Î S, or both.
16 27
38 49
5 10
k=4
S = { 3, 6, 7, 10 }
The University of Sydney
Page 13

Vertex Cover
VERTEX COVER: Given a graph G = (V, E) and an integer k, is there a subset of vertices S Í V such that |S| £ k, and for each edge (u, v) either u Î S, or v Î S, or both.
Vertex Cover (or Independent Set) arises naturally in many applications:
– dynamic detection of race conditions (distributed systems).
– computational biology
– Biochemistry
– Pattern recognition
– Computer vision –…
What should we do if we need to solve it?
The University of Sydney
Page 14

Finding Small Vertex Covers
Question: What if k is small?
Brute force: O(knk+1).
– Try all nk Î O(nk) subsets of size k.
– Takes O(kn) time to check whether a subset is a vertex cover.
Aim: Limit exponential dependency on k, e.g., to O(2k kn).
Example: n = 1000, k = 10.
– Brute force. knk+1 = 1034 Þ infeasible. – Better. 2k kn = 107 Þ feasible.
Remark. If k is a constant, algorithm is poly-time; if k is a small constant, then it’s also practical.
The University of Sydney
Page 16

Finding Small Vertex Covers
Theorem: Let (u,v) be an edge of G. G has a vertex cover of size £ k iff at least one of G\{u} and G\{v} has a vertex cover of size £ k-1.
delete u and all incident edges
Proof:
uv
Þ
– Suppose G has a vertex cover S of size £ k.
– S contains either u or v (or both). Assume it contains u.
– S\{u} is a vertex cover of G\{u}. Ü
– Suppose S is a vertex cover of G\{u} of size £ k-1.
– ThenSÈ{u}isavertexcoverofGofsizek. ▪
G\{u}
The University of Sydney
Page 19

Finding Small Vertex Covers: Algorithm
Theorem: The following algorithm determines if G has a vertex cover of size £ k in O(2k kn) time.
uv
G,k
G\{u} k-1
boolean Vertex-Cover(G,k) {
if (G contains no edges) return true
if (G contains > k(n-1) edges) return false
let (u,v) be any edge of G
a = Vertex-Cover(G\{u},k-1)
b = Vertex-Cover(G\{v},k-1)
return a or b
}
The University of Sydney
Page 21

Finding Small Vertex Covers: Algorithm
Theorem: The following algorithm determines if G has a vertex cover of size £ k in O(2k kn) time.
uv
G,k
boolean Vertex-Cover(G,k) {
if (G contains no edges) return true
if (G contains > k(n-1) edges) return false
let (u,v) be any edge of G
a = Vertex-Cover(G\{u},k-1)
b = Vertex-Cover(G\{v},k-1)
return a or b
}
G\{u} k-1
G\{v}
k-1 £k
The University of Sydney
Page 22

Finding Small Vertex Covers: Algorithm
Theorem: The following algorithm determines if G has a vertex cover of size £ k in O(2k kn) time.
boolean Vertex-Cover(G,k) {
if (G contains no edges) return true
if (G contains > k(n-1) edges) return false
let (u,v) be any edge of G
a = Vertex-Cover(G\{u},k-1)
b = Vertex-Cover(G\{v},k-1)
return a or b
}
Proof:
uv
G,k
G\{u} k-1
G\{v}
k-1 £k
– Correctness follows from previous two claims.
– There are £ 2k+1 nodes in the recursion tree; each invocation
takes O(kn) time. ▪ The University of Sydney
Page 23

Finding Small Vertex Covers: Algorithm
Theorem: Vertex cover can be solved in O(2k kn) time. This is fine as long as k is (a small) constant.
What if k is not a small constant?
uv
G,k
G\{u} k-1
G\{v}
k-1 £k
The University of Sydney
Page 24

Approximation Algorithms
Sacrifice optimality, but still run in polynomial time.
Approximation ratio =
Cost of apx solution Cost of optimal solution
An approximation algorithm for a minimization problem requires an approximation guarantee:
• Approximation ratio ≤ c
• Approximation solution ≤ c · value of optimal solution
The University of Sydney
Page 25

Vertex Cover
The University of Sydney Page 26

Vertex Cover
Optimization version: Given an undirected graph G = (V, E), find the smallest vertex cover C.
First try: Pick an edge e = (u,v) that is uncovered. Choose one of u and v to add to cover. Repeat.
On a star graph, approximation factor is 𝛀(n)! Example: Star graph with 4 vertices
The University of Sydney
Page 27

Vertex Cover
Weak duality: Let M be a matching, and let C be a vertex cover. Then, |M| £ |C|.
Proof: Each vertex can cover at most one edge in any matching. 1 1′
2 2′
3 3′ 4 4′
5 5′
M = 1-2′, 3-1′, 4-5′ |M| = 3
The University of Sydney
Page 28

Vertex Cover
Weak duality: Let M be a matching, and let C be a vertex cover. Then, |M| £ |C|.
Greedy algorithm:
– Initialise S = empty set
– While there exists an edge e = (u,v) not covered by S:
– AddbothuandvtoS Thm: S is a 2-approximation.
1 1′ 2 2′
3 3′ 4 4′
5 5′
Proof:
– Let M be set of edges encountered in while loop. This is a matching.
– Let C* be smallest vertex cover.
– |S| = 2|M| £ 2|C*|.
Not known how to do better! Beating 1.3606 is NP-hard. The University of Sydney
Page 29

11.1 Approximation algorithms: Load Balancing
The University of Sydney Page 30

Load Balancing
Input: m identical machines; n jobs, job j has processing time tj. – Job j must run contiguously on one machine.
– A machine can process at most one job at a time.
Machine 1: Machine 2: Machine 3:
A
E
MHachine1 I
B
D
Machine 2 G
C
F Machine 3 J
The University of Sydney
0 Time
Page 31

Load Balancing
Input: m identical machines; n jobs, job j has processing time tj. – Job j must run contiguously on one machine.
– A machine can process at most one job at a time.
Definition: Let J(i) be the subset of jobs assigned to machine i. The load of machine i is Li = SjÎJ(i) tj.
Example: J(1) = {A,E,H,I}, J(2) = {B,D,G}, J(3)={C,F,J}
Machine 1: Machine 2: Machine 3:
A
E
MHachine1 I
B
D
Machine 2 G
C
F Machine 3 J
The University of Sydney
0 Time
Page 32

Load Balancing
Input: m identical machines; n jobs, job j has processing time tj. – Job j must run contiguously on one machine.
– A machine can process at most one job at a time.
Definition: Let J(i) be the subset of jobs assigned to machine i. The load of machine i is Li = SjÎJ(i) tj.
Example: J(1) = {A,E,H,I}, J(2) = {B,D,G}, J(3)={C,F,J}
Definition: The makespan is the maximum load on any machine L = maxi Li.
Load balancing: Assign each job to a machine to minimize makespan.
Machine 1: Machine 2: Machine 3:
A
E
MHachine1 I
B
D
Machine 2 G
C
F Machine 3 J
The University of Sydney
0 Time
Page 33

Load Balancing: List Scheduling
List-scheduling algorithm:
– Consider n jobs in some fixed order.
– Assign job j to machine whose load is smallest so far.
List-Scheduling(m, n, t1,t2,…,tn) { for i = 1 to m {
Li ¬ 0
J(i)¬ Æ }
load on machine i
jobs assigned to machine i
for j = 1 to n {
i = argmink Lk J(i) ¬ J(i) È {j} Li ¬ Li + tj
} }
machine i has smallest load assign job j to machine i update load of machine i
Implementation: O(n log n) using a priority queue. The University of Sydney
Page 34

Load Balancing: List Scheduling
ABCDE FG
HIJ
Machine 1 Machine 2 Machine 3
0 Time The University of Sydney
Page 35

Load Balancing: List Scheduling
BCDE FG
HIJ
A
Machine 1
Machine 2 Machine 3
0 Time The University of Sydney
Page 36

Load Balancing: List Scheduling
CDE FG
HIJ
A
Machine 1
B
Machine 2
Machine 3
0 Time The University of Sydney
Page 37

Load Balancing: List Scheduling
DE FG
HIJ
A
Machine 1
B
Machine 2
C
Machine 3
0 Time The University of Sydney
Page 38

Load Balancing: List Scheduling
FG HIJ
E
A
Machine 1
B
D
Machine 2
C
Machine 3
0 Time The University of Sydney
Page 39

Load Balancing: List Scheduling
FG HIJ
A
E
Machine 1
B
D
Machine 2
C
Machine 3
0 Time The University of Sydney
Page 40

Load Balancing: List Scheduling
G
HIJ
A
E
Machine 1
B
D
Machine 2
C
F Machine 3
0 Time The University of Sydney
Page 41

Load Balancing: List Scheduling
HIJ
A
E
Machine 1
B
D
Machine 2 G
C
F Machine 3
0 Time The University of Sydney
Page 42

Load Balancing: List Scheduling
IJ
A
E
HMachine 1
B
D
Machine 2 G
C
F Machine 3
0 Time The University of Sydney
Page 43

Load Balancing: List Scheduling
J
A
E
HMachine 1 I
B
D
Machine 2 G G
C
F Machine 3
0 Time The University of Sydney
Page 44

Load Balancing: List Scheduling
A
E
HMachine 1 I
B
D
Machine 2 G
C
F Machine 3
J
0 Time The University of Sydney
Page 45

Is this a good schedule?
– The schedule may not be optimal (minimum makespan). – How do we prove that statement?
– We only need to provide a counterexample.
The University of Sydney Page 46

Approximation Algorithms
Sacrifice optimality, but still run in polynomial time.
Approximation ratio =
Cost of apx solution Cost of optimal solutions
An approximation algorithm for a minimization problem requires an approximation guarantee:
• Approximation ratio ≤ c
• Approximation solution ≤ c · value of optimal solution
The University of Sydney
Page 47

Load Balancing: List Scheduling Analysis
Theorem: [Graham, 1966]
Greedy algorithm is a 2-approximation.
– First worst-case analysis of an approximation algorithm.
– Need to compare resulting solution with optimal makespan L*.
Lemma 1: The optimal makespan L* 3 maxj tj.
Proof: Some machine must process the most time-consuming job. ▪
Lemma 2: The optimal makespan L* 3 m1 åj tj. Proof:
– The total processing time is Sj tj .
– One of m machines must do at least a 1/m fraction of total work. ▪
The University of Sydney Page 48

Load Balancing: List Scheduling Analysis
Theorem: Greedy algorithm is a 2-approximation. Proof: Consider load Li of bottleneck machine i.
– Let j be last job scheduled on machine i.
– When job j assigned to machine i, i had smallest load. Its load before
assignmentisLi-tj Þ Li-tj £ Lk forall1£k£m. blue jobs scheduled before j
machine i
0
j
The University of Sydney
Li – tj
L = Li
Page 49

Load Balancing: List Scheduling Analysis
Theorem: Greedy algorithm is a 2-approximation. Proof: Consider load Li of bottleneck machine i.
– Let j be last job scheduled on machine i.
– When job j was assigned to machine i, i had smallest load. Its
loadbeforeassignmentisLi-tj Þ Li-tj £ Lk forall1£k£m. – Sum inequalities over all k and divide by m:
!!
“$!#! $!%!!
&
£ L*
£ L* Lemma 2
L-t£1åL
i j måk =1t
k k
Lemma 1
m £ L*
£2L*.
k
– Now
Li =(Li-tj)+tj
▪
The University of Sydney
Page 50

Load Balancing: List Scheduling Analysis
Question: Is our analysis tight? i.e. is it possible that a smaller bound is possible? Answer: Yes…more or less.
Example: m machines, m(m-1) jobs of length 1, one job of length m
machine 2 idle
machine 3 idle
machine 4 idle
machine 5 idle
machine 6 idle
machine 7 idle
machine 8 idle
machine 9 idle
machine 10 idle
m = 10
The University of Sydney list scheduling makespan = 19 Page 51

Load Balancing: List Scheduling Analysis
Question: Is our analysis tight? Answer: Yes…more or less.
Example: m machines, m(m-1) jobs of length 1, one job of length m
m = 10
The University of Sydney optimal makespan = 10 Page 52

Load Balancing: LPT Rule
Longest processing time (LPT): Sort n jobs in descending order of processing time, and then run list scheduling algorithm.
LPT-List-Scheduling(m, n, t1,t2,…,tn) { Sortjobssothatt1≥t2≥ … ≥tn
for i = 1 to m { Li ¬ 0
J(i)¬Æ }
load on machine i
jobs assigned to machine i
for j = 1 to n {
i = argmink Lk J(i) ¬ J(i) È {j} Li ¬ Li + tj
} }
machine i has smallest load assign job j to machine i
update load of machine i
The University of Sydney Page 53

Load Balancing: LPT Rule
Observation: If at most m jobs, then list-scheduling is optimal. Proof: Each job put on its own machine. ▪
Lemma 3: If there are more than m jobs, L* 3 2 tm+1. Proof:
– Consider first m+1 jobs t1, …, tm+1.
– Since the ti’s are in descending order, each takes at least tm+1 time.
– There are m+1 jobs and m machines, so by pigeonhole principle, at least one machine gets two jobs. ▪
Theorem: LPT rule is a 3/2 approximation algorithm. Proof: Same basic approach as for list scheduling.
j
L i = ( L i – t j ) + t j “$!#! $!%!! &
!! £L* £12L*
£ 23 L * .
▪
The University of Sydney Lemma 3
( by observation, can assume number of jobs > m )
0
Li – tj
L = Li
Page 54

Load Balancing: LPT Rule
Observation: If at most m jobs, then list-scheduling is optimal. Proof: Each job put on its own machine. ▪
Lemma 3: If there are more than m jobs, L* 3 2 tm+1. Theorem: LPT rule is a 3/2 approximation algorithm.
Proof: Same basic approach as for list scheduling. Case 1: j 3 m + 1
j
Li = (Li-tj) + tj “$!#!$!%!! &
!! £L* £12L* Case 2: j < m + 1 𝐿!=𝑡" ≤𝐿∗ £ 23L*. 0 Li-tj L=Li j The University of Sydney Page 55 0 L = Li Load Balancing: LPT Rule Question: Is our 3/2 analysis tight? Answer: No. Theorem: [Graham, 1969] LPT rule is a 4/3-approximation. Proof: More sophisticated analysis of same algorithm. Question: Is Graham's 4/3 analysis tight? Answer: Essentially yes. Example: m machines, n = 2m+1 jobs, 2 jobs of length m+1, m+2, ..., 2m-1 and one job of length m. The University of Sydney Page 56 13.4 MAX 3-SAT The University of Sydney Page 57 Maximum 3-Satisfiability MAX-3SAT: Given 3-SAT formula, find a truth assignment that satisfies as many clauses as possible. C1 =x2Úx3Úx4 C2 =x2Úx3Úx4 C3 =x1Úx2Úx4 C4 =x1Úx2Úx3 C5 =x1Úx2Úx4 Remark. NP-hard search problem. Simple idea: Flip a coin, and set each variable true with probability 1⁄2, independently for each variable. The University of Sydney Page 58 Maximum 3-Satisfiability: Analysis Theorem: Given a 3-SAT formula with k clauses, the expected number of clauses satisfied by a random assignment is 7k/8. Proof: Consider random variable Zj = ìí1 if clause Cj is satisfied î 0 otherwise. Let Z = number of clauses satisfied by the assignment Þ Z=Z1+Z2+Z3+...+Zk The University of Sydney Page 59 linearity of expectation E[Z] = åk E[Zj] j=1 Pr[clauseC issatisfied] j E[Zj] = Probability that (x Ú y Ú z) is satisfied = = 78 k 18 åk 1⁄2·1⁄2·1⁄2=/ j=1 The Probabilistic Method Corollary: For any instance of 3-SAT, there exists a truth assignment that satisfies at least a 7/8 fraction of all clauses. Proof: There must be an assignment that performs at least as well as the expectation! ▪ Remark: Any 3-SAT instance with at most 7 clauses must always be satisfiable. Probabilistic method: We showed the existence of a non-obvious property of 3-SAT by showing that a random construction produces it with positive probability! Hardness of Approximation: NP-hard to do better than 7/8! The University of Sydney Page 60 Summary NP-complete problems show up in many applications. There are different approaches to cope with it: • Approximation algorithms • Restricted cases (trees, bipartite graphs, small solution...) • Randomized algorithms •... Each approach has its pros and cons. The University of Sydney Page 61

Related Posts