CS计算机代考程序代写 DNA algorithm cache COMP2100/COMP6442

COMP2100/COMP6442
Algorithms PArt iii
Kin Chau [
Sid Chi
– Lecture 7]
1

What is Dynamic Programming
• Dynamic programming (DP) is a general technique
• Powerful algorithmic design technique using recursion and memorization
• A class of seemingly exponential-time problems may have a polynomial-time solution via DP
• Particularly for optimization (min/max) problems (e.g., shortest paths)
• “Programming” is not related to particular programming language
• Dynamic programming does not always guarantee efficiency • DP ≈ “controlled brute force”
• When designed properly, dynamic programming can be efficient • Memorization stores the results of expensive calls in the cache
• DP ≈ recursion + memorization
2

Goals of This Lecture
• Apply dynamic programming to solve the following problems • Fibonacci numbers
• Shortest paths
• Minimum edit distance • Tetris
• See how efficiency of dynamic programming in each of these problems
• From polynomial-time to exponential-time solutions
3

Fibonacci Numbers
• Fibonacci numbers = (1, 1, 2, 3, 5, 8, 13, 21, 34, …)
• Fibonacci numbers are often observed in nature
• Shell, plant
• They also give elegant patterns
• Architecture, art
• Recurrence equation of
Fibonacci numbers:
• Fn =Fn-1 +Fn-2 • F2 = F1 = 1
4

Fibonacci Numbers
• Recurrence equation of Fibonacci numbers: • Fn =Fn-1 +Fn-2
• F2 = F1 = 1
• Running time of direct computation:
• T(n) = T(n-1) + T(n-2) + O(1)
≥ 2 T(n-2) + O(1) ≥ O(2n/2)
• There are redundant computations
• Can be improved by memorization in dynamic
programming
• Then running time is T(n) = O(n) because of only n non-memorized calls
Fib[n]
// By direct computation If n ≤ 2 Then
Return f ← 1 Else
Return f ← Fib[n-1] + Fib[n−2]
Fib[n]
// By dynamic programming If memo[n] ≠ null Then
Return memo[n] // memorized call Else If n ≤ 2 Then
f←1
Else // non-memorized call
f ← Fib[n-1] + Fib[n-2] memo[n] ← f
Return f
5

Shortest Paths
• Given a directed graph (V, E)
• Each directed edge (u, v) has a weight w(u, v)
• Let δ(s, v) be the total weight of the shortest path from node s to node v in E
• Shortest paths are useful for many applications • Telematic navigation
• Communication networks • Logistic and transportation • Planning and scheduling
• Shortest path is the first dynamic programming problem by Bellman
v
w(u, v) =20
16 7
6
z
8
x
2y5 u 10
6

Shortest Paths
• Shortest path can be found iteratively from neighbours
• Let SP(u, v) be the shortest path from u to v • Then SP(s, v) is the lost-cost path
s w(s,u) u δ(u,v)
v
concatenating edge (s, u) and SP(u, v) • SP(s, v) can be found based on its total
weight that satisfies
δ(s, v) = min{w(s, u) + δ(u, v) | (s, u) ∈ E}
• Find shortest paths by memorized DP
• δ (s, v) ← min{w(s, u) + δ (u, v) | (s, u) ∈ E} k k−1
• δ0(s,v)←∞fors≠v(basecase)
7
ShortestPath[V,E,v]
// Shortest path by dynamic program δ0(v,v) ← 0 // Initialization δ0(s,v) ← ∞foralls ≠ v
// memorized DP in multiple iterations For k = 1 to |V|
δ (s,v) ← min{w(s,u)+δ (u,v)|(s,u)∈E} k k−1
foralls ≠ v
// Return all shortest paths {SP(s, v)}
Return {SP(s,v)|δ (s,v)=w(u,v)+δ (u,v), s∈V} kk

Shortest Paths:
Example
v
w(u, v) =20
16 7
6
z
10
v
w(u, v) =20
16 7
6
8
x
k
δk(v, v)
δk(x, v)
δk(y, v)
δk(z, v)
δk(u, v)
0
0




1
0
8
16

20
2
0
8
16 (15)
14
20 (18)
3
0
8
15 (19)
14
18 (24)
4
0
8
15
14
18 (17)
5
0
8
15
14
17
2y5
u
8
x
(new path)
2y5
u
10
z
8

Steps for Dynamic Programming
0. 1. 2. 3. 4.
Original problem
• e.g., find SP(s, v) or compute δ(s, v)
Define subproblems
• e.g., δ(u, v) where u is a neighbour of s
Guess (part of solution) • e.g., w(s, u) + δ(u, v)
Relate subproblem solutions
• e.g., δ(s, v) = min{w(s, u) + δ(u, v) | (s, u) ∈ E}
Recurse + memorize
• Build DP table bottom-up check subproblems acyclic/topological order
• e.g., δ (s, v) ← min{w(s, u) + δ (u, v) | (u, v) ∈ E} k k−1
9

Edit Distance
• Used for DNA comparison, plagiarism detection, etc. • Given two strings x and y, what is the cheapest possible
sequence of character edits to transform x into y? • Character edits:
• Insert a new character c into x
• Delete a character c from x
• Replaceacharactercinxbyc’:c→c’
• Cost of edit depends only on characters c, c’
• For example in DNA, common mutation C → G has low cost
• Edit distance is the total cost of a sequence of edits
H
H
H
E
O
O
L
L
L
L
P
P
H
O
P
E
O
P
P
H
E
10

Edit Distance: Example
• Edit distance = Total cost of edits
• No cost for the same character in both strings
• If insertion and deletion cost 0.5 and replacement costs 1, the minimum edit distance equivalent to finding the longest common subsequence
• Subsequence is sequential but not necessarily contiguous
• Example
• HIEROGLYPHOLOGYvs.MICHAELANGELO • The longest common subsequence is HELLO
• Edit distance
= insertion cost (1.5) + deletion cost (2.5) + replace cost (5) = 9
11

Edit Distance: Dynamic Programming
Start
• Construct a directed graph
• From start (leftmost top)
• To finish (rightmost bottom) • Deletion cost
= horizontal edge cost • Insertion cost
= vertical edge cost • Replacement cost
= diagonal edge cost O (zero cost for same character)
• Finding the minimum
edit distance is equivalent to finding shortest path
Delete
M
H
I
E
R
O
G
L
Y
P
H
O
L
O
G
Y
I
C
H
A
E
L
A
N
G
E
L
Zero cost
Finish
12
Insert

Edit Distance: Dynamic Programming
• More general problems for multiple strings/sequences • Suffix/prefix/substring subproblems
• Multiply state spaces
• Still polynomial for a constant number of strings
• Given strings x and y, let x[i] and y[j] be the i-th and j-th characters of x and y, respectively
• Guess whether, to turn x into y, following one of the 3 choices: • Deleting x[i] incurs a cost Costdel(x[i])
• Inserting y[j] incurs a cost Costins(y[j])
• Replacing x[i] by y[j] incurs a cost Costrep(x[i],y[j])
• Costrep(x[i],y[j]) = 0, when x[i] = y[j]
13

Edit Distance
• c(i, j) is min cost from (i, j) to (|x|,|y|)
c(i, j)
• Recurrence: c(i, j) = minimum of: • Costdel(x[i]) + c(i+1, j) if i < |x|, • Costins(y[j]) + c(i, j+1) if j < |y|, • Costrep(x[i],y[j]) + c(i+1, j+1) if i < |x| and j < |y| • Set c(|x|, |y|) = 0 • Directed graph of the table: • Top to bottom OR right to left • Linear space of states of table size • Total running time = O(|x| · |y|) c(0, 0) H I E R O G L Y P H O L O G Y M I C H A E L A N G E L O c(|x|, |y|) 14 Tetris • There is an empty board of small width w • Given a sequence of n Tetris pieces • For the t-th piece, decide its move mt • Orientation Ot (rotate by 90°, 180°, 270°) • X-coordinate xt (in {1,..., w}) • Then must drop piece till it hits something • Full rows do not clear • Goal: • Tosurvive,namely,staywithinheighth 15 mt =(Ot,xt) Tetris h w 16 Tetris 1. Subproblem: Survive in each column i: • The column occupancy heights ht = (ht1, ht2, ... ,htw) at time t • Define Height[t] to be the min height by adding the t-th piece 2. Recurrence: • At time t, the t-th piece is dropped 3. • Height[t] = min(Height[t-1] + cost of a valid move mt of the t-th piece in ht) • The number of moves of the t-th piece = O(4w) Construct a directed graph • Connect each valid move mt of the t-th piece to every valid move mt-1 of the (t-1)-th piece • The cost of each move is the additional height incurred by the t-th piece 17 mt-1 ...... mt ...... ......... 1 w ...... ...... ...... 1 w 1 w 1 w 1 w ... ...... ...... ... 18 Summary • Dynamic programming (DP) is a general technique • memorization stores the results of expensive calls in the cache • DP ≈ recursion + memorization • Problems: • Fibonacci numbers • Shortest paths • Edit distance • Tetris 19 Reference • Visualizations • https://www.cs.usfca.edu/~galles/visualization/DPFib.html • https://www.cs.usfca.edu/~galles/visualization/Dijkstra.html • https://www.cs.usfca.edu/~galles/visualization/DPLCS.html 20