CS计算机代考程序代写 DNA cache algorithm COMP2100/COMP6442

COMP2100/COMP6442
Algorithms Part III
– Lecture 7]
Kin Chau [
Sid Chi
1

What is Dynamic Programming
• Dynamic programming (DP) is a general technique
• Powerful algorithmic design technique using recursion and memorization
• A class of seemingly exponential-time problems may have a polynomial-time solution via DP
• Particularly for optimization (min/max) problems (e.g., shortest paths)
• “Programming” is not related to particular programming language
• Dynamic programming does not always guarantee efficiency • DP ≈ “controlled brute force”
• When designed properly, dynamic programming can be efficient • Memorization stores the results of expensive calls in the cache
• DP ≈ recursion + memorization
2

Goals of This Lecture
• Apply dynamic programming to solve the following problems • Fibonacci numbers
• Shortest paths
• Minimum edit distance • Tetris
• See how efficiency of dynamic programming in each of these problems
• From polynomial-time to exponential-time solutions
3

Fibonacci Numbers
• Fibonacci numbers = (1, 1, 2, 3, 5, 8, 13, 21, 34, …) • Fibonacci numbers are often observed in nature
• Shell, plant
• They also give elegant patterns
• Architecture, art
• Recurrence equation of
Fibonacci numbers:
• Fn = Fn-1 + Fn-2 • F2 = F1 = 1
4

Fibonacci Numbers
• Recurrence equation of Fibonacci numbers:
• Fn = Fn-1 + Fn-2
• F2 = F1 = 1
• Running time of direct computation:
• T(n) = T(n-1) + T(n-2) + O(1)
≥ 2 T(n-2) + O(1) ≥ O(2n/2)
• There are redundant computations
• Can be improved by memorization in dynamic
programming
• Then running time is T(n) = O(n) because of only n non-memorized calls
Fib[n]
// By direct computation If n ≤ 2 Then
Return f  1 Else
Return f  Fib[n-1] + Fib[n−2]
Fib[n]
// By dynamic programming If memo[n]  null Then
Return memo[n] // memorized call Else If n ≤ 2 Then
f1
Else // non-memorized call
f  Fib[n-1] + Fib[n-2] memo[n]  f
Return f
5

Shortest Paths
• Given a directed graph (V, E)
• Each directed edge (u, v) has a weight w(u, v)
• Let δ(s, v) be the total weight of the shortest path from node s to node v in E
• Shortest paths are useful for many applications • Telematic navigation
• Communication networks • Logistic and transportation • Planning and scheduling
• Shortest path is the first dynamic programming problem by Bellman
v
w(u, v) =20
16 7
6
2y5
z
10
8
x
u
6

Shortest Paths
• Shortest path can be found iteratively from neighbours
• Let SP(u, v) be the shortest path from u to v
• Then SP(s, v) is the lowest-cost path
concatenating edge (s, u) and SP(u, v)
• SP(s, v) can be found based on its total
weight that satisfies
δ(s, v) = min{w(s, u) + δ(u, v) | (s, u) ∈ E}
• Find shortest paths by memorized DP • δ0(s,v)∞forsv(basecase)
• δk(s, v)  min{w(s, u) + δk−1(u, v) | (s, u) ∈ E}
s w(s,u) u δ(u,v)
v
7
ShortestPath[V,E,v]
// Shortest path by dynamic program δ0(v,v)  0 // Initialization δ0(s,v)  ∞foralls  v
// memorized DP in multiple iterations For k = 1 to |V|
δk(s,v)  min{w(s,u)+δk−1(u,v)|(s,u)∈E} foralls  v
// Return all shortest paths {SP(s, v)} Return {SP(s,v)|δk(s,v)=w(u,v)+δk(u,v), s∈V}

Shortest Paths:
Example
v
w(u, v) =20
16 7
6
2y5
z
10
v
w(u, v) =20
16 7
6
z
8
x
k
δk(v, v)
δk(x, v)
δk(y, v)
δk(z, v)
δk(u, v)
0
0




1
0
8
16

20
2
0
8
16 (15)
14
20 (18)
3
0
8
15 (19)
14
18 (24)
4
0
8
15
14
18 (17)
5
0
8
15
14
17
u
8
x
(new path)
2y5
u
10
8

Steps for Dynamic Programming
0. 1. 2. 3. 4.
Original problem
• e.g., find SP(s, v) or compute δ(s, v)
Define subproblems
• e.g., δ(u, v) where u is a neighbour of s
Guess (part of solution) • e.g., w(s, u) + δ(u, v)
Relate subproblem solutions
• e.g., δ(s, v) = min{w(s, u) + δ(u, v) | (s, u) ∈ E}
Recurse + memorize
• Build DP table bottom-up check subproblems acyclic/topological order
• e.g., δk(s, v)  min{w(s, u) + δk−1(u, v) | (u, v) ∈ E}
9

Edit Distance
• Used for DNA comparison, plagiarism detection, etc. • Given two strings x and y, what is the cheapest possible
sequence of character edits to transform x into y? • Character edits:
• Insert a new character c into x
• Delete a character c from x
• Replace a character c in x by c’: c → c’
• Cost of edit depends only on characters c, c’
• For example in DNA, common mutation C → G has low cost
• Edit distance is the total cost of a sequence of edits
H
E
L
P
H
O
L
P
H
O
L
P
H
O
L
P
E
H
O
P
E
10

Edit Distance: Example
• Edit distance = Total cost of edits
• No cost for the same character in both strings
• If insertion and deletion cost 0.5 and replacement costs 1, the minimum edit distance equivalent to finding the longest common subsequence
• Subsequence is sequential but not necessarily contiguous
• Example
• H I E R O G L Y P H O L O G Y vs. M I C H A E L A N G E L O • The longest common subsequence is HELLO
• Edit distance
= insertion cost (1.5) + deletion cost (2.5) + replace cost (5) = 9
11

Edit Distance: Dynamic Programming
Start
• Construct a directed graph
• From start (leftmost top)
• To finish (rightmost bottom) • Deletion cost
= horizontal edge cost • Insertion cost
= vertical edge cost • Replacement cost
= diagonal edge cost O (zero cost for same character)
• Finding the minimum
edit distance is equivalent to finding shortest path
Delete
M
H
I
E
R
O
G
L
Y
P
H
O
L
O
G
Y
I
C
H
A
E
L
A
N
G
E
L
Zero cost
Finish
12
Insert

Edit Distance: Dynamic Programming
• More general problems for multiple strings/sequences • Suffix/prefix/substring subproblems
• Multiply state spaces
• Still polynomial for a constant number of strings
• Given strings x and y, let x[i] and y[j] be the i-th and j-th characters of x and y, respectively
• Guess whether, to turn x into y, following one of the 3 choices: • Deleting x[i] incurs a cost Costdel(x[i])
• Inserting y[j] incurs a cost Costins(y[j])
• Replacing x[i] by y[j] incurs a cost Costrep(x[i],y[j])
• Costrep(x[i],y[j]) = 0, when x[i] = y[j]
13

Edit Distance
• c(i, j) is min cost from (i, j) to (|x|,|y|) c(i, j)
• Recurrence: c(i, j) = minimum of:
• Costdel(x[i]) + c(i+1, j) if i < |x|, • Costins(y[j]) + c(i, j+1) if j < |y|, • Costrep(x[i],y[j]) + c(i+1, j+1) if i < |x| and j < |y| A c(0,0) HIEROGLYPHOLOGY M I C H • Set c(|x|,|y|) = 0 • Directed graph of the table: • Top to bottom OR right to left • Linear space of states of table size • Total running time = O(|x| · |y|) E L A N G E L O c(|x|, |y|) 14 Tetris • There is an empty board of small width w • Given a sequence of n Tetris pieces • For the t-th piece, decide its move mt • Orientation Ot (rotate by 90°, 180°, 270°) • X-coordinate xt (in {1,..., w}) • Then must drop piece till it hits something • Full rows do not clear • Goal: • To survive, namely, stay within height h 15 Tetris mt =(Ot,xt) h w 16 Tetris 1. Subproblem: Survive in each column i: • The column occupancy heights ht = (ht1, ht2, ... ,htw) at time t • Define Height[t] to be the min height by adding the t-th piece 2. Recurrence: • At time t, the t-th piece is dropped 3. • Height[t] = min(Height[t-1] + cost of a valid move mt of the t-th piece in ht) • The number of moves of the t-th piece = O(4w) Construct a directed graph • Connect each valid move mt of the t-th piece to every valid move mt-1 of the (t-1)-th piece • The cost of each move is the additional height incurred by the t-th piece 17 mt-1 ...... mt ...... ......... 1 w ...... ...... ...... 1 w 1 w 1 w 1 w ... ...... ...... ... 18 Summary • Dynamic programming (DP) is a general technique • memorization stores the results of expensive calls in the cache • DP ≈ recursion + memorization • Problems: • Fibonacci numbers • Shortest paths • Edit distance • Tetris 19 Reference • Visualizations • https://www.cs.usfca.edu/~galles/visualization/DPFib.html • https://www.cs.usfca.edu/~galles/visualization/Dijkstra.html • https://www.cs.usfca.edu/~galles/visualization/DPLCS.html 20