Dynamic Programming 2
David Weir (U of Sussex) Program Analysis Term 1, 2015 510 / 606
Sequence Alignment
David Weir (U of Sussex) Program Analysis Term 1, 2015 511 / 606
Sequence Alignment Problem
Measuring the distance between two sequences of characters Extent that one string needs to be edited to produce the other? Penalty for changing characters
Penalty for gaps
Examples
“itemised” very close to “itemized” “Sily mistake” close to “silly mustakes”
Alignment of long DNA sequences
— to determine whether they are similar
— similar DNA sequences suggests similar functionality
David Weir (U of Sussex) Program Analysis Term 1, 2015 512 / 606
Sequence Alignment
Sil-y mistake- ↕↕↕↕↕↕↕↕↕↕↕↕↕↕ silly mustakes ↑↑↑↑
Cost of this alignment
Cost of difference between S and s
— denoted δ(S, s)
Cost of difference between i and u
— denoted δ(i, u)
Cost of 2 gaps in top sequence:
— denoted 2 × γ
David Weir (U of Sussex) Program Analysis Term 1, 2015 513 / 606
Matches
Consider two sequences X = (x1,…,xn) and Y = (y1,…,ym)
An alignment M of X and Y is a set of pairs (i,j) where 1 ≤ i ≤ n and 1 ≤ j ≤ m
For each i there is at most one pair (i,j) in M
For each j there is at most one pair (i,j) in M
(i,j) ∈ M, (i′,j′) ∈ M and i < i′ implies j < j′
Note that:
No pairings cross over
The number of pairs in M can be at most min(n, m) — i.e. |M| ≤ min(n,m)
The number of unpaired positions gaps(M) = max(n, m) − |M|
David Weir (U of Sussex) Program Analysis Term 1, 2015 514 / 606
Minimum Cost Sequence Alignment
Given sequences X = (x1,...,xn) and Y = (y1,...,ym) the minimum cost sequence alignment of X and Y is the alignment M with least cost
The cost of match M is:
δ(xi,yj) (i ,j )∈M
+
γ · gaps(M)
David Weir (U of Sussex)
Program Analysis
Term 1, 2015
515 / 606
Solving the Sequence Alignment Problem
Given sequences X = (x1,...,xn) and Y = (y1,...,ym)
Space of sub-problems:
A sub-problem for each pair (i,j) where 1 ≤ i ≤ n and 1 ≤ j ≤ m The cost of the best alignment of the subsequences
(x1,...,xi) and (y1,...,yj) A total of Θ(n · m) sub-problems
David Weir (U of Sussex) Program Analysis Term 1, 2015 516 / 606
Solving Problems Recursively
For each i, j the optimal solution B(i,j) is the best among:
δ(xi , yj ) + B(i − 1, j − 1) γ + B(i − 1, j)
γ + B(i, j − 1)
David Weir (U of Sussex) Program Analysis Term 1, 2015 517 / 606
Algorithm
SequenceAlignment(X , Y ):
let B(i, 0) ← i · γ for each 1 ≤ i ≤ n let B(0, j) ← j · γ for each 1 ≤ j ≤ m for i ← 1 to n
for j ← 1 to m
B(i, j) ← min[ δ(xi , yj ) + B(i − 1, j − 1),
γ + B(i, j − 1), γ + B(i − 1, j) ]
David Weir (U of Sussex) Program Analysis Term 1, 2015 518 / 606
A Graphical Representation of the Problem
y7
y6
y5
y4
y3
y2
y1
x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11
David Weir (U of Sussex) Program Analysis Term 1, 2015 519 / 606
Ordering of Sub-Problems
y7
y6
y5
y4
y3
y2
y1
x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11
David Weir (U of Sussex) Program Analysis Term 1, 2015 520 / 606
Ordering of Sub-Problems
y7
y6
y5
y4
y3
y2
y1
x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11
David Weir (U of Sussex) Program Analysis Term 1, 2015 520 / 606
Ordering of Sub-Problems
y7
y6
y5
y4
y3
y2
y1
x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11
David Weir (U of Sussex) Program Analysis Term 1, 2015 520 / 606
Ordering of Sub-Problems
y7
y6
y5
y4
y3
y2
y1
x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11
David Weir (U of Sussex) Program Analysis Term 1, 2015 520 / 606
Ordering of Sub-Problems
y7
y6
y5
y4
y3
y2
y1
x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11
David Weir (U of Sussex) Program Analysis Term 1, 2015 520 / 606
Ordering of Sub-Problems
y7
y6
y5
y4
y3
y2
y1
x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11
David Weir (U of Sussex) Program Analysis Term 1, 2015 520 / 606
Ordering of Sub-Problems
y7
y6
y5
y4
y3
y2
y1
x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11
David Weir (U of Sussex) Program Analysis Term 1, 2015 520 / 606
Ordering of Sub-Problems
y7
y6
y5
y4
y3
y2
y1
x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11
David Weir (U of Sussex) Program Analysis Term 1, 2015 520 / 606
Ordering of Sub-Problems
y7
y6
y5
y4
y3
y2
y1
x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11
David Weir (U of Sussex) Program Analysis Term 1, 2015 520 / 606
Ordering of Sub-Problems
y7
y6
y5
y4
y3
y2
y1
x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11
David Weir (U of Sussex) Program Analysis Term 1, 2015 520 / 606
Optimal Alignment Corresponds to Cheapest Path
y7
y6
y5
y4
y3
y2
y1
x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11
David Weir (U of Sussex) Program Analysis Term 1, 2015 521 / 606
Algorithm
SequenceAlignment(X , Y ):
let B(i, 0) ← i · γ for each 1 ≤ i ≤ n let B(0, j) ← j · γ for each 1 ≤ j ≤ m for i ← 1 to n
for j ← 1 to m
B(i, j) ← min[ δ(xi , yj ) + B(i − 1, j − 1),
γ + B(i, j − 1), γ + B(i − 1, j) ]
David Weir (U of Sussex) Program Analysis Term 1, 2015 522 / 606
Running Time of Algorithm
Solving B(i, j) for each (i, j) takes Θ(1) steps n · m different subproblems
Total running time is Θ(n · m)
David Weir (U of Sussex) Program Analysis Term 1, 2015 523 / 606
Example for you
Strings are made up of only the characters a, b and c Gap penalty γ is 5
Penalties δ are as given in the following table.
abc a5 b7 c570
Find optimal alignment of the strings abc and babc
0
8
8
0
David Weir (U of Sussex) Program Analysis Term 1, 2015 524 / 606
Complete the Table
20
15
15
10
10
5
5
8
5
10
15
c b a b
4 3 2 1 0
0123
abc
David Weir (U of Sussex)
Program Analysis
Term 1, 2015
525 / 606
Complete the Table
20
15
10
5
15
10
5
10
10
5
10
10
5
8
5
10
5
10
15
c b a b
4 3 2 1 0
0123
abc
David Weir (U of Sussex)
Program Analysis
Term 1, 2015
525 / 606
Dynamic Programming: Shortest Paths with Negative Weights
David Weir (U of Sussex) Program Analysis Term 1, 2015 526 / 606
Shortest Path Problem (again)
Limitation of Dijkstra’s Algorithm
Assumes there are no negative weights in the graph
Significance of negative weights
It can make sense for some weights to be negative
For example:
— edges correspond to activities and vertices to states
— weight of edge encodes “cost” associated with the activity — activities have negative cost when they generate income — shortest paths generate the most income
David Weir (U of Sussex) Program Analysis Term 1, 2015 527 / 606
Formulating the Problem
Shortest Path Problem:
Given a weighted directed graph G = (V , E ), a source vertex
s∈V andatargetvertext ∈V findtheshortestpathfroms to t
Assumption:
G does not contain a cycle with total weight that is negative There would be no bound on the lowest weighted path!
Note:
G can contain cycles
Cycles can contain edges with negative weights
David Weir (U of Sussex) Program Analysis Term 1, 2015 528 / 606
Important Observation
Since G has no negative cycles, there is a shortest path from s to t that contains no cycles
Justification:
If there was a cycle, it could be removed without increasing the cost of the path
A path without a cycle is called a simple path
Significance of this observation:
When looking for shortest paths, only consider simple paths Length no more than |V | − 1 = n − 1
David Weir (U of Sussex) Program Analysis Term 1, 2015 529 / 606
Space of Sub-Problems
Sub-problem space:
B(i,v)
Cost of least cost path from v to t containing no more than i edges
Number of Sub-problems:
v can be any of the n vertices in the graph 1≤i≤n−1
Hence, Θ(n2) sub-problems to solve
David Weir (U of Sussex) Program Analysis Term 1, 2015 530 / 606
Solving Sub-Problems
B(i,v) is the best of the following: B(i − 1,v)
—costofleastcostv tot pathwith≤i−1edges w(v,u)+B(i−1,u)where(v,u)∈E forsomeu∈V
— cost of edge to u plus cost from u to t with ≤ i − 1 edges There could be up to n − 1 edges (v, u) to consider!
Number of sub-problems to consider O(n) not O(1)
David Weir (U of Sussex) Program Analysis Term 1, 2015 531 / 606
Algorithm
BellmanFord(G, w , s, t ):
let B(0,t) ← 0 letB(0,v)←∞foreachv ∈V −{t} for i ← 1 to n − 1
foreachv ∈V
B(i,v) ← B(i − 1,v) for each (v,u) ∈ E
if B(i,v) > w(v,u) + B(i − 1,u) then B(i,v) ← w(v,u) + B(i − 1,u)
David Weir (U of Sussex) Program Analysis Term 1, 2015 532 / 606
Example Graph
v9v3v 521
15
11 2
-2
v v -3 v s v
-4 8473t
4
7 v6
-6
David Weir (U of Sussex)
Program Analysis Term 1, 2015 533 / 606
Illustration of Algorithm
i
01234567 v1
v2 v3 v4 v5 v6 v7 v8
Initialise with solutions to smallest sub-problems
David Weir (U of Sussex) Program Analysis Term 1, 2015 534 / 606
Illustration of Algorithm
i
01234567 v1
v2 v3 v4 v5 v6 v7 v8
∞
∞
0
∞
∞
∞
∞
∞
David Weir (U of Sussex)
Program Analysis
Term 1, 2015
535 / 606
Illustration of Algorithm
v9v3v 521
15
11 2
-2
v v -3 v s v
-4 8473t
4
7 v6
-6
i=1
v =v1: lowestcostisw(v1,v3)+B(0,v3)=2
David Weir (U of Sussex) Program Analysis Term 1, 2015 536 / 606
Illustration of Algorithm
i
01234567 v1
v2 v3 v4 v5 v6 v7 v8
∞
2
∞
0
∞
∞
∞
∞
∞
David Weir (U of Sussex)
Program Analysis
Term 1, 2015
537 / 606
Illustration of Algorithm
v9v3v 521
15
11 2
-2
v v -3 v s v
-4 8473t
4
7 v6
-6
v =v2: lowestcostisw(v2,v3)+B(0,v3)=−4 v = v3: lowest cost is B(0,v3) = 0
David Weir (U of Sussex) Program Analysis Term 1, 2015 538 / 606
Illustration of Algorithm
v9v3v 521
15
11 2
-2
v v -3 v s v
-4 8473t
4
7 v6
-6
v =v6: lowestcostisw(v6,v3)+B(0,v3)=−6 for all other v: B(1,v) = B(0,v) = ∞
David Weir (U of Sussex) Program Analysis Term 1, 2015 539 / 606
Illustration of Algorithm
i
01234567 v1
v2 v3 v4 v5 v6 v7 v8
∞
2
∞
−4
0
0
∞
∞
∞
∞
∞
−6
∞
∞
∞
∞
David Weir (U of Sussex)
Program Analysis
Term 1, 2015
540 / 606
Illustration of Algorithm
v9v3v 521
15
11 2
-2
v v -3 v s v
-4 8473t
4
7 v6
-6
i=2
v =v1: lowestcostisw(v1,v2)+B(1,v2)=−1 v = v2: lowest cost is B(1,v2) = −4
David Weir (U of Sussex) Program Analysis Term 1, 2015 541 / 606
Illustration of Algorithm
v9v3v 521
15
11 2
-2
v v -3 v s v
-4 8473t
4
7 v6
-6
v = v3: lowest cost is B(1,v3) = 0
v =v4: lowestcostisw(v4,v6)+B(1,v6)=1
David Weir (U of Sussex) Program Analysis Term 1, 2015 542 / 606
Illustration of Algorithm
v9v3v 521
15
11 2
-2
v v -3 v s v
-4 8473t
4
7 v6
-6
v =v5: lowestcostisw(v5,v2)+B(1,v2)=5 v = v6: lowest cost is B(1,v6) = −6
David Weir (U of Sussex) Program Analysis Term 1, 2015 543 / 606
Illustration of Algorithm
v9v3v 521
15
11 2
-2
v v -3 v s v
-4 8473t
4
7 v6
-6
v = v7: lowest cost is B(1,v7) = ∞
v =v8: lowestcostisw(v8,v6)+B(1,v6)=−2
David Weir (U of Sussex) Program Analysis Term 1, 2015 544 / 606
Illustration of Algorithm
i
01234567 v1
v2 v3 v4 v5 v6 v7 v8
∞
2
−1
∞
−4
−4
0
0
0
∞
∞
1
∞
∞
5
∞
−6
−6
∞
∞
∞
∞
∞
−2
David Weir (U of Sussex)
Program Analysis
Term 1, 2015
545 / 606
Illustration of Algorithm
v9v3v 521
15
11 2
-2
v v -3 v s v
-4 8473t
4
7 v6
-6
i=3
v = v1: lowest cost is B(2,v1) = −1 v = v2: lowest cost is B(2,v2) = −4
David Weir (U of Sussex) Program Analysis Term 1, 2015 546 / 606
Illustration of Algorithm
v9v3v 521
15
11 2
-2
v v -3 v s v
-4 8473t
4
7 v6
-6
v = v3: lowest cost is B(2,v3) = 0 v = v4: lowest cost is B(2,v4) = 1
David Weir (U of Sussex) Program Analysis Term 1, 2015 547 / 606
Illustration of Algorithm
v9v3v 521
15
11 2
-2
v v -3 v s v
-4 8473t
4
7 v6
-6
v =v5: lowestcostisw(v5,v8)+B(2,v8)=−4 v = v6: lowest cost is B(2,v6) = −6
David Weir (U of Sussex) Program Analysis Term 1, 2015 548 / 606
Illustration of Algorithm
v9v3v 521
15
11 2
-2
v v -3 v s v
-4 8473t
4
7 v6
-6
v =v7: lowestcostisw(v7,v4)+B(2,v4)=−2 v = v8: lowest cost is B(2,v8) = −2
David Weir (U of Sussex) Program Analysis Term 1, 2015 549 / 606
Illustration of Algorithm
i
01234567 v1
v2 v3 v4 v5 v6 v7 v8
∞
2
−1
−1
∞
−4
−4
−4
0
0
0
0
∞
∞
1
1
∞
∞
5
−4
∞
−6
−6
−6
∞
∞
∞
−2
∞
∞
−2
−2
David Weir (U of Sussex)
Program Analysis
Term 1, 2015
550 / 606
Illustration of Algorithm
v9v3v 521
15
11 2
-2
v v -3 v s v
-4 8473t
4
7 v6
-6
i=4
v = v1: lowest cost is B(3,v1) = −1 v = v2: lowest cost is B(3,v2) = −4
David Weir (U of Sussex) Program Analysis Term 1, 2015 551 / 606
Illustration of Algorithm
v9v3v 521
15
11 2
-2
v v -3 v s v
-4 8473t
4
7 v6
-6
v = v3: lowest cost is B(3,v3) = 0 v = v4: lowest cost is B(3,v4) = 1
David Weir (U of Sussex) Program Analysis Term 1, 2015 552 / 606
Illustration of Algorithm
v9v3v 521
15
11 2
-2
v v -3 v s v
-4 8473t
4
7 v6
-6
v = v5: lowest cost is B(3,v5) = −4 v = v6: lowest cost is B(3,v6) = −6
David Weir (U of Sussex) Program Analysis Term 1, 2015 553 / 606
Illustration of Algorithm
v9v3v 521
15
11 2
-2
v v -3 v s v
-4 8473t
4
7 v6
-6
v = v7: lowest cost is B(3,v7) = −2 v = v8: lowest cost is B(3,v8) = −2
David Weir (U of Sussex) Program Analysis Term 1, 2015 554 / 606
Illustration of Algorithm
None of the values for i = 4 have changed Therefore, all values stable
David Weir (U of Sussex) Program Analysis Term 1, 2015 555 / 606
Illustration of Algorithm
i
01234567 v1
v2 v3 v4 v5 v6 v7 v8
∞
2
−1
−1
−1
−1
−1
−1
∞
−4
−4
−4
−4
−4
−4
−4
0
0
0
0
0
0
0
0
∞
∞
1
1
1
1
1
1
∞
∞
5
−4
−4
−4
−4
−4
∞
−6
−6
−6
−6
−6
−6
−6
∞
∞
∞
−2
−2
−2
−2
−2
∞
∞
−2
−2
−2
−2
−2
−2
David Weir (U of Sussex)
Program Analysis
Term 1, 2015
556 / 606
Algorithm
BellmanFord(G, w , s, t ):
let B(0,t) ← 0 letB(0,v)←∞foreachv ∈V −{t} for i ← 1 to n − 1
foreachv ∈V
B(i,v) ← B(i − 1,v) for each {v,u} ∈ E
if B(i,v) > w(v,u) + B(i − 1,u) then B(i,v) ← w(v,u) + B(i − 1,u)
David Weir (U of Sussex) Program Analysis Term 1, 2015 557 / 606
Efficiency of Algorithm
Θ(n2) sub-problems
Each sub-problem takes O(n) time to solve Total running time is O(n3)
David Weir (U of Sussex) Program Analysis Term 1, 2015 558 / 606