Assessment schedule
Quiz 2 During tutorial time today
Note: Some MCQ with checkboxes could have multiple or
single choices correct. No partial marks awarded. Closed book exam and no translation plugins allowed.
Assignment 2 11:59 PM Sun 29 September 2019
The University of Sydney
Page 1
Dynamic Programming II
The University of Sydney Page 2
General techniques in this course
Greedy algorithms
Divide Conquer algorithms
Dynamic programming algorithms Network flow algorithms
The University of Sydney
Page 3
Algorithmic Paradigms
Greedy. Build up a solution incrementally, optimizing some local criterion.
Divideandconquer. Break up a problem into subproblems, solve each subproblem independently, and combine solution to subproblems to form solution to original problem.
Dynamic programming. Break up a problem into a series of overlapping subproblems, and build up solutions to larger and larger subproblems.
The University of Sydney Page 4
Key steps: Dynamic programming
1. Define subproblems 2. Find recurrences
3. Solve the base cases
4. Transform recurrence into an efficient algorithm
The University of Sydney Page 5
Recap: Weighted Interval Scheduling
Job j starts at sj, finishes at fj, and has value vj.
Two jobs compatible if they dont overlap.
Goal: find maximum value subset of mutually compatible jobs.
b
a
c
d
e
f
g
h
0 1 2 3 4 5 6 7 8 9 10 11 The University of Sydney
Time
Page 6
Recap: Weighted Interval Scheduling
Notation. Label jobs by finishing time: f1 f2 . . . fn .
Def. pj largest index ij such that job i is compatible with j. Ex: p85,p73,p20.
1
2
3
4
5
6
7
8
0 1 2 3 4 5 6 7 8 9 10 11 The University of Sydney
Time
Page 7
Recap: Dynamic Programming Step 1 Step 1: Define subproblems DP states
OPTj value of optimal solution to the problem consisting of job requests 1, 2, …, j.
The University of Sydney
Page 8
Recap: Dynamic Programming Step 2 Step 2: Find recurrences
Case 1: OPT selects job j.
cantuseincompatiblejobspj1,pj2,…,j1
must include optimal solution to problem consisting of remaining compatible jobs 1, 2, …, pj
Case 2: OPT does not select job j.
must include optimal solution to problem consisting of
remaining compatible jobs 1, 2, …, j1
OPTj max vj OPT pj, OPTj1 Case 1 Case 2
The University of Sydney
Page 9
Recap: Dynamic Programming Step 3 Step 3: Solve the base cases
OPT0 0
OPTjii 0 if j0 imaxvj OPTpj, OPTj1 otherwise
The University of Sydney
Page 10
Done…more or less
Recap: Naive Recursion is Exponential
OPTjii 0 if j0 imaxvj OPTpj, OPTj1 otherwise
5 43
3221 211010
10
Could get an exponential number of subproblems!
The University of Sydney
Page 11
Recap: Memoization
OPTjii 0 if j0 imaxvj OPTpj, OPTj1 otherwise
5 43
3221 211010
10
Could get an exponential number of subproblems!
Memoization: Instead of recomputing every subproblem store the results of each subproblem.
The University of Sydney Page 12
Recap: Bottomup
Bottomup Dynamic Programming. Unwind recursion.
OPTjii 0 if j0 imaxvj OPTpj, OPTj1 otherwise
ComputeOpt
OPT0 0
for j 1 to n
OPTj maxvj OPTpj, OPTj1
The University of Sydney
Page 13
Time: On
Recap: Finding a Solution
Question. Dynamic programming algorithm computes optimal value.
What if we want the solution itself? Answer. Do some postprocessing.
Run ComputeOptn
Run FindSolutionn
FindSolutionj
if j 0
output nothing
else if vj Mpj Mj1
print j
FindSolutionpj
else
FindSolutionj1
picked job j
of recursive calls n On. The University of Sydney
Page 14
14
Recap: Knapsack Problem
Knapsackproblem.
Givennobjectsandaknapsack.
Item i weighs wi 0 kilograms and has value vi 0. KnapsackhascapacityofWkilograms.
Goal: fill knapsack so as to maximize total value.
W 11
Item
Value
Weight
1
1
1
2
6
2
3
18
5
4
22
6
5
28
7
The University of Sydney
Page 15
Recap: Dynamic Programming Step 1 Step 1: Define subproblems
OPTi, w max profit with subset of items 1, …, i with weight limit w.
The University of Sydney
Page 16
Recap: Dynamic Programming Step 2 Step 2: Find recurrences
Case 1: OPT does not select item i.
OPT selects best of 1, 2, …, i1 using weight limit w
Case 2: OPT selects item i.
newweightlimitwwi
OPT selects best of 1, 2, …, i1 using this new weight limit
OPTi,w max vi OPT i1,wwi, OPTi1,w
The University of Sydney
Page 17
Recap: Dynamic Programming Step 2 Step 2: Find recurrences
Case 1: OPT does not select item i.
OPT selects best of 1, 2, …, i1 using weight limit w
Case 2: OPT selects item i.
newweightlimitwwi
OPT selects best of 1, 2, …, i1 using this new weight limit
If wiw
OPTi,w OPT i1,w
otherwise
OPTi,w max vi OPT i1,wwi, OPTi1,w
The University of Sydney
Page 18
Recap: Dynamic Programming Step 3 Step 3: Solve the base cases
OPT0,w 0
i0 if i0 OPTi,wiOPTi1,w if i0andw w
i
i
imaxOPTi1,w,vOPTi1,ww otherwise iii
The University of Sydney
Page 19
Done…more or less
Knapsack Problem: BottomUp
Knapsack. Fill up an n1byW1 array.
Input: n, w1,…,wN, v1,…,vN for w 0 to W
M0, w 0
for i 1 to n
for w 1 to W
if wi w
Mi, w Mi1, w
else
Mi, w max Mi1, w, vi Mi1, wwi return Mn, W
The University of Sydney Page 20
Knapsack Algorithm Recap
W1
0
1
2
3
4
5
6
7
8
9
10
11
0
0
0
0
0
0
0
0
0
0
0
0
1
0
1
1
1
1
1
1
1
1
1
1
1
1, 2
0
1
6
7
7
7
7
7
7
7
7
7
1, 2, 3
0
1
6
7
7
18
19
24
25
25
25
25
1, 2, 3, 4
0
1
6
7
7
18
22
24
28
29
29
40
1, 2, 3, 4, 5
0
1
6
7
7
18
22
28
29
34
35
40
n1
Item
Value
Weight
1
1
1
2
6
2
3
18
5
4
22
6
5
28
OPT: 4,3
value 22 18 40
W 11
The University of Sydney
7
Page 21
Recap: Longest increasing subsequence
Given a sequence of numbers X1..n find the longest increasing subsequence i1, i2, …, ik, that is a subsequence where numbers in the sequence increase.
The University of Sydney
Page 22
52813697
Longest Common Subsequence
The University of Sydney Page 23
Longest Common Subsequence
Given two sequences X1..n and Y1..m, find the longest subsequences X of X and Y of Y such that X Y.
Example 1 X BANANAS Y KATANA
LCSX,Y AANA
Example 2 X DYNAMIC
Y PROGRAMMING
LCSX,Y AMI The University of Sydney
Page 24
Longest Common Subsequence: Matching Definition
Given two sequences X1..n and Y1..m, find the longest non crossing matching between elements of X and elements of Y
Example 1 X BANANAS Y KATANA
LCSX,Y AANA. Matching: X2 Y2, X4 Y4, X5 Y5, X6 Y6.
Example 2 X DYNAMIC
Y PROGRAMMING
LCSX,Y AMI. Matching: X4 Y6, X5 Y7, X6 Y9.
Note: if crossings allowed, then can add X3 Y10. The University of Sydney
Page 25
Longest Common Subsequence: Applications
Measures similarity of two strings Bioinformatics
Merging in version control
The University of Sydney Page 26
LCS Dynamic Programming: Step 1 Step 1: Define subproblems first try
OPTi length of LCSX1..i, Y.
Subproblems: finding LCS of Y and prefixes of X
Counterexample X AAA
YA
Need to know if optimal solution to LCSX1..2,Y matched
Y1.
The University of Sydney
Page 27
LCS Dynamic Programming: Step 1 Step 1: Define subproblems
OPTi,j length of LCSX1..i, Y1..j. Subproblems: finding LCS of prefixes of X and Y
Example
X BANANAS Y KATANA
OPT6, 4 length of LCSBANANA, KATA 2 The University of Sydney
Page 28
LCS Dynamic Programming: Step 2
Notations:
OPTi,j length of LCSX1..i, Y1..j. Step 2: Finding recurrences
Case 1: Xi Yj.
Leave Xi unmatched and take LCSX1..i1, Y1..j. OR Leave Yj unmatched and take LCSX1..i, Y1..j1.
OPTi,j maxOPTi1,j, OPTi,j1
Example
X1..5 BANAN Y1..6 KATANA
X5 unmatched: LCS LCSX1..4, Y1..6 ANA
Y6 unmatched: LCS LCSX1..5, Y1..5 AAN The University of Sydney
Page 29
LCS Dynamic Programming: Step 2
Notations:
OPTi,j length of LCSX1..i, Y1..j. Step 2: Finding recurrences
Case 1: Xi Yj.
Leave Xi unmatched and take LCSX1..i1, Y1..j. OR Leave Yj unmatched and take LCSX1..i, Y1..j1.
OPTi,j maxOPTi1,j, OPTi,j1
Example
X1..5 BANAN Y1..6 KATANA
X5 unmatched: LCS LCSX1..4, Y1..6 ANA
Y6 unmatched: LCS LCSX1..5, Y1..5 AAN
The University of Sydney
Page 30
LCS Dynamic Programming: Step 2
Notations:
OPTi,j length of LCSX1..i, Y1..j. Step 2: Finding recurrences
Case 1: Xi Yj.
Leave Xi unmatched and take LCSX1..i1, Y1..j. OR Leave Yj unmatched and take LCSX1..i, Y1..j1.
OPTi,j maxOPTi1,j, OPTi,j1
Example
X1..3 BAN Y1..6 KATANA
X3 unmatched: LCSX1..2, Y1..6 A
Y6 unmatched: LCSX1..3, Y1..5 AN The University of Sydney
Page 31
LCS Dynamic Programming: Step 2
Notations:
OPTi,j length of LCSX1..i, Y1..j. Step 2: Finding recurrences
Case 1: Xi Yj.
Leave Xi unmatched and take LCSX1..i1, Y1..j. OR Leave Yj unmatched and take LCSX1..i, Y1..j1.
OPTi,j maxOPTi1,j, OPTi,j1
Example
X1..3 BAN Y1..6 KATANA
X3 unmatched: LCSX1..2, Y1..6 A
Y6 unmatched: LCSX1..3, Y1..5 AN
The University of Sydney
Page 32
LCS Dynamic Programming: Step 2
Notations:
OPTi,j length of LCSX1..i, Y1..j. Step 2: Finding recurrences
Case 2: Xi Yj.
Match Xi to Yj and take LCSX1..i1, Y1..j1 Xi OR
Leave Xi unmatched or Yj unmatched
OPTi,j maxOPTi1, j1 1, OPTi1, j, OPTi, j1
Example
X1..6 BANANA Y1..6 KATANA
Match X6 Y6: LCS LCSX1..5, Y1..5 A AANA
X6 unmatched: LCS LCSX1..5, Y1..6 AAN
Y6 unmatched: LCS LCSX1..6, Y1..5 AAN The University of Sydney
Page 33
LCS Dynamic Programming: Step 3 Step 3: Solving the base cases
OPTi,0 0 for all i, OPT0,j 0 for all j
80 if i 0 or j 0
OPTi,j maxOPTi1,j11,OPTi,j1,OPTi1,j ifXiYj
maxOPTi, j 1, OPTi 1, j if Xi 6 Y j
:
The University of Sydney Page 34
LCS Dynamic Programming: Algorithm
INPUT: n, m, X1..n, Y1..m
for j 0 to m
M0,j 0
for i 0 to n
Mi,0 0
for i 1 to n
for j 1 to m
if Xi Yj
Mi,j maxMi1,j1 1, Mi1,j, Mi,j1
else
Mi,j maxMi1,j, Mi,j1
return Mn,m
80 if i 0 or j 0
OPTi,j maxOPTi1,j11,OPTi,j1,OPTi1,j ifXiYj
maxOPTi, j 1, OPTi 1, j if Xi 6 Y j
:
The University of Sydney Page 35
LCS Dynamic Programming: Analysis
INPUT: n, m, X1..n, Y1..m
for j 0 to m
M0,j 0
for i 0 to n
Mi,0 0
for i 1 to n
for j 1 to m
On m
Onm
if Xi Yj
Mi,j maxMi1,j1 1, Mi1,j, Mi,j1
else
Mi,j maxMi1,j, Mi,j1
return Mn,m
Running time: Onm Space: Onm
The University of Sydney
Page 36
Shortest Paths
The University of Sydney Page 37
Shortest Paths
Shortest path problem. Given a directed graph G V, E, with edge weights cvw, find shortest path from node s to node t.
allow negative weights
2 10 3
9
s
18
30
5
8
20
7
6
6
15
16 6
4 19
6 16
t
11
44
The University of Sydney
Page 38
Shortest Paths: Attempts by Dijkstra
Negative edge costs.
2
u
3
Dijkstra Failed
sv
6 t
1
Reweighting. Adding a constant to every edge weight can fail.
55
22
s6 6t
3
3
3
0
The University of Sydney
Page 39
Shortest Paths: Negative Cost Cycles
Negative cost cycle.
6
4 7
Observation. If some path from s to t contains a negative cost cycle, there does not exist a shortest st path; otherwise, there exists one that is simple.
st
W cW 0
The University of Sydney
Page 40
Shortest Paths: Dynamic Programming
Problem: Find shortest path from s to t Step 1: Define subproblems
OPTi, v length of shortest vt path P using at most i edges.
The University of Sydney
Page 41
s
vt
i edges
Shortest Paths: Dynamic Programming
Step 2: Find recurrences
Case 1: P uses at most i1 edges.
OPTi, v OPTi1, v Case 2: P uses exactly i edges.
t
if v, w is first edge, then OPT uses v, w, and then selects best wt path using at most i1 edges
w
t
v
i1 edges
v
i1 edges
OPTi,v
minOPTi1,v, min OPTi1,wcvw v,wIE
The University of Sydney
Page 42
Shortest Paths: Dynamic Programming Step 3: Solve the base cases
OPT0,t 0 and OPT0,vt
The University of Sydney
Page 43
Shortest Paths: Dynamic Programming
Step 1: OPTi, v length of shortest vt path P using at most i edges.
Step 2:
Case 1: P uses at most i1 edges. OPTi, v OPTi1, v
Case 2: P uses exactly i edges.
if v, w is first edge, then OPT uses v, w, and then selects
best wt path using at most i1 edges
Step 3: OPT0,t 0 and OPT0,vt
0 if i0 and vt OPTi,v if i0 and vt
minOPTi1,v, min OPTi1,wcvw otherwise
The University of Sydney
v,wIE
Page 44
BellmanFord Shortest Path Implementation
Analysis. Qmn time, Qn2 space.
Finding the shortest paths. Maintain a successor for each table
entry.
The University of Sydney Page 45
Shortest Paths: Dynamic Programming Table
Single row in the table corresponds to the shortest path from a particular node to t, as we allow the path to use an increasing number of edges.
The University of Sydney Page 46
BellmanFord: Efficient Implementation
PushBasedShortestPathG, s, t foreach node v I V
Mv successorv
Mt 0
for i 1 to n1
foreach node w I V
if Mw has been updated in previous iteration
foreach node v such that v, w I E if Mv Mw cvw
Mv Mw cvw
successorv w
If no Mw value changed in iteration i, stop.
The University of Sydney Analysis: Qmn time, Qmn space. Page 47
Not Examined
BellmanFord Shortest Paths
Practical improvements.
Maintain only one array Mv shortest vt path that we have
found so far.
No need to check edges of the form v, w unless Mw changed in previous iteration.
Theorem: Throughout the algorithm, Mv is length of some vt path, and after i rounds of updates, the value Mv is no larger than the length of shortest vt path using i edges.
Overall impact.
Memory: Om n.
Running time: Omn worst case, but substantially faster in practice.
The University of Sydney
Page 48
48
Segmented Least Squares
The University of Sydney Page 49
Segmented Least Squares
Least squares.
Foundational problem in statistic and numerical analysis.
Given n points in the plane: x1, y1, x2, y2 , . . . , xn, yn.
Find a line y ax b that minimizes the sum of the squared error:
y
S S E an y i a x i b 2 i1
Solution. Calculus min error is achieved when The University of Sydney
x
anaixiyi aixiai yi, bai yi aaixi
n a x 2 a x 2 n ii ii
Page 50
Segmented Least Squares
Segmented least squares.
Points lie roughly on a sequence of several line segments.
Givennpointsintheplanex1,y1,x2,y2,…,xn,ynwith
x1 x2 … xn, find a sequence of lines that minimizes fx.
Question. Whats a reasonable choice for fx to balance
accuracy and complexity?
number of lines
y
The University of Sydney
x Page 51
Segmented Least Squares
Segmented least squares.
Points lie roughly on a sequence of several line segments.
Givennpointsintheplanex1,y1,x2,y2,…,xn,ynwith
x1 x2 … xn, find a sequence of lines that minimizes:
the sum of the sums of the squared errors E in each segment the number of lines L
Tradeoff function: E cL, for some constant c 0.
y
The University of Sydney
x Page 52
Dynamic Programming: Multiway Choice Step 1 Step 1: Define subproblems
OPTj minimum cost for points p1, p2 , . . . , pj.
The University of Sydney Page 53
Dynamic Programming: Multiway Choice Step 2
Notations:
OPTjminimumcostforpointsp1,p2 ,…,pj.
ei,j minimumsumofsquaresforpointspi,pi1 ,…,pj.
Step 2: Finding recurrences
Lastsegmentusespointspi,pi1 ,…,pj forsomei. Cost ei, j c OPTi1.
OPTj min ei,j c OPT i1 1ij
The University of Sydney Page 54
Dynamic Programming: Multiway Choice Step 3
Step 3: Solving the base cases OPT0 0
If j0 then OPT0 0
otherwise
OPTj min ei,j c OPT i1 1ij
The University of Sydney
Page 55
Segmented Least Squares: Algorithm
On2 iterations
On iterations
INPUT: n, p1,…,pn, c
SegmentedLeastSquares
M0 0
for j 1 to n
for i 1 to j
compute the least square error eij for the segment pi,…, pj
for j 1 to n
Mjmin1 i jeij cMi1
return Mn
If j0 then
OPT0 0
otherwise
OPTj min ei,j c OPT i1 1ij
The University of Sydney
Page 56
Segmented Least Squares: Algorithm
On2 iterations
On iterations
INPUT: n, p1,…,pn, c
SegmentedLeastSquares
M0 0
for j 1 to n
for i 1 to j
compute the least square error eij for the segment pi,…, pj
On
On
for j 1 to n
Mjmin1 i jeij cMi1
return Mn
The University of Sydney
Page 57
Running time: On3 Space: On2
RNA Secondary Structure
The University of Sydney Page 58
Motivation: Future Biology
Biology of the future should only involve a biologist and his dog:
thebiologisttowatchthe biological experiments and understand the hypotheses that the dataanalysis algorithms produce and
thedogtobitehimifheever touches the experiments or the computers.
The University of Sydney Page 59
Not Examined
The Central Dogma
The University of Sydney Page 60
Not Examined
RNA Ribonucleic acid Secondary Structure
RNA. String B b1b2…bn over alphabet A, C, G, U .
Secondary structure. RNA is singlestranded so it tends to loop
back and form base pairs with itself. This structure is essential
for understanding behavior of molecule.
CA AA AU
GC CGUAA G
Ex: GUCGAUUGAGCGAAUGUAACAACGUGGCUACGGCGAGA UGAUUA
G
ACGCU CGCGAGC
G
G AU
The University of Sydney
complementary base pairs: AU, CG G
Page 61
RNA Secondary Structure
Secondary structure. A set of pairs S bi, bj that satisfy: WatsonCrick. S is a matching and each pair in S is a WatsonCrick
complement: AU, UA, CG, or GC.
No sharp turns. The ends of each pair are separated by at least 4
intervening bases. If bi, bj I S, then i j 4.
Noncrossing. If bi, bj and bk, bl are two pairs in S, then we cannot
have i k j l.
Free energy. Usual hypothesis is that an RNA molecule will form the secondary structure with the optimum total free energy.
approximate by number of base pairs
Goal. Given an RNA molecule B b1b2…bn, find a secondary structure S that maximizes the number of base pairs.
The University of Sydney
Page 62
62
RNA Secondary Structure: Examples
GGGGG CUGGCU CGCGCU
AUAUAG UAUAUA
base pair
AUGUGGCCAU AUGGGGCAU AGUUGGCCAU 4
ok sharp turn crossing
The University of Sydney
Page 63
RNA Secondary Structure: Subproblems
First attempt Step 1. OPTj maximum number of base pairs in a secondary structure of the substring b1b2…bj.
match bt and bn
1
tn
Difficulty. Results in two subproblems.
Findingsecondarystructurein:b1b2…bt1.
Findingsecondarystructurein:bt1bt2…bn1.
OPTt1
need more subproblems
The University of Sydney
Page 64
RNA Secondary Structure: Subproblems recurrence
Including the pair t, j results in two independent subproblems. Recurrence using one variable
Recurrence using two variable
The University of Sydney Page 65
Dynamic Programming Over Intervals Step 1 Step 1: Define subproblems
OPTi, j maximum number of base pairs in a secondary structure of the substring
bibi1…bj.
The University of Sydney
Page 66
Dynamic Programming Over Intervals Step 2 Notation. OPTi, j maximum number of base pairs in a
secondary structure of the substring bibi1…bj. Step 2: Find recurrences
Case 1. Base bj is not involved in a pair. OPTi, j OPTi, j1
Case2. Basebjpairswithbt forsomeitj4.
noncrossing constraint decouples resulting subproblems
OPTi, j 1 max OPTi, t1 OPTt1, j1 i t j4
The University of Sydney
Page 67
Dynamic Programming Over Intervals Step 3 Step 3: Solve the base cases
If i 3 j 4 then
OPTi, j 0 by nosharp turns condition
also if j5
The University of Sydney
Page 68
Bottom Up Dynamic Programming Over Intervals
Question: What order to solve the subproblems? Answer: Do shortest intervals first.
RNA1,n
for k 5, 6, …, n1
for i 1, 2, …, nk jik
Compute OPTi,j
return OPT1,n
using the recurrence
0
0
0
0
0
0
i
4 3 2 1
Runningtime:On3 The University of Sydney
Page 69
6789 j
Dynamic Programming Summary I
3 steps:
1. Definingsubproblems
The University of Sydney
Page 70
2. 3.
Finding recurrences Solving the base cases
70
Dynamic Programming Summary II
1D dynamic programming
Weighted interval scheduling
Segmented Least Squares
Maximumsum contiguous subarray Longest increasing subsequence
2D dynamic programming Knapsack
Longest common subsequence Shortest path
Dynamic programming over intervals
RNA Secondary Structure The University of Sydney
Not Examined
Page 71