Recall the maximum common subsequence problem from last day:
More sophisticated: count # changes
e.g., You : Pythagorus I
Google : Pythagoras ? 7- change
TARMAC
A change is: } – add a letter
– delete a letter
– replace a letter –
gap
CS 341 F20 Lecture 9 1 Dynamic Programming II
xx
CATAMARAN
You : recur ance 11
Google : recurrence ? 2 changes
The problem comes up in bioinformatics for DNA strings.
DNA is a sequence of chromosones, i.e., a string over the mismatch alphabet A, C, T, G.
This is called edit distance.
Two string can be aligned in di↵erent ways:
e.g. AACAT e.g. AACAT (ll ll
AA AAG
3 changes
(2 gaps, 1 mismatch)
AAAAG
2 changes
(2 mismatches)
I
X
Pr
I.e
oblem:
.,
choices:
M
find the
Dynamic
Subproblem:
(i
,j)
-m
-m
-m
Given
Prog
align
atch
atch
M
atch yj to
= min
2
xi to
s
m
(i,j)
xi to
tr
> >: a +
ent
8> >< M ( i r+
d+
ings x1..xm
bla
bla
that
ramming Algorithm
= min
yi,
gives t
pay replacement
nk (delete
nk (add
M(i
1, j 1)
1, j
1,j) M (i, j 1)
M(i
yi
j
x
)
i)
CS
and y1..yn
he min
imum number of
1)
341 F20
if
ma
, co
cost if
ma
xi 6
tch
tch
Lecture
=
mpute
imum numb
changes
yj
x
yj to
9
they
i to
their
er of
di↵er
blan
blan
e
dit
changes.
to match x1..xi 1xi
ka ↳
distance.
d
= delete
= ad
x,
d
Xu
and y1..yj 1yj.
cost
c
ost
2
if
xi
=
yj
k
wher
r
e:
=
r
eplac
emen
t
cost
--
i.
i
So
far, we used r
=
d=
a
= 1 (i.
e., co
unt
#
changes).
Y
-
ya
,
--
-
Tj
Mor
e
sop
histicated:
r(a,
c) =
2
beca
..
.
not
use
these ke
too close
ys are clos
cost depend
e
on ty
s
on the
pewriter
,
letters.
-
-
.
wy
←
j+
- ,
-
-
r(xi,yj) - replacement
gap
e.g., r(a,
s) =
1
In what order do we solve subproblems? Same as last day.
M [0..m, 0..n]
for i = 0..m: M(i,0) = id ofneed -' hese forj=0..n:M(0,j)=ja
- -
delete i letters add le tte rs
2
6 •
j
J 33 subproblems for i = 1..m } fill matrix in order 1¥)
i64 o ca
A di↵erent application: music pattern matching
matrix m xn
Analysis: O(nm) time and O(nm) space (nm subproblems, constant time each)
CS 341 F20 Lecture 9
3
7 forj=1..n (or coulddo
75 M(i,j)=... columnsfirst)
←
←
this
match this
to
use replacement rules that allowd→•l•l
Recall I
size
A
subset
Weighted I
nterval
Weighted Interval
e.g., you
more
• I is
• w(i) =
• som
Find a m
Can
Pr
be
oblem is
e
gene
a
set
S
have
ral
aximum
mo
o
weight
pairs
deled as
Max
cheduling aka
of disjoint
nterval
Sche
probl
f element (“
o
a
int
Schedulin
ervals:
duling:
rences
em:
f item i
(i, j) conflict
graph:
g
Activity
items”)
Gi
ven
I
CS
341 F20
Selection:
and
Lecture
wreairreangeght
9
Given
w(i)
a
set of
for
each
inter
i
valsI,
2 I,
find a maximum
find set S
✓ I su
ch
4
that no
two inte
rvals
pre
overl
fe
ap and
for
Pi2S w(i). certain activities.
maximize
weight
subset
Weight Independent Set and
S⇢
ver
tex
I
with no
= item
we
c
onflicti
e
dge
will see
ng pairs.
= conflict
later
that
it
is NP-complete.
A
Con
In
gen
Essentialy,
W
Fo
s.t
eral ap
sider
OPT(I) =
genera
hen I =
Order
something
r
each
. inte
one i
l this
i,
we
set
pro
intervals 1..n by
let
rval j is
n
ic
In
ar
ach
tem i.
max{OPT(
recursive
may
e
tervals
e
to
E
T(n
finding
ither
I
end up
of intervals,
happ
1..j f
p(i) =
right en
ens
we
solution
)=
s
2
T
max
olving
we
disjoint from
or some
largest i
disjoint from in
c
choose
does
(n
weig
{i}), w(i) + OPT(I
sub
CS
not
1)+
an do better wi
dpoint
j
ndex
j
terval i.
341 F20
ht i
it or
probl
interval i