程序代写代做代考 algorithm Bioinformatics graph C DNA ER Recall the maximum common subsequence problem from last day:

Recall the maximum common subsequence problem from last day:
More sophisticated: count # changes
e.g., You : Pythagorus I
Google : Pythagoras ? 7- change
TARMAC
A change is: } – add a letter
– delete a letter
– replace a letter –
gap
CS 341 F20 Lecture 9 1 Dynamic Programming II
xx
CATAMARAN
You : recur ance 11
Google : recurrence ? 2 changes
The problem comes up in bioinformatics for DNA strings.
DNA is a sequence of chromosones, i.e., a string over the mismatch alphabet A, C, T, G.
This is called edit distance.
Two string can be aligned in di↵erent ways:
e.g. AACAT e.g. AACAT (ll ll
AA AAG
3 changes
(2 gaps, 1 mismatch)
AAAAG
2 changes
(2 mismatches)

I
X
Pr
I.e
oblem:
.,
choices:
M
find the
Dynamic
Subproblem:
(i
,j)
-m
-m
-m
Given
Prog
align
atch
atch
M
atch yj to
= min
2
xi to
s
m
(i,j)
xi to
tr
> >: a +
ent
8> >< M ( i r+ d+ ings x1..xm bla bla that ramming Algorithm = min yi, gives t pay replacement nk (delete nk (add M(i 1, j 1) 1, j 1,j) M (i, j 1) M(i yi j x ) i) CS and y1..yn he min imum number of 1) 341 F20 if ma , co cost if ma xi 6 tch tch Lecture = mpute imum numb changes yj x yj to 9 they i to their er of di↵er blan blan e dit changes. to match x1..xi1xi ka ↳ distance. d = delete = ad x, d Xu and y1..yj1yj. cost c ost 2 if xi = yj k wher r e: = r eplac emen t cost -- i. i So far, we used r = d= a = 1 (i. e., co unt # changes). Y - ya , -- - Tj Mor e sop histicated: r(a, c) = 2 beca .. . not use these ke too close ys are clos cost depend e on ty s on the pewriter , letters. - - . wy ← j+ - , - - r(xi,yj) - replacement gap e.g., r(a, s) = 1 In what order do we solve subproblems? Same as last day. M [0..m, 0..n] for i = 0..m: M(i,0) = id ofneed -' hese forj=0..n:M(0,j)=ja - - delete i letters add le tte rs 2 6 • j J 33 subproblems for i = 1..m } fill matrix in order 1¥) i64 o ca A di↵erent application: music pattern matching matrix m xn Analysis: O(nm) time and O(nm) space (nm subproblems, constant time each) CS 341 F20 Lecture 9 3 7 forj=1..n (or coulddo 75 M(i,j)=... columnsfirst) ← ← this match this to use replacement rules that allowd→•l•l Recall I size A subset Weighted I nterval Weighted Interval e.g., you more • I is • w(i) = • som Find a m Can Pr be oblem is e gene a set S have ral aximum mo o weight pairs deled as Max cheduling aka of disjoint nterval Sche probl f element (“ o a int Schedulin ervals: duling: rences em: f item i (i, j) conflict graph: g Activity items”) Gi ven I CS 341 F20 Selection: and Lecture wreairreangeght 9 Given w(i) a set of for each inter i valsI, 2 I, find a maximum find set S ✓ I su ch 4 that no two inte rvals pre overl fe ap and for Pi2S w(i). certain activities. maximize weight subset Weight Independent Set and S⇢ ver tex I with no = item we c onflicti e dge will see ng pairs. = conflict later that it is NP-complete. A Con In gen Essentialy, W Fo s.t eral ap sider OPT(I) = genera hen I = Order something r each . inte one i l this i, we set pro intervals 1..n by let rval j is n ic In ar ach tem i. max{OPT( recursive may e tervals e to E T(n finding ither I end up of intervals, happ 1..j f p(i) = right en ens we solution )= s 2 T max olving we disjoint from or some largest i disjoint from in c choose does (n weig {i}), w(i) + OPT(I sub CS not 1)+ an do better wi dpoint j ndex j terval i. 341 F20 ht i it or probl interval i