计算机代考 Local alignment (Smith-

Local alignment (Smith-
Waterman), overlap alignments,
using non-linear gap penalties CompSci 369, 2022

Copyright By PowCoder代写 加微信 powcoder

School of Computer Science, University of Auckland

Last lecture
Principle of optimality
recurrence relation Needleman-Wunsch algorithm elements of an alignment algorithm

This lecture
Local alignment algorithm
Overlap alignment
Alignment using more complex gap penalties

Elements of alignment algorithm
a recurrence relation for the quantity we are trying to optimise boundary conditions
tabular computing to eciently calculate the recurrence traceback, includes rules for where to start and stop traceback.

Local alignment: algorithm
Want to nd best alignment for subsequences of Fix and .
Dene a Sux alignment: any alignment of
for some and .
j≤r≤1 i≤s≤1 jy…1+ryry
ix…1+sxsx

Let highest score of any sux alignment of and Since score of empty alignment is 0,
So instead of a letting a score for an alignment to go negative, we start a new alignment.
Traceback Starts at highest value in matrix, trace it back until we hit rst 0 with no predecesor.
. d − ) 1 − j , i ( F ⎪⎩⎪ ⎪
,d − )j ,1 − i( F ⎨ xam = )j ,i( F ,)iy,ix(s+)1−j,1−i(F⎪⎧
0 ≥ )j,i(F

Local alignment example
Find best local alignment of x = ATA and y = AGTTA with same gap penalty and score matrix as before.
Want to ll out

Highest score is at , so start traceback from there and stop when hit rst zero to get

Overlap alignment
Often have seqeunces x and y are related like:
Want a global alignment algorithm that doesn’t penalise the preceding or succeeding gaps.
————AGGGTCCTGGGTGCAGGGCACTG—– = x
GCTTAGCCGTTTTGGGTCCTGGGAGCAGGGCACTCACGAA = y
GCTTAGCCGTTTTGGGTCCTGGGAGCAGGGCACTCACGAA = x
———TTTTAGGTCCTGGGTGCAGGCCACT—— = y
GGCTGTTCTATGCATGAG———————- = x
—–TTCTATGCATGAGACAACTAGCCGTAAATTGGCGG = y
———-CGGGATCATATGAGCAATACGCCAAATGAA = x
ACATGGGTCACGGGGTCATA——————– = y

Overlap alignment
Use global aligment algorithm and just change boundary conditions and traceback:
Set for all and so that overlapping ends on the left are not penalised.
Use the same recurrence relation as for the Needleman-Wunsch algorithm Start traceback at maximum value on bottom/right boundaries, or
,some or .
Stop traceback when top or right boundary is reached.
j i 0 = )j,0(F = )0,i(F
j i )m,i(F

More complex gap penalities
Linear gap penalty results in an unrealisitc distribution of gaps. With an artibtrary gap penalty , get global recurrence relation:
.1 − j , … ,0 = k ,)k − j(γ + )k ,i( F ⎩
,1 − i , … ,0 = k ,)k − i(γ + )j ,k( F ⎨ xam = j,iF
,)jy,ix(s+)1−j,1−i(F⎧ )k(γ

More complex gap penalities
.1 − j , … ,0 = k ,)k − j(γ + )k ,i( F ⎩
,1 − i , … ,0 = k ,)k − i(γ + )j ,k( F ⎨ xam = j,iF
,)jy,ix(s+)1−j,1−i(F⎧

For need to consider other cells: diagonal + everything same row and column. Result is , not in speed
)2n(O )3n(O
1+j+i )j,i(F

程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com