Local alignment (Smith-
Waterman), overlap alignments,
using non-linear gap penalties CompSci 369, 2022
Copyright By PowCoder代写 加微信 powcoder
School of Computer Science, University of Auckland
Last lecture
Principle of optimality
recurrence relation Needleman-Wunsch algorithm elements of an alignment algorithm
This lecture
Local alignment algorithm
Overlap alignment
Alignment using more complex gap penalties
Elements of alignment algorithm
a recurrence relation for the quantity we are trying to optimise boundary conditions
tabular computing to e ciently calculate the recurrence traceback, includes rules for where to start and stop traceback.
Local alignment: algorithm
Want to nd best alignment for subsequences of Fix and .
De ne a Su x alignment: any alignment of
for some and .
j≤r≤1 i≤s≤1 jy…1+ryry
ix…1+sxsx
Let highest score of any su x alignment of and Since score of empty alignment is 0,
So instead of a letting a score for an alignment to go negative, we start a new alignment.
Traceback Starts at highest value in matrix, trace it back until we hit rst 0 with no predecesor.
. d − ) 1 − j , i ( F ⎪⎩⎪ ⎪
,d − )j ,1 − i( F ⎨ xam = )j ,i( F ,)iy,ix(s+)1−j,1−i(F⎪⎧
0 ≥ )j,i(F
Local alignment example
Find best local alignment of x = ATA and y = AGTTA with same gap penalty and score matrix as before.
Want to ll out
Highest score is at , so start traceback from there and stop when hit rst zero to get
Overlap alignment
Often have seqeunces x and y are related like:
Want a global alignment algorithm that doesn’t penalise the preceding or succeeding gaps.
————AGGGTCCTGGGTGCAGGGCACTG—– = x
GCTTAGCCGTTTTGGGTCCTGGGAGCAGGGCACTCACGAA = y
GCTTAGCCGTTTTGGGTCCTGGGAGCAGGGCACTCACGAA = x
———TTTTAGGTCCTGGGTGCAGGCCACT—— = y
GGCTGTTCTATGCATGAG———————- = x
—–TTCTATGCATGAGACAACTAGCCGTAAATTGGCGG = y
———-CGGGATCATATGAGCAATACGCCAAATGAA = x
ACATGGGTCACGGGGTCATA——————– = y
Overlap alignment
Use global aligment algorithm and just change boundary conditions and traceback:
Set for all and so that overlapping ends on the left are not penalised.
Use the same recurrence relation as for the Needleman-Wunsch algorithm Start traceback at maximum value on bottom/right boundaries, or
,some or .
Stop traceback when top or right boundary is reached.
j i 0 = )j,0(F = )0,i(F
j i )m,i(F
More complex gap penalities
Linear gap penalty results in an unrealisitc distribution of gaps. With an artibtrary gap penalty , get global recurrence relation:
.1 − j , … ,0 = k ,)k − j(γ + )k ,i( F ⎩
,1 − i , … ,0 = k ,)k − i(γ + )j ,k( F ⎨ xam = j,iF
,)jy,ix(s+)1−j,1−i(F⎧ )k(γ
More complex gap penalities
.1 − j , … ,0 = k ,)k − j(γ + )k ,i( F ⎩
,1 − i , … ,0 = k ,)k − i(γ + )j ,k( F ⎨ xam = j,iF
,)jy,ix(s+)1−j,1−i(F⎧
For need to consider other cells: diagonal + everything same row and column. Result is , not in speed
)2n(O )3n(O
1+j+i )j,i(F
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com