程序代写代做 computational biology Bayesian algorithm clock chain C Lecture 9: Reaction kinetics and deterministic ODE models

Lecture 9: Reaction kinetics and deterministic ODE models
We looked at stochastic models of biological models. In the limit of large numbers of molecules systems of coupled ODE are the most common class of models in computational biology.
Law of mass action
• Consider the chemical reaction with molecules A, B, C, D and stochiometric coeffi- cients a, b, c, d:
k+
aA + bB 􏰀 cC + dD k−
• The reaction rate v(t) is given by (square brackets denote concentrations) v(t)=−1d[A] =−1d[B] = 1d[C] = 1d[D]
a dt b dt c dt d dt
• Law of mass action: In equilibrium the rate of the forward and backward reac- tions are equal
k+ [A]a [B]b = k− [C]c [D]d
(Remark: This applies to simple, one step reactions, in solution or gas. For any more
complex reactions the exponents can only be experimentally deduced.) • Far from equilibrium we write:
d [A] = −ak+ [A]a [B]b + ak− [C]c [D]d dt
Example 14. Lotka-Volterra equations (ecology, predator-prey model) X ̇ = k1X−k2XY
Y ̇ = k2XY−k3Y
• The underlying reaction scheme could be formulated as:
k1
2A+X −→ 2X (1)
k2
X+Y −→ 2Y (2)
k3
Y −→ B
• [A] is assumed big so we ignore any reverse reactions, and any drop in [A]. [A] is implicitly incorporated into k1.
• Reactions (1,2) are autocatalytic and result in oscillatons of [X] and [Y ].
35

Michaelis-Menten (MM) kinetics
• Consider the conversion of substrate S to product P by an enzyme E which involves the formation of an intermediate enzyme-substrate comples ES.
k1+
E + S 􏰀 ES
d[S] = − Vmax[S] dt [S] + Km
• V = k+[E + ES] and Michaelis-Menten constant K = k1−+k2+ max 2 m k1+
• Km is the concentration of [S] at which the reaction has half-maximum speed Other common enzyme kinetics
• MM is one of the most common rate laws in computational models
• A generalised version of MM kinetics is:
d[S] = −Vmax [1 − exp(−[S])]
dt Km
• Logistic growth d[S] = [S](1 − [S]/K) dt
• The Hill equation (sigmoidal activation function) assumes cooperativity of binding (see GRN lecture)
d[Y] = k[S]n dt [S]n + Kan
• More abstract formulations exist that are based on power law kinetics for example Three protein oscillator
• Temperature compensated circadian rhythms can be created in vitro
• In bacteria the clock runs over several weeks, and is passed on to daughter cells
• KaiA, KaiB, KaiC: Three proteins isolated from a cyanobacterium, plus an energy source (ATP)
36
k1−
E S − −→ P + E
k2+
• Noting that [E] is small compared to [S] and does not change overall, we can derive:

• KaiC: can exist in four forms (U: unphosphorylated, S: Serin-P, T: Threonin-P, ST: double phosphorylated)
• KaiA enhances autophosphorylation of KaiC; without KaiA, KaiC dephosphory- lates
• KaiB inhibits KaiA
Figure 15: Kai clock model • Rust et al., Science 318, 809-12, 2007.
With D for the double phosphorylated state ST we have:
T ̇ =kUT(S)U+kDT(S)D−kTU(S)T−kDT(S)T, D ̇ = kDT (S)T + kSD(S)S − kDT (S)D − kDS(S)D, S ̇ =kUS(S)U+kDS(S)D−kSU(S)S−kSD(S)S,
A(S) = max{0, [KaiA] − 2S}
T(0)=0.68 D(0) = 1.36
S(0)=0.34
kA A(S) k (S)=k0 + XY
XY XY K + A(S) U = 3.4 − T − D − S
37

[KaiA] = 1.3 K = 0.43
• The concentration of active KaiA monomers (A) is defined as the total concen- tration of KaiA monomers ([KaiA]) minus 2 times the concentration of S (which inhibits KaiA via KaiB), but cannot be less than zero.
• Remark: KaiB’s binding to S is not included. We assume KaiB is in excess at all times and that its interaction with S is fast.
Table 7: Kai model parameter
Numerical solutions of ODE models
• Generally, for non-linear systems solutions in closed form are not possible
• Efficient solvers are paramount if the forward problem has to be solved many times,
as required when performing parameter fitting (see next lecture).
• Runge-Kutta solvers of 4th order with adaptive timestep are usually a good choice.
Stiff differential equations
• “Stiff” problems contain solutions which rapidly change, so standard solvers must take extremely small steps to remain stable.
• These problems arise for solutions of the form e−λt , where λ is a positive constant.
• Stiff solvers (Matlab ode23s, for example) employ implicit-multistep methods and
achieve faster convergence.
Example 15. For the initial value problem y′(t) = −15y(t), t ≥ 0, y(0) = 1.
the exact solution is y(t) = e−15t, and clearly y(t) → 0 as t → ∞ .
However, the magnitude of the the truncation error (n ’th derivatives) are 15n exp(−15t)
UT
TD
SD
US
TU
DT
DS
SU
k0
0
0
0
0
0.21
0
0.31
0.11
kA
0.479077
0.212923
0.505692
0.0532308
0.0798462
0.173
-0.319385
-0.133077
.
38

Lecture 10: Fitting models to data
• Consider the Lotka Volterra (LV) equation which describe oscillations of popula- tions (or molecules) X and Y over time:
X ̇ = k1X − k2XY The “Hello World” of dynamical systems Y ̇ = k 2 X Y − k 3 Y
with initial conditions X0 and Y0 , and parameters k1, k2 and k3 .
• Solutions for any time dependent ODE/SDE model can be written in the form
􏰎 f1(t,Θ) 􏰏 f(t,Θ)= f2(t,Θ)
• For LV, parameter vector Θ could either be θ = (k1, k2, k3) [fixed X0 and Y0 ], or θ∗ = (X0, Y0, k1, k2, k3) treating initial conditions as freely adjustable parameters.
• Goal: Fit parameters Θ to a timeseries of the model f(ti,Θ) that best fit data D(ti) at specific timepoints ti.
Least squares fitting
• Consider a n-variable time dependent model with solutions f1,…,n(t, Θ) that depend on parameters Θ. Data is given as g1,…,n(t) at time points t = t1,…,T
• We formulate an objective function based on the error y(t,Θ) = f(t,Θ)−g(t) and find the minimum error by minimising the sum of squares, employing.
minΘ(||y(t, Θ)||) = minΘ(y(t1, Θ)2 + y(t2, Θ)2 + … + y(tT , Θ)2)
• Gradient/steepest descent:
(i) Assume objective function y(Θ) with parameters Θ (ii) Begin with an initial approxi- mation Θn=1. y(Θ) decreases most rapidly in the direction of the vector −gradient y |Θn=1
(iii) Iteratively select better approximations using
Θn = Θn−1 − t gradΘy 􏰁􏰁Θn−1
Gradient descent has linear convergence
39

Newton’s method (Non-linear least squares, NLLS)
• Newton’s formula for finding the minimum of y(θ) achieves quadratic convergence by letting t = 1/y′′, but only if we start close to our true solution
θn = θn−1 − y′(θn−1) y′′(θn−1)
(Remark: If the profile of y(θ) has a distinct local minimum (y′′ > 0) we need to take smaller steps, but have more confidence in determining θ)
• For Θ = (θ1,…,θN) Newton’s method can be written in matrix form as Θn = Θn−1 − ∇y(Θn−1)(Hy(Θn−1)−1)′
∇y(Θ) = grad(y(Θ)) =
􏰐 ∂y 􏰑 ∂Θ1
∂y ∂Θ2
The Hessian matrix H represents the second derivatives of a function. 􏰎 ∂2y(Θ) 􏰏
Hy(Θ) = ∂Θi∂Θj N×N Matlab example LVfit_leastsquares.m
Remarks:
• All NLLS non-linear least squares optimisation methods follow the idea of Newton’s method. Example lsqnonlin in Matlab
• NLLS get stuck in local minima. Different start values are needed to explore other minima
• The Hessian can be used to directly estimate confidence intervals for parameters,
Markov Chain Monte Carlo (MCMC) methods for fitting models
• Suppose we have data D(ti) and want to fit the model f(ti|Θ) that best fit data
• Remark: Following a Bayesian approach we use conditional probabilities (indicated by the “|” sign). As you will see you won’t require an in depth understanding of Bayesian stats. MCMC model fitting is rather straightforward.
• Again we define an error function and employ the sum of squared errors, writin χ2 = 􏰃 (D(ti) − f(ti|Θ))2
i 2σχ2
σχ can be guessed, unless you have an estimate of the expected error
• The likelihood P of our parameters Θ explaining the data is higher when the error is minimal. A simple choice is using
P(D|Θ) = e(−χ2) 40

Metropolis MCMC algorithm
(i) Start with a guess for Θ and call that Θcurrent. Solve model and compute χ2current. (ii) Propose a new guess Θproposed = Θcurrent + σN (0, 1). Solve model and compute
χ2proposed .
(iii) To check whether P improves or not, compute the likelihood ratio
ratio = P(D|Θproposed) = e(χ2current-χ2proposed) P (D|Θcurrent)
(iv) Pick a uniformly distributed random number r ∈ [0, 1]. If r < ratio keep the current guess. Otherwise accept the proposed guess. (v) Proceed with (ii) until you see Θ converging • Because it can take a long time to achieve convergence, you only store values for Θ and χ2after some “burn-in” period • Optimal parameters will be the ones where χ2 is minimal, but usually you are interested in the stationary distribution of Θ. Matlab example: MetLV.m 41