Numerical Methods & Scientific Computing: lecture notes
Data fitting
Hybrid methods
Since we have
a method [steepest descent with line search] which guarantees global convergence but can stagnate near the minimum
a method [Gauss-Newton] which is not guaranteed to converge, unless you’re close to a minimum, but typically converges faster near the minimum
=) we should try a hybrid method that combines the best features of both
like fzero for root-finding
We present as a trust region method [could also present as line search].
Numerical Methods & Scientific Computing: lecture notes
Data fitting
Trust region methods
The idea of trust region TR methods is to to locally minimize a model for the cost function, in a trust region of radius about the current position, in which you think the model is good.
A pseudocode would be:
Initialize the trust region radius
Do until termination criteria satisfied
solve the trust region subproblem
[minimize the model inside the trust region]
test the trial point x_t and the TR radius and
decide to accept or not x_t or radius or both
end
Numerical Methods & Scientific Computing: lecture notes
Data fitting
The trust region subproblem
At the current position, we want to minimize the quadratic model
min m(x) = C (xc ) + sT J(xc )T R(xc ) + 1 sT J(xc )T J(xc )s 2
using the Gauss-Newton approximation to r2C, subject to ||s||
We either
have the local minimizer inside the TR, in which case we don’t see the TR constraint, and the minimizer is found from
J(xc )T J(xc )s = J(xc )T R(xc )
the Gauss-Newton step, or
the minimizer falls outside the TR, so we must have the equality constraint
||s|| =
This second case we can solve using Lagrange multipliers [ from MAST20009].
Numerical Methods & Scientific Computing: lecture notes
Data fitting
Lagrange multipliers
To minimize a function m(x) subject to an equality constraint c(x) = 0, we form the augmented cost function or Lagrangian
L(x, ) = m(x) c(x) Then the necessary conditions for a minimum are
rxL = 0 and the constraint equation.
In the present context, we have c(x) = 1 ( 2 sT s) and m(x) is the
quadratic model. Then
rxL = J(xc )T R(xc ) + s + J(xc )T J(xc )s = 0
⇥J(xc )T J(xc ) + I⇤ s = J(xc )T R(xc )
a weighted combination of Gauss-Newton (for small ) and steepest descent (for large ).
2
Numerical Methods & Scientific Computing: lecture notes
Data fitting
Levenberg-Marquardt method [1944/1963]
This weighted combination is called the Levenberg-Marquardt method and is a common method for nonlinear least squares problems. The trust region derivation is due to Mor ́e [1977].
The idea is to adjust the value of so as to maintain a TR in which the quadratic model is good. In many implementations we never adjust explicitly, only — the Levenberg-Marquardt parameter. We take a trial step; if it’s good by some measure, we accept the step and perhaps alter ; if not, we reject the step, alter and make a new trial step.
We must start with an initial value for — values of 10 2 or 10 3 are typical.
Numerical Methods & Scientific Computing: lecture notes
Data fitting
Choosing values of
Since s decreases as increases, if the step is successful we decrease [increase ] to make the step more like Gauss-Newton; if not, we increase [decrease ] to make the step more like steepest descent.
MATLAB defines as a successful step whether the cost function is reduced, then is divided by 10. If the cost function increases at the trial position, the step is rejected and is mutiplied by 10.
More sophisticated schemes are often used based on the actual reduction of the cost function, compared to the predicted reduction of the cost function i.e. how well the quadratic model is doing.
Numerical Methods & Scientific Computing: lecture notes
Data fitting
Properties
At each step, we don’t solve the normal equations
⇥J(xc )T J(xc ) + I⇤ s = J(xc )T R(xc )
Instead, we find the (linear) least squares solution of the overdetermined
system
J(xc) R(xc) p I s= 0
Under mild assumptions, the Levenberg-Marquardt method converges globally to a local minimum
converges quadratically in the final iterations
an e cient and robust method for nonlinear least squares
MATLAB by default uses a di↵erent trust region method, with Levenberg-Marquardt as an optional method.
Numerical Methods & Scientific Computing: lecture notes
Data fitting
Example
Numerical Methods & Scientific Computing: lecture notes
Data fitting
In MATLAB
the commands for solving NLSQ problems are in the Optimization App/Toolbox and are used in the Curve Fitting App/Toolbox.
They include:
lsqnonlin
lsqcurvefit
lsqcurvefit is mainly a more convenient wrapper to lsqnonlin for curve fitting applications. It calls lsqnonlin but requires di↵erent inputs from the user.
lsqnonlin requires the user to provide a function that returns the residual vector R at the current value of x, given the data {Ti , Yi }.
lsqcurvefit requires the user to provide a function F that describes the nonlinear model
y ⇠ F (x, {Ti })
at the current value of x, given the data {Ti }. The data {Ti , Yi } are also explicit input arguments.
Numerical Methods & Scientific Computing: lecture notes
Data fitting
End of Week 9