2. Next, we will consider the case the multivariate function f : Rn Ñ Rm is
linear. Switching to a vector notation x “ px1, … , xnq, f being linear implies
Copyright By PowCoder代写 加微信 powcoder
(0.5) fpa1x1 ` a2x2q “ a1fpx1q ` a2fpx2q.
Linearity is a simplifying assumption: when f was nonlinear its values at two
points fpx1q, fpx2q did not automatically allow you to know all values of at
the linear combinations a1x1 ` b2x2.
In this case, we can rewrite the equations fpxq “ 0 as
(0.6) fpxq “
fpe1q | … | fpenq
Writing the matrix on the RHS as A, we arrive at the equation
(0.7) Ax “ 0.
We will also consider a related eigenvalue problem
(0.8) Ax “ �x.
We will derive numerical algorithms for solving these problems.
3. Next, we will consider methods for polynomial approximation, especially for
di↵erentiation and integration. Integration of functions f : ra, bs Ñ R, in
general, is a di�cult problem. One way to compute integrals of arbitrary
functions is to find a polynomial approximation pn of the given function f ,
then by computing the integral
pnpxqdx since we know how to integrate
polynomials.
We will derive numerical methods for computing such approximation pn, and
for computing the derivatives and the integrals. We will also derive error
bounds for how accurate p1n is in approximating f
pn dx in approxi-
1. Solving nonlinear equations on the single-dimensional interval.
1.1. The two problems (1.1-2, 1.6). We start by considering two problems:
Given a continuous function f : ra, bs Ñ R find ⇠ P ra, bs such that fp⇠q “ 0.(P1)
Given a continuous function g : ra, bs Ñ R find ⇠ P ra, bs such that gp⇠q “ ⇠.(P2)
It is not di�cult to see that the two problems are equivalent, if one lets, for
example gpxq :“ fpxq ` x (or fpxq :“ gpxq ´ x).
First, we answer the question of existence: When does a solution exist to the
problem (P1)? The following theorem answers that question.
Theorem 1.1. Given a continuous f : ra, bs Ñ R, fpaqfpbq § 0 then there exists
⇠ P ra, bs such that fp⇠q “ 0.
Proof. If fpaq “ 0 (or fpbq “ 0) then ⇠ “ a (or ⇠ “ b, resp). Supposing otherwise
if fpaqfpbq † 0 then there exists such a ⇠ by the Intermediate Value Theorem (IVT).
Instead of simply invoking the IVT, let us prove the latter statement in a con-
structive way. Without loss of generality, let fpaq † fpbq, fpaq ‰ 0, fpbq ‰ 0, and we
will show that there is a c P pa, bq such that fpcq “ 0. Consider the following sequence
of numbers paiq, pbiq given by the following program
Let a0 :“ a, b0 :“ b, i “ 0
For i “ 0, 1, 2, …
assign ci “ pai ` biq{2.
if fpciq “ 0 assign ai`1 :“ ci, bi`1 :“ ci, break.
else if fpciq ° 0 assign ai`1 :“ ai, bi`1 “ ci.
else if fpciq † 0 assign ai`1 :“ ci, bi`1 “ bi.
assign ai :“ ai`1, bi :“ bi`1
Initialize the sequence by a0 :“ a, b0 :“ b the above iteration yields the sequences
that satisfy the relations
(1.1) |ai`1 ´ ai| § |a ´ b| 2´i, |bi`1 ´ bi| § |a ´ b| 2´i, |ai ´ bi| § |a ´ b| 2´i.
So the sequences paiq and pbiq are each Cauchy (first two inequalities) and further,
they approach the same limit (last inequality).
Further, by construction fpaiq † 0 for all i, so that
fpaiq † 0 ñ fpcq “ f
fpaiq § 0.
Similarly, since fpbiq ° 0 it must be that fpcq • 0. Therefore fpcq “ 0.
The program above is called the bisection method, an algorithm for finding ⇠.
Translating Theorem 1.1 into an existence statement for (P2), we arrive at the
following.
Theorem 1.2 (Brouwer’s Fixed Point Theorem). Given a continuous g : ra, bs Ñ
ra, bs, there exists ⇠ P ra, bs such that gp⇠q “ ⇠. Such ⇠ is called a fixed point.
Proof. The statement follows from fpxq “ x ´ gpxq and applying Theorem 1.1;
a § gpxq § b for all x P ra, bs in particular for x “ a and x “ b.
Supposing that we have g : ra, bs Ñ ra, bs, the fact that g maps the interval ra, bs
into itself suggests the iteration we call a simple iteration.
Definition 1.3. A simple iteration on a function g : ra, bs Ñ R is given by
xk`1 “ gpxkq, k “ 0, 1, 2, … .
If a simple iteration converges to some c P ra, bs, then we have by continuity of g,
(1.2) c “ lim
Hence it follows that c P ra, bs is a fixed point.
In a particular setting, the solution to (P2) is unique, and the simple iteration
always converges to the unique fixed point. That is when the continuous function g
is a contraction.
Definition 1.4 (Contraction). A function g : ra, bs Ñ R is a contraction if there
exists a L P p0, 1q such that
|gpxq ´ gpyq| § L |x ´ y| .
for all x, y P ra, bs.
This is also a form of continuity called Lipschitz continuity.
Theorem 1.5 (Contraction Mapping Theorem).
If g : ra, bs Ñ ra, bs is a contraction, then
‚ g has a unique fixed point ⇠ P ra, bs,
‚ simple iteration converges to ⇠ for any x0 P ra, bs.
Proof. Uniqueness. Suppose ⇠1, ⇠2 P ra, bs are two fixed points. Then the dis-
tance between them is upper bounded by,
|⇠1 ´ ⇠2| § |gp⇠1q ´ gp⇠2q| § L |⇠1 ´ ⇠2| .
Therefore |⇠1 ´ ⇠2| § 0 which implies ⇠1 “ ⇠2. So there can only be one fixed point.
Convergence. The distance between the k-th iterate of the simple iteration xk
and the fixed point ⇠ is upper bounded by
|xk ´ ⇠| § |gpxk´1q ´ gp⇠q| § L |xk´1 ´ ⇠| .
Repeating this, one sees that
(1.3) |xk ´ ⇠| § Lk |x0 ´ ⇠| .
Since L P p0, 1q, the RHS converges to 0 as k Ñ 8. Hence xk Ñ ⇠.
Using the equation (1.3), one may attempt to estimate the number of iteration
one needs to ensure that xk is within some ✏ of ⇠. That is,
(1.4) Lk |x0 ´ ⇠| † ” ñ |xk ´ ⇠| † “.
Thus if one choose k satisfying
then xk will be within ” of xk. However, this is not useful in practice, since the term
|x0 ´ ⇠| on the RHS implies the knowledge of ⇠ already.
Instead, we can estimate the distance between xk and xk`1, and show that the
simple iteration pxkq is Cauchy, to show (i) the convergence of the sequence, as well
as (ii) estimate the number of iterations necessary to be within ” of the fixed point
without the knowledge of ⇠.
Let us first assume the conditions of Theorem 1.5 are satisfied. Let �k :“ xk`1 ´
xk. Then for any n ° m • 0,
|xn ´ xm| §
Lk´m |�m| § |�m|
Now, using the relation that |�k`1| § L |�k|, we have
(1.7) |xn ´ xm| §
|x1 ´ x0| .
Thus, if want |xn ´ xm| † “, we can simply choose n ° m • N where
Thus the simple iteration pxkq in this setting is a Cauchy sequence and therefore is
convergent. Since a convergent simple iteration always converges to the fixed point
⇠, the second part of the conclusion in Theorem 1.5 holds. Furthermore, for Cauchy
sequences, the limit ⇠ can be replaced with xn in the inequality (1.7), thus choosing
k greater than the RHS of (1.8) guarantees that xk is within ” of ⇠.
Now, let us consider the case that g : ra, bs Ñ ra, bs is a continuously di↵erentiable
function. What are the conditions such that g is a contraction in that case?
‚ Suppose g : ra, bs Ñ ra, bs is continuously di↵erentiable and g1 : pa, bq Ñ R
satisfies |g1pxq| § L † 1 for some L.
‚ Then one has for all x, y P ra, bs, gpxq ´ gpyq “ g1p⌘qpx´yq for some ⌘ P pa, bq
by the Mean Value Theorem, so that
|gpxq ´ gpyq| §
�� |x ´ y| § L |x ´ y| ,
so that g is a contraction.
Such a condition is a global condition on g1, that is, |g1pxq| § L for all values of
x P ra, bs. With a more local condition, the convergence theorem becomes a bit more
Theorem 1.6. Let g : ra, bs Ñ ra, bs be continuous and suppose g1 is continuous
in the neighborhood of a fixed point ⇠ P ra, bs with |g1p⇠q| † 1. Then the simple
iteration pxkq converges to ⇠ if the initial iterate x0 is su�ciently close to ⇠.
The proof relies on the fact that when one restricts the function g to the small
enough neighborhood of ⇠, the restricted function satisfies the conditions of Theo-
Proof. Since g1 is continuous ins the neighborhood of ⇠ let us choose an interval
I� “ p⇠ ´ �, ⇠ ` �q small enough that g
1 is continuous in I� and further that x P I�
��g1pxq ´ g1p⇠q
Then further more, for x P I� we have
��g1pxq ´ g1p⇠q ` g1p⇠q
��g1pxq ´ g1p⇠q
Then let us restrict the function g to the interval I�, that is, consider the function
g : I� Ñ I� (why is must it be that gpI�q Ä I�?). As above, we can use the MVT to
show that for any x, y P I�,
|gpxq ´ gpyq| § L |x ´ y| .
Therefore Theorem 1.5 applies to the restricted g, so if x0 P I� then the simple
iteration converges to the unique fixed point ⇠ P I�.
1.2. Newton’s method (1.3-4). A particular instance of simple iteration is a
relaxation iteration
Definition 1.7. Given f : ra, bs Ñ R, a relaxation iteration is given by
xk`1 “ xk ´ �fpxkq, for k “ 0, 1, 2, … ,
with some parameter �.
The relaxation iteration can also be viewed as a simple iteration with g given by
gpxq :“ x ´ �fpxq. As such, the criteria for convergence of this iteration is provided
by conditions that make g a contraction, that is, |g1| § L † 1 in some neighborhood
of the fixed point ⇠.
Example 1.8 (Why the relaxation?). Consider the two functions f1pxq “
and f2pxq “ 5 sinpxq. The two functions have the same zeros, fpn⇡q “ 0 for n P
t0, 1, 2, … u. If we were to construct a simple iteration
(1.9) xk`1 “ gpxkq with gpxq :“ x ´ fpxq
for both of these functions, and attempt to apply Theorem 1.6 then we would have
(1.10) g11pxq “ 1 ´
cospxq g12pxq “ 1 ´ 5 cospxq.
Near the fixed points n⇡, |g11pn⇡q| “
so the conditions for Theorem 1.6 is satisfied
for the iteration for f1, whereas for f2, you have |g12pn⇡q| “ 4, so the conditions are
not satisfied. Thus the iteration on gpxq “ x ´ fpxq can fail to converge to the fixed
point simply due to a di↵erent scaling of the function (di↵erent constants multiplying
the function, 1
To construct a sequence that works more generally (regardless of scaling, for
example), we add a parameter �
(1.11) gpxq “ x ´ �fpxq,
that you can adjust depending on f . Note also that the fixed point for the simple
iteration for this g is still the solution to the nonlinear problem of finding ⇠ for which
The parameter � plays a role here now, since g1 “ 1 ´ �f 1. Thus we explore how
to set � in order to ensure that |g1| § L † 1 for some L.
Theorem 1.9. Given f : ra, bs Ñ R, suppose f 1 is continuous about ⇠, at which
fp⇠q “ 0 and f 1p⇠q ‰ 0. Then there are �,� ° 0 such that the relaxation iteration
converges to ⇠ if x0 P r⇠ ´ �, ⇠ ` �s.
Proof. Without a loss of generality let f 1p⇠q ° 0. Since f 1 is continuous at ⇠, for
f 1p⇠q ° m ° 0, one can choose � so that f 1pxq ° m for x P r⇠ ´ �, ⇠ ` �s. Now let us
define M :“ supxPr⇠´�,⇠`�s f
m § f 1pxq § M for x P r⇠ ´ �, ⇠ ` �s.
Thus the lower and upper bounds for |g1| there becomes
1 ´ �M § 1 ´ �f 1pxq § 1 ´ �m.
Now we choose L so that |1 ´ �f 1pxq| § L. One choice is to set � so that
1 ´ �M “ ´L, 1 ´ �m “ L,
from which � “ 2{pM ` mq. One sees that 1{� is the average between M and m, so
that 0 † L † 1. (Note that this is a valid but not the only choice for �.)
Now, with such a choice for �, the relaxation iteration is a simple iteration whose
g is a contraction. By Theorem 1.5, the iteration converges.
The Newton’s method employs a generalized version of the relaxation iteration
above. Instead of fixed parameter � one allows � to depend on the current iterate, so
the iterate would be of the form
xk`1 “ xk ´ �kfpxkq,
Or, to write out the RHS as a function,
xk`1 “ gkpxkq, gkpxq “ x ´ �kfpxq
It would be advantageous if we could choose �k so that |g1k| “ |1 ´ �kf
1| is as small
as possible near xk. One such choice is to choose �k :“ 1{f
pxkq, which leads to the
Newton’s method.
Fig. 1.1: An illustration of the Newton’s method (left) and the secant method (right).
The next iterate xk`1 in the Newton’s method is the intersection of the tangent line
xk with the x-axis. The next iterate xk`1 in the secant method is the intersection of
the line passing through the pair of points pxk´1, fpxk´1qq and pxk, fpxkqq, and the
Definition 1.10. The Newton’s method for solving fpxq “ 0 uses the iteration
(1.12) xk`1 “ xk ´
, k “ 0, 1, 2, … ,
where it is assumed that f 1pxkq ‰ 0 for all k. The sequence is initialized by setting
Each iteration of the Newton’s method can intuitively be thought as a solution
to the linear problem: Find the point where the line
tpx, yq P R2 : y “ f 1pxkqpx ´ xkq ` fpxkqu
intersects the x-axis (tpx, 0q P R2 : x P Ru). The solution is the next iterate xk. See
?? for an illustration.
Theorem 1.11 (Convergence of the Newton’s method). Suppose that f : ra, bs Ñ
R satisfies
‚ f2 is continuous in r⇠ ´ �, ⇠ ` �s for some � ° 0
‚ fp⇠q “ 0 and f2p⇠q ‰ 0,
‚ there is a A ° 0 such that
�� for all x, y P r⇠ ´ �, ⇠ ` �s.
Then the Newton’s method pxkq converges to ⇠ if |x0 ´ ⇠| § mint�, 1{Au.
Proof. We will show that if |xk ´ ⇠| § mint�, 1{Au then |xk`1 ´ ⇠| § mint�, 1{Au
(1.14) |xk`1 ´ ⇠| §
|xk ´ ⇠| .
We first consider a Taylor expansion around x where f2 is continuous,
(1.15) fp⇠q “ fpxq ` p⇠ ´ xqf 1pxq `
for some ⌘ that lies between x and ⇠. Letting x “ xk and denoting the corresponding
(1.16) fp⇠q “ fpxkq ` p⇠ ´ xkqf
Dividing through by f 1pxkq and using that fp⇠q “ 0,
(1.17) xk ´
One see that the first two terms are equal to xk`1 in the Newton step (1.12), so that
(1.18) xk`1 ´ ⇠ “
Taking the absolute value,
(1.19) |xk`1 ´ ⇠| “
���� |xk ´ ⇠|
By the condition (1.13) along with the fact that we assumed |xk ´ ⇠| § 1{A, we have
���� |xk ´ ⇠| § A ¨
Inserting this into (1.19), we get (1.14). This shows convergence, since |xk ´ ⇠| §
2´k |x0 ´ ⇠| whenever we choose |x0 ´ ⇠| § mint�, 1{Au.
The Newton’s method requires the evaluation of f 1. If f 1 is unavailable, then one
can try to approximate f 1 just using f . One such idea leads to the secant method.
Definition 1.12. The secant method uses the iteration
(1.21) xk`1 “ xk ´
fpxkq ´ fpxk´1q
fpxkq for k “ 1, 2, … ,
assuming that fpxkq ´ fpxk´1q does not vanish for all k. The sequence is initialized
by setting x0 and x1.
The secant method substitutes the slope f 1pxkq that appears in the Newton’s
method by the approximate slope pfpxkq ´ fpxk´1qq{pxk ´ xk´1q.
The secant method also converges, as we will show next.
Theorem 1.13 (Convergence of the secant method). Suppose f : ra, bs Ñ R
satisfies,
‚ f 1 is continuous in a neighborhood of ⇠
‚ fp⇠q “ 0 and f 1p⇠q ‰ 0.
Then the secant method pxkq converges to ⇠ if x0 and x1 are su�ciently close to ⇠.
Proof. Let f 1 be continuous in r⇠ ´ h, ⇠ ` hs for some h ° 0.
Let ↵ :“ f 1p⇠q and without loss of generality let ↵ ° 0.
Then let us choose � small enough (smaller than h), so that for some ” “ 1
(other choices are possible),
(1.22) |x ´ ⇠| † � ñ
��f 1pxq ´ f 1p⇠q
Now suppose xk, xk´1 are within r⇠ ´ �, ⇠ ` �s. Using the Mean Value Theorem,
(1.23) fpxkq ´ fpxk´1q “ pxk ´ xk´1qf
where ‘k is between xk and xk´1, and
(1.24) fpxkq ´ fp⇠q “ pxk ´ ⇠qf
where ✓k is between xk and ⇠. Inserting these two into (1.21), we have
(1.25) xk`1 “ xk ´
Subtracting both sides by ⇠ and taking the absolute value,
(1.26) |xk`1 ´ ⇠| “
���� |xk ´ ⇠| .
Since f 1 is positive in this neighborhood, and further restricted by (1.22), we obtain
Hence we arrive at
(1.28) |xk`1 ´ ⇠| §
|xk ´ ⇠| .
So, as long as x0, x1 are set to be within � of ⇠,
(1.29) |xk ´ ⇠| §
|x1 ´ ⇠| ,
showing convergence.
2. Solving nonlinear equations in a multivariate domain.
2.1. Vector and matrix norms (2.7). In order to generalize the ideas and
algorithms we have covered so far to the multivariate setting, we will introduce some
basic notions regarding vectors in Rn.
Definition 2.1 (Linear space over R). We say V is a linear space (or vector
space) over R if it has two associated binary operations
1. scalar multiplication ¨ : R ˆ V Ñ V,
2. vector addition ` : V ˆ V Ñ V,
that satisfies
‚ apbuq “ pabqu for all a, b,u P V,
‚ u ` v “ v ` u for all u,v P V,
‚ pu ` vq ` w “ u ` pv ` wq for all u,v,w P V
‚ apu ` vq “ au ` av for all a P R, u,v P V,
‚ pa ` bqu “ au ` bu for all a, b P R,u P V,
‚ there is 0 P V such that u ` 0 “ u for all u P V
‚ 1u “ u for all u P V,
‚ for each u P V there is p´uq P V such that u ` p´uq “ 0.
We will define the notion of norms over any linear space V.
Definition 2.2 (Norm). Let V be a linear space. A norm on V is a function
k¨k : V Ñ R` that satisfies
‚ kvk “ 0 if and only if v “ 0,
‚ k�uk “ |�| kuk for � P R and u P V,
‚ ku ` vk § kuk ` kvk for u,v P V.
In particular if V “ Rn a norm measures the size of a vector. There are various
norms that can be defined for Rn.
Definition 2.3 (1-norm, 2-norm and 8-norm). Let v “ pv1, v2, … , vnq P Rn.
‚ The 1-norm is given by
(2.1) kvk1 :“
‚ The 2-norm is given by
(2.2) kvk2 :“
‚ The 8-norm is given by
(2.3) kvk8 :“ max
The unit circles in R2 measured in the 1-, 2-, inf-norms are shown in Figure 2.1.
These are special cases of p-norms.
Fig. 2.1: The unit circles tkvk1 “ 1u (left), tkvk2 “ 1u (middle) and tkvk8 “ 1u
(right). Note that the circle in the 2-norm is invariant with respect to rotation.
Definition 2.4 (p-norm). For v P Rn the p-norm for p • 1 is given by
(2.4) kvkp :“
It is straightforward to show that the p-norm satisfies the first two requirements
to be a norm Definition 2.2,
kvkp “ 0 ô kvkpp “ 0 ô
“ 0 ô vi “ 0,
“ |�| kvkp.
What remains is to show that the p-norm satisfies the triangle inequality (the last
requirement). This is straightforward to show for the 2-norm, using the following
inequality.
Theorem 2.5 (Cauchy-Schwarz inequality). For u,v P Rn, it holds
§ kuk2kvk2.
Proof. This is a special case of Hölder’s inequality which we will prove shortly.
Note that limpÑ8kvkp “ kvk8, since kvkp{kvk8 § n1{p Ñ 1 as p Ñ 8.
Theorem 2.6 (Young’s inequality). For a, b • 0 and p, q ° 1 which satisfy 1{p`
1{q “ 1, it holds that
(2.7) ab §
Proof. For the case either a or b is zero, the inequality follows directly by substi-
tution. In the case a, b ° 0, we have
(2.8) exprlnpabqs “ exp
exp rln aps `
exp rln bqs ,
by the convexity of the function expp¨q.
The pairs p, q • 1 satisfying 1{p` 1{q are called conjugate pairs. When p “ 1 the
corresponding q “ 8, and vice versa.
Theorem 2.7 (Hölder’s inequality). For u,v P Rn and p, q • 1 with 1{p`1{q “ 1
it holds that
|uivi| § kukpkvkq.
Proof. First, let us show the inequality for the case p, q ° 1. Using Young’s
inequality, it follows that
For the case either
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com