MATH50003 Numerical Analysis: Problem 3¶
This problem sheet explores implementation of triangular solves,
supporting a matrix with two super-diagonals, as well as
Copyright By PowCoder代写 加微信 powcoder
permutation and Householder reflections that can be applied to a vector in
$O(n)$ complexity.
Questions marked with a ⋆ are meant to be completed without using a computer.
Problems are denoted A/B/C to indicate their difficulty.
using LinearAlgebra, Test
# We will override these functions below
import Base: getindex, setindex!, size, *, \
1. Dense Matrices¶
Problem 1.1 (C) Show that A*x is not
implemented as mul(A, x) from the lecture notes
by finding a Float64 example where the bits do not match.
First we have to define mul(A, x) as in the lecture notes:
function mul(A, x)
m,n = size(A)
c = zeros(eltype(x), m) # eltype is the type of the elements of a vector/matrix
for j = 1:n, k = 1:m
c[k] += A[k, j] * x[j]
mul (generic function with 1 method)
Then we can easily find examples, in fact we can write a function that searches for examples:
using ColorBitstring
function findblasmuldifference(n,l)
for j = 1:n
A = randn(l,l)
x = rand(l)
if A*x != mul(A,x)
return (A,x)
n = 100 # number of attempts
l = 10 # size of objects
(A,x) = findblasmuldifference(n,l) # find a difference
println(“Bits of obtained A*x”)
printlnbits.(A*x);
println(“Bits of obtained mul(A,x)”)
printlnbits.(mul(A,x));
println(“Difference vector between the two solutions:”)
println(A*x-mul(A,x))
Bits of obtained A*x
1011111111101011101100000111101000111101010100101011000010011101
1011111111111100011110010101110111100101001111101101110001110100
1100000000000111010110000101110100000000101011000000101000000110
1011111111100110011000011010000000101010101010001010000100010100
1100000000011110010110000000010101010011010111011101000000011101
0011111111110001001101000000001110001101001110111010011001101010
1100000000011010111111100010000111110011110111100001101000110100
0100000000000110001110010111000110100011010001110101001111111110
0011111111010011001100000100111011111010101000111110101010001010
0011111111011001110010110101001001110011111111010011011001100011
Bits of obtained mul(A,x)
1011111111101011101100000111101000111101010100101011000010011100
1011111111111100011110010101110111100101001111101101110001110100
1100000000000111010110000101110100000000101011000000101000000101
1011111111100110011000011010000000101010101010001010000100010010
1100000000011110010110000000010101010011010111011101000000011101
0011111111110001001101000000001110001101001110111010011001101100
1100000000011010111111100010000111110011110111100001101000110010
0100000000000110001110010111000110100011010001110101001111111110
0011111111010011001100000100111011111010101000111110101010001010
0011111111011001110010110101001001110011111111010011011001100011
Difference vector between the two solutions:
[-1.1102230246251565e-16, 0.0, -4.440892098500626e-16, -2.220446049250313e-16, 0.0, -4.440892098500626e-16, -1.7763568394002505e-15, 0.0, 0.0, 0.0]
2. Triangular Matrices¶
Problem 2.1 (B) Complete the following functions for solving linear systems with
triangular systems by implementing back and forward-substitution:
function ldiv(U::UpperTriangular, b)
n = size(U,1)
if length(b) != n
error(“The system is not compatible”)
x = zeros(n) # the solution vector
## TODO: populate x using back-substitution
function ldiv(U::LowerTriangular, b)
n = size(U,1)
if length(b) != n
error(“The system is not compatible”)
x = zeros(n) # the solution vector
## TODO: populate x using forward-substitution
ldiv (generic function with 2 methods)
function ldiv(U::UpperTriangular, b)
n = size(U,1)
if length(b) != n
error(“The system is not compatible”)
x = zeros(n) # the solution vector
for k = n:-1:1 # start with k=n, then k=n-1, …
r = b[k] # dummy variable
for j = k+1:n
r -= U[k,j]*x[j] # equivalent to r = r-U[k,j]*x[j]
x[k] = r/U[k,k]
function ldiv(U::LowerTriangular, b)
n = size(U,1)
if length(b) != n
error(“The system is not compatible”)
x = zeros(n) # the solution vector
for k = 1:n # start with k=1
r = b[k] # dummy variable
for j = 1:k-1
r -= U[k,j]*x[j]
x[k] = r/U[k,k]
ldiv (generic function with 2 methods)
Here is an example:
x = [1,2,3,4]
Ldense = [1 0 0 0; 2 3 0 0; 4 5 6 0; 7 8 9 10]
Ltriang = LowerTriangular(Ldense)
4×4 LowerTriangular{Int64, Matrix{Int64}}:
1 ⋅ ⋅ ⋅
2 3 ⋅ ⋅
4 5 6 ⋅
7 8 9 10
Ldense\x-ldiv(Ltriang,x)
4-element Vector{Float64}:
Problem 2.2⋆ (B) Given $𝐱 \in \mathbb{R}^n$, find a lower triangular matrix of the form
L = I – 2 𝐯 𝐞_1^⊤
such that:
L 𝐱 = x_1 𝐞_1.
What does $L𝐲$ equal if $𝐲 ∈ ℝ^n$ satisfies $y_1 = 𝐞_1^⊤ 𝐲 = 0$?
By straightforward computation we find
$$Lx = x – 2 𝐯 𝐞_1^⊤x = x – 2 𝐯 x_1$$and thus we find such a lower triangular $L$ by choosing $v_1 = 0$ and $v_k = \frac{x_k}{2 x_1}$ for $k=2..n$ and $x_1 \neq 0$.
3. Banded matrices¶
Problem 3.1 (C) Complete the implementation of UpperTridiagonal which represents a banded matrix with
bandwidths $(l,u) = (0,2)$:
struct UpperTridiagonal{T} <: AbstractMatrix{T}
d::Vector{T} # diagonal entries
du::Vector{T} # super-diagonal enries
du2::Vector{T} # second-super-diagonal entries
size(U::UpperTridiagonal) = (length(U.d),length(U.d))
function getindex(U::UpperTridiagonal, k::Int, j::Int)
d,du,du2 = U.d,U.du,U.du2
# TODO: return U[k,j]
function setindex!(U::UpperTridiagonal, v, k::Int, j::Int)
d,du,du2 = U.d,U.du,U.du2
if j > k+2
error(“Cannot modify off-band”)
# TODO: modify d,du,du2 so that U[k,j] == v
U # by convention we return the matrix
setindex! (generic function with 90 methods)
struct UpperTridiagonal{T} <: AbstractMatrix{T}
d::Vector{T} # diagonal entries
du::Vector{T} # super-diagonal enries
du2::Vector{T} # second-super-diagonal entries
size(U::UpperTridiagonal) = (length(U.d),length(U.d))
function getindex(U::UpperTridiagonal, k::Int, j::Int)
d,du,du2 = U.d,U.du,U.du2
if j == k+2
return U.du2[k]
elseif j == k+1
return U.du[k]
elseif j == k
return U.d[k]
else # off band entries are zero
return zero(eltype(U))
function setindex!(U::UpperTridiagonal, v, k::Int, j::Int)
d,du,du2 = U.d,U.du,U.du2
if (j > k+2)||(j
$$\begin{split}
1\oplus\sqrt{1\oplus h^2}^{fp}=&(1+\delta_2)(1+\sqrt{1\oplus h^2}^{fp})\\
=&(1+\delta_2)(1+\sqrt{1+h^2+\delta_1}(1+\delta_3))
\end{split}$$
where $|\delta_2|,|\delta_3|\le \frac{\epsilon_m}{2}$. Then
$$\begin{split}
\frac{1\oplus\sqrt{1\oplus h^2}^{fp}}{1+\sqrt{1+h^2}}=&(1+\delta_2)\left(1+\frac{\sqrt{1+h^2+\delta_1}(1+\delta_3)-\sqrt{1+h^2}}{1+\sqrt{1+h^2}}\right)\\
=&(1+\delta_2)\left(1+\frac{(1+\delta_3)(\sqrt{1+h^2+\delta_1}-\sqrt{1+h^2})+\delta_3\sqrt{1+h^2}}{1+\sqrt{1+h^2}}\right)\\
\approx &(1+\delta_2)\left(1+\frac{\delta_1}{2(1+\sqrt{1+h^2})\sqrt{1+h^2}}+\delta_3\frac{\sqrt{1+h^2}}{1+\sqrt{1+h^2}}\right)
\end{split}$$
and we can bound the relative error by
$$|\delta_2|+|\delta_1|\frac{1}{2(1+\sqrt{1+h^2})\sqrt{1+h^2}}+|\delta_3|\frac{\sqrt{1+h^2}}{1+\sqrt{1+h^2}}\le |\delta_2|+\frac{|\delta_1|}{4}+\frac{3|\delta_3|}{4}\le \epsilon_m.$$
In conclusion, it’s very accurate to compute $1+\sqrt{1+h^2}$. Let us verify this:
using Plots
S = precision(Float64)-1;
relative_error = zeros(S)
for n = 1:S
h = 2.0^(-n)
exact = 1+sqrt(1+BigFloat(h)^2)
numerical = 1+sqrt(1+h^2)
relative_error[n] = abs(numerical-exact)/exact
println(eps())
maximum(relative_error)
2.220446049250313e-16
1.137500569478947e-16
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com