MATH50003 Problem Sheet 1¶
This problem sheet tests the representation of numbers on the computer, using
modular arithmetic. We also use floating point rounding modes to implement
Copyright By PowCoder代写 加微信 powcoder
interval arithmetic, and thereby
produce rigorous bounds on the exponential.
using ColorBitstring, SetRounding
Questions marked with a ⋆ are meant to be completed without using a computer.
1. Binary representation¶
Problem 1.1 What is the binary representations of $1/5$? (Hint: use printbits to derive a proposed form.)
Problem 1.2⋆ What is $\pi$ to 5 binary places? Hint: recall that $\pi \approx 3.14$.
2. Integers¶
Problem 2.1⋆ With 8-bit signed integers, find the bits for the following: $10, 120, -10$.
Problem 2.2⋆ What will Int8(120) + Int8(10) return?
3. Floating point numbers¶
Problem 3.1⋆ What are the single precision $F_{32}$ (Float32) floating point representations for the following:
2, 31, 32, 23/4, (23/4)\times 2^{100}
Check your answers using printbits.
Problem 3.2 Let $m(y) = \min\{x \in F_{32} : x > y \}$ be the smallest single precision number
greater than $y$. What is $m(2) – 2$ and $m(1024) – 1024$? Check your answer using the nextfloat
4. Arithmetic¶
Problem 4.1⋆ Suppose $x = 1.25$ and consider 16-bit floating point arithmetic (Float16).
What is the error in approximating $x$ by the nearest float point number ${\rm fl}(x)$?
What is the error in approximating $2x$, $x/2$, $x + 2$ and $x – 2$ by $2 \otimes x$, $x \oslash 2$, $x ⊕ 2$ and $x \ominus 2$?
Problem 4.2⋆ For what floating point numbers is $x \oslash 2 \neq x/2$ and $x ⊕ 2 \neq x + 2$?
Problem 4.3⋆ Explain why the following return true. What is the largest floating point number y such that y + 1 ≠ y?
x = 10.0^100
x + 1 == x
Problem 4.4⋆ What are the exact bits for $1/5$, $1/5 + 1$ computed
using half-precision arithmetic (Float16) (using default rounding)?
Problem 4.5⋆ Explain why the following does not return 1. Can you compute the bits explicitly?
Float16(0.1) / (Float16(1.1) – 1)
Float16(1.004)
Problem 4.6⋆ Find a bound on the absolute error in terms of a constant times
$ϵ_{\rm m}$ for the following computations
\begin{align*}
(1.1 * 1.2) &+ 1.3 \\
(1.1 – 1) &/ 0.1
\end{align*}
implemented using floating point arithmetic (with any precision).
5. Interval arithmetic¶
The following problems consider implementation of interval arithmetic for
proving precise bounds on arithmetic operations. That is recall the set operations
A + B = \{x + y : x \in A, y \in B\}, AB = \{xy : x \in A, y \in B\}.
Problem 5.1⋆ For intervals $A = [a,b]$ and $B = [c,d]$ such that $0 \notin A,B$
and integer $n \neq 0$,
deduce formulas for the minimum and maximum of $A/n$, $A+B$ and $AB$.
Problem 5.2
We want to implement floating point variants such that, for $S = [a,b] + [c,d]$
$P = [a,b] * [c,d]$, and $D = [a,b]/n$ for an integer $n$,
\begin{align*}
[a,b] ⊕ [c,d] &:= [{\rm fl}^{\rm down}(\min S), {\rm fl}^{\rm up}(\max S)] \\
[a,b] ⊗ [c,d] &:= [{\rm fl}^{\rm down}(\min P), {\rm fl}^{\rm up}(\max P)] \\
[a,b] ⊘ n &:= [{\rm fl}^{\rm down}(\min D), {\rm fl}^{\rm up}(\max D)]
\end{align*}
This guarantees $S ⊆ [a,b] ⊕ [c,d]$, $P ⊆ [a,b] ⊗ [c,d]$, and
$D ⊆ [a,b] ⊘ n$.
In other words, if $x \in [a,b]$ and
$y \in [c,d]$ then $x +y \in [a,b] ⊕ [c,d]$, and we thereby have bounds on $x + y$.
Use the formulae from Problem 5.1 to complete (by replacing the # TODO: … comments with code)
the following implementation of an
so that +, -, and / implement $⊕$, $⊖$, and $⊘$ as defined above.
# Interval(a,b) represents the closed interval [a,b]
struct Interval{T}
import Base: *, +, -, /, one, in
# create an interval corresponding to [1,1]
one(x::Interval) = Interval(one(x.a), one(x.b))
# Support x in Interval(a,b)
in(x, y::Interval) = y.a ≤ x ≤ y.b
# Following should implement ⊕
function +(x::Interval, y::Interval)
T = promote_type(typeof(x.a), typeof(x.b))
a = setrounding(T, RoundDown) do
# TODO: upper bound
b = setrounding(T, RoundUp) do
# TODO: upper bound
Interval(a, b)
# Following should implement ⊘
function /(x::Interval, n::Integer)
T = typeof(x.a)
if iszero(n)
error(“Dividing by zero not support”)
a = setrounding(T, RoundDown) do
# TODO: lower bound
b = setrounding(T, RoundUp) do
# TODO: upper bound
Interval(a, b)
# Following should implement ⊗
function *(x::Interval, y::Interval)
T = promote_type(typeof(x.a), typeof(x.b))
if 0 in x || 0 in y
error(“Multiplying with intervals containing 0 not supported.”)
a = setrounding(T, RoundDown) do
# TODO: lower bound
b = setrounding(T, RoundUp) do
# TODO: upper bound
Interval(a, b)
* (generic function with 405 methods)
Problem 5.3 The following function computes the first n+1 terms of the Taylor series of $\exp(x)$:
\sum_{k=0}^n {x^k \over k!}
function exp_t(x, n)
ret = one(x) # 1 of same type as x
s = one(x)
for k = 1:n
s = s/k * x
ret = ret + s
exp_t (generic function with 1 method)
Bound the tail of the Taylor series for ${\rm e}^x$ assuming $|x| \leq 1$.
(Hint: ${\rm e}^x \leq 3$ for $x \leq 1$.)
Use the bound
to complete the function exp_bound which computes ${\rm e}^x$ with rigorous error bounds, that is
so that when applied to an interval $[a,b]$ it returns an interval that is
guaranteed to contain the interval $[{\rm e}^a, {\rm e}^b]$.
function exp_bound(x::Interval, n)
if abs(x.a) > 1 || abs(x.b) > 1
error(“Interval must be a subset of [-1, 1]”)
ret = exp_t(x, n) # the code for Taylor series should work on Interval unmodified
f = factorial(min(20, n + 1)) # avoid overflow in computing factorial
T = typeof(ret.a)
# TODO: modify ret so that exp(x) is guaranteed to lie in it
exp_bound (generic function with 1 method)
Check your result for computing ${\rm ℯ}$ and ${\rm e}^{-1}$ by assuring that
the following returns true:
exp(big(1)) in exp_bound(Interval(1.0,1.0), 20) && exp(big(-1)) in exp_bound(Interval(-1.0,-1.0), 20)
MethodError: no method matching isless(::Nothing, ::Int64)
Closest candidates are:
isless(::AbstractFloat, ::Real) at ~/Projects/julia-1.7/usr/share/julia/base/operators.jl:186
isless(::Real, ::Real) at ~/Projects/julia-1.7/usr/share/julia/base/operators.jl:430
isless(::Any, ::Missing) at ~/Projects/julia-1.7/usr/share/julia/base/missing.jl:88
Stacktrace:
[1] <(x::Nothing, y::Int64)
@ Base ./operators.jl:352
[2] <=(x::Nothing, y::Int64)
@ Base ./operators.jl:401
[3] in(x::Int64, y::Interval{Nothing})
@ Main ./In[4]:13
[4] *(x::Interval{Nothing}, y::Interval{Float64})
@ Main ./In[4]:45
@ ./In[5]:5 [inlined]
[6] exp_bound(x::Interval{Float64}, n::Int64)
@ Main ./In[6]:5
[7] top-level scope
@ ./boot.jl:373 [inlined]
[9] include_string(mapexpr::typeof(REPL.softscope), mod::Module, code::String, filename::String)
@ Base ./loading.jl:1196
Further, ensure that the width of each returned interval is less than $10^{-14}$.
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com