Instructions
CS/SE 4X03 — Assignment 1
January 25, 2021 Due date: February 4
• If you write your solutions by hand, please ensure your handwriting is legible and easy to read. Otherwise, please type your solutions. We may subtract marks for messy solutions.
• Submit to Avenue a PDF file containing your solutions and the required Matlab files. Assignments in other formats e.g. IMG, PNG will be not be marked.
• Name your PDF file LastName-FirstName-StudentNumber.pdf. Files with other names will not be marked.
• Submit only what is required.
Problem 1 [3 points] Consider a floating-point (FP) system with 5 decimal digits and
rounding to the nearest. Give an example for each of the following: • (a+b)/2∈/[a,b]
• a+(b+c)̸=(a+b)+c
• a∗(b∗c)̸=(a∗b)∗c
where a, b, c are FP numbers in this system. Problem 2 [8 points] Consider the Matlab code
1
x = 10864;
y = 18817;
f = @(x,y) 9*x^4-y^4+2*y^2;
u = f(x,y);
v = f(single(x),single(y));
z = f(vpa(x), vpa(y));
u, v, z
(a) [2 points] What is the hexadecimal representation of 9*x^4 in double precision, i.e. what is stored in the IEEE double format? Is 9*x^4 exact in double precision? Explain.
(b) [6 points] Explain the values for u, v, and z.
Problem 3 [6 points] Consider three methods for computing the sum of n floating-floating
point numbers:
(1) summation is in decreasing order of magnitude, e.g. 1.2+(2.3+(3.4+5.1))
(2) summation is in increasing order of magnitude e.g. ((1.2+2.3)+3.4)+5.1
(3) usingKahan’ssummationalgorithmhttps://en.wikipedia.org/wiki/Kahan_summation_ algorithm
Apply these three methods to compute the sum n=10000 i−2. How would you compute an i=1
accurate result for this sum? Using such a result, report the errors in (1), (2), (3). Report in the format format short e of Matlab.
Submit your Matlab code to Avenue. Name the main program problem3.m. When executed it should produce output in the form
decreasing order increasing order Kahan’s sum
error error error
Note: ∞n=1 1/n2 = π2/6.
Problem 4 [3 points] Consider the Taylor series
x3 x5 x7 arctan(x)=x− 3 + 5 − 7 +···.
Derive how many terms in this series are needed to approximate arctan(x) for any x ∈ [−0.2,0.2] with absolute error of at most 10−8.
Describe how you can verify numerically that the error in your approximation is indeed within 10−8. You can either describe in words or use Matlab code.
Problem 5 [3 points] Consider solving the scalar equation ax = b, for given a and b and assume that you have computed x. To measure the quality of x, we can compute the residual r = b − ax. Derive the error in fl (r), that is the relative error in the floating point representation of r. Can it be large, and why?
2
Problem 6 [2 points]
Show all steps. Problem 7 [10 points]
Apply Gaussian elimination with partial pivoting to the matrix
10001 −1 1 0 0 1 −1 1 1 0 1 − 1 − 1 − 1 1 1 −1 −1 −1 −1 −1
Consider solving the linear system Ax = b, where A ∈ Rn×n and b ∈ Rn. Suppose that we have computed a numerical solution x1 ≈ x. The residual is
r1 =b−Ax1 =Ax−Ax1 =A(x−x1). Denote d1 = x − x1 and solve Ad1 = r1. Then x2 = x1 + d and
Ax2 =Ax1 +Ad=Ax1 +Ax−Ax1 =Ax=b. That is, x2 solves Ax2 = b. However, in computer arithmetic,
r2 =b−Ax2
is unlikely the zero vector, and we can repeat the above with x2 to compute x3 and so on. This is called iterative refinement. When computing the residuals in the same precision, there will be cancellations, and we have to evaluate them in higher precision.
Let Aij = 1/(i + j − 1), i,j = 1,…,n. This is a Hilbert matrix. Let u be the n-vector of ones, u = (1,1,…1)T, and let b = Au. Then Ax = b has the exact solution u.
For a given n, store this A in single precision, and denote this single precision matrix by As. Compue b = Asu in double precision and round it to single precision.
Then compute the residual in double precision and iterate:
%file iterrefin.m
clear all; close all;
%turn this on to see warning messages
w = warning (’off’,’all’);
n = 8; iterations = 100;
As = hilb(n,’single’);
u = ones(n,1,’single’);
b = double(As)*double(u);
bs = single(b);
tic;
x(:,1) = As\bs;
for i=2:iterations
r = b – double(As)*double(x(:,i-1));
r = single(r);
d = As\r;
3
x(:,i) = x(:,i-1)+d;
end
toc;
error = vecnorm(x-u)./norm(u); semilogy(error,’–o’)
legend(’$\|x-u\| /\|u\|$’,’interpreter’,’latex’) set(gca, ’FontSize’, 14)
xlabel(’iteration’)
ylabel(’error’)
To see the importance of evaluating the residual in higher precision, try this code with r = b – double(As)*double(x(:,i-1)) replaced by r = bs-As*x(:,i-1).
(a) [2 points] Try this code with n = 5, 6, 7, 8. Why does the error in the solution not go below 10−8? Why does the error increase for n ≥ 9?
(b) [2 points] Modify this code so it solves the systems using the LU factorization of As, where this factorization is computed by Matlab’s lu.
Include your code in the PDF file and highlight the places in the code where the LU factorization is used.
Run your modified code with n = 8. If the plots are different, can you explain why. Include the resulting plots in the PDF.
(c) [4 points] Implement in Matlab the following functions
function [L, U, p] = lupp(A)
% [L,U,p] = lupp(A) computes the LU factorization of A
% using scaled partial pivoting.
% p is a permutation vector storing the row permulations
% during the pivoting.
function x = lusolve(L,U,b,p)
% Solves LUx=b(p) by solving Ly=b(p) and then Ux=y.
% L is unit lower triangular, U is upper triangular
% p is a permutation vector obtained from lupp.
(d) [2 points] Modify the above code so it does the linear solves with these functions. Name it problem7.m. Submit to Avenue problem7.m, lupp.m, and lusolve.
Include the produced plots for n = 7, 8, 9 in the PDF file.
4