CSC 336 Midterm Test 19 October 2018. This is a closed-book test: no books, no notes, no calculators, no phones, no tablets, no
computers (of any kind) allowed.
Duration of the test: 50 minutes (11:10 AM to noon).
Do NOT turn this page over until you are TOLD to start.
Answer ALL Questions.
Write your answers in the test booklets provided.
Please fill-in ALL the information requested on the front cover of EACH test booklet that you use.
The test consists of 4 pages, including this one. Make sure you have all 4 pages.
The test consists of 4 questions. Answer all 4 questions. The mark for each question is listed at the start of the question.
The test was written with the intention that you would have ample time to complete it. You will be rewarded for concise well-thought-out answers, rather than long rambling ones. We seek quality rather than quantity.
Moreover, an answer that contains relevant and correct information as well as irrelevant or incorrect information will be awarded fewer marks than one that contains the same relevant and correct information only.
Write legibly. Unreadable answers are worthless.
Page 1 of 4 pages.
1.
[5 marks: 1 mark for each answer]
Consider a floating-point number system with parameters β = 10, p = 3, L = −10 and U = +10 that uses the round-to-nearest rounding rule and allows gradual underflow to subnormal numbers as well as underflow to zero. That is, the numbers in the system include zero and nonzero numbers of the form ±d1.d2d3 · 10n where di ∈ {0, 1, 2, . . . , 9} for i = 1, 2, 3 and n ∈ {−10, −9, −8, . . . , 10}. The normalized floating-point numbers in this system include 0 and the nonzero numbers of the form ±d1.d2d3 · 10n with d1 ̸=0. Thesubnormalnumbershaven=−10,d1 =0anddi ̸=0foratleastone of i = 2 or i = 3. Like the IEEE floating-point number system, this number system also has the two special numbers +Infty and −Infty, which stand for numbers that are too large in magnitude (either positive or negative, respectively) to represent in this floating-point system. The system also has a NaN, which stands for “not-a-number”.
In the floating-point number system described above, what is the result of each of the floating-point arithmetic operations (a)–(e) below? Write your answer as
• a normalized number in this floating-point system, if possible,
• a subnormal number in this floating-point system in the case of gradual underflow,
• zero in the case that the true answer is zero or there is an underflow to zero,
• +Infty or −Infty in the case of overflow,
• NaN if the result of the computation is not any of the above.
(a) (5.26·102)+(2.57·101)
(b) (5.03·105)×(2.02·10−2)
(c) (4.04 · 10−7 ) × (−3.03 · 10−5 )
(d) (−3.03·106)×(5.05·104)
(e) (7.06·106)×(2.03·104)−(4.02·105)×(3.01·105)
Page 2 of 4 pages.
2.
[5 marks]
Assume both x and y are positive real numbers and x ≈ y. In this case, we would
expect some cancellation in computing loge(x) − loge(y). On the other hand, loge(x) − loge(y) = loge(x/y)
and loge(x/y) involves no subtractions (hence no cancellation). Does this mean that computing loge(x/y) is likely to give a more accurate result than computing loge(x) − loge(y)?
Hint: for what values is the function loge poorly conditioned?
3.
[5 marks] Assume
1
Give the value of each of the following norms.
(a) ∥x∥1 (b) ∥x∥2
(c) ∥x∥∞ (d) ∥A∥1
(e) ∥A∥∞
2
x = −3 and
2 1−3 A = −2 3 −2
Page 3 of 4 pages.
3−1 4
4.
[5 marks]
Recall that, for any real number p ≥ 1, the p-norm of a vector x ∈ Rn is
1/p
For any real n × n matrix A, the matrix p-norm subordinate to this vector p-norm is
∥A∥p = max ∥Ax∥p = max ∥Ax∥p x̸=0 ∥x∥p ∥x∥p =1
Assuming A is nonsingular,
condp(A) = ∥A∥p ∥A−1∥p Let D be a real n × n diagonal matrix. That is,
d1 0 0 ··· 0
0 d2 0 ··· 0
Explain why (1) is true.
n ∥x∥p = |xi|p
i=1
00d···0 D = 3
. . . .. . … ..
0 0 0 ··· dn
where di ∈ R for i = 1,2,…,n. In addition, assume that di ̸= 0 for i = 1,2,…,n. I
mentioned in class that
condp(D) = max{|di| : i = 1,2,…,n} (1) min{|di| : i = 1,2,…,n}
Page 4 of 4 pages.