Name: ___________________ ID: ____________
COMP326/5261 Assignment 4 Fall 2021
Issued: November 29, 2021 Due: December 7, 2021
Copyright By PowCoder代写 加微信 powcoder
Submit electronically to Moodle. No extension will be granted.
1. Vector Performance (40 marks)
Consider the following code fragment:
vld v1,r5 // load vector x
vld v2,r6 // load vector y
vmul v3,v1,f0 // z1 = a * x
vadd v4,v2,v3 // z2 = y + z1
vmul v5,v2,f2 // z3 = b * y
vst v4,r6 // store z2 as vector y
vst v5,r5 // store z3 as vector x
l1 —-> m3 // flow-dependence graph
\ // labeled in program order
a4 —-> s6
l2 —-> m5 —-> s7
Make the following assumptions:
– this is a four-lane vector processor—this affects the running time of
vector instructions
– there are two copies of each arithmetic vector functional unit, and two
copies of the vector load-store functional unit—this affects the number
of vector instructions of the same type that can run in parallel in a gang
– start-up penalties of functional units are: l/s = 12, add = 6, mul = 7
– the vector register length is 256
a) In the absence of vector chaining, determine the execution time of this
code fragment by direct summation with start-up penalties and the value of
‘n’ (i.e., after the lanes have been factored in). Sum the gangs to get
the total time. Show the gang members.
b) In the presence of vector chaining, determine the execution time of this
code fragment by direct summation with start-up penalties and the value of
‘n’. Sum the gangs to get the total time. Show the gang members.
2. Vectorizing Compiler (30 marks)
Consider this familiar loop:
loop: fld f4,0(r1) l1
fld f6,0(r2) l2
fmul f4,f4,f0 m1
fmul f6,f6,f2 m2
fadd f4,f4,f6 a1
fsd f4,0(r1) s1
subi r1,r1,8 sub1
subi r2,r2,8 sub2
bnez r1,loop br
Rewrite the code using vector instructions. Draw the flow-dependence graph
of vector instructions. Using vector chaining, calculate the running time of
the program, using the data in question 1. Note: startup time is _added_ to
vector-instruction time. Show gang members.
3. Vectorizing Loop Nests (30 marks)
Consider the following loop nest that computes the lengths of many strings,
anchored in an array.
for( i=0; i