程序代写代做代考 CS314 Fall 2018

CS314 Fall 2018

Assignment 8

Problem 1 – Dependencies

Suppose we have the following program (a sequence of instructions), with each instruction labeled as
S:

S1: a := 4;

S2: b := 2;

S3: c := 5;

S4: read(d);

S5: a := b + 3;

S6: b := a – 3;

S7: c := d * b;

S8: e := a + 6;

S9: print(c);

S10: print(e);

(a) Give the statement-level dependence graph for the above program. A node in the statement-level
dependence graph represents a statement, an edge represents dependence between the statements
(i.e. the nodes). Label each edge as a true data dependence, an anti data dependence, or an
output data dependence.

(b) Assume that each statement takes 1 cycle to execute. What is the execution time of the sequen-
tial code? What is the fastest parallel execution time of the program (i.e. the critical path)?
You may assume that I/O operations (read and print) can be done in parallel.

Problem 2 – Dependence Analysis

Give the source and sink references, the type (whether a dependence is true, anti, or output), and
the distance vectors for all dependences in the following loops.

(a) do i = 3, 100
a(i) = a(i – 1) + a(i + 1) – a(i – 2)

enddo

(b) do i = 2, 100
a(3 * i) = a(3 * i – 3) + a(3 * i + 3)

enddo

(c) do i = 1, 10
a(i) = a(5) + a(i)

enddo

Use aW (i) and aR(i) to annotate the write access to a(i) and the read access to a(i) respectively.

Problem 3 – Loop Parallelization

Given the following nested loop:

1

do i = 2, 100

do j = 2, 100

S1: a(i, j) = b(i – 1, j – 1) + 2

S2: b(i, j) = i + j – 1

enddo

enddo

(a) Give the statement-level dependence graph. Show the dependence graph for statement instances
in a part of the iteration space: i = 2 … 5, j = 2 … 5.

(b) In its current form, can any loop be parallelized? If so, which loop(s)? If not, justify your
answer.

(c) Provide one valid affine schedule for statements S1 and S2 such that p(S1)= C11*i + C12*j +
d1 and p(S2)= C21*i + C22*j + d2 in order to achieve synchronization-free parallelism. There
could be many possible solutions for {C11, C12, C21, C22, d1, d2}. You only need to provide one
feasible solution. (Hint: You can let d1 = d2 = 0.)

(d) Generate two-level loop code for the affine schedule you provided. Please use p as outermost
loop and i as innermost loop in the transformed loop. Calculate the loop bounds for p and i
using Fourier-Motzkin elimination. You might need to calculate the overlapping polyhedron for
S1 and S2 in order to eliminate the j loop. Please refer to the techniques for code generation in
lecture 20 and lecture 21.

2