并行计算代写 COMP90025

The University of Melbourne COMP90025 Parallel and Multicore Computing Semester 2, 2016 Final Examination

Department of Computing and Information Systems COMP90025 Parallel and Multicore Computing Reading Time: 15 minutes
Writing Time: 3 hours

Open Book Status: Closed Book
This paper has 3 pages including this page

Identical Examination Papers: none Common Content: none

Authorized Materials:

No materials are authorized.

Instructions to invigilators:

No papers may be taken from the exam room.

Instructions to students:

All answers are to be written in the script book(s) provided. Attempt all questions – partial credit is available.
The examination is worth 60% of the subject assessment.

Paper to be held by Baillieu Library: yes

1 of 3

continued on next page

COMP90025 Parallel and Multicore Computing

Q.1. (a) (b)

Q.2. (a) (b)

Q.3. (a) (b)

(c)

Q.4. (a)

(b)

(c)

[4 marks] Given a sequential algorithm for a problem that runs in T (n) time, and a parallel algorithm using p(n) processors that runs in t(n) time, define what is meant by the parallel algorithm being optimal.

[6 marks] Brent’s Principle states that if an algorithm involving a total of

x operations can be performed in t time on a PRAM with sufficiently many

processors, then it can be performed in time t + x−t on a PRAM with p p

processors.

  1. Show how to derive Brent’s Principle.
  2. Consider an algorithm that involves O(n) operations in total and O(log n) time. Using Brent’s Principle or otherwise, what value of p provides op- timality?

[5 marks] Show how to optimally compute the prefix sum of n numbers on an EREW PRAM. Make sure to show that your approach is optimal.

[5 marks] Explain the ring termination algorithm that can handle restarting processes.

[3 marks] Consider a hypercube of size 2t nodes, where each node contains a number. Write a hypercube algorithm that adds up all of the numbers with the result being stored at all nodes, and taking O(t) steps.

[3 marks] Show by structural induction that a linear array of length 2t is contained in a hypercube of size 2t.

[4 marks] Define the following terms with respect to graph embeddings:

i. dilation ii. expansion

iii. load
iv. congestion

[8 marks] Consider a sorted array of n unique integers. For a given integer x, the search problem is to determine if the array contains x. This can be done sequentially using O(log n) time (binary search). Show how search can be done in O(logp n) time using p processors on a CREW PRAM.

[8 marks] Consider a mesh of size √n × √n nodes, where each node of the mesh contains an integer. The uniqueness problem is to determine whether all of the integers are unique. Consider a parallel algorithm that results in every node of the mesh knowing the result of the uniqueness problem. Show how this can be done in O(√n log n) steps.

[6 marks] A comparator is like a switch, however a comparator always routes the smallest of its input values to the upper output. A comparator is shown in detail in the following figure for both possibilities where the numbers 1 and 3 are provided as example input. The figure also shows a 3 stage sorting network

2 of 3 continued on next page

COMP90025 Parallel and Multicore Computing

(using comparators) that takes 4 inputs and sorts them, producing the sorted result at the 4 outputs. Explain how to construct an n-input sorting network that uses O(log2 n) stages and O(n log2 n) comparators in total, and draw an example of such a sorting network on 8 inputs.

5

3111 1333

2

1

252 1

5

1

1 335

(d) [8 marks] Consider the Sieve of Eratosthenes algorithm, shown below, that returns an array of length n in which the i-th position is set to true if i is a prime and to false otherwise.

PRIMES(n ) :

let A be an array of length n

set all but the first element of A to TRUE

3

for i from 2 to begin

n

if A[i] is TRUE
then set all multiples of i up to n to FALSE

end

The PRIMES algorithm above runs in O(nloglogn) sequential time. Show how to parallelize the PRIMES algorithm using O(n) processors on a PRAM, that takes O(log log n) time. The following fact may help you:

􏰀n 1 = O(log log n) p∈prime p

END OF EXAMINATION

3 of 3