CS计算机代考程序代写 algorithm Quiz

Quiz

49

1

Question 1
If 99.9% of your sequential program execution time is spent inside a
loop that can be parallelized, what is the maximum speedup you can
achieve via parallelizing this loop?

Quiz

50

1

Question 1
If 99.9% of your sequential program execution time is spent inside a
loop that can be parallelized, what is the maximum speedup you can
achieve via parallelizing this loop?

Solution: Maximum speedup = 1/(1-99.9%) = 1/0.1% = 1000

Quiz

51

1

Question 2
If a two-hour parallel program running on one thousand cores spends
one hour in its parallel portion, what is the scaled speedup of the
program on one thousand cores?

Quiz

52

1

Question 2
If a two-hour parallel program running on one thousand cores spends
one hour in its parallel portion, what is the scaled speedup of the
program on one thousand cores?

Solution: Scaled speedup = (1+1*1000)/2 = 500.5

Quiz

53

1

Question 3
A parallel program achieves a speedup of 2 on 4 cores. Is it likely to
achieve a speedup of 4 on 8 cores?

Quiz

54

1

Question 3
A parallel program achieves a speedup of 2 on 4 cores. Is it likely to
achieve a speedup of 4 on 8 cores?

Speedup = 1/[f+(1-f)/4]=2 Serial fraction f = 1/3 Speedup =
1/[1/3+(1-1/3)/8] = 2.4 < 4 Impossible to achieve a speedup of 4 on 8 cores Quiz 55 1 Question 4 A computer animation program generates feature movies frame-by- frame. Each frame can be generated and written to its own file independently. If it takes 90 seconds to render a frame and 10 second to write a frame into a file, what is the maximum speedup you can achieve if you have 100 single-processor computers to use? Quiz 56 1 Question 4 A computer animation program generates feature movies frame-by- frame. Each frame can be generated and written to its own file independently. If it takes 90 seconds to render a frame and 10 second to write a frame into a file, what is the maximum speedup you can achieve if you have 100 single-processor computers to use? Solution: Maximum speedup = 100 because each processor can handle separate frames independently. Quiz 57 1 Question 5 Let n>f(p) denote the Isoefficiency relation of a parallel system and
M(n) denote the amount of memory for a problem of size n. Use the
scalability function to rank the following parallel systems from most to
least scalable:
(a). f(p)=p and M(n)=n;
(b). f(p)=p and M(n)=n^2;
(c). f(p)=p*log(p) and M(n)=n;
(d). f(p)=p*log(p) and M(n)=n^2.

Quiz

58

1

Question 5
Let n>f(p) denote the Isoefficiency relation of a parallel system and
M(n) denote the amount of memory for a problem of size n. Use the
scalability function to rank the following parallel systems from most to
least scalable:
(a). f(p)=p and M(n)=n;
(b). f(p)=p and M(n)=n^2;
(c). f(p)=p*log(p) and M(n)=n;
(d). f(p)=p*log(p) and M(n)=n^2.
Solution:
a. Scalability function = M(f(p))/p = (Cp)/p = C
b. Scalability function = M(f(p))/p = (Cp)^2/p = C^2p
c. Scalability function = M(f(p))/p = (Cp*log(p))/p = C*log(p)
d. Scalability function = M(f(p))/p = (Cp*log(p))^2/p = C^2*p*log^2(p)
From most scalable to least scalable is a > c > b > d.

Quiz

59

1

Question 6
Your program has an execution time of n^3 and uses n^2 bytes space
when the problem size is n. You have worked very hard to make it fully
parallel and reduced the communication time to 24𝑛2𝑙𝑜𝑔2

𝑝
on p cores.

If you want to achieve a speed up of 1000 with 1024 cores, what is the
minimum memory each core must have? If you have 1024 cores with 1
GB (2^30 bytes) memory per core, what is the maximum speedup you
can achieve?

Quiz

60

1

Question 6
Your program has an execution time of n^3 and uses n^2 bytes space
when the problem size is n. You have worked very hard to make it fully
parallel and reduced the communication time to 24𝑛2𝑙𝑜𝑔2

𝑝
on p cores.

If you want to achieve a speed up of 1000 with 1024 cores, what is the
minimum memory each core must have? If you have 1024 cores with 1
GB (2^30 bytes) memory per core, what is the maximum speedup you
can achieve?

𝑆𝑝𝑒𝑒𝑑𝑢𝑝 =
𝑛3

𝑛3/𝑝 + 24𝑛2𝑙𝑜𝑔2
𝑝 =

1

1
1024

+ 24𝑙𝑜𝑔2
1024/𝑛

=
1

1
1024

+ 240/𝑛

= 1000 ⇒ 𝑛 = 10240000
So the minimum memory each core must have is n^2/p = 1024*10^8
bytes = 10^8 KB = 100GB
If each core only has 1GB, n^2 /p=n^2 /1024≤2^30 ⇒n≤2^20 ⇒

𝑠𝑝𝑒𝑒𝑑𝑢𝑝 ≤
220

220

210
+240

= 830 So the maximum speedup is 830.

Quiz

61

1

Question 7
Assume a parallel program takes 1000 seconds to finish when using 1
core and 600 seconds to finish when using 2 cores. Is it possible to
finish the program within 400 seconds when using 4 cores? If your
experiment actually takes 500 seconds when using 4 cores, what is
the minimum execution time when using 6 cores?

Quiz

62

1

Question 7
Assume a parallel program takes 1000 seconds to finish when using 1
core and 600 seconds to finish when using 2 cores. Is it possible to
finish the program within 400 seconds when using 4 cores? If your
experiment actually takes 500 seconds when using 4 cores, what is
the minimum execution time when using 6 cores?

Solution(1/2): The speedup is 5/3 when using 2 cores, so using Karp-

Flatt Metric we can get serial fraction 𝑒 =

1

Ψ

1

𝑝

1−
1

𝑝

=
3

5

1

2

1−
1

2

=
1

10
1

2

=
1

5
When

using 4 cores, the speedup = 1/[1/5+(1-1/5)/4] = 2.5 and execution
time = 1000/2.5 = 400 seconds. So, it’s possible to finish the program
within 400 seconds when using 4 cores.

Quiz

63

1

Question 7
Assume a parallel program takes 1000 seconds to finish when using 1
core and 600 seconds to finish when using 2 cores. Is it possible to
finish the program within 400 seconds when using 4 cores? If your
experiment actually takes 500 seconds when using 4 cores, what is
the minimum execution time when using 6 cores?

Solution(2/2): Actually the speedup is 2 when using 4 cores, so using

Karp-Flatt Metric we can get serial fraction 𝑒 =

1

Ψ

1

𝑝

1−
1

𝑝

=
1

2

1

4

1−
1

4

=
1

4
3

4

=
1

3

Therefore, when using 6 cores, serial fraction e =

1

Ψ

1

𝑝

1−
1

𝑝

=
1

Ψ

1

6

1−
1

6

=
1

Ψ

1

6
5

6

1

3
⇒ Ψ ≤

9

4
, so the minimum execution time is 444.4 seconds.

p 2 4 6

e 1/5 1/3 >=1/3

◼ A parallel program executing on 32 processors spends 5%
of its time in sequential code. What is the scaled speedup of
this program?

Pop Quiz

◼ Sequential algorithm complexity
T(n,1) = (n)

◼ Parallel algorithm

◆Computational complexity = (n/p)
◆Communication complexity = (log p)

◼ Parallel overhead
T0(n,p) = (p log p)

Example 1: Reduction

◼ The system has good scalability

◼ Isoefficiency relation: n  C p log p

◼ We ask: To maintain same level of efficiency, how must

n increase when p increases?

◼ M(n) = n

M (Cplogp) / p = Cplogp / p = Clogp

Reduction (continued)

◼ Sequential time complexity: T(n,1)=(n3)
◼ Parallel computation time: (n3/p)
◼ Parallel communication time: (n2log p)
◼ Parallel overhead: T0(n,p) = (pn2log p)

Example 2: Floyd’s Algorithm

◼ Isoefficiency relation
n3  C(p n2 log p)  n  C p log p

◼ M(n) = n2

◼ The parallel system has poor scalability

M (Cplogp) / p =C2 p2 log2 p / p =C2 p log2 p

Floyd’s Algorithm (continued)

◼ Sequential time complexity per iteration: T(N,1)=(n2)
◼ Parallel communication complexity per iteration: (n/p)
◼ Parallel overhead: T_o(n,p)=(n p)
◼ Space Complexity: M(n) = n2

◼ What is the scalability function?

Example 3: Finite Difference

◼ Isoefficiency relation
n2  Cnp  n  C p

◼ M(n) = n2

◼ This algorithm is perfectly scalable

M (C p) / p =C2 p / p =C2

Finite Difference (continued)