MPI代写

Prefix Scan

• Toolchain:
• Visual studio (express edition freely available) • MicrosoftMPI

• First install Visual Studio, then Open MPI • Compile with: cl
• Launch with: mpiexec

Prefix Scan

• Given a sequence of numbers, e.g.: (𝑎0, 𝑎1, 𝑎2, 𝑎3,…, 𝑎𝑛−1)

• Compute the output sequence: (𝑏 , 𝑏 , 𝑏 , 𝑏 ,…, 𝑏 )

• Where 𝑏=𝑖𝑎

𝑖 𝑗=0𝑗 • Example: 𝑛 = 4

0123 𝑛−1

(𝑎 )3 = 1,3,2,2 𝑖 𝑖=0

(𝑏 )3 = (1,4,6,8) 𝑖 𝑖=0

Prefix Scan

• Serial Algorithm:

0: <initialise a and b> 1: b[0] = a[0];

for(int i=1; i<n; ++i) 2: b[i] = b[i-1] + a[i];

• Variable values:

  1. 0:  a = (1,3,2,2)
  2. 1:  a = (1,3,2,2)
  1. 2(i=1):  a = (1,3,2,2)
  2. 2(i=2):  a = (1,3,2,2)
  3. 2(i=3):  a = (1,3,2,2)
b = (0,0,0,0)
b = (1,0,0,0)
b = (1,4,0,0)
b = (1,4,6,0)
b = (1,4,6,8)

Prefix Scan

• Alternative algorithm:

𝒂𝟎

𝒂𝟏

𝒂𝟐

𝒂𝟑

𝒂𝟒

𝒂𝟓

𝒂𝟔

[0]

[0,1]

[2]

[2,3]

[4]

[4,5]

[6]

[0]

[0,1]

[2]

[0,3]

[4]

[4,5]

[6]

[0]

[0,1]

[2]

[0,3]

[4]

[0,5]

[6]

[0]

[0,1]

[0,2]

[0,3]

[0,4]

[0,5]

[0,6]

up

down

1210

Prefix Scan

• Example: a = (1,2,2,3,1,4,2)
• Expected output: b = (1,3,5,8,9,13,15)

1

𝟐

𝟐

𝟑

1

4

2

1

3

2

5

1

5

2

1

3

2

8

1

5

2

1

3

2

8

1

13

2

1

3

5

8

9

13

15

up

down

1210

Prefix Scan

  • Number of rounds in both up and down phase: R= log𝑛,

    where the brackets denote ‘floor’ and log is the logarithm base 2.

  • In the up-phase, in round 𝑟 (𝑟 = 0, … , 𝑅 − 1), processes whose rank is a (multiple of 2𝑟+1) minus 1 receive their data from 2𝑟processes to the left, and add it to the local result.
  • In the down-phase, in round 𝑟 (𝑟 = 𝑅, … , 1), processes whose rank is a (multiple of 2𝑟) minus 1 send their data 2𝑟−1 processes to the right, where it is added to the local result.
  • Sending and receiving only takes place when both partners are within the range [0,n-1].

Prefix Scan

• Sequential algorithm: one process performs n-1 additions.

• Parallel algorithm: n processes perform a maximum of 𝑅 + 1 = log 𝑛 computations each.

Prefix Scan

• Implement the prefix scan algorithm in c/mpi:

  • Input: the number n of elements to be added
  • Process 0 will generate n random integers and broadcast these
  • Output: process 0 prints the input and output sequences

    • Deliverables:

  • A single source file prefix.c that can be compiled
  • The source file will contain all necessary c/mpi code
  • Your code == Your report:
    • Document your code
    • Show you understand

      • DO NOT USE MPI_Scan!