CS计算机代考程序代写 ER Microsoft PowerPoint – COMP528 HAL11 MPI datatypes, rules, motivation for collectives.pptx

Microsoft PowerPoint – COMP528 HAL11 MPI datatypes, rules, motivation for collectives.pptx

Dr Michael K Bane, G14, Computer Science, University of Liverpool
m.k. .uk https://cgi.csc.liv.ac.uk/~mkbane/COMP528

COMP528: Multi-core and
Multi-Processor Programming

11-HAL

RECAP

• MPI: Message Passing Interface
• One standard

• Various implementations

• SPMD

• Point-to-point
• Pair-wise comms

• One sends: MPI_Send

• One recvs: MPI_Recv

• Blocking & non-blocking

• Send NUM of elements in contiguous memory

COMP328/COMP528 (c) mkbane, university of liverpool

Final Words for Point-to-Point

• How to time using MPI function?

• Wildcards

• Fairness

• Message Sizes / Datatypes / How much data received?

• There is more detail…
• see MPI FORUM if interested

• You now have most of (pt-to-pt) that you will usually use

COMP328/COMP528 (c) mkbane, university of liverpool

MPI Timers

• MPI_Wtime
• returns time since some previous point

• Thus difference between 2 calls is the elapsed (WALL CLOCK) time

• double MPI_Wtime(void)

• MPI_Wtick
• Gives resolution of MPI_Wtime in seconds

• double MPI_Wtick(void)

COMP328/COMP528 (c) mkbane, university of liverpool

Datatypes

COMP328/COMP528 (c) mkbane, university of liverpool

Datatypes

• For each native C type, there is a
corresponding MPI datatype

• This is what to use.

• E.g. if passing floats, you would use
MPI_FLOAT
as the MPI_Datatype

• Convention: UPPERCASE means a constant

LIFE IS SO MUCH MORE THAN
A SINGLE PAIRWISE SEND/RECV…

COMP328/COMP528 (c) mkbane, university of liverpool

Complexity, Quickly

• One sends, one receives
– blocking: function waits until safe to continue

– non-blocking: functionality starts but program returns immediately from
function call to do next statement

– https://www.mpi-forum.org/docs/mpi-3.1/mpi31-report/node57.htm#Node57

• Common to have many sends and receives within a real life
simulation
– e.g. for n-body

– Could initialise on rank-0 then send masses & positions & velocities to all
other ranks

– Send new position of each body to all other ranks – for every time step

• Real life…
– Likely to have many messages

– Want to reduce un-necessary delays

• e.g. say rank 0 is receiving updates from all arrays
– Rank 0: loop over ranks 1 to #PEs-1 to receive its data

– Ranks >0: each send data to rank 0

COMP328/COMP528 (c) mkbane, university of liverpool

• Real life…
– Likely to have many messages

– Want to reduce un-necessary delays

• e.g. say rank 0 is receiving updates from all arrays
– Rank 0: loop over ranks 1 to #PEs-1 to receive its data

– Ranks >0: each send data to rank 0

Do we need to
do this in a fixed

order?

• e.g. say rank 0 is receiving updates from all arrays
– Rank 0: loop over ranks 1 to #PEs-1 to receive its data

– Ranks >0: each send data to rank 0

• e.g. say rank 0 is receiving updates from all arrays
– Rank 0: loop over ranks 1 to #PEs-1

& receive any available data

– Ranks >0: each send data to rank 0

• e.g. say rank 0 is receiving updates from all arrays
– Rank 0: loop over ranks 1 to #PEs-1 to receive its data

– Ranks >0: each send data to rank 0

• e.g. say rank 0 is receiving updates from all arrays
– Rank 0: loop over ranks 1 to #PEs-1

& receive any available data

– Ranks >0: each send data to rank 0

As long as order received
does not matter

COMP328/COMP528 (c) mkbane, university of liverpool

– Rank 0: loop over ranks 1 to #PEs-1
& receive any available data

– Ranks >0: each send data to rank 0

• This would be achieved by the MPI_Recv(…) function calls
accepting from any source
– Replace given process rank by MPI_ANY_SOURCE

MPI_Recv(&inputSum, 1, MPI_FLOAT, MPI_ANY_SOURCE, 999, MPI_COMM_WORLD, &stat);
sum += &inputSum;

Wildcard

if (myRank != 0) {
MPI_Send(&mySum, 1, MPI_FLOAT, 0, 999, MPI_COMM_WORLD);

}

else { // myRank is 0
float inputSum;
for (int i=1; i