Microsoft PowerPoint – COMP528 HAL11 MPI datatypes, rules, motivation for collectives.pptx
Dr Michael K Bane, G14, Computer Science, University of Liverpool
m.k. .uk https://cgi.csc.liv.ac.uk/~mkbane/COMP528
COMP528: Multi-core and
Multi-Processor Programming
11-HAL
RECAP
• MPI: Message Passing Interface
• One standard
• Various implementations
• SPMD
• Point-to-point
• Pair-wise comms
• One sends: MPI_Send
• One recvs: MPI_Recv
• Blocking & non-blocking
• Send NUM of elements in contiguous memory
COMP328/COMP528 (c) mkbane, university of liverpool
Final Words for Point-to-Point
• How to time using MPI function?
• Wildcards
• Fairness
• Message Sizes / Datatypes / How much data received?
• There is more detail…
• see MPI FORUM if interested
• You now have most of (pt-to-pt) that you will usually use
COMP328/COMP528 (c) mkbane, university of liverpool
MPI Timers
• MPI_Wtime
• returns time since some previous point
• Thus difference between 2 calls is the elapsed (WALL CLOCK) time
• double MPI_Wtime(void)
• MPI_Wtick
• Gives resolution of MPI_Wtime in seconds
• double MPI_Wtick(void)
COMP328/COMP528 (c) mkbane, university of liverpool
Datatypes
COMP328/COMP528 (c) mkbane, university of liverpool
Datatypes
• For each native C type, there is a
corresponding MPI datatype
• This is what to use.
• E.g. if passing floats, you would use
MPI_FLOAT
as the MPI_Datatype
• Convention: UPPERCASE means a constant
LIFE IS SO MUCH MORE THAN
A SINGLE PAIRWISE SEND/RECV…
COMP328/COMP528 (c) mkbane, university of liverpool
Complexity, Quickly
• One sends, one receives
– blocking: function waits until safe to continue
– non-blocking: functionality starts but program returns immediately from
function call to do next statement
– https://www.mpi-forum.org/docs/mpi-3.1/mpi31-report/node57.htm#Node57
• Common to have many sends and receives within a real life
simulation
– e.g. for n-body
– Could initialise on rank-0 then send masses & positions & velocities to all
other ranks
– Send new position of each body to all other ranks – for every time step
• Real life…
– Likely to have many messages
– Want to reduce un-necessary delays
• e.g. say rank 0 is receiving updates from all arrays
– Rank 0: loop over ranks 1 to #PEs-1 to receive its data
– Ranks >0: each send data to rank 0
COMP328/COMP528 (c) mkbane, university of liverpool
• Real life…
– Likely to have many messages
– Want to reduce un-necessary delays
• e.g. say rank 0 is receiving updates from all arrays
– Rank 0: loop over ranks 1 to #PEs-1 to receive its data
– Ranks >0: each send data to rank 0
Do we need to
do this in a fixed
order?
• e.g. say rank 0 is receiving updates from all arrays
– Rank 0: loop over ranks 1 to #PEs-1 to receive its data
– Ranks >0: each send data to rank 0
• e.g. say rank 0 is receiving updates from all arrays
– Rank 0: loop over ranks 1 to #PEs-1
& receive any available data
– Ranks >0: each send data to rank 0
• e.g. say rank 0 is receiving updates from all arrays
– Rank 0: loop over ranks 1 to #PEs-1 to receive its data
– Ranks >0: each send data to rank 0
• e.g. say rank 0 is receiving updates from all arrays
– Rank 0: loop over ranks 1 to #PEs-1
& receive any available data
– Ranks >0: each send data to rank 0
As long as order received
does not matter
COMP328/COMP528 (c) mkbane, university of liverpool
– Rank 0: loop over ranks 1 to #PEs-1
& receive any available data
– Ranks >0: each send data to rank 0
• This would be achieved by the MPI_Recv(…) function calls
accepting from any source
– Replace given process rank by MPI_ANY_SOURCE
MPI_Recv(&inputSum, 1, MPI_FLOAT, MPI_ANY_SOURCE, 999, MPI_COMM_WORLD, &stat);
sum += &inputSum;
Wildcard
if (myRank != 0) {
MPI_Send(&mySum, 1, MPI_FLOAT, 0, 999, MPI_COMM_WORLD);
}
else { // myRank is 0
float inputSum;
for (int i=1; i