CS计算机代考程序代写 Microsoft PowerPoint – COMP528 HAL13 MPI collective comms – detail.pptx

Microsoft PowerPoint – COMP528 HAL13 MPI collective comms – detail.pptx

Dr Michael K Bane, G14, Computer Science, University of Liverpool
m.k. .uk https://cgi.csc.liv.ac.uk/~mkbane/COMP528

COMP528: Multi-core and
Multi-Processor Programming

13 – HAL

NAME
MPI_Send – Performs a blocking sending

SYNOPSIS
int MPI_Send(void *buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm)

INPUT PARAMETERS
buf – initial address of sending buffer (choice)
count – number of elements in sending buffer (non-negative integer)
datatype

– datatype of each sending buffer element (handle)
dest – rank of destination (integer)
tag – message tag (integer)
comm – communicator (handle)

NAME
MPI_Recv – Blocking receiving for a message

SYNOPSIS
int MPI_Recv(void *buf, int count, MPI_Datatype datatype, int source, int tag, MPI_Comm comm,

MPI_Status *status)

OUTPUT PARAMETERS
buf – initial address of receiving buffer (choice)
status – status object (Status)

INPUT PARAMETERS
count – maximum number of elements in receiving buffer (integer)
datatype – datatype of each receiving buffer element (handle)
source – rank of source (integer)
tag – message tag (integer)
comm – communicator (handle)

Same “message data” syntax:
What to send|recv, how

many, what data type

Require matching “message
envelope” for message to

“complete”

Nothing changes on the Sender
BUT the Receiver gets new data in

“buf” and “status”

Re-visit syntax
NAME

MPI_Send – Performs a blocking sending

SYNOPSIS
int MPI_Send(void *buf, int count, MPI_Datatype datatype, int dest, int tag,

MPI_Comm comm)

INPUT PARAMETERS
buf – initial address of sending buffer (choice)
count – number of elements in sending buffer (non-negative integer)
datatype

– datatype of each sending buffer element (handle)
dest – rank of destination (integer)
tag – message tag (integer)
comm – communicator (handle)

• MPI_Datatype
• Pre-defined (via )

C type MPI_Datatype

int MPI_INT

float MPI_FLOAT

double MPI_DOUBLE

Re-visit syntax
NAME

MPI_Send – Performs a blocking sending

SYNOPSIS
int MPI_Send(void *buf, int count, MPI_Datatype datatype, int dest, int tag,

MPI_Comm comm)

INPUT PARAMETERS
buf – initial address of sending buffer (choice)
count – number of elements in sending buffer (non-negative integer)
datatype

– datatype of each sending buffer element (handle)

• “Pass by reference” – buffer will be copied (fits general MPI model)
• Pass the ADDRESS…
• Scalars: &x, &xyzzy – pass the variable’s address (x would be the value of the scalar)
• Vectors: by definition if variable y is an array then “y” is a ptr

• &y is thus address of a pointer: wrong
• &y[0] – pass the address of the variable’s first element to be copied: okay
• y is also okay

Re-visit syntax
NAME

MPI_Send – Performs a blocking sending

SYNOPSIS
int MPI_Send(void *buf, int count, MPI_Datatype datatype, int dest, int tag,

MPI_Comm comm)

INPUT PARAMETERS
buf – initial address of sending buffer (choice)
count – number of elements in sending buffer (non-negative integer)
datatype

– datatype of each sending buffer element (handle)
dest – rank of destination (integer)
tag – message tag (integer)
comm – communicator (handle)

• “comm” is the variable name of the required “communicator”
• “comm” is of type MPI_Comm (ie defined via )
• We use MPI_COMM_WORLD communicator which is defined to be all

processes (via mpirun)

Syntax of Collectives similar to Syntax of Point to Point

NAME
MPI_Bcast – Broadcasts a message from the process with rank “root” to all other processes

of the communicator

SYNOPSIS
int MPI_Bcast( void *buffer, int count, MPI_Datatype datatype, int root, MPI_Comm comm )

INPUT/OUTPUT PARAMETER
buffer – starting address of buffer (choice)

INPUT PARAMETERS
count – number of entries in buffer (integer)
datatype – data type of buffer (handle)
root – rank of broadcast root (integer)
comm – communicator (handle)

• How the MPI implementation implements MPI_Bcast may be a series of MPI_Send & MPI_Recv
and if so it has to ensure no possible mix up with user’s point-to-point comms; no need for “dest”
since ALL processes now involved (and “src” can be consider as “root”)

Same “message data” syntax:
What to send|recv, how

many, what data type

“message envelope”
> no tags

(but… “root”)

Upon completion, “buffer” on
each process has copy of “buffer”

values on “root”

Scatter: syntax like send+recv syntax

NAME
MPI_Scatter – Sends data from one process

to all other processes in a communicator

SYNOPSIS
int MPI_Scatter(void *sendbuf, int sendcnt, MPI_Datatype sendtype,

void *recvbuf, int recvcnt, MPI_Datatype recvtype,
int root,
MPI_Comm comm)

INPUT PARAMETERS
sendbuf – address of sending buffer (choice, significant only at root )
sendcount – number of elements sent to each process (integer, significant only at root )
sendtype – data type of sending buffer elements (significant only at root ) (handle)
recvcount – number of elements in receiving buffer (integer)
recvtype – data type of receiving buffer elements (handle)
root – rank of sending process (integer)
comm – communicator (handle)

OUTPUT PARAMETER
recvbuf – address of receiving buffer (choice)

Same “message data” syntax:
What to send, how many,

what data type

“message envelope”
> no tags

(but… “root”)

Upon completion, “recvbuf” on
each process has a share of copy

of “sendbuf” values on “root”

Same “message data” syntax:
What to recv, how many, what

data type

Gather (inverse of scatter function):
syntax like send+recv syntax

NAME
MPI_Gather – Gathers together values

from a group of processes

SYNOPSIS
int MPI_Gather(void *sendbuf, int sendcnt, MPI_Datatype sendtype,

void *recvbuf, int recvcnt, MPI_Datatype recvtype,
int root, MPI_Comm comm)

INPUT PARAMETERS
sendbuf – starting address of sending buffer (choice)
sendcount – number of elements in sending buffer (integer)
sendtype – data type of sending buffer elements (handle)
recvcount – number of elements for any single receive (integer, significant only at root)
recvtype – data type of recv buffer elements (significant only at root) (handle)
root – rank of receiving process (integer)
comm – communicator (handle)

OUTPUT PARAMETER
recvbuf – address of receiving buffer (choice,

significant only at root )

Same “message data” syntax:
What to send, how many,

what data type

“message envelope”
> no tags

(but… “root”)

Upon completion, “recvbuf” on “root” has rank-
ordered collection of each copy of “sendbuf” from

all processes

Same “message data” syntax:
What to recv, how many, what

data type

for (int i=myStartIter; i<=myFinishIter; i++) { x = a + i*stepsize; mySum += 0.5*stepsize*(func(x) + func(x+stepsize)); } Rank 0 Rank 1 Rank 2 0 1 2 3 4 5 So each MPI process sums its 2 trapezoidals in to “mySum” Need to form globalSum on rank 0 Every process calls the MPI_Reduce collective for (int i=myStartIter; i<=myFinishIter; i++) { x = a + i*stepsize; mySum += 0.5*stepsize*(func(x) + func(x+stepsize)); } globalSum = 0.0; MPI_Reduce(&mySum, &globalSum, 1, MPI_FLOAT, MPI_SUM, 0, MPI_COMM_WORLD); if (myRank==0) { printf("TOTAL SUM: %f\n", globalSum ) } Reminder why using collective • More simple to write • Clean code is good code • Expectation that MPI collective is more efficient that user-written functionality • A good MPI implementation will make use of system knowledge for (int i=myStartIter; i<=myFinishIter; i++) { x = a + i*stepsize; mySum += 0.5*stepsize*(func(x) + func(x+stepsize)); } /* each pass back sum to root for global sum */ if (myRank != 0) { MPI_Send(&mySum, 1, MPI_FLOAT, 0, 999, MPI_COMM_WORLD); } else { // myRank is 0 float inputSum; for (int i=1; i1 then Reduction is “element-wise”

Element-wise Reduction

Rank
0

Rank
1

Rank
2

Rank
3

A1 A2 A3 A4 A5 B1 B2 B3 B4 B5 C1 C2 C3 C4 C5 D1 D2 D3 D4 D5

MPI_Reduce(mySum, globalSum, 3, MPI_FLOAT, MPI_SUM, 0, MPI_COMM)

mySum:

A1 + B1 + C1 +
D1

A2+B2+C2+D2 A3+B3+C3+D3

globalSum: ?? ?? ??

A1 A2 A3 A4 A5 B1 B2 B3 B4 B5 C1 C2 C3 C4 C5 D1 D2 D3 D4 D5

mySum:

MPI_Reduce
NAME

MPI_Reduce – Reduces values on all processes to a single value

SYNOPSIS
int MPI_Reduce(void *sendbuf, void *recvbuf, int count, MPI_Datatype datatype,

MPI_Op op, int root, MPI_Comm comm)

INPUT PARAMETERS
sendbuf – address of sending buffer (choice)
count – number of elements in sending buffer (integer)
datatype – data type of elements of sending buffer (handle)
op – reduce operation (handle)
root – rank of root process (integer)
comm – communicator (handle)

OUTPUT PARAMETER
recvbuf – address of receiving buffer (choice, significant only at root )

• Reduce operation
• Commutative: [x op y = y op x]
• Presumed associative: x op y op z = (x op y) op z = x op (y op z) within

rounding errors

MPI Reduction Operations

• Requirements
• commutative [x op y = y op x]
• Assumed to be associative: x op y op z = (x op y) op z = x op (y op z)

within rounding errors
• Any more potential reduction operators…?

• A number are pre-defined
• Can also define own

• Need to meet commutative & associative rules

YES NO

+ * /

Max Min –

MinLOC, MaxLOC

MPI

+ MPI_SUM

* MPI_PROD

Min MPI_MIN

Max MPI_MAX

MinLOC MPI_MINLOC

MaxLOC MPI_MAXLOC

Questions via MS Teams / email
Dr Michael K Bane, Computer Science, University of Liverpool
m.k. .uk https://cgi.csc.liv.ac.uk/~mkbane