Collective Communications
Reusing this material
This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License.
http://creativecommons.org/licenses/by-nc-sa/4.0/deed.en_US
This means you are free to copy and redistribute the material and adapt and build on the
material under the following terms: You must give appropriate credit, provide a link to the license and indicate if changes were made. If you adapt or build on the material you must distribute your work under the same license as the original.
Acknowledge EPCC as follows: EPCC, The University of Edinburgh, www.epcc.ed.ac.uk
Note that this presentation contains images owned by others. Please seek their permission before reusing these images.
3
Collecti一ve Communication
• Communications involving a group of processes.
• Called by
• Examples:
– Barrier synchronisation.
– Broadcast, scatter, gather.
– Global sum, global maximum, etc.
all processes in a communicator.
4
Characteristics of Collective Comms
• Synchronisation may or may not occur. =_=
• Collective action over a communicator.
• All processes must communicate.
• Standard collective operations are blocking.
– non-blocking versions recently introduced into MPI 3.0
– may be useful in some situations but not yet commonly employed – obvious extension of blocking version: extra request parameter
• No tags. anner
• Receive buffers must be exactly the right size.
对沟通者的集体行动。 •所有过程都必须进行通信。
•同步可能会或可能不会发生。
•标准集体行动受到阻碍。
-最近在MPI 3.0中引入的非阻塞版本 -在某些情况下可能有用,但尚未被普遍采用-明显扩展了阻止版本:额外的请求参数 •没有标签。
•接收缓冲区的大小必须完全正确。 5
Barrier Synchronisation
• C:
int MPI_Barrier (MPI_Comm comm)
一
• Fortran:
MPI_BARRIER (COMM, IERROR)
INTEGER COMM, IERROR
6
Broadcast
• C:
int MPI_Bcast (void *buffer, int count,
MPI_Datatype datatype, int root,
MPI_Comm comm)
一
• Fortran:
MPI_BCAST (BUFFER, COUNT, DATATYPE, ROOT,
COMM, IERROR)
INTEGER COUNT, DATATYPE, ROOT, COMM, IERROR
7
Scatter
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
8
Scatter
• C:
int MPI_Scatter(void *sendbuf,
int sendcount, MPI_Datatype sendtype, void *recvbuf, int recvcount, MPI_Datatype recvtype, int root, MPI_Comm comm)
• Fortran:
MPI_SCATTER(SENDBUF, SENDCOUNT, SENDTYPE,
RECVBUF, RECVCOUNT, RECVTYPE, ROOT, COMM, IERROR)
INTEGER SENDCOUNT, SENDTYPE, RECVCOUNT INTEGER RECVTYPE, ROOT, COMM, IERROR
9
Gather
A
B
C
D
E
A
B
C
D
E
A
B
C
D
E
10
Gather
• C:
int MPI_Gather(void *sendbuf, int sendcount,
MPI_Datatype sendtype, void *recvbuf,
int recvcount, MPI_Datatype recvtype,
int root, MPI_Comm comm)
• Fortran:
MPI_GATHER(SENDBUF, SENDCOUNT, SENDTYPE,
RECVBUF, RECVCOUNT, RECVTYPE,
ROOT, COMM, IERROR)
INTEGER SENDCOUNT, SENDTYPE, RECVCOUNT
INTEGER RECVTYPE, ROOT, COMM, IERROR
11
Global Reduction Operations
• Used to compute a result involving data distributed over a group of processes.
• Examples:
– global sum or product
– global maximum or minimum – global user-defined operation
12
Predefined Reduction Operations
MPI Name
MPI_MAX
Function Maximum
MPI_MIN
Minimum
MPI_SUM
Sum
MPI_PROD
Product
MPI_LAND
Logical AND
MPI_BAND
Bitwise AND
MPI_LOR
Logical OR
MPI_BOR
Bitwise OR
MPI_LXOR
Logical Exclusive OR
MPI_BXOR
Bitwise Exclusive OR
MPI_MAXLOC
Maximum and location
MPI_MINLOC
Minimum and location
13
MPI_Reduce
• C:
int MPI_Reduce(void *sendbuf, void *recvbuf,
言 -_- MPI_Op op, int root, MPI_Comm comm)
int count, MPI_Datatype datatype,
• Fortran:
MPI_REDUCE(SENDBUF, RECVBUF, COUNT,
DATATYPE, OP, ROOT, COMM, IERROR)
INTEGER SENDCOUNT, SENDTYPE, RECVCOUNT
INTEGER RECVTYPE, ROOT, COMM, IERROR
14
MPI_REDUCE
Rank 0
Root 1
MPI_REDUCE
A
B
C
D
A
B
C
D
E
F
G
H
E
F
G
H
I
J
K
L
2
3
AoEoIoM
I
J
K
L
M
N
O
P
N
O
P
M
15
Example of Global Reduction
Integer global sum
• C:
MPI_Reduce(&x, &result, 1, MPI_INT,
sendrnfrreubffomtdataype.mn op
MPI_SUM,0, MPI_COMM_WORLD)
• Sum of all the x values is placed in result. oeet
• Fortran:
CALL MPI_REDUCE(x, result, 1, MPI_INTEGER,
MPI_SUM, 0,
MPI_COMM_WORLD, IERROR)
• The result is only placed there on processor 0. 16
U T s e r – D e f i n e d R 澂- e d u c t i o n O p e r a t o r s • Reducing using an arbitrary operator, o
• C – function of type MPI_User_Function:
void my_op (void *invec, void *inoutvec, int *len,
MPI_Datatype *datatype)
• Fortran – external subprogram of type
SUBROUTINE MY_OP(INVEC(*), INOUTVEC(*), LEN,
DATATYPE)
INTEGER LEN, DATATYPE
17
Reduction Operator Functions
• Operator function for o must act as for (i = 1 to len)
inoutvec(i) = inoutvec(i) o invec(i)
• Operator o need not commute, but must be associative
o不必 但必须具有关联性
18
Registering User-Defined Operator
一
• Operator handles have type MPI_Op or INTEGER
• C:
int MPI_Op_create(MPI_User_function *my_op,
int commute, MPI_Op *op)
• Fortran:
MPI_OP_CREATE (MY_OP, COMMUTE, OP, IERROR)
EXTERNAL MY_OP
LOGICAL COMMUTE
INTEGER OP, IERROR
19
Variants of M𤀼PI_REDUCE
• MPI_Allreduce no root process
• MPI_Reduce_scatter result is scattered • MPI_Scan parallel prefix
并行
20
MPI_ALLREDUCE
Rank 0
A
B
C
D
A
B
C
D
E
F
G
H
1
MPI_ALLREDUCE
E
F
G
H
I
J
K
L
2
3
AoEoIoM
I
J
K
L
M
N
O
P
N
O
P
M
21
MPI_ALLREDUCE
Integer global sum
• C:
int MPI_Allreduce(void* sendbuf,
void* recvbuf, int count,
MPI_Datatype datatype,
MPI_Op op, MPI_Comm comm)
• Fortran:
MPI_ALLREDUCE(SENDBUF, RECVBUF, COUNT,
DATATYPE, OP, COMM, IERROR)
22
MPI_SCAN
Rank 0
A
B
C
D
A
B
C
D
E
F
G
H
E
F
G
H
1
A
MPI_SCAN
AoE
AoEoI
AoEoIoM
I
J
K
L
2
3
I
J
K
L
M
N
O
P
N
O
P
M
23
MPI_SCAN
• C:
Integer partial sum
int MPI_Scan(void* sendbuf, void* recvbuf,
int count, MPI_Datatype datatype,
MPI_Op op, MPI_Comm comm)
• Fortran:
MPI_SCAN(SENDBUF, RECVBUF, COUNT,
DATATYPE, OP, COMM, IERROR)
24
Exercise
• See Exercise 5 on the sheet
• Rewrite the pass-around-the-ring program to use MPI
global reduction to perform its global sums.
• Then rewrite it so that each process computes a partial sum
25