Overview Point-to-point communication Communication details Summary and next lecture
XJCO3221 Parallel Computation
University of Leeds
Copyright By PowCoder代写 加微信 powcoder
Lecture 9: Point-to-point communication
XJCO3221 Parallel Computation
Point-to-point communication Previous lectures
Communication details Today¡¯s lecture Summary and next lecture
Previous lectures
Last lecture we started looking at distributed memory systems: Each processing unit can only see a fraction of memory.
e.g. clusters of machines for HPC = High Performance Computing, where each node has its own memory.
Standard API for low-level programming is MPI = Message Passing Interface.
Processing units are processes rather than threads. Saw a ¡®Hello World¡¯ program for MPI.
XJCO3221 Parallel Computation
Point-to-point communication Previous lectures
Communication details Today¡¯s lecture Summary and next lecture
mpiexec or mpirun
Launches multiple executables simultaneously, possibly on different machines/nodes, which are identical in every way except their rank:
MPI_Comm_rank();
mpiexec -n 4 ./helloWorld
All processes exist for the duration of the program run. Creation or destruction of processes is expensive (compared
to threads in e.g. shared memory systems). XJCO3221 Parallel Computation
MPI_Comm_rank();
MPI_Comm_rank();
MPI_Comm_rank();
Point-to-point communication Previous lectures Communication details Today¡¯s lecture
Summary and next lecture
Today¡¯s lecture
Today we will start looking at using MPI to solve real problems. The simplest form of communication: point-to-point.
Implemented in the MPI standard with MPI Send() and MPI Recv() (plus variations).
Vector addition, the same problem we looked at for shared memory systems in Lecture 3.
How exceeding the buffer size for some communication patterns can lead to deadlock.
XJCO3221 Parallel Computation
Point-to-point communication
Communication details Summary and next lecture
Vector addition in general
Vector addition in MPI
MPI Send() and MPI Recv() Completing the calculation
Vector addition
Recall vector addition can be written mathematically as
c=a+b or ci=ai+bi,i=1…N, and in serial code as
for( i=0; i
MPI_Recv( local_a ,
// Pointer to the data // The size being sent // The data type
// Source process rank // Tag (can set to 0) // Communicator
// MPI_Status object
localSize , MPI_FLOAT ,
0, MPI_COMM_WORLD , &status
And similarly for local b.
XJCO3221 Parallel Computation
Point-to-point communication
Communication details Summary and next lecture
Vector addition in general Vector addition in MPI
MPI Send() and MPI Recv() Completing the calculation
a[0] a[localSize] a[2*localSize] localSize
MPI_Send()
MPI_Recv()
local_a[0] Rank 1
MPI_Recv()
MPI_Recv()
MPI_Send()
local_a[0]
local_a[0]
MPI_Send()
XJCO3221 Parallel Computation
Point-to-point communication
Communication details Summary and next lecture
Vector addition in general Vector addition in MPI
MPI Send() and MPI Recv() Completing the calculation
Completing the calculation
After rank 0 has distributed the full arrays a and b to the local arrays local a and local b on all other ranks:
They all perform vector addition using their local arrays.
The local c arrays on each rank>0 are sent to rank 0 using the same procedure as before.
Note that, in the code, local arrays are given separate names to the full arrays.
e.g. local a rather than a.
This is recommended (but not essential) to help keep track.
XJCO3221 Parallel Computation
Point-to-point communication
Communication details Summary and next lecture
Vector addition in general Vector addition in MPI
MPI Send() and MPI Recv() Completing the calculation
The p-loop starts from 1, not zero. Sending ¡®to self¡¯ (e.g. from rank 0 to rank 0) is undefined.
Works in OpenMPI but not MPICH, so your code would not be portable.
The data type is one of MPI FLOAT, MPI INT, MPI DOUBLE, MPI CHAR, . . .
&a[p*localSize] is a pointer to a sub-array that starts at element p*localSize of a.
Most MPI calls return MPI SUCCESS if successful, otherwise an error occurred.
Can probe the status object to determine errors, rank of sending process etc.
Can also replace &status with MPI STATUS IGNORE. XJCO3221 Parallel Computation
Overview Point-to-point communication Communication details Summary and next lecture
Communication buffers
Blocking communication
Buffers can lead to deadlock Resolving communication deadlocks
How is the communication performed?
The MPI standard does not specify how the communication is actually performed.
If the nodes have IP addresses, could use standard internet protocols (e.g. sockets); the Network layer [cf. XJCO2221 Networks].
For HPC machines (where nodes do not have IP addresses), could use Link layer protocols or bespoke methods.
In this module we focus on general aspects of distributed system programming, not details of any MPI implementation.
Portable code that should run on any implementation. XJCO3221 Parallel Computation
Overview Point-to-point communication Communication details Summary and next lecture
Communication buffers
Blocking communication
Buffers can lead to deadlock Resolving communication deadlocks
Common communication features
Each data message has a header containing information such as the source and destination ranks1.
Message placed on a buffer ready to send.
If it is too large, will send direct to the destination process.
small message
MPI_Send()
Network interconnect
large message
1Maximum header size is MPI BSEND OVERHEAD, defined in mpi.h. XJCO3221 Parallel Computation
Overview Point-to-point communication Communication details Summary and next lecture
Communication buffers
Blocking communication
Buffers can lead to deadlock Resolving communication deadlocks
Blocking communication
MPI Send() and MPI Recv() are examples of blocking routines. Blocking routines do not return until all resources can be reused
This means e.g. values in the data array can be altered without affecting the values sent.
Convenient from a programming perspective.
By contrast, non-blocking routines return ¡®immediately,¡¯ even though the data may still be being copied over.
We will cover non-blocking communication in Lecture 12.
and XJCO3221 Parallel Computation
Overview Point-to-point communication Communication details Summary and next lecture
Communication buffers
Blocking communication
Buffers can lead to deadlock Resolving communication deadlocks
Cyclic communictation
Code on Minerva: cyclicSendAndReceive.c
Consider a problem where the communication pattern is cyclic: rank 0 rank 1 rank p-1
Encode this concisely using the ternary operator ¡®(a?b:c)¡¯ to handle the wrap-around:
1 2 3 4 5 6 7
// Send data ¡®to the right ¡¯.
MPI_Send( sendData , N, MPI_INT ,
( rank==numProcs-1 ? 0 : rank+1 ), … );
// Receive data ¡®from the left ¡¯.
MPI_Recv( recvData , N, MPI_INT ,
( rank==0 ? numProcs-1 : rank-1 ), … );
XJCO3221 Parallel Computation
Overview Point-to-point communication Communication details Summary and next lecture
Communication buffers
Blocking communication
Buffers can lead to deadlock Resolving communication deadlocks
Use of buffering
If the data is small enough to fit on the buffer:
1 Each process calls MPI Send() to send data ¡®to its right.¡¯
2 The data is copied to the buffer and MPI Send() returns.
3 Each process calls MPI Recv() and receives data from the process ¡®to its left.¡¯
If the data is too large for the buffer, the application hangs:
1 MPI Send() does not return until the destination process
receives the data.
2 All processes are in the same situation – none of them
reach their call to MPI Recv().
3 As no data is received, no process returns from MPI Send().
XJCO3221 Parallel Computation
Overview Point-to-point communication Communication details Summary and next lecture
Communication buffers
Blocking communication
Buffers can lead to deadlock Resolving communication deadlocks
This is another example of deadlock that we first saw in Lecture 7:
Each process is waiting for a synchronisation event that never occurs.
In this case the ¡®synchronisation event¡¯ is the blocking send and receive that required the destination process to receive the data.
Say more about the relationship between blocking and synchronisation in Lecture 12.
and XJCO3221 Parallel Computation
Overview Point-to-point communication Communication details Summary and next lecture
Communication buffers
Blocking communication
Buffers can lead to deadlock Resolving communication deadlocks
Resolving communication deadlocks
The buffer size is not specified by the MPI standard and varies between implementations.
Even allowed to be zero size!
Want to write code that works for any buffer size.
There are various ways to resolve this deadlock problem:
1 Change the program logic [here].
2 Use non-blocking communication [Lecture 12].
3 Allocate your own memory for a buffer and use buffered send MPI Bsend().
XJCO3221 Parallel Computation
Overview Point-to-point communication Communication details Summary and next lecture
Communication buffers
Blocking communication
Buffers can lead to deadlock Resolving communication deadlocks
Staggering the send and receives
For this example, it is easiest to change the program logic to use staggered sends and receives1:
1 2 3 4 5 6 7 8 9
if( rank%2 ) {
MPI_Send(sendData ,N,MPI_INT ,…);
MPI_Recv(recvData ,N,MPI_INT ,…); }
MPI_Recv(recvData ,N,MPI_INT ,…); MPI_Send(sendData ,N,MPI_INT ,…);
1Recall i%2==0 if i is even, and 1 if i is odd.
XJCO3221 Parallel Computation
Overview Point-to-point communication Communication details Summary and next lecture
Communication buffers
Blocking communication
Buffers can lead to deadlock Resolving communication deadlocks
Processes with even-numbered ranks receive first then send, breaking the deadlock:
rank 2 rank 3
Note the arguments with each MPI Send() and MPI Recv(), including the source and destination ranks, have not been altered.
XJCO3221 Parallel Computation
Overview Point-to-point communication Communication details Summary and next lecture
Summary and next lecture
Summary and next lecture
Today we have looked at point-to-point communication in a distributed memory system:
How to implement a data parallel problem (or a map) using MPI Send() and MPI Recv().
These routines are blocking, a similar concept to synchronous communication.
Exceeding the buffer can lead to deadlock.
Next time we will look at some performance considerations, and
how they can be improved by using collective communication.
XJCO3221 Parallel Computation
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com