CS代考 FIT3143 – LECTURE WEEK 5

Information Technology
FIT3143 – LECTURE WEEK 5
PARALLEL COMPUTING IN DISTRIBUTED MEMORY – MESSAGE PASSING LIBRARY

Copyright By PowCoder代写 加微信 powcoder

1. Message Passing Interface (MPI)
2. MPI Routines
Learning outcome(s) related to this topic
Explain the fundamental principles of parallel computing architectures and algorithms (LO1)
Design and develop parallel algorithms for various parallel computing architectures (LO3)
FIT3143 Parallel Computing 2

What is MPI
▪M P I = Message Passing Interface
▪MPI is a specification for the developers and users of message passing libraries. By itself, it is NOT a library – but rather the specification of what such a library should be.
▪Simply stated, the goal of the Message Passing Interface is to provide a widely used standard for writing message passing programs. The interface attempts to be
– practical – portable – efficient – flexible
FIT3143 Parallel Computing 3

Reasons for Using MPI
▪ Standardization – MPI is the only message passing library which can be considered a standard. It is supported on virtually all major platforms and many specialised HPC systems. Practically, it has replaced all previous message passing libraries.
▪ Portability – There is no need to modify your source code when you port your application to a different platform that supports (and is compliant with) the MPI standard.
▪ Performance Opportunities – Vendor implementations should be able to exploit native hardware features to optimize performance.
▪ Functionality – Over 115 routines are defined in MPI-1 alone.
▪ Availability – A variety of implementations are available, both vendor and public domain.
FIT3143 Parallel Computing 4

Programming Model
▪ MPI lends itself to virtually any distributed memory parallel programming model. In addition, MPI is commonly used to implement (behind the scenes) some shared memory models, such as Data Parallel, on distributed memory architectures.
▪ Hardware platforms:
– Distributed Memory: Originally, MPI was targeted for distributed memory
– Shared Memory: As shared memory systems became more popular, particularly SMP / NUMA architectures, MPI implementations for these platforms appeared.
– Hybrid: MPI is now used on just about any common parallel architecture including massively parallel machines, SMP clusters, workstation clusters and heterogeneous networks.
FIT3143 Parallel Computing 5

Programming Model
▪ All parallelism is explicit: the programmer is responsible for correctly identifying parallelism and implementing parallel algorithms using MPI constructs.
▪ The number of tasks dedicated to run a parallel program is static. New tasks can not be dynamically spawned during run time. (MPI-2 addresses this issue).
FIT3143 Parallel Computing 6

Getting Started ▪ MPI is native to ANSI C
• C++ and Java bindings are available
• MPI C++ classes – www.mcs.anl.gov
• mpiJava API – www.hpjava.org
▪ MPI versions ▪ MPI – C
• Header File:
– Required for all programs/routines which make MPI library calls.
#include “mpi.h”
#include
• Format of MPI Calls:
Format: rc = MPI_Xxxxx(parameter, … )
Example: rc = MPI_Bsend(&buf,count,type,dest,tag,comm) Error code: rc value is set to MPI_SUCCESS if successful
FIT3143 Parallel Computing 7

General MPI Program Structure
FIT3143 Parallel Computing 8

MPI’s “Hello World”
1. Create Source Code File: hello.c
#include
#include
int main(int argc, char *argv[]) { int numprocs, rank, namelen;
char processor_name[MPI_MAX_PROCESSOR_NAME];
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &numprocs);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Get_processor_name(processor_name, &namelen);
printf(“Process %d on %s out of %d\n”, rank, processor_name,
numprocs);
MPI_Finalize(); }
2. Compile: mpicc hello.c –o hello-mp
3. Execute: mpirun -np 2 hello-mp
FIT3143 Parallel Computing 9

Communicators and Groups
▪ MPI uses objects called communicators and groups to define which collection of processes may communicate with each other. Most MPI routines require you to specify a communicator as an argument.
▪ Communicators and groups will be covered in more detail later. For now, simply use MPI_COMM_WORLD whenever a communicator is required – it is the predefined communicator that includes all of your MPI processes.
FIT3143 Parallel Computing 10

Communicators and Groups
– Within a communicator, every process has its own unique, integer identifier assigned by the system when the process initializes. A rank is sometimes also called a “task ID”. Ranks are contiguous and begin at zero.
– Used by the programmer to specify the source and destination of messages. Often used conditionally by the application to control program execution (if rank=0 do this / if rank=1 do that etc).
FIT3143 Parallel Computing 11

Environment Management Routines
▪ MPI_Init
– Initializes the MPI execution environment. This function must be called in every MPI program, must be called before any other MPI functions and must be called only once in an MPI program. For C programs, MPI_Init may be used to pass the command line arguments to all processes, although this is not required by the standard and is implementation dependent.
MPI_Init (&argc, &argv)
▪ MPI_Comm_size
– Determines the number of processes in the group associated with a communicator. Generally used within the communicator MPI_COMM_WORLD to determine the number of processes being used by your application.
MPI_Comm_size (comm, &size)
FIT3143 Parallel Computing 12

Environment Management Routines
▪ MPI_Comm_rank
– Determines the rank of the calling process within the communicator. Initially, each process will be assigned a unique integer rank between 0 and number of processors – 1 within the communicator MPI_COMM_WORLD. This rank is often referred to as a task ID.
MPI_Comm_rank (comm, &rank)
▪ MPI_Abort
– Terminates all MPI processes associated with the communicator. In most MPI implementations it terminates ALL processes regardless of the communicator specified.
MPI_Abort (comm, errorcode)
FIT3143 Parallel Computing 13

Environment Management Routines
▪ MPI_Get_processor_name
– Returns the processor name. Also returns the length of the name. The buffer for “name” must be at least MPI_MAX_PROCESSOR_NAME characters in size. What is returned into “name” is implementation dependent – may not be the same as the output of the “hostname” or “host” shell commands.
MPI_Get_processor_name (&name, &resultlength)
▪ MPI_Initialized
– Indicates whether MPI_Init has been called – returns flag as either logical true (1) or false(0). MPI requires that MPI_Init be called once and only once by each process. This may pose a problem for modules that want to use MPI and are prepared to call MPI_Init if necessary. MPI_Initialized solves this problem.
MPI_Initialized (&flag)
FIT3143 Parallel Computing 14

Environment Management Routines
▪ MPI_Wtime
– Returns an elapsed wall clock time in seconds (double precision) on the calling processor.
MPI_Wtime ()
▪ MPI_Wtick
– Returns the resolution in seconds (double precision) of MPI_Wtime.
MPI_Wtick ()
▪ MPI_Finalize
– Terminates the MPI execution environment. This function should be the last MPI routine called
in every MPI program – no other MPI routines may be called after it.
MPI_Finalize ()
FIT3143 Parallel Computing 15

Environment Management Routines Example
#include “mpi.h”
#include
int main(int argc, char *argv[])
int numtasks, rank, rc;
rc = MPI_Init(&argc,&argv); if (rc != MPI_SUCCESS)
{ printf (“Error starting MPI program. Terminating.\n”); MPI_Abort(MPI_COMM_WORLD, rc);
MPI_Comm_size(MPI_COMM_WORLD,&numtasks); MPI_Comm_rank(MPI_COMM_WORLD,&rank);
printf (“Number of tasks= %d My rank= %d\n”, numtasks,rank);
/******* do some work *******/
MPI_Finalize(); }
FIT3143 Parallel Computing 16

Point to Point (P2P) Communication
General Concepts
▪ Types of Point-to-Point Operations:
– MPI point-to-point operations typically involve message passing between two, and only two, different MPI tasks. One task is performing a send operation and the other task is performing a matching receive operation.
– There are different types of send and receive routines used for different purposes. For example:
• Synchronous send
• Blocking send / blocking receive
• Non-blocking send / non-blocking receive
• Buffered send
• Combined send/receive
• “Ready” send
– Any type of send routine can be paired with any type of receive routine.
– MPI also provides several routines associated with send – receive operations, such as those used to wait for a message’s arrival or probe to find out if a message has arrived.
FIT3143 Parallel Computing 17

P2P general concepts ▪ Buffering:
– In a perfect world, every send operation would be perfectly synchronized with its matching receive. This is rarely the case. Somehow or other, the MPI implementation must be able to deal with storing data when the two tasks are out of sync.
– Consider the following two cases:
• A send operation occurs 5 seconds before the receive is ready – where is the message while the receive is pending?
• Multiple sends arrive at the same receiving task which can only accept one send at a time – what happens to the messages that are “backing up”?
– The MPI implementation (not the MPI standard) decides what happens to data in these types of cases. Typically, a system buffer area is reserved to hold data in transit.
FIT3143 Parallel Computing 18

P2P general concepts
▪ System buffer space is:
– Opaque to the programmer and managed entirely by the MPI library
– A finite resource that can be easy to exhaust
– Often mysterious and not well documented
– Able to exist on the sending side, the receiving side, or both
– Something that may improve program performance because it allows send – receive operations to be asynchronous.
– User managed address space (i.e. your program variables) is called the application buffer. MPI also provides for a user managed send buffer.
FIT3143 Parallel Computing 19

P2P general concepts ▪ Blocking vs. Non-blocking:
▪ Most of the MPI point-to-point routines can be used in either blocking or non- blocking mode.
▪ Blocking:
– A blocking send routine will only “return” after it is safe to modify the application buffer (your send data) for reuse. Safe means that modifications will not affect the data intended for the receive task. Safe does not imply that the data was actually received – it may very well be sitting in a system buffer.
– A blocking send can be synchronous which means there is a handshake occurring with the receive task to confirm a safe send.
– A blocking send can be asynchronous if a system buffer is used to hold the data for eventual delivery to the receive.
– A blocking receive only “returns” after the data has arrived and is ready for use by the program.
FIT3143 Parallel Computing 20

P2P general concepts
▪ Non-blocking:
– Non-blocking send and receive routines behave similarly – they will return almost immediately. They do not wait for any communication events to complete, such as message copying from user memory to system buffer space or the actual arrival of message.
– Non-blocking operations simply “request” the MPI library to perform the operation when it is able. The user can not predict when that will happen.
– It is unsafe to modify the application buffer (your variable space) until you know for a fact the requested non-blocking operation was actually performed by the library. There are “wait” routines used to do this.
– Non-blocking communications are primarily used to overlap computation with communication and exploit possible performance gains
FIT3143 Parallel Computing 21

P2P general concepts
▪ Order and Fairness: ▪ Order:
– MPI guarantees that messages will not overtake each other.
– If a sender sends two messages (Message 1 and Message 2) in succession to the same destination, and both match the same receive, the receive operation will receive Message 1 before Message 2.
– If a receiver posts two receives (Receive 1 and Receive 2), in succession, and both are looking for the same message, Receive 1 will receive the message before Receive 2.
– Order rules do not apply if there are multiple threads participating in the communication operations.
▪ Fairness:
– MPI does not guarantee fairness – it’s up to the programmer to
prevent “operation starvation”.
– Example: task 0 sends a message to task 2. However, task 1 sends a competing message that matches task 2’s receive. Only one of the sends will complete.
FIT3143 Parallel Computing 22

Point to Point Communication Routines and Arguments
▪ MPI Message Passing Routine Arguments
Blocking sends MPI_Send(buffer, count, type, dest, tag, comm)
Non-blocking sends MPI_Isend(buffer, count, type, dest, tag, comm, request) Non-blocking receive MPI_Irecv(buffer, count, type, source, tag, comm, request)
Blocking receive MPI_Recv(buffer, count, type, source, tag, comm, status)
– Program (application) address space that references the data that is to be sent or received. In most cases, this is simply the variable name that is be sent/received. For C programs, this argument is passed by reference and usually must be prepended with an ampersand: &var1
▪ Data Count
– Indicates the number of data elements of a particular type to be sent.
▪ Data Type
– For reasons of portability, MPI predefines its elementary data types. The table in next slide
lists those required by the standard.
FIT3143 Parallel Computing 23

Data Types
signed char
signed short int
signed int
signed long int
MPI_UNSIGNED_CHA R
unsigned char
MPI_UNSIGNED_SHO RT
unsigned short int
MPI_UNSIGNED
unsigned int
MPI_UNSIGNED_LON G
unsigned long int
MPI_DOUBLE
MPI_LONG_DOUBLE
long double
8 binary digits
MPI_PACKED
data packed or unpacked with MPI_Pack()/ MPI_Unpack
FIT3143 Parallel Computing 24

MPI Message Passing Routine Arguments ▪ Destination
– An argument to send routines that indicates the process where a message should be delivered. Specified as the rank of the receiving process.
– An argument to receive routines that indicates the originating process of the message. Specified as the rank of the sending process. This may be set to the wild card MPI_ANY_SOURCE to receive a message from any task.
– Arbitrary non-negative integer assigned by the programmer to uniquely identify a message. Send and receive operations should match message tags. For a receive operation, the wild card MPI_ANY_TAG can be used to receive any message regardless of its tag.
– The MPI standard guarantees that integers 0-32767 can be used as tags, but most implementations allow a much larger range than this.
FIT3143 Parallel Computing 25

MPI Message Passing Routine Arguments ▪ Communicator
– Indicates the communication context, or set of processes for which the source or destination fields are valid. Unless the programmer is explicitly creating new communicators, the predefined communicator MPI_COMM_WORLD is usually used.
– For a receive operation, indicates the source of the message and the tag of the message. In C, this argument is a pointer to a predefined structure MPI_Status (ex. stat.MPI_SOURCE stat.MPI_TAG). Additionally, the actual number of bytes received are obtainable from Status via the MPI_Get_count routine.
– Used by non-blocking send and receive operations. Since non-blocking operations may return before the requested system buffer space is obtained, the system issues a unique “request number”. The programmer uses this system assigned “handle” later (in a WAIT type routine) to determine completion of the non-blocking operation. In C, this argument is a pointer to a predefined structure MPI_Request.
FIT3143 Parallel Computing 26

Blocking Message Passing Routines
▪ MPI_Send
– Basic blocking send operation. Routine returns only after the application buffer in the sending task is free for reuse. Note that this routine may be implemented differently on different systems. The MPI standard permits the use of a system buffer but does not require it. Some implementations may actually use a synchronous send (discussed below) to implement the basic blocking send.
MPI_Send (&buf,count,datatype,dest,tag,comm)
▪ MPI_Recv
– Receive a message and block until the requested data is available in the application
buffer in the receiving task.
MPI_Recv (&buf,count,datatype,source,tag,comm,&status)
FIT3143 Parallel Computing 27

Blocking Message Passing Routines
▪ MPI_Ssend
– Synchronous blocking send: Send a message and block until the application buffer in the
sending task is free for reuse and the destination process has started to receive the message.
MPI_Ssend (&buf,count,datatype,dest,tag,comm)
▪ MPI_Bsend
– Buffered blocking send: permits the programmer to allocate the required amount of buffer space into which data can be copied until it is delivered. Insulates against the problems associated with insufficient system buffer space. Routine returns after the data has been copied from application buffer space to the allocated send buffer. Must be used with the MPI_Buffer_attach routine.
MPI_Bsend (&buf,count,datatype,dest,tag,comm)
FIT3143 Parallel Computing 28

Blocking Message Passing Routines
▪ MPI_Buffer_attach
– Used by programmer to allocate/deallocate message buffer space to be used by the MPI_Bsend routine. The size argument is specified in actual data bytes – not a count of data elements. Only one buffer can be attached to a process at a time. Note that the IBM implementation uses MPI_BSEND_OVERHEAD bytes of the allocated buffer for overhead.
MPI_Buffer_attach (&buffer,size) MPI_Buffer_detach (&buffer,size)
▪ MPI_Rsend
– Blocking ready send. Should only be used if the programmer is certain that the matching receive
has already been posted.
MPI_Rsend (&buf,count,datatype,dest,tag,comm)
FIT3143 Parallel Computing 29

Blocking Message Passing Routines
▪ MPI_Sendrecv
– Send a message and post a receive before blocking. Will block until the sending application buffer is free for reuse and until the receiving application buffer contains the received message.
MPI_Sendrecv (&sendbuf,sendcount,sendtype,dest,sendtag, &recvbuf,recvcount,recvtype,source,recvtag, comm,&status)
FIT3143 Parallel Computing 30

Blocking Message Passing Routines
▪ MPI_Wait MPI_Waitany MPI_Waitall MPI_Waitsome
– MPI_Wait blocks until a specified non-blocking send or receive operation has completed. For multiple non-blocking operations, the programmer can specify any, all or some completions.
MPI_Wait (&request,&status)
MPI_Waitany (count,&array_of_requests,&index,&status) MPI_Waitall (count,&array_of_requests,&array_of_statuses) MPI_Waitsome (incount,&array_of_requests,&outcount,
&array_of_offsets, &array_of_statuses)
FIT3143 Parallel Computing 31

Blocking Message Passing Routines
▪ MPI_Probe
– Performs a blocking test for a message. The “wildcards” MPI_ANY_SOURCE and MPI_ANY_TAG may be used to test for a message from any source or with any tag. For the C routine, the actual source and tag will be returned in the status structure as status.MPI_SOURCE and status.MPI_TAG. For the Fortran routine, they will be returned in the integer array status(MPI_SOURCE) and status(MPI_TAG).
MPI_Probe (source,tag,comm,&status)
FIT3143 Parallel Computing 32

Example: Blocking Message Passing Routines
#include “mpi.h” #include
int main(int argc,char *argv[])
{ int numtasks, rank, dest, source, rc, count, tag=1;
char inmsg, outmsg=’x’;
MPI_Status Stat;
MPI_Init(&argc,&argv);
MPI_Comm_size(MPI_COMM_WORLD, &numtasks);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
if (rank == 0)
{ dest = 1; source = 1;

程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com