COMP5426 Distributed
mming Distributed Memory
Platforms (
Copyright By PowCoder代写 加微信 powcoder
ting (GEPP)
for ib = 1 to n-1 step b … Process matrix b columns at a time
end = ib + b-1 … Point to end of block of b columns
apply BLAS2 version of GEPP to get A(ib:n , ib:end) = L’ * U’
… let LL denote the strict lower triangular part of A(ib:end , ib:end) + I
A(ib:end , end+1:n) = LL-1 * A(ib:end , end+1:n)
A(end+1:n , end+1:n ) = A(end+1:n , end+1:n )
– A(end+1:n , ib:end) * A(ib:end , end+1:n)
… apply delayed updates with single matrix-multiply … with inner dimension b
… update next b ro
or distribu
This is simply because owned by any thread
For shared memory machines, actually, the task assignment presented in the previous lecture is just
flexibility
ted memory machines data mu
Exchange data using message passing
Must consider how to
flexibility for task
specifically
communicati
2D cyclic block: mesh/torus
Assgn for D
processes are als
o organized as a 2D
Matrix multiply of
green = green – blue * pink
P1·L1·U1 P2·L2·U2 P3·L3·U3 P4·L4·U4
P34·L34·U34
1234·L1234·U1234
and use these b pivot
Avoiding G
Choose b Choose b Choose b
pivot rows of pivot rows of pivot rows of
pivot rows of
without pivoting
Not the same pivots rows chosen as for GEPP
Need to show numerically stable (D., Grigori, Xiang, ‘11) 10
W1, call th W2, call th W3, call th W4, call th
pivot rows, call them W12’
W1 W2 W3 W4
P12·L12·U12
pivot rows, call them W34’
pivot rows
rows (i.e., move
Avoiding G
Choose b pivot rows
MPI example:
for ( int i
= 0; i < st
eps; i++ ) {
mputation here */
I_Send(&buf, n, MPI_DO
1, 42, comm);
for ( int i
= 0; i < st
eps; i++ ) {
MPI_Recv(&buf, n, MPI_DOUBLE, 0, 42, comm, &status);
... /* do some
computation
MPI example:
Total time
MPI example:
some computation
MPI_Request req[2]; /* each Isend/Irecv needs a
MPI_Wait(&req[idx], &status); idx = (idx
MPI_Isend(&buf, n, MPI_DOUBLE, 1, 42, comm, }
MPI_Irecv(&buf, n, MPI_DOUBL
for ( int i = 0; i < steps; i++ ) {
MPI_Wait(&req[i], &status); idx = (idx + 1) % 2; MPI_Irecv(&buf, n, MPI_DOUBLE, 0, 42, comm, &req[idx]); ... /* do some computation here*/
E, 0, 42, comm, &req[idx]);
&req[idx] );
if (rank == 0 ) {idx = 0;
... /* do some computation here */
MPI_Isend(&buf, n, MPI_DOUBLE, 1, 42, comm, &req[idx] for ( int i = 0; i < steps; i++ ) {
MPI example:
In distributed memory across the processes
e need to consi
and assignment simultaneously
It becomes more restrictive for
We must seriously balancing and the e utilization
Additional cost for data communicati we must also seriously consider ho
consider d fficiency o
th data and task
distributed
ata locality, load f resource
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com