Distributed Processing / Network Topology
What is the bisection bandwidth of a 5x5x6 3D mesh topology, with each link having a bandwidth of 8GB/s?
Solution:
25 links need to be cut to split network into (nearly) two halves. 25 * 8 GB/s = 200 GB/s
Problem:
Show graphically the working of an ¡°All Reduce¡± collective, assuming a binary tree based virtual network topology with 7 processes, and an initial data distribution of 1 integer per process.
Solution:
Very similar to Allgather. However, the output size is also one integer per process.
Problem:
Determine the time necessary to broadcast 1 MB of data across a binary tree network topology of 127 nodes. Assume a latency of 1 microsecond and a bandwidth of 10 GB/s.
Solution:
A binary tree topology of 127 nodes is 7 levels deep. Using the alpha-beta model for communication estimation, transmit time will be 2 * (latency + data_size/bandwidth). The transmit time from the root node to the two nodes in level 2 is 2 * [ 10^-6 + 10^6 / 10^10 ] = 0.000202 seconds. The 2 is due to transmitting to two nodes. This is performed 6 times to broadcast data down to the last level. The final time will be 0.001212 seconds.
If we use pipelining and assume a 1KB packet size. Time for one packet transmission is (10^-6 + 10^3 / 10^10) or 0.0000011 seconds. We will have to send 1000 transmissions to send all of the data. This will propagate in waves, 1-2, 2-3, 3-4, 4-5, 5-6, 6-7. The second packet cannot be sent from 1-2 until 2-3 is complete, it will require a tick-tock timing. So the timing will be 1000 * 2 * 2 * 0.0000011 seconds or 0.0044 seconds.