CS计算机代考程序代写 cache assembly algorithm Slide 1

Slide 1

HPC ARCHITECTURES

Interconnects and Networks

Introduction

• Interconnects and Networks move data from one place to

another.

• Vital part of all computer systems, and especially

important in parallel systems.

• Interconnects occur at many scales with a variety of

differing requirements:

• Connecting components within a processor.

• Connect processors to external memory. (Memory Interconnect)

• Connect parallel compute elements. (Processor Interconnect)

• Connect processors to disks (e.g. SANs Storage Area Networks).

• LANs (Local Area Networks).

• WANS (Wide Area Networks).

• The internet (Global Network).

HPC Architectures 2

OSI Model
• This is a theoretical model of networking developed by the

ISO standards organisation.

• Divides networking into 7 layers:

1. Physical Layer

2. Data Link Layer

3. Network Layer

4. Transport Layer

5. Session Layer

6. Presentation Layer

7. Application Layer

• This division into layers makes it easier to define standards.

• A Network Layer standard can be independent of the underlying

Data Link Layer it is implemented on.

HPC Architectures 3

OSI and HPC

• Higher levels of the OSI models may not make sense in

some contexts:

• E.g. Memory interconnects

• Lower OSI levels still useful.

• Some layers may be merged/omitted for performance

reasons.

• Especially in Proprietary HPC interconnects where performance is

critical and inter-operation largely irrelevant.

• Message passing libraries such as MPI may provide functions from

both level-5 and level-6 though for performance reasons their

implementations may use levels as low as level-2.

HPC Architectures 4

Physical layer

• Level-1 of the OSI model is the physical layer.

• There are many different ways of moving bits of

information around:

• Voltages applied to wires

• Optical pulses/waves travelling along fibre-optic cable.

• Electromagnetic waves travelling along an electrical transmission

line.

• Radio waves propagating in air.

• All of these can be used to construct computer

interconnects.

HPC Architectures 5

Electrical Signalling lines
• Simplest form of signalling is to apply a voltage to a wire.

• Essentially the same approach as used in the Victorian telegraph.

• Underlying physics described by the “Telegrapher’s Equations” derived

in 1855 by Heaviside.

• At low frequencies wave nature can be neglected, wire acts

like a simple capacitor

• At high-frequencies (where wavelength approaches wire

length) signalling lines need to be constructed as

“transmission lines”

HPC Architectures 6

Point-to-point/Multi-station

• Signalling lines can be used as a broadcast interconnect

(many recipients can read the same signal)

• Original Ethernet was multi-station broadcast network

• Maximum segment length needed to be restricted so collisions

could be detected reliably.

• Fast networks easier to build out of routers connected by

point-to-point links.

• Full-duplex (simultaneous communications in both directions).

• Half-duplex (only one direction at a time).

HPC Architectures 7

Serial vs Parallel
• Data rates can be increased by either

• Increasing the signalling frequency

• Using multiple wires in parallel

• Parallel connections

• Generally more expensive as more wires and more off-chip pin

connections.

• Number of available chip pins proportional to chip perimeter so scarce

resource on large modern chips.

• Operating frequency limited by clock-skew between wires

• Serial connections

• Can operate at higher frequencies

• Throughput can be increased by using multiple (independent/non-

synchronised) serial connections.

• Most modern networks based on High-Speed serial connections.

HPC Architectures 8

High speed serial connections

• Many modern technologies use high speed serial

connections.

• Complex encoding of binary data onto analogue signals.

• Currently running at GHz signal frequencies (cm wavelengths).

• Used in a variety of different types of interconnect

• USB, Infiniband, SATA, PCIe, HDMI

• Underlying SerDes (Serializer/Deserializer) technology essentially the

same.

• Electrical signalling consumes large amounts of power

• Increases with distance and frequency

• Power is a major limiting factor in modern HPC system design.

• General trend towards optical signalling

HPC Architectures 9

Optical networks

• Signalling with flashing lights is as old as fire.

• Modern optical networks use lasers confined within optical

fibre.

• Still using electromagnetic waves but light frequency is much higher

than signalling frequency and can be treated as binary pulses.

• Energy loss in optical fibre is very low (energy costs are essentially

independent of distance at scales less than many kilometres) but optical

transceivers take a lot of power.

• Optical fibres are point to point links

• Signals need to be converted back to electrical to perform routing.

• Recent advances in silicon-photonics allow all necessary

components for optical networking to be built on-chip.

• Chip to chip optical links now a possibility.

HPC Architectures 10

Packet switching
• Network bits are usually organised into packets.

• Currently HPC networks are all packet-switched networks.

• As well as embedded payload the packet header usually

contains additional information e.g.

• Source address

• Destination address

• Checksums for error detection

• Packet size.

• Routers read header to determine where next destination.

• Routers are always electronic so need to convert from optical domain to

route.

• Packet overhead reduces effective bandwidth (especially for

small messages).

HPC Architectures 11

Optical switching
• In the future we expect silicon photonics to make optical

switching more important.

• Optical switches are circuit-switches not packet-switches.

• However each fibre can carry multiple wave-lengths which

can be switched independently.

• Can embed complex circuit topology in simpler physical one.

28/10/2019 HPC Architectures 12

Networks
• More complex networks are built by linking network segments

using routers.

HPC Architectures 13

• Topology of the network can affect cost/performance
– Topology described as a graph

– Routers as graph nodes

– Links as graph edges

– Properties of the graph related to properties of the network

Diameter

• The distance between 2 graph nodes is the length of the

shortest path connecting them.

• This is proportional to the network latency between the routers in

the corresponding network. (Assuming the network latency of each

link is the same)

• The diameter of graph is largest distance between 2

nodes contained in the graph.

• This is proportional to the worst case network latency of the

corresponding network.

HPC Architectures 14

Degree

• The degree of a graph node is the number of edges

attached to that node.

• This corresponds to the number of ports needed on the router in

the corresponding network.

• In practice many high port count routers are implemented internally

as networks of smaller routers.

HPC Architectures 15

Bisection

• A bisection of a graph is when it is divided into two equal

parts and the number of links connecting the two parts is

the width of that bisection.

• The minimum bisection width of a graph is the smallest

value from all possible bisections of the graph.

• This is proportional to the worst case bandwidth between two

halves of the corresponding network (If all links can support the

same bandwidth).

• This is usually referred to as the bisection bandwidth of the

network.

HPC Architectures 16

Example Topologies

• Simple topologies

HPC Architectures 17

1-D array

Ring

More example topologies

HPC Architectures 18

2-D Mesh

2-D torus

4-cube

Natural extensions

to higher

dimensions

Multi stage networks

• Common approach to building scalable networks

• Routers arranged in layers

• Only the outermost layers have connections to network end-points

• Inner layers connect router to router.

• Larger networks need more intermediate layers

• Many different varients

• Can scale as: N log N

HPC Architectures 19

Tree topologies

• Outside of HPC many computer networks are built using a

hierarchy of routers in a tree topology.

• Makes sense for client/server and task-farms where most

communication is between leaves and root of the tree.

• For general communication patterns large volumes of traffic will

need to pass through links/routers near the root of the tree.

• To maintain bisection bandwidth need higher performing

routers/links near root of tree.

HPC Architectures 20

Fat tree

• Uses multiple root nodes to maintain bisection bandwidth

HPC Architectures

Recursive networks
• Many multi-stage networks can be built recursively

• E.g. the Benes network.

HPC Architectures 22

Dragonfly

• Hierarchy of “groups”

• Within a group all nodes are fully connected

• All groups are fully connected in the next layer

• Next layer connection usually “Fatter” to maintain

bisection bandwidth.

28/10/2019 HPC Architectures 23

Routing and Addressing
• How do we get a message through a network?

• choice of multiple paths

• Message can specify route, or just the destination

• in latter case routers have to decide the route

• Can be deterministic or adaptive

• deterministic: every message between two given nodes always takes

the same path

• Packet order preserved (no re-assembly required)

• May not use all available bandwidth

• adaptive: path can vary according to network conditions/randomly

• Easier to implement fault tolerance. Better use of available paths.

• Can be minimal or non-minimal

• minimal routing always takes shortest path

• not necessarily unique path

HPC Architectures 24

Routing latency
• Routing overheads add to the overall network latency.

• Store and forward routers

• Read entire packet into internal buffer before forwarding down next link.

• Full packet needs to be stored (always)

• Can add significantly to message latency if packets are large.

• Cut-through router

• Calculates next link from message header.

• Starts forwarding packet as soon as destination known. (May be before

the full packet has been received.

• May still have to buffer packet if output link is busy.

• Either way routing algorithm needs to be fast. E.g.

• Simple algorithm based on destination address.

• Deterministic algorithm with cached results.

HPC Architectures 25

Dimension ordered routing

• Example: Dimension ordered routing in 2-D grid

• Simple algorithm

• go in X direction until X co-ordinate is correct

• then go in Y direction.

• Deterministic, minimal

HPC Architectures 26

Error correction and Re-transmition
• Hardware is not perfect

• Packets may be lost (Need to implement packet acknowledge/resend)

• Packets may be corrupted (Checksums, ECC codes resend)

• Guaranteed delivery network

• Packet verification and resend implemented on each network hop.

• Minimises performance impact.

• More buffer space needed

• Have to make sure buffer space not exhausted (e.g. link flow control)

• Non guaranteed delivery

• Verification and resend implemented end-to-end at higher OSI layer

• Packets dropped when error detected (or insufficient buffer space)

• Larger performance impact from errors.

• Can recover from complete loss of router node

• Model used by Internet, TCP/IP, Ethernet.

HPC Architectures 27

OSI Layers again
• TCP is built on top of IP (Internet protocol) which is carried

by some data-link protocol. Data is sent as packets

• Think Russian dolls.

• Protocol nesting adds overhead

• Even when HPC interconnects support IP can be faster to target

lower levels.

HPC Architectures 28

Data

TCP packet

IP packet

HDR

HDR

HDR

TCP packet

IP packet

Data layer packet

Data HDR HDR HDR