CS计算机代考程序代写 cache algorithm Caches

Caches

HARDWARE/RESEARCH

PAPER REVIEW
Adrian Jackson

a. .ac.uk

• A Scalable Architecture for Ordered Parallelism – MIT

• Epiphany-V: A 1024 processor 64-bit RISC System-

On-Chip – Adapteva

• Ten Years of Building Broken Chips: The Physics and

Engineering of Inexact Computing – Rice University

• HpMC: An Energy-aware Management System of

Multi-level Memory Architectures – LLNL, etc..

Questions….

• What problem is the proposed hardware/software trying to

address?

• What applications would the hardware be good for?

• What didn’t you understand?

• What did you disagree with?

• Does it achieve what it sets out to do?

A Scalable Architecture for Ordered Parallelism

• SWARM

• Ordered irregular parallelism

• Supporting task based parallelism

• Not traditional regular computational simulation

• How to parallelise irregular task dependencies and generated tasks

• How to support small tasks

• Reduce the overheads of running tasks to make small task

parallelisation beneficial

• Pushing instruction level parallelism up the stack

• Considering a lot of the same issues seen scheduling instruction

pipelines

• Moving from instructions to tasks (groups of instructions)

A Scalable Architecture for Ordered Parallelism

A Scalable Architecture for Ordered Parallelism

• Two levels of innovation

• Hardware support for task management and communication

• New algorithms for task scheduling

• Hard to separate the performance impact of these

• Performance analysis is also complicated

• Hardware simulator

• Speedup is calculated using number of cycles executed

• Benchmarks are ported to new software model

A Scalable Architecture for Ordered Parallelism

• Nice features

• Sensitivity analysis

• Estimating energy costs of conflict detection etc…

• Not so nice

• Requires new programming model

• Speed-up is measured in instructions, not an actual comparison

with raw performance of software only implementation on another

chip

• Would this processor actually perform better than a standard

multicore where someone does software task dispatching

Epiphany-V: A 1024 processor 64-bit RISC System-On-Chip

• Energy efficient computing

• 75 GFlop/s per watt

• Targetting low power applications

• Autonomous cars, cryptography, etc…

• Simplify processor, remove costly hardware

• System on chip (SOC) approach

Epiphany-V: A 1024 processor 64-bit RISC System-On-Chip

• Distributed shared memory processor

• RISC design, reduce manufacturing and design complexity

• Cacheless and distributed memory design

• Each processor has 64 MB distributed across the 1024 cores

• Every core can access the memory in any other core

• Network integrated into the cores

• Each core has a network chip

• Network for messages and memory operations

• Scalable by design

• Network capacity to scale up to 1 million chips (1 billion cores)

Epiphany-V: A 1024 processor 64-bit RISC System-On-Chip

• Saving energy and design costs through RISC

• Saving energy through different memory

technologies/provision

• No cache

• No memory coherency on remote writes

• Saving energy through simple networks

• Separating out network functionality

• Making routing simple and distributed

Epiphany-V: A 1024 processor 64-bit RISC System-On-Chip

• Nice features

• Inherently scalable design

• Low energy solutions

• Floating point focus (64-bit)

• Distributed memory provides potential very high bandwidth low

latency

• Not so nice

• Pushes all the problems on to the programmer

• Managing memory coherency is explicit

• Managing memory locality is explicit

• Very small available memory

• Everything has to be very parallel to use the memory effectively

• Memory/communication costs scale with system

Ten Years of Building Broken Chips: The Physics and Engineering of Inexact Computing

• Addressing energy costs by allowing lower standards of

correctness

• Performance vs accuracy trade-offs

Ten Years of Building Broken Chips: The Physics and Engineering of Inexact Computing

• Aiming for applications that are naturally probabilistic

• Monte Carlo style simulations

• Machine learning

• Vision, graphics, audio, etc…

• Over clocking with overscaling voltage reduction impacts

correctness probability but improves performance to

energy ratios

• Removing some parts of the circuits which impact low

precision accuracy, or will be rarely used

• Designing in inaccuracy to the circuit

• Discarding data that will have little impact on the overall result

Ten Years of Building Broken Chips: The Physics and Engineering of Inexact Computing

Ten Years of Building Broken Chips: The Physics and Engineering of Inexact Computing

• Good things

• Much reduced energy requirements for processing

• Interesting exploration of allowable error in applications

• Often people don’t know what errors they can accept

• Not so good things

• Application/area specific designs

• Can’t replace whole chips

• Programmer needs to understand the reduced guarantees the

processors is providing

• New development infrastructures would be required

HpMC: An Energy-aware Management System of Multi-level Memory Architectures

• Software response to new memory systems that are

coming out

• DRAM, NVRAM, MCDRAM, etc..

• Partly a response to high static energy cost of memory

• How do applications efficiently use these new memory models?

• Vendors offering two different approaches

HpMC: An Energy-aware Management System of Multi-level Memory Architectures

• This complicates application landscape

• New functionality, new performance possible

• How to best use in applications

• What data to put where

• How work across heterogeneous systems

• Proposed hardware management of these systems

• More intelligent than cache mode

• More complex memory controller

HpMC: An Energy-aware Management System of Multi-level Memory Architectures

HpMC: An Energy-aware Management System of Multi-level Memory Architectures

HpMC: An Energy-aware Management System of Multi-level Memory Architectures

• Good things

• Reduces complexity for programmers

• Enables portability

• Reduces energy consumption/improves performance

• Not so good things

• Increases performance complexity

• Relies on sensible decisions in hardware

Work to do
• Review the paper:

• The Ecological Impact of High-performance Computing in
Astrophysics

• Critically analyse whether the paper is fair and correct, and what you
might change if you were doing similar research

• Choose one of the other papers from the list:
• Swarm Micro hardware

• Epiphany-V

• Emu Chick

• Ten Years of Building Broken Chips

• HpMC: An Energy-aware Management System of Multi-level Memory
Architectures

• MemCache

• Beating Floating Point at its Own Game: Posit Arithmetic

• Read and review