Caches
HARDWARE/RESEARCH
PAPER REVIEW
Adrian Jackson
a. .ac.uk
• A Scalable Architecture for Ordered Parallelism – MIT
• Epiphany-V: A 1024 processor 64-bit RISC System-
On-Chip – Adapteva
• Ten Years of Building Broken Chips: The Physics and
Engineering of Inexact Computing – Rice University
• HpMC: An Energy-aware Management System of
Multi-level Memory Architectures – LLNL, etc..
Questions….
• What problem is the proposed hardware/software trying to
address?
• What applications would the hardware be good for?
• What didn’t you understand?
• What did you disagree with?
• Does it achieve what it sets out to do?
A Scalable Architecture for Ordered Parallelism
• SWARM
• Ordered irregular parallelism
• Supporting task based parallelism
• Not traditional regular computational simulation
• How to parallelise irregular task dependencies and generated tasks
• How to support small tasks
• Reduce the overheads of running tasks to make small task
parallelisation beneficial
• Pushing instruction level parallelism up the stack
• Considering a lot of the same issues seen scheduling instruction
pipelines
• Moving from instructions to tasks (groups of instructions)
A Scalable Architecture for Ordered Parallelism
A Scalable Architecture for Ordered Parallelism
• Two levels of innovation
• Hardware support for task management and communication
• New algorithms for task scheduling
• Hard to separate the performance impact of these
• Performance analysis is also complicated
• Hardware simulator
• Speedup is calculated using number of cycles executed
• Benchmarks are ported to new software model
A Scalable Architecture for Ordered Parallelism
• Nice features
• Sensitivity analysis
• Estimating energy costs of conflict detection etc…
• Not so nice
• Requires new programming model
• Speed-up is measured in instructions, not an actual comparison
with raw performance of software only implementation on another
chip
• Would this processor actually perform better than a standard
multicore where someone does software task dispatching
Epiphany-V: A 1024 processor 64-bit RISC System-On-Chip
• Energy efficient computing
• 75 GFlop/s per watt
• Targetting low power applications
• Autonomous cars, cryptography, etc…
• Simplify processor, remove costly hardware
• System on chip (SOC) approach
Epiphany-V: A 1024 processor 64-bit RISC System-On-Chip
• Distributed shared memory processor
• RISC design, reduce manufacturing and design complexity
• Cacheless and distributed memory design
• Each processor has 64 MB distributed across the 1024 cores
• Every core can access the memory in any other core
• Network integrated into the cores
• Each core has a network chip
• Network for messages and memory operations
• Scalable by design
• Network capacity to scale up to 1 million chips (1 billion cores)
Epiphany-V: A 1024 processor 64-bit RISC System-On-Chip
• Saving energy and design costs through RISC
• Saving energy through different memory
technologies/provision
• No cache
• No memory coherency on remote writes
• Saving energy through simple networks
• Separating out network functionality
• Making routing simple and distributed
Epiphany-V: A 1024 processor 64-bit RISC System-On-Chip
• Nice features
• Inherently scalable design
• Low energy solutions
• Floating point focus (64-bit)
• Distributed memory provides potential very high bandwidth low
latency
• Not so nice
• Pushes all the problems on to the programmer
• Managing memory coherency is explicit
• Managing memory locality is explicit
• Very small available memory
• Everything has to be very parallel to use the memory effectively
• Memory/communication costs scale with system
Ten Years of Building Broken Chips: The Physics and Engineering of Inexact Computing
• Addressing energy costs by allowing lower standards of
correctness
• Performance vs accuracy trade-offs
Ten Years of Building Broken Chips: The Physics and Engineering of Inexact Computing
• Aiming for applications that are naturally probabilistic
• Monte Carlo style simulations
• Machine learning
• Vision, graphics, audio, etc…
• Over clocking with overscaling voltage reduction impacts
correctness probability but improves performance to
energy ratios
• Removing some parts of the circuits which impact low
precision accuracy, or will be rarely used
• Designing in inaccuracy to the circuit
• Discarding data that will have little impact on the overall result
Ten Years of Building Broken Chips: The Physics and Engineering of Inexact Computing
Ten Years of Building Broken Chips: The Physics and Engineering of Inexact Computing
• Good things
• Much reduced energy requirements for processing
• Interesting exploration of allowable error in applications
• Often people don’t know what errors they can accept
• Not so good things
• Application/area specific designs
• Can’t replace whole chips
• Programmer needs to understand the reduced guarantees the
processors is providing
• New development infrastructures would be required
HpMC: An Energy-aware Management System of Multi-level Memory Architectures
• Software response to new memory systems that are
coming out
• DRAM, NVRAM, MCDRAM, etc..
• Partly a response to high static energy cost of memory
• How do applications efficiently use these new memory models?
• Vendors offering two different approaches
HpMC: An Energy-aware Management System of Multi-level Memory Architectures
• This complicates application landscape
• New functionality, new performance possible
• How to best use in applications
• What data to put where
• How work across heterogeneous systems
• Proposed hardware management of these systems
• More intelligent than cache mode
• More complex memory controller
HpMC: An Energy-aware Management System of Multi-level Memory Architectures
HpMC: An Energy-aware Management System of Multi-level Memory Architectures
HpMC: An Energy-aware Management System of Multi-level Memory Architectures
• Good things
• Reduces complexity for programmers
• Enables portability
• Reduces energy consumption/improves performance
• Not so good things
• Increases performance complexity
• Relies on sensible decisions in hardware
Work to do
• Review the paper:
• The Ecological Impact of High-performance Computing in
Astrophysics
• Critically analyse whether the paper is fair and correct, and what you
might change if you were doing similar research
• Choose one of the other papers from the list:
• Swarm Micro hardware
• Epiphany-V
• Emu Chick
• Ten Years of Building Broken Chips
• HpMC: An Energy-aware Management System of Multi-level Memory
Architectures
• MemCache
• Beating Floating Point at its Own Game: Posit Arithmetic
• Read and review