程序代写代做代考 compiler algorithm c++ distributed system PowerPoint Presentation

PowerPoint Presentation

Dr Massoud Zolgharni
mzolgharni@lincoln.ac.uk

Room SLB1004, SLB

Dr Grzegorz Cielniak
gcielniak@lincoln.ac.uk

Room INB2221, INB

Dr Grzegorz Cielniak (module coordinator)
• lecturer, researcher in autonomous systems

(robotics, machine vision)

• week 1-6

Dr Massoud Zolgharni
• lecturer, expertise in medical imaging

• week 7-13

Demonstrators
• Jacob Carse, Cheng Hu, Jacobus Lock

Lecture
• Theoretical aspects of parallel

computing

• Friday, 10:00-11:00, AAD0W25

Workshops
• Tutorials & Assignment

• Targeted specifically to GPGPU &
OpenCL

• 2 hours/week

• Group A: Monday, 9:00-11:00, MC3204

• Group B: Monday, 11:00-13:00, MC3204

• Group C: Friday, 15:00-17:00, MC3204

Assessment Item 1
• Coursework – programming assignment, 30%

• Released in week 6

• Code only submission (no report)

• In-class demonstration (weeks B11-B13)

Assessment Item 2
• Exam, 70%

• Paper-based on theory (mock-paper provided after
Easter)

Check Blackboard for all hand-in dates!

Week W/C Lecture Workshop

1 23/01 Introduction –

2 30/01 Architectures
Tutorial-1

3 06/02 Patterns 1

4 13/02 Patterns 2
Tutorial-2

5 20/02 Patterns 3

6 27/02 Patterns 4
Tutorial-3

7 06/03 Communication & Synchronisation

8 13/03 Algorithms 1
Assessment

support
9 20/03 Algorithms 2

10 27/03 Algorithms 3

11 03/04 Performance & Optimisation
Tutorial-4 &

Demonstration
12 24/04 Parallel Libraries

13 01/05 –

Essential:
• Structured Parallel Programming: Patterns

for Efficient Computation, McCool et al. (e-
book)

• An Introduction to Parallel Programming, P.
Pacheco

Recommended:
• Heterogeneous computing with OpenCL:

revised OpenCL 1.2 edition, B. Gaster (e-
book)

• OpenCL in action: how to accelerate
graphics and computation, M. Scarpino

Why parallelism? Motivation

Applications

Overview of module

Why do we parallelise at all?
• to solve bigger problems (more complex, more accurate,

more realistic)

• to solve more problems (faster)

• to reduce power consumption related to the
computation (cheaper)

Parallelism = Optimisation!

Analogy: teamwork

Parallel machines already popular in the 70’s
• long development time

• expensive

Moore’s Law driving optimisation through
increasing speed of serial processors

• much easier/cheaper to simply wait few years for
technology to catch up rather than invest in
complex/expensive architectures

ILLIAC IV

the number of transistors on a chip doubles approximately every two years

Transistor clock speed (clock frequency, clock rate)

how fast switch between on to off → speed of operation

Smaller transistor

shorter critical path

quicker charging/discharging
capacitor

faster clock

more transistors

Transistor count (number of transistors on chip)

• deeper instruction pipelining, superscalar processor
• more operations per time period
• perform more complicated instructions

increase in frequency

decrease in runtime

Power consumption P
• C – capacitance

• V – voltage

• F – frequency

The faster we switch transistors on and off, the more
heat will be generated.

• In reality, increasing F requires increasing V for a given
transistor size and hence:

Transistor
count still

rising

Clock rate
flattening

sharply

Frequency-scaling dominant reason for
improvements in computer performance until 2004

“Free lunch” of automatically faster serial
applications through faster microprocessors ended

Industry-wide shift to parallel computing in form of
multi-core processors

All computers nowadays are Parallel Computers!

Serial Processing Parallel Processing

increase clock rate ≃
simply make person work faster

multiple processing cores ≃
workforce group

organizing, strategy, communication

Autonomous Cars Augmented Reality

Video Games Weather Forecasting

Source: nvidia.com

Different solutions and programmer
support

• pipelines, vector instructions in CPU
• limited access by a programmer, built-in

or through compiler

• multi-core CPUs and multi-processors
• OS level, multithreading libraries (e.g.

Boost.Thread)

• dedicated parallel processing units
(e.g. GPU)

• libraries with different level of
granularity (e.g. OpenCL,
Boost.Compute)

• distributed systems
• distributed parallel libraries (e.g. MPI)

Serial programming is straightforward
• and we know how to teach this subject

Parallel programming
• less intuitive than serial programming due to non-trivial

communication & synchronisation

• many different hardware solutions and software
libraries

• relatively recent adoption by the programming
community

In this module
• focus on general programming patterns and algorithms

• practical programming aspect using a popular/open
specification (OpenCL)

Task: C = A + B

A, B, C – vectors, N – number of elements

for (int i = 0; i < N; i++) C[i] = A[i] + B[i]; par_for (int i = 0; i < N; i++) C[i] = A[i] + B[i]; serial parallel No dependencies so each element can be processed separately for (int i = 0; i < N; i++) b = b + A[i] serial parallel ? Standard C++ • no special compiler/extensions Library-based solution • no special build-system Vendor-neutral • open-standard managed by the Khronos group Multi-platform, portable performance Heterogeneous computing (CPU/GPU/FPGA) Week W/C Lecture Workshop 1 23/01 Introduction - 2 30/01 Architectures Tutorial-1 3 06/02 Patterns 1 4 13/02 Patterns 2 Tutorial-2 5 20/02 Patterns 3 6 27/02 Patterns 4 Tutorial-3 7 06/03 Communication & Synchronisation 8 13/03 Algorithms 1 Assessment support 9 20/03 Algorithms 2 10 27/03 Algorithms 3 11 03/04 Performance & Optimisation Tutorial-4 & Demonstration 12 24/04 Parallel Libraries 13 01/05 - Parallelism is the future of computing Multi-core and many-core era is here to stay due to technology trends It is a rare skill and is likely to be in demand in your future jobs Understanding parallel techniques makes you a better programmer Structured parallel programming: patterns for efficient computation • Section 1.1-1.3 An Introduction to Parallel Programming • Chapter 1