CS计算机代考程序代写 compiler Java cuda GPU flex AI algorithm Microsoft PowerPoint – COMP528 SYNC02 intro.pptx

Microsoft PowerPoint – COMP528 SYNC02 intro.pptx

Dr Michael K Bane, G14, Computer Science, University of Liverpool
m.k. .uk https://cgi.csc.liv.ac.uk/~mkbane/COMP528

COMP528: Multi-core and
Multi-Processor Programming

2 – sync

This session will be
RECORDED…

This allows you to revise later

The session will be mainly me
presenting…

but it is good to have interaction

But if anybody wishes to not be
recorded then I can pause recording
COMP328/528 (c) univ of liverpool

Contact

• MS Teams
• channels for:

• general announcements & items of general interest
• lab sessions

• you can also chat direct to me

• Email: m.k. .uk

“Timetable” [updates]
• Intro

• Async lectures via CANVAS (i.e. online)
• loosely based on traditional 30 hours
• released in blocks

• Synchronous ==> now online
• 1 hr for each of weeks 3, 5, 8: to explain the assessed assignments (including feedback)
• Plus 1 hr others weeks: lab solutions, ensure materials making sense, Q&A

• Labs
• synchronous, online

• mandatory
• start: summary of lectures + Q&A
• main: do lab work — it is expected to finish labs (using own time if necessary)
• tutor: on hand to explain, answers Qs, give example solutions at end

• but generally support available other times
• but not instantaneously via MS Teams

• Teaching under covid will be more asynchronous but we will get there!

Previously covered

Who, What, Where, When?

• Me…
• You…
• Course:

• Why  contents

Who am I?
• My first computer

• VIC 20

7

https://aydinstone.wordpress.com/tag/commodore-vic-20/

Who am I?
• My first computer

• VIC 20

• My first parallel
computer

• INMOS Transputer

8

http://www.brunel.ac.uk/~eesttti/papers/main.html

Who am I?
• My first computer

• VIC 20

• My first parallel
computer

• INMOS Transputer

• My first super computer
• Cray T3E

9

Who am I?
• My first computer

• VIC 20

• My first parallel computer
• INMOS Transputer

• My first super computer
• Cray T3E

• Jobs
• Supporting HPC
• Modelling chemical

weather
• Manager:

Research Apps
• Energy Efficient

Compute Research

10

• … and now my U/Liverpool
• COMP528: multi-core & multi-processor programming
• COMP328: UG 3rd “High Performance Computing”
• (proposed: COMPxxx: “Advanced HPC”
• COMP530: MSc group projects

• Academic Lead for Faculty’s Enhancing Education Group
• Member of

• PGT Course Programme Review
• Research Infrastructure subgroup, “Digital” Research Theme

• Manager, Centre for AI Solutions
• Industrial Placements (2019/20)

YOU…

• Comp Sci u/g
• Yes .v. No (raise hands within MS Teams video conference)

• Language
• Type in to the chat for this video conference

• Previous experience with C (see poll on MS Teams)
• Any previous HPC experience

• e.g. p-threads, Java threads (raise hands)
• OpenMP? (raise hands)
• MPI? (raise hands)
• CUDA? (raise hands)
• COMP328 (raise hands)

Labs
• You are assigned to one OR other

of these sessions.
• Those are the slots to attend

• “remote labs”
• You connect in to a machine on campus
• Me + demonstrator can view that

machine’s desktop (so we can suggest
steps to help you solve problems)

• Plan is to then split each session in to
two zones (details to follow)

• Next Week (i.e. 22nd or 23rd):
• ensuring you can login to local HPC facility, and edit + compile + run in batch

Motivation

Why do we want “more” or “better”…
• Improving products

• More efficient (i.e. longer range) electric cars
• More accurate weather forecast (when going to rain in my street to what will the

weather be like if I book a vacation next month; hurricanes’ strength & landfall)

• Urgent computing: modelling covid, computer testing of potential vaccines
• Nuclear arms stockpile: how to manage / monitor others
• Precision (personalised) medicine
• Deep(er) Learning & (better) Artificial Intelligence
• Deeper understanding of the universe and all the processes within

• Modelling ever finer details and joining yet more models together dynamically

Generally can call the ways to these:
HIGH PERFORMANCE COMPUTING (HPC)

COMP328/COMP528 (c) Univ of Liverpool

Why?

• Not just faster

• Also
• solving ever bigger problems
• doing things ever more accurately
• bigger & more accurate

• Examples
• streaming data from IoT / social media / etc
• knowing more precisely the weather for more days in advance
• understanding details of combustion

These will all require
• faster processing
• more memory


• “high performance computers”
• “high performance computing”
• (aka “supercomputing”)

• So…

• want faster — how will we do this?

• want bigger — how will we do this?

• and to use them effectively
— how will we do this??!

Course elements

• [whiteboard – why HPC]
• [whiteboard – let’s design HPC]
• topics we will cover, in which order

Start with uni-core processor…

Solve problems quicker by…?

Solve larger problems by…?

Interconnect
– speed
– through-put

– topology

– (cost!)

Accelerators
• GPU
• Xeon Phi
• FPGA

The reasons for multi-core processors, I

• We want applications to execute faster
• Clock speeds no longer increasing exponentially

picture is from Intel Software College materials

10 GHz

1 GHz

100 MHz

10 MHz

1 MHz
’79 ’87 ’95 ’03 ’11

The reasons for multi-core processors, II
• Clock speeds of CPU are stabilizing:

• Excessive power consumption
(power proportional to freq^3)

• Heat dissipation
• Overly complicated design
• Memory wall: the increasing gap between processor and

memory speeds (cannot feed the cpu quick enough)
• Strategy:

• Limit CPU speed and sophistication
• Put multiple CPU “cores” on a single chip in a socket

• Potential performance:
CPU freq * no ops per cycle * #cores per CPU * #CPUs per node

COMP328/COMP528 (c) Univ of Liverpool

… and then
• Strategy:

• Limit CPU speed and sophistication
• Put multiple CPUs (“cores”) on a single chip in a socket
• Several (2 or 4) sockets on a node (cf motherboard)
• Connect 10s or 100s or 1000s of nodes…

• Potential performance:
CPU freq * no ops per cycle * #cores per CPU * #CPUs per node * #nodes

Methodology

• Study problem, sequential program, or code segment
• Look for opportunities for parallelism

• Usually best to start by thinking abstractly about the problem to be solved,
not by any current program implementation of a given solution

• Try to keep all processors busy doing useful work
• Processors (cores) could be either placed locally (multicore

processors), or connected by local/global networks => huge variety of
approaches/methods

(after Intel Software College)

How to Program a Parallel Computer?

Sequential approach?
• Problem has inherent

parallelism
• Programming language cannot

express parallelism
• Compiler and/or hardware must

find hidden parallelism

• Does not work well in general

Possible parallel approach
• Problem has inherent parallelism
• Programmer has way to express

parallelism
• Compiler translates program for

multiple cores or multiple
processors (etc)

• We will find out how well this
works!

Low level vs High level parallelism

• Low-level constructions
• very flexible, but require much more care on details

• High-level constructions
• compiler takes care on many details, still reasonably flexible

• You will practically explore both options
• University HPC (& maybe others…)
• CPU & GPU
• C (plus language extensions) & some CUDA

How Big?

• We will cover “Top500” next week

• But check out
#exascaleday

• Exascale = 10 to power of 18

• Exascale Day => Oct 18 (10-18 in American format)
• #exascaleday => lots of examples from ECP
• Liverpool link…

• want faster

MODULE AIMS
• To provide students with a deep, critical and

systematic understanding of key issues and
effective solutions for parallel programming
for systems with multi-core processors and
parallel architectures

• want faster

• want bigger

MODULE AIMS
• To provide students with a deep, critical and

systematic understanding of key issues and
effective solutions for parallel programming
for systems with multi-core processors and
parallel architectures

• want faster

• want bigger

• and to use them effectively

MODULE AIMS
• To provide students with a deep, critical and

systematic understanding of key issues and
effective solutions for parallel programming
for systems with multi-core processors and
parallel architectures

• To develop students’ appreciation of a
variety of approaches to parallel
programming, including using MPI and
OpenMP

• To develop the students’ skills in parallel
programming in particular using MPI and
OpenMP

• To develop the students’ skills in
parallelization of ideas, algorithms and of
existing serial code.

** what elements to consider

hw: cpu, mem (NUMA),

sw: language extensions, new
languages

misc: compilers, DSL,

** options to get to HPC

cpu: x86/arm/power, gpu, xeon
phi, FPGA, custom ASIC

mem: shared, dist, (v-shared)

lang: openmp, mpi, java
threads, cuda, opencl

data parallelism, task
parallelism

instruction level parallelism

• Improving a single chip
• Faster clock
• More ops/cycle

• … and improving memory
• Faster, bigger, better

• ONLY GOES SO FAR

• Using >>1 gets us a LOT future
• multi-core and multi-

processor
•  parallel computing

Aims

• To provide students with a deep, critical and systematic
understanding of key issues and effective solutions for parallel
programming for systems with multi-core processors and parallel
architectures

• To develop students’ appreciation of a variety of approaches to
parallel programming, including using MPI and OpenMP

• To develop the students’ skills in parallel programming in
particular using MPI and OpenMP

• To develop the students’ skills in parallelization of ideas,
algorithms and of existing serial code.

How it fits together…

• Lectures
• theory
• programming languages / ideas

• MPI
• OpenMP

• accelerators

• Labs
• hands-on exploration of topics
• support for assessments

• Assessments: (see previous lecture)

attendance is recorded (?)

Background Reading
• My web pages

https://cgi.csc.liv.ac.uk/~mkbane/facilities/index.html

• Reading List @ UoL Library

• MOOC from PRACE https://www.futurelearn.com/courses/supercomputing
• can view for free, but limited time
• complements COMP528

• Learning “C”
• just another programming language

• imperative, with options for OO
• I. Horton, “Beginning C”,

Berkeley, CA : Apress (PDF via UoL Library)
• COMP327 (Dr. Terry Payne)

Today

• who I am
• who you are

• course structure
• motivation to HPC

• multi-core & multi-processor

Next

• Async lectures
• Terminology: cores, cpus, processors;

threads, processes
• How “high performing” is a

supercomputer?
• ways of exploiting parallelism
• barriers
• … and what these mean

• labs:
• getting set up on the systems
• expect some teething problems

ANY QUESTIONS
Dr Michael K Bane, G14, Computer Science, University of Liverpool
m.k. .uk https://cgi.csc.liv.ac.uk/~mkbane/COMP528