IT代写 CS402/922 High Performance Computing ● ●

OpenMP: An Implementation of Thread Level Parallelism
aka “Real-world Multithreading!!” https://warwick.ac.uk/fac/sci/dcs/teaching/material/cs402/ 18/01/2022 ● CS402/922 High Performance Computing ● ●

18/01/2022

Copyright By PowCoder代写 加微信 powcoder

“Previously, on the HPC module…”
• ThreadàA small section of code that is split into multiple copies within a single process.
• Threads often share code, global information and other resources
• Multiprocessing vs. Multithreading
• Allocate 1 (or more if hyperthreading) thread to a processor core

18/01/2022
Parallelism made easy!
• OpenMP (shortened to OMP) is a pragma based multithreading library
• Compiler manages threads
• Programmers write specialised comments
• Supports FORTRAN, C and C++
• Version 1.0 came out on FORTRAN in 1997, with
C/C++ following the next year
• Version 3.0 (most widely used) came out in 2008

Home

18/01/2022
Fork-Join Model
Not to be confused with fork and spoons…
• Different ways threads can be managed
• Thread PoolàA collection of persistent threads that
work could be allocated to
• Fork-JoinàThreads are created (forked) and
destroyed (joined) when required • OMP uses fork-join model
Master thread
Parallel region
Parallel region

18/01/2022
Building programs with OpenMP
Not to be confused with fork and spoons…
• OMP has been built into many compilers
• Careful!àDifferent compilers require different flags to
• Need to include the OpenMP header file (omp.h)
libomp to be installed separately (e.g. through Homebrew)
OpenMP Flag(s)
OpenMP Support
• GCC 6 onwards supports OMP 4.5
• GCC 9 has initial support for OMP 5
Clang (LLVM)
• Fully supports OMP 4.5
• Working on OMP 5.0 and 5.1
Clang (Apple)
-Xpreprocessor -fopenmp -lomp
• See Clang (LLVM)
• Intel 17.0 onwards supports OMP 4.5
• Intel oneAPI supports part of OMP 5.1

18/01/2022
Parallelising Loops
Finally, lets parallelise!
• OMP is most often utilised through pragma comments
#pragma omp
• Creates OMP threads and executes the following region in parallel
int driver1(int N, int* a, int* b, int* c)
#pragma omp
parallel for
• Specifies a for loop to be ran in parallel over all OMP threads
int kernel2(int N, int* a, int* b, int* c)
Other pragma commands
• #pragma omp parallel do
• Equivalent to parallel for, but for do … while loops
• #pragma omp parallel loop
• Allows for multiple loops to be ran concurrently
• #pragma omp simd
• Indicates a loop can be transformed into a SIMD loop
#pragma omp parallel {
kernel1(N, a, b, c); }}
#pragma omp parallel for
for (i = 0; i < N; i++) { c[i] = a[i] + b[i]; 18/01/2022 Private variables • Specifies a list of variables that are local to that thread a=0 ... a=4 • The variables can be set and reset in different ways • privateàIt is not given an initial value • firstprivateàIts initial variable is set to the value of the variable • lastprivateàThe variables value is set to the value in the primary thread a=9 ... a=4 a=9 ... a=4 a=0 ... a=5 a=9 ... a=5 a=0 ... a=6 a=9 ... a=6 a=9 ... a=5 a=9 ... a=6 Lastprivate firstprivate private 18/01/2022 So who’s taking what thing again? • We can specify how the work is split up between threads • The most commonly used ones are: • staticàworkload is split evenly between threads before compute • dynamicàworkload is split into equally sized chunks, threads request chunks when required • guidedàsame asdynamic, but successive chunks get smaller • Great for load balancing and/or reducing overhead 18/01/2022 Syncing OpenMP Threads Even threads need to coordinate sometimes! • Synchronisation is sometimes required as well • #pragma omp critical • Runs the following code in a single thread • #pragma omp atomic • Ensures a memory location is accessed without conflict • Difference between these operations: • atomic has a lower overhead • critical allows for multi-line statements int factorialCritical(int N) { int i; int resultShared = 0; #pragma omp parallel int resultLocal = 0; #pragma omp for for (i = 0; i < N; i++) { resultLocal += i; #pragma omp critical resultShared += resultLocal; } return resultShared; int factorialAtomic(int N) { int resultShared = 0; #pragma omp parallel for for (i = 0; i < N; i++) { #pragma omp atomic resultShared += i; return resultShared; 18/01/2022 Reductions • Allows for the same operation to be applied to the same variable over multiple threads • Often faster than atomic or critical • Limited to a certain number of operations: int factorialReduction(int N) { int i; int resultShared = 0; #pragma omp parallel for reduction(+:resultShared) for (i = 0; i < N; i++) { resultShared += i; return resultShared; } Making dependencies easier one step at a time! • Identifier function/expression (min, max) • +,-,*,&,|,^,&&,|| 18/01/2022 OpenMP functions What’s a library without functions! • Some aspects of the OMP environment can be set or retrieved within the program • Key examples include: • omp_get_num_threads() à Gets the number of threads available • omp_get_thread_num() à Gets the ID of the thread • omp_set_num_threads(int) à Sets the number of threads that can be utilised • omp_get_wtime() à Gets the wall clock time (thread safe) 18/01/2022 Environment Variables Why recompile when we can alter the environment‽ • Allows us to change key elements without changes to the code • Often used examples: • OMP_NUM_THREADSàThe number of threads to be utilised in the program • OMP_SCHEDULEàThe ordering with which the threads should iterate through a loop • OMP_PROC_BINDàControls if and how threads can move between cores 18/01/2022 What’s next for OpenMP? Onwards and upwards! • OMP4.5, 5.0 and 5.1 • Target offloadàSpecify where the compute should occur (CPU/GPU/Accelerator etc.) • Memory managementàSpecify where the data should be stored and how • New version out (5.2) out soon 18/01/2022 Interesting related reads Some of this might even be fun... • International Workshop on OpenMP (IWOMP)àhttps://www.iwomp.org/ • OpenMP Quick Reference Guide (Version 5.2)à https://www.openmp.org/wp-content/uploads/OpenMPRefCard-5-2-web.pdf • OpenMP Examples (Version 5.1)àhttps://www.openmp.org/wp- content/uploads/openmp-examples-5.1.pdf Next lecture: Intro to Coursework 1 程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com