Comprehensive OpenMP
Shuaiwen SOFT 3410 Week 10
Basics
• OpenMP is readily available in the GCC compiler; to use it, we just need to add the right #pragma omp directives and compile and link with the command line switch -fopenmp.
• How do you normally check how many CPU cores you have (in other words, how many threads you want to launch)?
lscpu
cat /proc/cpuinfo
You can always use OMP_GET_NUM_PROCS() to get the available cpu cores (logical or physical, hyper-threading can play a role).
OMP_get_max_threads() usually returns the same number.
The number of threads is automatically chosen by OpenMP; it is typically equal to the number of hardware threads that the CPU supports, which in the case of a low-end CPU is usually the number of CPU cores.
❑ Note that other threads are free to execute e.g. c(3) while another thread is running c(2). However, no other thread can start executing c(2) while one thread is running it.
❑ Critical sections also ensure that modifications to shared data in one thread will be visible to the other threads, as long as all references to the shared data are kept inside critical sections. Each thread synchronizes its local caches with global memory whenever it enters or exits a critical section.
OpenMP Parallel for loops
Common Mistakes (1)
Common Mistakes (2)
Legit Split
nowait
Interaction with Critical Sections
Opportunities using nowait
Scheduling
Static Scheduling
Dynamic Scheduling
Dynamic Scheduling
Dynamic Scheduling
Parallelizing Nested Loops
Some Other Considerations on Nested Loops
Some Other Considerations on Nested Loops
Some Other Considerations on Nested Loops
Some Other Considerations on Nested Loops
In theory, can we always parallelize the most inner loop?
When we can, what are the performance consequences?
Common Mistakes (3)
Common Mistakes (4)
Hyper-Threading: Helping Keep CPU Busy
When Hyper-Threading Helps
Atomic Operations
Useful Tips for Writing Your OpenMP Program
Controlling the number of threads
https://stackoverflow.com/questions/11095309/openmp-set-num- threads-is-not-working
Other methods: omp_set_number_threads(x)
But be careful !
Use omp_set_dynamic (0);