FIT3143 Tutorial Week 4
Lecturers: ABM Russel (MU Australia) and (MU Malaysia)
SHARED MEMORY (OPENMP)
OBJECTIVES
Copyright By PowCoder代写 加微信 powcoder
• The purpose of this tutorial is to introduce Parallel Computing on shared memory
• Understand the concept of OPENMP (OMP) thread
Note: Tutorials are not assessed. Nevertheless, please attempt the questions to improve
your unit comprehension in preparation for the labs, assignments, and final assessments.
QUESTIONS/TOPICS
1. OpenMP provides a temporary view of the thread memory. Why?
All OpenMP threads have access to a place to store and to retrieve variables, called the memory. In addition, each thread is allowed to have its own temporary view of the memory. The temporary view of memory for each thread is not a required part of the OpenMP memory model, but can represent any kind of intervening structure, such as machine registers, cache, or other local storage, between the thread and the memory. The temporary view of memory allows the thread to cache variables and thereby to avoid going to memory for every reference to a variable. Each thread also has access to another type of memory that must not be accessed by other threads, called thread private memory.
https://www.openmp.org/spec-html/5.0/openmpsu9.html
2. Withreferencetothefollowingcode:
#include
#include
#define N 20
int main() {
int a[N] = {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20}; int b[N] = {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20}; int c[N] = {0};
#pragma omp parallel for shared (a,b,c) private(i)
for(i=0; i
c[i] = c[i-1] + (a[i] * b[i]);
c[i] = (a[i] * b[i]);
printf(“Values of C:\n”);
for(i=0; i
#include
#include
#include
#include
#define THREADS 6
#define N 18
int main () {
struct timespec start, end;
double time_taken;
clock_gettime(CLOCK_MONOTONIC, &start);
#pragma omp parallel for schedule(static) num_threads(THREADS) for (i = 0; i < N; i++) {
printf("Thread %d has completed iteration %d.\n", omp_get_thread_num(), i);
clock_gettime(CLOCK_MONOTONIC, &end);
time_taken = (end.tv_sec - start.tv_sec) * 1e9;
time_taken = (time_taken + (end.tv_nsec - start.tv_nsec)) * 1e-9; printf("Overall time (s): %lf\n", time_taken);
/* all threads done */
printf("All done!\n");
When schedule(static) is specified within the OpenMP construct, the following outcome is observed (code compiled and executed on a computer with six cores):
However, when schedule(dynamic) is specified within the OpenMP construct, the following outcome is observed (code compiled and executed on a computer with six cores):
Based on the aforementioned code and observations:
a) Explain why the schedule(dynamic) construct performs better (i.e., lower computational time) than that of the schedule(static) construct. Hint: Refer to this link for an explanation of the OpenMP schedule clause.
Dynamic scheduling is better when the iterations may take very different amounts of time, which is the case in the code example above.
b) Whatwouldbetheexpectedobservationifschedule(guided)wasused? Answers from: http://jakascorner.com/blog/2016/06/omp-for-scheduling.html
The guided scheduling type is similar to the dynamic scheduling type. OpenMP again divides the iterations into chunks. Each thread executes a chunk of iterations and then requests another chunk until there are no more chunks available.
The difference with the dynamic scheduling type is in the size of chunks. The size of a chunk is proportional to the number of unassigned iterations divided by the number of the threads. Therefore, the size of the chunks decreases.
In the context of the code above, if schedule(guided) was used, the performance would be similar to the dynamic scheduling type.
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com