CS计算机代考程序代写 compiler c++ Fortran concurrency cache UNIS Template

UNIS Template

More Concurrency
Exercises and OpenMP

Shuaiwen Leon Song

Concurrency and Synchronization Exercise

2

Q1(fork)

3

How many lines of output will it produce?

Answer: 7

Q2 (sempahores)

4

Consider the following program using semaphores . Which of the following execution order

of sem_wait and sem_post are possible (W represent sem_wait, P represent sem_post)?

(A) WPWWWP

(B) WWPPWW

(C) PPWWWW

(D) WPWPWW

(E) WWWWPP

Q3 (Zombie Processes)

5

Will the following programs create zombie processes?

Q4 (Scalability)

6

Consider a multithreaded webserver running on a machine with N CPU cores. The

server has M worker threads. Every incoming request is put in a request queue, and

served by one of the free worker threads. The server is fully saturated and has a certain

throughput at saturation. Under which circumstances will increasing M might lead to

an approximate linear increase in the saturation throughput of the server?

(A) Under any circumstances

(B) Under no circumstances

(C) when M < N and the workload to the server is CPU-bound (D) when M < N and the workload to the server is I/O-bound Q5 (unintentional share or not) 7 Please choose all possible outputs of the following program. A. 0123456789 B. 0234156789 C. 0987123456 D. 8905674321 E. 0114459876 F. 1124465789 Q6 (variable scope) 8 In the program below, which variables are shared among threads? Which are not? 9 Consider the program below, is there any potential performance issue with it? If there is, how to fix it? Note that the two functions are concurrently executed on two different processor cores, each of which has its own cache hierarchy. Q6 (false sharing vs. true sharing) What about static struct foo f[3]; False Sharing vs True Sharing 10 True sharing: true sharing, would require programmatic synchronization constructs to ensure ordered data access. False sharing: The frequent coordination required between processors when cache lines are marked ‘Invalid’ requires cache lines to be written to memory and subsequently loaded. False sharing increases this coordination and can significantly degrade application performance. OpenMP programs are known to be prone to false sharing. 11 If a process performs a sem_post(sem_t ∗sem) operation on a counting semaphore whose value is zero, and there are one or more other processes waiting on the semaphore, how many of the processes could be unblocked? A. Exactly one process B. All processes C. Zero D. Could be any value from one to N (the number of blocked process) Q7 (semaphore mechanism) Q8 ( comparison across different synchronizations) 12 Which of the following descriptions about semaphores and spinlocks are correct. A. Spinlocks waste CPU time (busy waiting) while semaphores almost do not. B. A semaphore has a counter and could be acquired by multiple threads while a spinlock could only be acquired by one thread. C. When threads hold locks for a short period of time and the contention between threads is not significant, Spinlocks are preferred due to no context switch overheads. D. When threads hold locks for a short period of time and the contention between threads is not significant, Semaphores are preferred since spinlock’s busy-waiting will waste CPU cycles. Q9 (fork() and waitpid(-1)) 13 Please write out all the possible printouts of the following program A and B. Q10 (thread detachment) 14 (I) pthread_detach(t); pthread_exit(NULL); (II) pthread_detach(t); return 0; (III) pthread_detach(t); exit(0); (IV)pthread_join(t, NULL); return 0; Q11 (unintentional sharing) 15 Please consider the following program. The program's original intention was to print out “hello from thread 0”, 'hello from thread 1' all the way to 'hello from thread 9' (note that the order of the printout should be random). However, there are several bugs in the code below. Please list all the bugs appeared in this code and explain why they cause problems. Please provide a fix to the code so it can run correctly. Q12 (Synchronization using mutex) › Door opening and locking 16 Q13 (writing a barrier) 17 OpenMP Basics 18 19 Introduction to OpenMP › What is OpenMP? - Open specification for Multi-Processing - “Standard” API for defining multi-threaded shared-memory programs - openmp.org – Talks, examples, forums, etc. - computing.llnl.gov/tutorials/openMP/ - portal.xsede.org/online-training - www.nersc.gov/assets/Uploads/XE62011OpenMP.pdf › High-level API - Preprocessor (compiler) directives ( ~ 80% ) - Library Calls ( ~ 19% ) - Environment Variables ( ~ 1% ) http://www.openmp.org/ https://portal.xsede.org/online-training http://www.nersc.gov/assets/Uploads/XE62011OpenMP.pdf 20 A Programmer’s View of OpenMP › OpenMP is a portable, threaded, shared-memory programming specification with “light” syntax - Exact behavior depends on OpenMP implementation! - Requires compiler support (C, C++ or Fortran) › OpenMP will: - Allow a programmer to separate a program into serial regions and parallel regions, rather than T concurrently-executing threads. - Hide stack management - Provide synchronization constructs › OpenMP will not: - Parallelize automatically - Guarantee speedup - Provide freedom from data races 21 Motivation – OpenMP int main() { // Do this part in parallel printf( "Hello, World!\n" ); return 0; } 22 Motivation – OpenMP int main() { omp_set_num_threads(16); // Do this part in parallel #pragma omp parallel { printf( "Hello, World!\n" ); } return 0; } Programming Model 23 Because OpenMP is designed for shared memory parallel programming, it largely limited to single node parallelism. Typically, the number of processing elements (cores) on a node determine how much parallelism can be implemented. 24 Programming Model – Concurrent Loops › OpenMP easily parallelizes loops - Requires: No data dependencies (reads/write or write/write pairs) between iterations! › Preprocessor calculates loop bounds for each thread directly from serial source ? ? for( i=0; i < 25; i++ ) { printf(“Foo”); } #pragma omp parallel for Motivation of using OpenMP 25 LLNL OpenMP Tutorial › No better materials then LLNL OpenMP tutorial. › We are now going to go through some of the important knowledge points using the tutorial: › https://computing.llnl.gov/tutorials/openMP/ 26