OpenMP API 4.5 C/C++ Page 1
OpenMP 4.5 API C/C++ Syntax Reference Guide
OpenMP Application Program Interface (API) is a portable, scalable model that gives parallel programmers a simple and flexible interface for
® developing portable parallel applications. OpenMP
Copyright By PowCoder代写 加微信 powcoder
supports multi-platform shared-memory parallel programming in C/C++ and Fortran on all architectures, including Unix platforms and Windows platforms.
See www.openmp.org for specifications.
Text in this color indicates functionality that is new or changed in the OpenMP API 4.5 specification. [n.n.n] Refers to sections in the OpenMP API 4.5 specification.
[n.n.n] Refers to sections in the OpenMP API 4.0 specification.
Directives and Constructs for C/C++
An OpenMP executable directive applies to the succeeding structured block or an OpenMP construct. Each directive starts with #pragma omp. The remainder of the directive follows the conventions of the C and C++ standards for compiler directives. A structured-block is a single statement or a compound statement with a single entry at the top and a single exit at the bottom.
parallel [2.5] [2.5]
Forms a team of threads and starts parallel execution.
#pragma omp parallel [clause[ [, ]clause] …] structured-block
if([ parallel : ] scalar-expression) num_threads(integer-expression) default(shared | none)
private(list)
firstprivate(list)
shared(list)
copyin(list) reduction(reduction-identifier: list) proc_bind(master | close | spread)
for [2.7.1] [2.7.1]
Specifies that the iterations of associated loops will be executed in parallel by threads in the team in the context of their implicit tasks.
#pragma omp for [clause[ [, ]clause] …] for-loops
clause: private(list)
firstprivate(list)
lastprivate(list)
linear(list [ : linear-step]) reduction(reduction-identifier : list)
schedule( [modifier [, modifier] : ] kind[, chunk_size]) collapse(n)
ordered[ (n) ] nowait
• static: Iterations are divided into chunks of size
chunk_size and assigned to threads in the team in round-robin fashion in order of thread number.
• dynamic: Each thread executes a chunk of iterations then requests another chunk until none remain.
• guided: Each thread executes a chunk of iterations then requests another chunk until no chunks remain to be assigned.
• auto: The decision regarding scheduling is delegated to the compiler and/or runtime system.
• runtime: The schedule and chunk size are taken from the run-sched-var ICV.
• monotonic: Each thread executes the chunks that it is
assigned in increasing logical iteration order.
• nonmonotonic: Chunks are assigned to threads in any order and the behavior of an application that depends
on execution order of the chunks is unspecified.
• simd: Ignored when the loop is not associated with a SIMD construct, otherwise the new_chunk_size
for all except the first and last chunks is chunk_size/ simd_width * simd_width where simd_width is an implementation-defined value.
sections [2.7.2] [2.7.2]
A noniterative worksharing construct that contains a set of structured blocks that are to be distributed among and executed by the threads in a team.
#pragma omp sections [clause[ [, ] clause] …] {
[#pragma omp section] structured-block
[#pragma omp section … structured-block]
clause: private(list)
firstprivate(list)
lastprivate(list) reduction(reduction-identifier: list) nowait
single [2.7.3] [2.7.3]
Specifies that the associated structured block is executed by only one of the threads in the team.
#pragma omp single [clause[ [, ]clause] …] structured-block
clause: private(list)
firstprivate(list) copyprivate(list) nowait
simd [2.8.1] [2.8.1]
Applied to a loop to indicate that the loop can be transformed into a SIMD loop.
#pragma omp simd [clause[ [, ]clause] …] for-loops
clause: safelen(length)
simdlen(length)
linear(list[ : linear-step]) aligned(list[ : alignment]) private(list)
lastprivate(list) reduction(reduction-identifier : list) collapse(n)
declare simd [2.8.2] [2.8.2]
Enables the creation of one or more versions that can process multiple arguments using SIMD instructions from a single invocation from a SIMD loop.
#pragma omp declare simd [clause[ [, ]clause] …] [#pragma omp declare simd [clause[ [, ]clause] …]]
[…] function definition or declaration
clause: simdlen(length)
linear(linear-list[ : linear-step]) aligned(argument-list[ : alignment]) uniform(argument-list)
notinbranch
for simd [2.8.3] [2.8.3]
Specifies that a loop that can be executed concurrently using SIMD instructions, and that those iterations will also be executed in parallel by threads in the team.
#pragma omp for simd [clause[ [, ]clause] …] for-loops
Any accepted by the simd or for directives with identical meanings and restrictions.
task [2.9.1] [2.11.1]
Defines an explicit task. The data environment of the task is created according to data-sharing attribute clauses on task construct and any defaults that apply.
#pragma omp task [clause[ [, ]clause] …] structured-block
if([ task : ] scalar-expression) final(scalar-expression)
default(shared | none) mergeable
private(list)
firstprivate(list)
shared(list) depend(dependence-type : list) priority(priority-value)
taskloop [2.9.2]
Specifies that the iterations of one or more associated loops will be executed in parallel using OpenMP tasks.
#pragma omp taskloop [clause[ [, ]clause] …] for-loops
if([ taskloop : ] scalar-expression) shared(list)
private(list)
firstprivate(list)
lastprivate(list)
default(shared | none) grainsize(grain-size) num_tasks(num-tasks) collapse(n) final(scalar-expression) priority(priority-value)
Continued4
© 2015 OpenMP ARB OMP1115C
Page 2 OpenMP API 4.5 C/C++
Directives and Constructs for C/C++ (continued)
taskloop simd [2.9.3]
Specifies that a loop that can be executed concurrently using SIMD instructions, and that those iterations will also be executed in parallel using OpenMP tasks.
#pragma omp taskloop simd [clause[ [, ]clause] …] for-loops
clause: Any accepted by the simd or taskloop directives with identical meanings and restrictions.
taskyield[2.9.4] [2.11.2]
Specifies that the current task can be suspended in favor of execution of a different task.
#pragma omp taskyield
target data [2.10.1] [2.9.1]
Creates a device data environment for the extent of the region.
#pragma omp target data clause[ [ [, ]clause] …] structured-block
if([ target data : ] scalar-expression) device(integer-expression) map([[map-type-modifier[ ,]] map-type: ] list) use_device_ptr(list)
target enter data [2.10.2]
Specifies that variables are mapped to a device data environment.
#pragma omp target enter data [clause[ [, ]clause] …] clause:
if([ target enter data : ] scalar-expression) device(integer-expression) map([[map-type-modifier[ ,]] map-type : ] list) depend(dependence-type : list)
target exit data [2.10.3]
Specifies that list items are unmapped from a device data environment.
#pragma omp target exit data [clause[ [, ]clause] …]
if([ target exit data : ] scalar-expression) device(integer-expression) map([[map-type-modifier[ ,]] map-type : ] list) depend(dependence-type : list)
target [2.10.4] [2.9.2]
Map variables to a device data environment and execute the construct on that device.
#pragma omp target [clause[ [, ]clause] …] structured-block
if([ target : ] scalar-expression) device(integer-expression)
private(list)
firstprivate(list)
map([[map-type-modifier[ ,]] map-type: ] list) is_device_ptr(list)
defaultmap(tofrom : scalar)
depend(dependence-type : list)
target update [2.10.5] [2.9.3]
Makes the corresponding list items in the device data environment consistent with their original list items, according to the specified motion clauses.
#pragma omp target update clause[ [ [, ]clause] …] clause is motion-clause or one of:
if([ target update : ] scalar-expression) device(integer-expression)
depend(dependence-type : list)
motion-clause: to(list)
from(list)
declare target [2.10.6] [2.9.4]
A declarative directive that specifies that variables and functions are mapped to a device.
#pragma omp declare target
declarations-definition-seq
#pragma omp end declare target
#pragma omp declare target (extended-list) #pragma omp declare target clause[ [, ]clause …]
clause: to(extended-list)
link(list)
teams [2.10.7] [2.9.5]
Creates a league of thread teams where the master thread of each team executes the region.
#pragma omp teams [clause[ [, ]clause] …] structured-block
num_teams(integer-expression) thread_limit(integer-expression) default(shared | none) private(list)
firstprivate(list)
shared(list) reduction(reduction-identifier : list)
distribute [2.10.8] [2.9.6]
Specifies loops which are executed by the thread teams.
#pragma omp distribute [clause[ [, ]clause] …] for-loops
clause: private(list)
firstprivate(list)
lastprivate(list)
collapse(n)
dist_schedule(kind[, chunk_size])
distribute simd [2.10.9] [2.9.7]
Specifies loops which are executed concurrently using SIMD instructions.
#pragma omp distribute simd [clause[ [, ]clause] …] for-loops
clause: Any accepted by the distribute or simd directives.
distribute parallel for [2.10.10] [2.9.8]
These constructs specify a loop that can be executed in parallel by multiple threads that are members of multiple teams.
#pragma omp distribute parallel for [clause[ [, ]clause] …] for-loops
clause: Any accepted by the distribute or parallel for directives
distribute parallel for simd [2.10.11] [2.9.9]
These constructs specify a loop that can be executed in parallel using SIMD semantics in the simd case by multiple threads that are members of multiple teams.
#pragma omp distribute parallel for simd [clause[ [, ]clause] …]
clause: Any accepted by the distribute or parallel for simd
directives.
parallel for [2.11.1] [2.10.1]
Shortcut for specifying a parallel construct containing one or more associated loops and no other statements.
#pragma omp parallel for [clause[ [, ]clause] …] for-loop
clause: Any accepted by the parallel or for directives, except the nowait clause, with identical meanings and restrictions.
parallel sections [2.11.2] [2.10.2]
Shortcut for specifying a parallel construct containing one sections construct and no other statements.
#pragma omp parallel sections [clause[ [, ]clause] …] {
[#pragma omp section] structured-block [#pragma omp section
structured-block]
clause: Any accepted by the parallel or sections directives, except the nowait clause, with identical meanings and restrictions.
parallel for simd [2.11.4] [2.10.4]
Shortcut for specifying a parallel construct containing one for simd construct and no other statements.
#pragma omp parallel for simd [clause[ [, ]clause] …] for-loops
clause: Any accepted by the parallel or for simd directives, except the nowait clause, with identical meanings and restrictions.
target parallel [2.11.5]
Shortcut for specifying a target construct containing a parallel construct and no other statements.
#pragma omp target parallel [clause[ [, ]clause] …] structured-block
clause: Any accepted by the target or parallel directives, except for copyin, with identical meanings and restrictions.
target parallel for [2.11.6]
Shortcut for specifying a target construct containing a parallel for construct and no other statements.
#pragma omp target parallel for [clause[ [, ]clause] …] for-loops
clause: Any accepted by the target or parallel for directives, except for copyin, with identical meanings and restrictions.
target parallel for simd [2.11.7]
Shortcut for specifying a target construct containing a parallel for simd construct and no other statements.
#pragma omp target parallel for simd [clause[ [, ]clause] …] for-loops
clause: Any accepted by the target or parallel for simd directives, except for copyin, with identical meanings and restrictions.
Continued4
© 2015 OpenMP ARB OMP1115C
OpenMP API 4.5 C/C++ Page 3
Directives and Constructs for C/C++ (continued)
target simd [2.11.8]
Shortcut for specifying a target construct containing a simd construct and no other statements.
#pragma omp target simd [clause[ [, ]clause] …] for -loops
clause: Any accepted by the target or simd directives with identical meanings and restrictions.
target teams [2.11.9] [2.10.5]
Shortcut for specifying a target construct containing a teams construct and no other statements.
#pragma omp target teams [clause[ [, ]clause] …] structured-block
clause: Any accepted by the target or teams directives with identical meanings and restrictions.
teams distribute [2.11.10] [2.10.6]
Shortcuts for specifying a teams construct containing a distribute construct and no other statements.
#pragma omp teams distribute [clause[ [, ]clause] …] for-loops
clause: Any clause used for teams or distribute, with identical meanings and restrictions.
teams distribute simd [2.11.11] [2.10.7]
Shortcuts for specifying a teams construct containing a distribute simd construct and no other statements.
#pragma omp teams distribute simd [clause[ [, ]clause] …] for-loops
clause: Any clause used for teams or distribute simd, with identical meanings and restrictions.
target teams distribute [2.11.12] [2.10.8]
Shortcuts for specifying a target construct containing a teams distribute construct and no other statements.
#pragma omp target teams distribute [clause[ [, ]clause] …] for-loops
clause: Any clause used for target or teams distribute target teams distribute simd [2.11.13] [2.10.9]
Shortcuts for specifying a target construct containing a teams distribute simd construct and no other statements.
#pragma omp target teams distribute simd [clause[ [, ] clause] …]
clause: Any clause used for target or teams distribute
teams distribute parallel for [2.11.14] [2.10.10] Shortcuts for specifying a teams construct containing a distribute parallel for construct and no other statements.
#pragma omp teams distribute parallel for [clause[ [, ] clause] …]
clause: Any clause used for teams or distribute parallel for
target teams distribute parallel for [2.11.15] [2.10.11]
Shortcut for specifying a target construct containing a teams distribute parallel for construct and no other statements.
#pragma omp target teams distribute parallel for
[clause[ [, ]clause] …] for-loops
clause: Any clause used for teams distribute parallel for or target
teams distribute parallel for simd [2.11.16] [2.10.12] Shortcut for specifying a teams construct containing
a distribute parallel for simd construct and no other statements.
#pragma omp teams distribute parallel for simd [clause[ [, ] clause] …]
clause: Any clause used for distribute parallel for simd or
target teams distribute parallel for simd [2.11.17] [2.10.13]
Shortcut for specifying a target construct containing a teams distribute parallel for simd construct and no other statements.
#pragma omp target teams distribute parallel for simd
[clause[ [, ]clause] …] for-loops
clause: Any clause used for teams distribute parallel for simd or target
master [2.13.1] [2.12.1]
Specifies a structured block that is executed by the master thread of the team.
#pragma omp master
structured-block
critical [2.13.2] [2.12.2]
Restricts execution of the associated structured block to a single thread at a time.
#pragma omp critical [(name) [hint (hint-expression)]] structured-block
barrier [2.13.3] [2.12.3]
Specifies an explicit barrier at the point at which the construct appears.
#pragma omp barrier
taskwait [2.13.4] [2.12.4]
Specifies a wait on the completion of child tasks of the current task.
#pragma omp taskwait
taskgroup [2.13.5] [2.12.5]
Specifies a wait on the completion of child tasks of the current task, then waits for descendant tasks.
#pragma omp taskgroup
structured-block
atomic [2.13.6] [2.12.6]
Ensures that a specific storage location is accessed atomically. May take one of the following three forms:
#pragma omp atomic [seq_cst[,]] atomic-clause [[,]seq_cst] expression-stmt
#pragma omp atomic [seq_cst] expression-stmt
(atomic continues in the next column) #pragma omp atomic [seq_cst[,]] capture [[,]seq_cst]
structured-block
atomic clause: read, write, update, or capture
(atomic continues in the next column)
atomic (continued)
expression-stmt may be one of:
structured-block may be one of the following forms:
if atomic clause is…
{v = x; x binop= expr;} {v=x;x=xbinopexpr;} {x=xbinopexpr;v=x;} {v=x;x=expr;}
{v = x; ++x;} {x++; v = x;} {v = x; –x;} {x–; v = x;}
{x binop= expr; v = x;} {v=x;x=exprbinopx;} {x=exprbinopx;v=x;} {v = x; x++;}
{++x; v = x;} {v = x; x–;} {–x; v = x;}
expression-stmt:
is not present
x++; x–; ++x; –x; x binop= expr; x = x binop expr;
x = expr binop x;
v=x++; v=x–; v=++x; v= –x; v=xbinop=expr; v=x=xbinopexpr; v=x = expr binop x;
flush [2.13.7] [2.12.7]
Executes the OpenMP flush operation, which makes
a thread’s temporary view of memory consistent with memory, and enforces an order on the memory operations of the variables.
#pragma omp flush [(list)] ordered [2.13.8] [2.12.8]
Specifies a structured block in a loop, simd, or loop SIMD region that will be executed in the order of the loop iterations.
#pragma omp ordered [clause[[, ] clause]…] structured-block
threads simd
#pragma omp ordered clause[[[, ] clause]…]
depend (source) depend(sink : vec)
cancel [2.14.1] [2.13.1]
Requests cancellation of the innermost enclosing region of the type specified. The cancel directive may not be used in place of the statement following an if, while, do, switch, or label.
#pragma omp cancel construct-type-clause[ [, ] if-clause] construct-type-clause:
parallel sections for taskgroup
if-clause: if(scalar-expression)
cancellation point [2.14.2] [2.13.2]
Introduces a user-defined cancellation point at which tasks check if cancellation of the innermost enclosing region of the type specified has been activated.
#pragma omp cancellation point construct-type-clause
construct-type-clause: parallel
sections for taskgroup
Continued4
© 2015 OpenMP ARB OMP1115C
Page 4 OpenMP API 4.5 C/C++
Directives and Constructs for C/C++ (continued)
threadprivate [2.15.2] [2.14.2]
Specifies that variables are replicated, with each thread having its own copy. Each copy of a threadprivate variable is initialized once prior to the first reference to that copy.
#pragma omp threadprivate(list)
list: A comma-separated list of file-scope, namespace- scope, or static block-scope variables that do not have incomplete types.
declare reduction [2.16] [2.15]
Declares a reduction-identifier that can be used in a reduction clause.
#pragma omp declare reduction(reduction-identifier : typename-list : combiner) [initializer-clause]
reduction-identifier: A base language identifer (for C), or an id-expression (for C++), or one of the following operators: +, -, *, &, |, ^, && and ||
typename-list: A list of type names
combiner: An expression
initializer-clause: initializer (initializer-expr) where initializer-expr is omp_priv = initializer or function-name (argument-list )
Runtime Library Routines for C/C++
Execution environment routines affect and monitor threads, processors, and the parallel environment. The library routines are external functions with “C” linkage.
Execution Environment Routines
omp_set_num_threads [3.2.1] [3.2.1]
Affects the number of threads used for subsequent parallel regions not specifying a num_threads clause, by setting the value of the first element of the nthreads-var ICV of the current task to num_threads.
void omp_set_num_threads(int num_threads); omp_get_num_threads [3.2.2] [3.2.2]
Returns the number of threads in the current team. The binding region for an omp_get_num_threads region is the innermost enclosing parallel region.
int omp_get_num_threads(void); omp_get_max_threads [3.2.3] [3.2.3]
Returns an upper bound on the number of threads that could be used to form
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com