C H A P T E R 1

OpenMP API Summary

The OpenMP Application Program Interface is a portable, parallel programming model for shared memory multiprocessor architectures, developed in collaboration with a number of computer vendors. The specifications were created and are published by the OpenMP Architecture Review Board. For more information on OpenMP, including tutorials and other resources, see their web site at: http://www.openmp.org/.

The OpenMP API is the recommended parallel programming model for all Sun Studio compilers on Solaris OS platforms. See Chapter 6 for guidelines on converting legacy Fortran and C parallelization directives to OpenMP.

This chapter summarizes the directives, run-time library routines, and environment variables comprising the OpenMP Version 2.0 Application Program Interfaces, as implemented by the Sun Studio Fortran 95, C and C++ compilers.

1.1 Where to Find the OpenMP Specifications

The material presented in this chapter is only a summary with many details left out intentionally for the sake of brevity. In all cases, refer to the OpenMP specification documents for complete details.

The Fortran and C/C++ OpenMP 2.0 specifications can be found on the official OpenMP website, http://www.openmp.org/.

1.2 Special Conventions Used Here

In the tables and examples that follow, Fortran directives and source code are shown in upper case, but are case-insensitive.

The term structured-block refers to a block of Fortran or C/C++ statements having no transfers into or out of the block.

Constructs within square brackets, [...], are optional.

Throughout this manual, "Fortran" refers to the Fortran 95 language and compiler, f95.

The terms "directive" and "pragma" are used interchangeably in this manual.

1.3 Directive Formats

Only one directive-name can be specified on a directive line, and applies to the succeeding program statement.

Fortran:

Fortran fixed format accepts three directive "sentinels", free format accepts only one. In the Fortran examples that follow, free format will be used.

C/C++:

C and C++ use the standard preprocessing directive starting with #pragma omp.

OpenMP 2.0 Fortran
Fixed Format:
`C$OMP` directive-name optional_clauses... `!$OMP` directive-name optional_clauses... `$OMP` directive-name* optional_clauses...
The sentinel must start in column one; continuation lines must have a non-blank or non-zero character in column 6. Comments may appear after column 6 on the directive line, initiated by an exclamation point (!). The rest of the line after the ! is ignored.
Free Format:
`!$OMP` directive-name optional_clauses...
May appear anywhere on a line, preceded only by whitespace; an ampersand (&) at the end of the line identifies a continued line. Comments may appear on the directive line, initiated by an exclamation point (!). The rest of the line is ignored.

OpenMP 2.0 C/C++
`#pragma` `omp` directive-name optional_clauses... Each pragma must end with a new-line character, and follows the conventions of standard C and C++ for compiler pragmas. Pragmas are case sensitive. The order in which clauses appear is not significant. White space can appear after and before the # and between words. The directive applies to the succeeding statement, which must be a structured block.

OpenMP 2.0 C/C++

#pragma omp directive-name optional_clauses...

Each pragma must end with a new-line character, and follows the conventions of standard C and C++ for compiler pragmas.

Pragmas are case sensitive. The order in which clauses appear is not significant. White space can appear after and before the # and between words.

The directive applies to the succeeding statement, which must be a structured block.

1.4 Conditional Compilation

The OpenMP API defines the preprocessor symbol _OPENMP to be used for conditional compilation. In addition, OpenMP Fortran API accepts a conditional compilation sentinel.

OpenMP 2.0 Fortran
Fixed Format:
`!$` fortran_95_statement `C$` fortran_95_statement `$` fortran_95_statement* `c$` fortran_95_statement
The sentinel must start in column 1 and have no intervening blanks. With OpenMP compilation enabled, the sentinel is replaced by two blanks. The rest of the line must conform to standard Fortran fixed format conventions. Example:
`C23456789` `!$ 10 iam = OMP_GET_THREAD_NUM() +` `!$ 1 index`
Free Format:
`!$` fortran_95_statement
This sentinel can appear in any column, preceded only by white space, and must appear as a single word. Fortran free format conventions apply to the rest of the line. Example:
`C23456789` `!$ iam = OMP_GET_THREAD_NUM() + &` `!$& index`
Fortran Preprocessor:
Compiling with OpenMP enabled defines the preprocessor symbol _OPENMP.
`#ifdef _OPENMP` `iam = OMP_GET_THREAD_NUM()+index` `#endif`

OpenMP 2.0 C/C++
C/C++ Preprocessor: Compiling with OpenMP enabled defines the macro _OPENMP. `#ifdef _OPENMP` `iam = omp_get_thread_num() + index;` `#endif`

OpenMP 2.0 C/C++

C/C++ Preprocessor:

Compiling with OpenMP enabled defines the macro _OPENMP.

#ifdef _OPENMP

iam = omp_get_thread_num() + index;

#endif

1.5 PARALLEL - Parallel Region Construct

The PARALLEL directive defines a parallel region, which is a region of the program that is to be executed by multiple threads in parallel.

OpenMP 2.0 Fortran
`!$OMP PARALLEL` [clause[[,]clause]...] structured-block `!$OMP END PARALLEL`

OpenMP 2.0 Fortran

!$OMP PARALLEL [clause[[,]clause]...]

structured-block

!$OMP END PARALLEL

OpenMP 2.0 C/C++
`#pragma omp parallel` [clause[[,]clause]...] structured-block

OpenMP 2.0 C/C++

#pragma omp parallel [clause[[,]clause]...]

structured-block

There are many special conditions and restrictions. Programmers are urged to refer to the appropriate OpenMP specification document for the details.

TABLE 1-1 identifies the clauses that can appear with this construct.

1.6 Work-Sharing Constructs

Work-sharing constructs divide the execution of the enclosed code region among the members of the team of threads that encounter it. Work sharing constructs must be enclosed within a parallel region for the construct to execute in parallel.

There are many special conditions and restrictions on these directives and the code they apply to. Programmers are urged to refer to the appropriate OpenMP specification document for the details.

1.6.1 DO and for Constructs

Specifies that the iterations of the DO or for loop that follows should be executed in parallel.

OpenMP 2.0 Fortran
`!$OMP DO` [clause[[,] clause]...] do_loop [`!$OMP END DO` [`NOWAIT`]]
The DO directive specifies that the iterations of the DO loop that immediately follows should be executed in parallel. The iterations of the loop are distributed across threads already existing in the team of threads executing the parallel region that binds the loop. This directive must appear within a parallel region to be effective.

OpenMP 2.0 Fortran

!$OMP DO [clause[[,] clause]...]

do_loop

[!$OMP END DO [NOWAIT]]

The DO directive specifies that the iterations of the DO loop that immediately follows should be executed in parallel. The iterations of the loop are distributed across threads already existing in the team of threads executing the parallel region that binds the loop. This directive must appear within a parallel region to be effective.

OpenMP 2.0 C/C++
`#pragma omp for` [clause[[,]clause]...] for-loop
The for pragma specifies that the iterations of the for-loop that immediately follows should be executed in parallel. The iterations of the loop are distributed across threads already existing in the team of threads executing the parallel region that binds the loop. This pragma must appear within a parallel region to be effective. The for pragma places restrictions on the structure of the corresponding for loop, and it must have canonical shape: `for (`initexpr`;` var logicop b`;` increxpr`)` where: initexpr is one of the following: var = lb integer_type var = lb increxpr is one of the following expression forms: `++`var var`++` `--`var var`--` var `+=` incr var `-=` incr var `=` var `+` incr var `=` incr `+` var var `=` var `-` incr var is a signed integer variable, made implicitly private for the range of the for. var must not be modified within the body of the for statement. Its value is indeterminate after the loop, unless specified lastprivate. logicop is one of the following logical operators: `< <= > >=` lb, b, and incr are loop invariant integer expressions.
There are further restrictions on the use of `<` or `<=` and `>` or `>=` as logicalop in the for statement. See the OpenMP C/C++ specifications for details.

OpenMP 2.0 C/C++

#pragma omp for [clause[[,]clause]...]

for-loop

The for pragma specifies that the iterations of the for-loop that immediately follows should be executed in parallel. The iterations of the loop are distributed across threads already existing in the team of threads executing the parallel region that binds the loop. This pragma must appear within a parallel region to be effective. The for pragma places restrictions on the structure of the corresponding for loop, and it must have canonical shape:

for (initexpr; var logicop b; increxpr)

where:

initexpr is one of the following:

var = lb
integer_type var = lb

increxpr is one of the following expression forms:

++var
var++ --var
var-- var += incr
var -= incr
var = var + incr
var = incr + var
var = var - incr

var is a signed integer variable, made implicitly private for the range of the for. var must not be modified within the body of the for statement. Its value is indeterminate after the loop, unless specified lastprivate.
logicop is one of the following logical operators:

< <= > >=

lb, b, and incr are loop invariant integer expressions.

There are further restrictions on the use of < or <= and > or >= as logicalop in the for statement. See the OpenMP C/C++ specifications for details.

TABLE 1-1 identifies the clauses that can appear with this construct.

1.6.2 SECTIONS Construct

The SECTIONS construct encloses a set of structured blocks of code to be divided among threads in the team. Each block is executed once by a thread in the team.

Each section is preceded by a SECTION directive, which is optional for the first section.

OpenMP 2.0 Fortran
`!$OMP SECTIONS` [clause[[,] clause]...] [`!$OMP SECTION`] structured-block [`!$OMP SECTION` structured-block ] ... `!$OMP END SECTIONS` [`NOWAIT`]

OpenMP 2.0 Fortran

!$OMP SECTIONS [clause[[,] clause]...]

[!$OMP SECTION]

structured-block

[!$OMP SECTION

structured-block ]

...

!$OMP END SECTIONS [NOWAIT]

OpenMP 2.0 C/C++
`#pragma omp sections` [clause[[,]clause]...] `{` [`#pragma omp section` ] structured-block [`#pragma omp section` structured-block] `...` `}`

OpenMP 2.0 C/C++

#pragma omp sections [clause[[,]clause]...]

{

[#pragma omp section ]

structured-block

[#pragma omp section

structured-block]

...

}

TABLE 1-1 identifies the clauses that can appear with this construct.

1.6.3 SINGLE Construct

The structured block enclosed by SINGLE is executed by only one thread in the team. Threads in the team that are not executing the SINGLE block wait at the end of the block unless NOWAIT is specified.

OpenMP 2.0 Fortran
`!$OMP SINGLE` [clause[[,] clause]...] structured-block `!$OMP END SINGLE` [end-modifier]

OpenMP 2.0 Fortran

!$OMP SINGLE [clause[[,] clause]...]
structured-block

!$OMP END SINGLE [end-modifier]

OpenMP 2.0 C/C++
`#pragma omp single` [clause[[,] clause]...] structured-block

OpenMP 2.0 C/C++

#pragma omp single [clause[[,] clause]...]

structured-block

TABLE 1-1 identifies the clauses that can appear with this construct.

1.6.4 Fortran WORKSHARE Construct

The WORKSHARE construct divides the work of executing the enclosed code block into separate units of work, and causes the threads of the team to share the work such that each unit is executed only once.

OpenMP 2.0 Fortran
`!$OMP WORKSHARE` structured-block `!$OMP END WORKSHARE` [`NOWAIT`]

OpenMP 2.0 Fortran

!$OMP WORKSHARE

structured-block

!$OMP END WORKSHARE [NOWAIT]

There is no C/C++ equivalent to the Fortran WORKSHARE construct.

1.7 Combined Parallel Work-sharing Constructs

The combined parallel work-sharing constructs are shortcuts for specifying a parallel region that contains one work-sharing construct.

There are many special conditions and restrictions on these directives and the code they apply to. Refer to the appropriate OpenMP specification document for the complete details. The description that follows is intended only as a summary and is not complete.

TABLE 1-1 identifies the clauses that can appear with these constructs.

1.7.1 PARALLEL DO and parallel for Constructs

Shortcut for specifying a parallel region that contains a single DO or for loop. Equivalent to a PARALLEL directive followed immediately by a DO or for directive. clause can be any of the clauses accepted by the PARALLEL and DO/for directives, except the NOWAIT modifier.

OpenMP 2.0 Fortran
`!$OMP PARALLEL DO` [clause[[,] clause]...] do_loop [`!$OMP END PARALLEL DO` ]

OpenMP 2.0 Fortran

!$OMP PARALLEL DO [clause[[,] clause]...]

do_loop

[!$OMP END PARALLEL DO ]

OpenMP 2.0 C/C++
`#pragma omp parallel for` [clause[[,] clause]...] for-loop

OpenMP 2.0 C/C++

#pragma omp parallel for [clause[[,] clause]...]

for-loop

1.7.2 PARALLEL SECTIONS Construct

Shortcut for specifying a parallel region that contains a single SECTIONS directive. Equivalent to a PARALLEL directive followed by a SECTIONS directive. clause can be any of the clauses accepted by the PARALLEL and SECTIONS directives, except the NOWAIT modifier.

OpenMP 2.0 Fortran
`!$OMP PARALLEL SECTIONS` [clause[[,] clause]...] [`!$OMP SECTION`] structured-block [`!$OMP SECTION` structured-block ] ... `!$OMP END PARALLEL SECTIONS`

OpenMP 2.0 Fortran

!$OMP PARALLEL SECTIONS [clause[[,] clause]...]

[!$OMP SECTION]

structured-block

[!$OMP SECTION

structured-block ]

...

!$OMP END PARALLEL SECTIONS

OpenMP 2.0 C/C++
`#pragma omp parallel sections` [clause[[`,`] clause]...] `{` [`#pragma omp section` ] structured-block [`#pragma omp section` structured-block ] ... `}`

OpenMP 2.0 C/C++

#pragma omp parallel sections [clause[[,] clause]...]

{

[#pragma omp section ]

structured-block

[#pragma omp section

structured-block ]

...

}

1.7.3 PARALLEL WORKSHARE Construct

The Fortran PARALLEL WORKSHARE construct provides a shortcut for specifying a parallel region that contains a single WORKSHARE directive. clause can be one of the clauses accepted by the PARALLEL directive.

OpenMP 2.0 Fortran
`!$OMP PARALLEL WORKSHARE` [clause[[,] clause]...] structured-block `!$OMP END PARALLEL WORKSHARE`

OpenMP 2.0 Fortran

!$OMP PARALLEL WORKSHARE [clause[[,] clause]...]

structured-block

!$OMP END PARALLEL WORKSHARE

There is no C/C++ equivalent.

1.8 Synchronization Constructs

The following constructs specify thread synchronization. There are many special conditions and restrictions regarding these constructs that are too numerous to summarize here. Programmers are urged to refer to the appropriate OpenMP specification document for the complete details.

1.8.1 MASTER Construct

Only the master thread of the team executes the block enclosed by this directive. The other threads skip this block and continue. There is no implied barrier on entry to or exit from the master construct.

OpenMP 2.0 Fortran
`!$OMP MASTER` structured-block `!$OMP END MASTER`

OpenMP 2.0 Fortran

!$OMP MASTER

structured-block

!$OMP END MASTER

OpenMP 2.0 C/C++
`#pragma omp master` structured-block

OpenMP 2.0 C/C++

#pragma omp master

structured-block

1.8.2 CRITICAL Construct

Restrict access to the structured block to only one thread at a time. The optional name argument identifies the critical region. All unnamed CRITICAL directives map to the same name. Critical section names are global entities of the program and must be unique. For Fortran, if name appears on the CRITICAL directive, it must also appear on the END CRITICAL directive. For C/C++, the identifier used to name a critical region has external linkage and is in a name space which is separate from the name spaces used by labels, tags, members, and ordinary identifiers.

OpenMP 2.0 Fortran
`!$OMP CRITICAL` [(name)] structured-block `!$OMP END CRITICAL` [(name)]

OpenMP 2.0 Fortran

!$OMP CRITICAL [(name)]

structured-block

!$OMP END CRITICAL [(name)]

OpenMP 2.0 C/C++
`#pragma omp critical` [(name)] structured-block

OpenMP 2.0 C/C++

#pragma omp critical [(name)]

structured-block

1.8.3 BARRIER Construct

Synchronizes all the threads in a team. Each thread waits until all the others in the team have reached this point.

OpenMP 2.0 Fortran
`!$OMP BARRIER`

OpenMP 2.0 C/C++
`#pragma omp barrier`

After all threads in the team have encountered the barrier, each thread in the team begins executing the statements after the BARRIER directive in parallel.

Note that because the barrier pragma does not have a C/C++ statement as part of its syntax, there are restrictions on its placement within a program. See the C/C++ OpenMP specifications for details.

1.8.4 ATOMIC Construct

Ensures that a specific memory location is to be updated atomically, rather than exposing it to the possibility of multiple, simultaneous writing threads.

OpenMP 2.0 Fortran
`!$OMP ATOMIC` expression-statement
The directive applies only to the expression-statement immediately following the directive, which must be in one of these forms: x = x operator expression x = expression operator x x = intrinsic(x, expr-list) x = intrinsic(expr-list, x) where: x is a scalar of intrinsic type expression is a scalar expression that does not reference x expr-list is a non-empty, comma-separated list of scalar expressions that do not reference x (see the OpenMP 2.0 Fortran specifications for details) intrinsic is one of MAX, MIN, IAND, IOR, or IEOR. operator is one of + - / .AND. .OR. .EQV. .NEQV.*

OpenMP 2.0 Fortran

!$OMP ATOMIC

expression-statement

The directive applies only to the expression-statement immediately following the directive, which must be in one of these forms:

x = x operator expression

x = expression operator x

x = intrinsic(x, expr-list)

x = intrinsic(expr-list, x)

where:

x is a scalar of intrinsic type
expression is a scalar expression that does not reference x
expr-list is a non-empty, comma-separated list of scalar expressions that do not reference x (see the OpenMP 2.0 Fortran specifications for details)
intrinsic is one of MAX, MIN, IAND, IOR, or IEOR.
operator is one of + - * / .AND. .OR. .EQV. .NEQV.

OpenMP 2.0 C/C++
`#pragma omp atomic` expression-statement
The directive applies only to the expression-statement immediately following the directive, which must be in one of these forms: x binop `=` expr x`++` `++`x x`--` `--`x where: x in an lvalue expression with scalar type. expr is an expression with scalar type that does not reference x. binop is not an overloaded operator and one of: +, *, -, /, &, ^, \|, <<, or >>.

OpenMP 2.0 C/C++

#pragma omp atomic

expression-statement

The directive applies only to the expression-statement immediately following the directive, which must be in one of these forms:

x binop = expr

x++

++x

x--

--x

where:

x in an lvalue expression with scalar type.
expr is an expression with scalar type that does not reference x.
binop is not an overloaded operator and one of: +, *, -, /, &, ^, |, <<, or >>.

This implementation replaces all ATOMIC directives by enclosing the expression-statement in a critical section.

1.8.5 FLUSH Construct

Thread-visible Fortran variables or C objects are written back to memory at the point at which this directive appears. The FLUSH directive only provides consistency between operations within the executing thread and global memory. The optional variable-list consists of a comma-separated list of variables or objects that need to be flushed. A FLUSH directive without a variable-list synchronizes all thread-visible shared variables or objects.

OpenMP 2.0 Fortran
`!$OMP FLUSH` [(variable-list)]

OpenMP 2.0 C/C++
`#pragma omp flush` [`(`variable-list`)`]

Note that because the flush pragma does not have a C/C++ statement as part of its syntax, there are restrictions on its placement within a program. See the C/C++ OpenMP specifications for details.

1.8.6 ORDERED Construct

The enclosed block is executed in the order that iterations would be executed in a sequential execution of the loop.

OpenMP 2.0 Fortran
`!$OMP ORDERED` structured-block `!$OMP END ORDERED`
The enclosed block is executed in the order that iterations would be executed in a sequential execution of the loop. It can appear only in the dynamic extent of a DO or PARALLEL DO directive. The ORDERED clause must be specified on the closest DO directive enclosing the block. A loop to which a DO directive applies must not execute the same ordered directive more than once per iteration, and it must not execute more than one ordered directive.

OpenMP 2.0 Fortran

!$OMP ORDERED

structured-block

!$OMP END ORDERED

The enclosed block is executed in the order that iterations would be executed in a sequential execution of the loop. It can appear only in the dynamic extent of a DO or PARALLEL DO directive. The ORDERED clause must be specified on the closest DO directive enclosing the block.
A loop to which a DO directive applies must not execute the same ordered directive more than once per iteration, and it must not execute more than one ordered directive.

OpenMP 2.0 C/C++
`#pragma omp ordered` structured-block
The enclosed block is executed in the order that iterations would be executed in a sequential execution of the loop. It can appear only in the dynamic extent of a for or parallel for directive with the ordered clause specified. A loop with a for construct must not execute the same ordered directive more than once per iteration, and it must not execute more than one ordered directive.

OpenMP 2.0 C/C++

#pragma omp ordered

structured-block

The enclosed block is executed in the order that iterations would be executed in a sequential execution of the loop. It can appear only in the dynamic extent of a for or parallel for directive with the ordered clause specified.
A loop with a for construct must not execute the same ordered directive more than once per iteration, and it must not execute more than one ordered directive.

1.9 Data Environment Directives

The following directives control the data environment during execution of parallel constructs.

1.9.1 THREADPRIVATE Directive

Makes the list of objects (Fortran common blocks and named variables, C and C++ named variables) private to a thread but global within the thread.

See the OpenMP specifications for the complete details and restrictions.

OpenMP 2.0 Fortran
`!$OMP THREADPRIVATE`(list)
Common block names must appear between slashes. To make a common block THREADPRIVATE, this directive must appear after every COMMON declaration of that block.

OpenMP 2.0 C/C++
`#pragma omp threadprivate` (list)
Each variable in list at file, namespace, or block scope must refer to a variable declaration at file, namespace, or block scope that lexically preceeds the pragma.

1.10 OpenMP Directive Clauses

This section summarizes the data scoping and scheduling clauses that can appear on OpenMP directives.

1.10.1 Data Scoping Clauses

Several directives accept clauses that allow a user to control the scope attributes of variables within the extent of the construct. If no data scope clause is specified for a directive, the default scope for variables affected by the directive is SHARED.

Fortran: list is a comma-separated list of named variables or common blocks that are accessible in the scoping unit. Common block names must appear within slashes (for example, /ABLOCK/).

There are important restrictions on the use of these scoping clauses. Refer to the appropriate sections of the OpenMP specifications for complete details.

TABLE 1-1 identifies the directives on which these clauses can appear.

1.10.1.1 PRIVATE Clause

private(list)

Declares the variables in the optional comma-separated list to be private to each thread in a team.

1.10.1.2 SHARED Clause

shared(list)

All the threads in the team share the variables that appear in list, and access the same storage area.

1.10.1.3 DEFAULT Clause

Fortran

DEFAULT(PRIVATE | SHARED | NONE)

C/C++

default(shared | none)

Specify scoping attributes for all variables within a parallel region. THREADPRIVATE variables are not affected by this clause. If not specified, DEFAULT(SHARED) is assumed. A variable's default data-sharing attribute can be overridden by using the private, firstprivate, lastprivate, reduction, and shared clauses.

1.10.1.4 FIRSTPRIVATE Clause

firstprivate(list)

The variables in list are PRIVATE. In addition, private copies of the variables are initialized from the original object existing before the construct.

1.10.1.5 LASTPRIVATE Clause

lastprivate(list)

The variables in the list are PRIVATE. In addition, when the LASTPRIVATE clause appears on a DO or for directive, the thread that executes the sequentially last iteration updates the original object. On a SECTIONS directive, the thread that executes the lexically last SECTION updates the original object.

1.10.1.6 COPYIN Clause

Fortran

COPYIN(list)

The COPYIN clause applies only to variables, common blocks, and variables in common blocks that are declared as THREADPRIVATE. In a parallel region, COPYIN specifies that the data in the master thread of the team be copied to the threadprivate copies at the beginning of the parallel region.

C/C++

copyin(list)

The COPYIN clause applies only to variables that are declared as THREADPRIVATE. In a parallel region, COPYIN specifies that the data in the master thread of the team be copied to the threadprivate copies at the beginning of the parallel region.

1.10.1.7 COPYPRIVATE Clause

Fortran

COPYPRIVATE(list)

Uses a private variable to broadcast a value, or a pointer to a shared object, from one member of a team to the other members. COPYPRIVATE clause can only appear on the END SINGLE directive. The broadcast occurs after the execution of the structured block associated with the single construct, and before any threads in the team have left the barrier at the end of the construct. The variables in list must not appear in a PRIVATE or FIRSTPRIVATE clause of the SINGLE construct specifying COPYPRIVATE.

C/C++

copyprivate(list)

Uses a private variable to broadcast a value from one member of a team to the other members. The copyprivate clause can only appear on the single directive. The broadcast occurs after the execution of the structured block associated with the single construct, and before any threads in the team have left the barrier at the end of the construct. The variables in list must not appear in a private or firstprivate clause for the same single directive.

1.10.1.8 REDUCTION Clause

Fortran

REDUCTION(operator|intrinsic:list)

operator is one of: +, *, -, .AND., .OR., .EQV., .NEQV.

intrinsic is one of: MAX, MIN, IAND, IOR, IEOR

Variables in list must be named variables of intrinsic type.

C/C++

reduction(operator:list)

operator is one of: +, *, -, &, ^, |, &&, ||

The REDUCTION clause is intended to be used on a region in which the reduction variable is used only in reduction statements. The variables in list must be SHARED in the enclosing context. A private copy of each variable is created for each thread as if it were PRIVATE. At the end of the reduction, the shared variable is updated by combining the original value with the final value of each of the private copies.

See the appropriate sections of the OpenMP specifications for complete details and restrictions on REDUCTION clauses and constructs.

1.10.2 Scheduling Clauses

The SCHEDULE clause specifies how iterations in a Fortran DO loop or C/C++ for loop are divided among the threads in a team. TABLE 1-1 shows which directives allow the SCHEDULE clause.

There are important restrictions on the use of these scheduling clauses. Refer to section 2.3.1 in the Fortran specification, and section 2.4.1 in the C/C++ specification for complete details.

schedule(type [,chunk])

Specifies how iterations of the DO or for loop are divided among the threads of the team. type can be one of STATIC, DYNAMIC, GUIDED, or RUNTIME. In the absence of a SCHEDULE clause, Sun Studio compilers use STATIC scheduling. chunk must be an integer expression.

1.10.2.1 STATIC Scheduling

schedule(static[,chunk])

Iterations are divided into pieces of a size specified by chunk. The pieces are statically assigned to threads in the team in a round-robin fashion in the order of the thread number. If not specified, chunk is chosen so that the iterations divide into contiguous chunks nearly equal in size with one chunk assigned to each thread.

1.10.2.2 DYNAMIC Scheduling

schedule(dynamic[,chunk])

Iterations are divided into pieces of a size specified by chunk, and assigned to a waiting thread. As each thread finishes its piece of the iteration space, it dynamically obtains the next set of iterations. When no chunk is specified, it defaults to 1.

1.10.2.3 GUIDED Scheduling

schedule(guided[,chunk])

With GUIDED, the chunk size is reduced in an exponentially decreasing manner with each dispatched piece of the iterations. chunk specifies the minimum number of iterations to dispatch each time. (The size of the chunks is determined by a formula that is implementation dependent; see GUIDED: Determination of Chunk Sizes.). When no chunk is specified, it defaults to 2.0.

1.10.2.4 RUNTIME Scheduling

schedule(runtime)

Scheduling is deferred until runtime. Schedule type and chunk size will be determined from the value of the OMP_SCHEDULE environment variable. (Default is SCHEDULE(STATIC).

1.10.3 NUM_THREADS Clause

The OpenMP API provides a NUM_THREADS clause on the PARALLEL, PARALLEL SECTIONS, PARALLEL DO, PARALLEL for,and PARALLEL WORKSHARE directives.

num_threads(scalar_integer_expression)

Specifies the number of threads in the team created when a thread enters a parallel region. scalar_integer_expression is the number of threads requested, and supersedes the number of threads defined by a prior call to the OMP_SET_NUM_THREADS library function, or the value of the OMP_NUM_THREADS environment variable. If dynamic thread management is enabled, the request is the maximum number of threads to use.

Note that num_threads does not apply to subsequent regions.

1.10.4 Placement of Clauses on Directives

TABLE 1-1 shows the clauses that can appear on these directives and pragmas:

PARALLEL

DO
for
SECTIONS
SINGLE
PARALLEL DO
parallel for
PARALLEL SECTIONS
PARALLEL WORKSHARE

TABLE 1-1 Pragmas Where Clauses Can Appear
Clause/Pragma	DO/for	SECTIONS	SINGLE
IF
PRIVATE
SHARED
FIRSTPRIVATE
LASTPRIVATE
DEFAULT
REDUCTION
COPYIN
COPYPRIVATE			1
ORDERED
SCHEDULE
NOWAIT	2	2	2
NUM_THREADS

1. Fortran only: COPYPRIVATE can appear on the END SINGLE directive.

2. For Fortran, a NOWAIT modifier can only appear on the END DO, END SECTIONS, END SINGLE, or END WORKSHARE directives.

3. Only Fortran supports WORKSHARE and PARALLEL WORKSHARE.

1.11 OpenMP Runtime Library Routines

OpenMP provides a set of callable library routines to control and query the parallel execution environment, a set of general purpose lock routines, and two portable timer routines. Full details appear in the Fortran and C/C++ OpenMP specifications.

1.11.1 Fortran OpenMP Routines

The Fortran run-time library routines are external procedures. In the following summary, int_expr is a scalar integer expression, and logical_expr is a scalar logical expression.

The OMP_ functions returning INTEGER(4) and LOGICAL(4) are not intrinsic and must be declared properly, otherwise the compiler will assume REAL. Interface declarations for the OpenMP Fortran runtime library routines summarized below are provided by the Fortran include file omp_lib.h and a Fortran MODULE omp_lib, as described in the Fortran OpenMP specifications.

Supply an INCLUDE 'omp_lib.h' statement or #include "omp_lib.h" preprocessor directive, or a USE omp_lib statement in every program unit that references these library routines.

Compiling with -Xlist will report any type mismatches.

The integer parameter omp_lock_kind defines the KIND type parameters used for simple lock variables in the OMP_*_LOCK routines.

The integer parameter omp_nest_lock_kind defines the KIND type parameters used for the nestable lock variables in the OMP_*_NEST_LOCK routines.

The integer parameter openmp_version is defined as a preprocessor macro _OPENMP having the form YYYYMM where YYYY and MM are the year and month designations of the version of the OpenMP Fortran API.

1.11.2 C/C++ OpenMP Routines

The C/C++ run-time library functions are external functions.

The header <omp.h> declares two types, several functions that can be used to control and query the parallel execution environment, and lock functions that can be used to synchronize access to data.

The type omp_lock_t is an object type capable of representing that a lock is available, or that a thread owns a lock. These locks are referred to as simple locks.

The type omp_nest_lock_t is an object type capable of representing that a lock is available, or that a thread owns a lock. These locks are referred to as nestable locks.

1.11.3 Run-time Thread Management Routines

For details, refer to the appropriate OpenMP specifications.

1.11.3.1 OMP_SET_NUM_THREADS Routine

Sets the number of threads to use for subsequent parallel regions not specified with a num_threads() clause. This call affects only the subsequent parallel regions encountered by the calling thread at the same or inner nesting level.

Fortran

SUBROUTINE OMP_SET_NUM_THREADS(int_expr)

C/C++

#include <omp.h> void omp_set_num_threads(int num_threads);

1.11.3.2 OMP_GET_NUM_THREADS Routine

Returns the number of threads currently in the team executing the parallel region from which it is called.

Fortran

INTEGER(4) FUNCTION OMP_GET_NUM_THREADS()

C/C++

#include <omp.h> int omp_get_num_threads(void);

1.11.3.3 OMP_GET_MAX_THREADS Routine

Returns maximum number of threads that would be used to form a team if an active paralel region specified without a num_threads() clause were to be encountered at this point in the program.

Fortran

INTEGER(4) FUNCTION OMP_GET_MAX_THREADS()

C/C++

#include <omp.h> int omp_get_max_threads(void);

1.11.3.4 OMP_GET_THREAD_NUM Routine

Returns the thread number, within its team, of the thread executing the call to this function. This number lies between 0 and OMP_GET_NUM_THREADS()-1, with 0 being the master thread.

Fortran

INTEGER(4) FUNCTION OMP_GET_THREAD_NUM()

C/C++

#include <omp.h> int omp_get_thread_num(void);

1.11.3.5 OMP_GET_NUM_PROCS Routine

Return the number of processors available to the program.

Fortran

INTEGER(4) FUNCTION OMP_GET_NUM_PROCS()

C/C++

#include <omp.h> int omp_get_num_procs(void);

1.11.3.6 OMP_IN_PARALLEL Routine

Determine whether or not thread is executing within the dynamic extent of a parallel region.

Fortran

LOGICAL(4) FUNCTION OMP_IN_PARALLEL()

Returns .TRUE. if called within the dynamic extent of an active parallel region, .FALSE. otherwise.

C/C++

#include <omp.h> int omp_in_parallel(void);

Returns nonzero if called within the dynamic extent of an active parallel region, zero otherwise.

An active parallel region is a parallel region where the IF clause evaluates to TRUE.

1.11.3.7 OMP_SET_DYNAMIC Routine

Enables or disables dynamic adjustment of the number of available threads. (Dynamic adjustment is enabled by default.) This call affects only the subsequent parallel regions encountered by the calling thread at the same or inner nesting level.

Fortran

SUBROUTINE OMP_SET_DYNAMIC(logical_expr)

Dynamic adjustment is enabled when logical_expr evaluates to .TRUE., and is disabled otherwise.

C/C++

#include <omp.h> void omp_set_dynamic(int dynamic);

If dynamic evaluates to nonzero, dynamic adjustment is enabled; otherwise it is disabled.

1.11.3.8 OMP_GET_DYNAMIC Routine

Determine whether or not dynamic thread adjustment is enabled at this point in the program.

Fortran

LOGICAL(4) FUNCTION OMP_GET_DYNAMIC()

Returns .TRUE. if dynamic thread adjustment is enabled, .FALSE. otherwise.

C/C++

#include <omp.h> int omp_get_dynamic(void);

Returns nonzero if dynamic thread adjustment is enabled, zero otherwise.

1.11.3.9 OMP_SET_NESTED Routine

Enables or disables nested parallelism.This call affects only the subsequent parallel regions encountered by the calling thread at the same or inner nesting level.

Fortran

SUBROUTINE OMP_SET_NESTED(logical_expr)

Nested parallelism is enabled if logical_expr evaluates to .TRUE., and is disabled otherwise.

C/C++

#include <omp.h> void omp_set_nested(int nested);

Nested parallelism is enabled if nested evaluates to non-zero, and is disabled otherwise.

Nested parallelism is disabled by default. See Chapter 2 for information on nested parallelism.

1.11.3.10 OMP_GET_NESTED Routine

Determine whether or not nested parallelism is enabled at this point in the program.

Fortran

LOGICAL(4) FUNCTION OMP_GET_NESTED()

Returns .TRUE. if nested parallelism is enabled, .FALSE. otherwise.

C/C++

#include <omp.h> int omp_get_nested(void);

Returns nonzero if nested parallelism is enabled, zero otherwise.

See Chapter 2 for information on nested parallelism.

1.11.4 Routines That Manage Synchronization Locks

Two types of locks are supported: simple locks and nestable locks. Nestable locks may be locked multiple times by the same thread before being unlocked; simple locks may not be locked if they are already in a locked state. Simple lock variables may only be passed to simple lock routines, and nested lock variables only to nested lock routines.

Fortran:

The lock variable var must be accessed only through these routines. Use the parameters OMP_LOCK_KIND and OMP_NEST_LOCK_KIND (defined in omp_lib.h INCLUDE file and the omp_lib MODULE) for this purpose. For example,

INTEGER(KIND=OMP_LOCK_KIND) :: var
INTEGER(KIND=OMP_NEST_LOCK_KIND) :: nvar

C/C++:

Simple lock variables must have type omp_lock_t and must be accessed only through these functions. All simple lock functions require an argument that points to omp_lock_t type.

Nested lock variables must have type omp_nest_lock_t, and similarly all nested lock functions require an argument that points to omp_nest_lock_t type.

1.11.4.1 OMP_INIT_LOCK and OMP_INIT_NEST_LOCK Routines

Initialize a lock variable for subsequent calls.

Fortran

SUBROUTINE OMP_INIT_LOCK(var)

SUBROUTINE OMP_INIT_NEST_LOCK(nvar)

C/C++

#include <omp.h> void omp_init_lock(omp_lock_t *lock);

void omp_init_nest_lock(omp_nest_lock_t *lock);

1.11.4.2 OMP_DESTROY_LOCK and OMP_DESTROY_NEST_LOCK Routines

Removes a lock variable.

Fortran

SUBROUTINE OMP_DESTROY_LOCK(var)

SUBROUTINE OMP_DESTROY_NEST_LOCK(nvar)

C/C++

#include <omp.h>

void omp_destroy_lock(omp_lock_t *lock);

void omp_destroy_nest_lock(omp_nest_lock_t *lock);

1.11.4.3 OMP_SET_LOCK and OMP_SET_NEST_LOCK Routines

Forces the executing thread to wait until the specified lock is available. The thread is granted ownership of the lock when it is available.

Fortran

SUBROUTINE OMP_SET_LOCK(var)

SUBROUTINE OMP_SET_NEST_LOCK(nvar)

C/C++

#include <omp.h>

void omp_set_lock(omp_lock_t *lock);

void omp_set_nest_lock(omp_nest_lock_t *lock);

1.11.4.4 OMP_UNSET_LOCK and OMP_UNSET_NEST_LOCK Routines

Releases the executing thread from ownership of the lock. Behavior is undefined if the thread does not own that lock.

Fortran

SUBROUTINE OMP_UNSET_LOCK(var)

SUBROUTINE OMP_UNSET_NEST_LOCK(nvar)

C/C++

#include <omp.h>

void omp_unset_lock(omp_lock_t *lock);

void omp_unset_nest_lock(omp_nest_lock_t *lock);

1.11.4.5 OMP_TEST_LOCK and OMP_TEST_NEST_LOCK Routines

OMP_TEST_LOCK attempts to set the lock associated with lock variable. Call does not block execution of the thread.

OMP_TEST_NEST_LOCK returns the new nesting count if the lock was set successfully, otherwise it returns 0. Call does not block execution of the thread.

Fortran

LOGICAL(4) FUNCTION OMP_TEST_LOCK(var)

Returns .TRUE. if the lock was set, .FALSE. otherwise.

INTEGER(4) FUNCTION OMP_TEST_NEST_LOCK(nvar)

Returns nesting count if lock was set successfully, zero otherwise.

C/C++

#include <omp.h> int omp_test_lock(omp_lock_t *lock);

Returns a nonzero value if lock was set successfully, zero otherwise.

int omp_test_nest_lock(omp_nest_lock_t *lock);

Returns lock nest count if lock was set successfully, zero otherwise.

1.11.5 Timing Routines

Two functions support a portable wall clock timer.

1.11.5.1 OMP_GET_WTIME Routine

Returns the elapsed wall clock time in seconds "since some arbitrary time in the past".

Fortran

REAL(8) FUNCTION OMP_GET_WTIME()

C/C++

#include <omp.h> double omp_get_wtime(void);

1.11.5.2 OMP_GET_WTICK Routine

Returns the number of seconds between successive clock ticks.

Fortran

REAL(8) FUNCTION OMP_GET_WTICK()

C/C++

#include <omp.h> double omp_get_wtick(void);