JavaScript is required to for searching.
Skip Navigation Links
Exit Print View
Oracle Solaris Studio 12.3: OpenMP API User's Guide     Oracle Solaris Studio 12.3 Information Library
search filter icon
search icon

Document Information

Preface

1.  Introducing the OpenMP API

2.  Compiling and Running OpenMP Programs

3.  Implementation-Defined Behaviors

4.  Nested Parallelism

4.1 Execution Model

4.2 Control of Nested Parallelism

4.2.1 OMP_NESTED

4.2.2 OMP_THREAD_LIMIT

4.2.3 OMP_MAX_ACTIVE_LEVELS

4.3 Using OpenMP Library Routines Within Nested Parallel Regions

4.4 Some Tips on Using Nested Parallelism

5.  Tasking

6.  Automatic Scoping of Variables

7.  Scope Checking

8.  Performance Considerations

A.  Placement of Clauses on Directives

Index

4.2 Control of Nested Parallelism

Nested parallelism can be controlled at runtime by setting various environment variables prior to execution of the program.

4.2.1 OMP_NESTED

Nested parallelism can be enabled or disabled by setting the OMP_NESTED environment variable or calling omp_set_nested().

The following example has three levels of nested parallel constructs.

Example 4-1 Nested Parallelism Example

#include <omp.h>
#include <stdio.h>
void report_num_threads(int level)
{
    #pragma omp single
    {
        printf("Level %d: number of threads in the team - %d\n",
                  level, omp_get_num_threads());
    }
 }
int main()
{
    omp_set_dynamic(0);
    #pragma omp parallel num_threads(2)
    {
        report_num_threads(1);
        #pragma omp parallel num_threads(2)
        {
            report_num_threads(2);
            #pragma omp parallel num_threads(2)
            {
                report_num_threads(3);
            }
        }
    }
    return(0);
}

Compiling and running this program with nested parallelism enabled produces the following (sorted) output:

% setenv OMP_NESTED TRUE
% a.out
Level 1: number of threads in the team - 2
Level 2: number of threads in the team - 2
Level 2: number of threads in the team - 2
Level 3: number of threads in the team - 2
Level 3: number of threads in the team - 2
Level 3: number of threads in the team - 2
Level 3: number of threads in the team - 2

The following example runs the same program but with nested parallelism disabled:

% setenv OMP_NESTED FALSE
% a.out
Level 1: number of threads in the team - 2
Level 2: number of threads in the team - 1
Level 3: number of threads in the team - 1
Level 2: number of threads in the team - 1
Level 3: number of threads in the team - 1

4.2.2 OMP_THREAD_LIMIT

The OpenMP runtime library maintains a pool of threads that can be used as slave threads in parallel regions. The setting of the OMP_THREAD_LIMIT environment variable controls the number of threads in the pool. By default, the number of threads in the pool is at most 1023.

The thread pool consists of only non-user threads that the runtime library creates. The pool does not include the initial thread or any thread created explicitly by the user's program.

If OMP_THREAD_LIMIT is set to 1 (or SUNW_MP_MAX_POOL_THREADS is set to 0), then the thread pool will be empty and all parallel regions will be executed by one thread.

The following example shows that a parallel region can get fewer slave threads if the pool does not contain sufficient threads. The code is the same as Example 4-1. The number of threads needed for all the parallel regions to be active at the same time is 8. Therefore, the pool needs to contain at least 7 threads. If OMP_THREAD_LIMIT is set to 6 (or SUNW_MP_MAX_POOL_THREADS to 5), then the pool contains at most 5 slave threads. This implies that two of the four innermost parallel regions might not be able to get all the slave threads requested. The following example shows one possible result:

% setenv OMP_NESTED TRUE
% OMP_THREAD_LIMIT 6
% a.out
Level 1: number of threads in the team - 2
Level 2: number of threads in the team - 2
Level 2: number of threads in the team - 2
Level 3: number of threads in the team - 2
Level 3: number of threads in the team - 2
Level 3: number of threads in the team - 1
Level 3: number of threads in the team - 1

4.2.3 OMP_MAX_ACTIVE_LEVELS

The environment variable OMP_MAX_ACTIVE_LEVELS controls the maximum number of nested active parallel regions. A parallel region is active if it is executed by a team consisting of more than one thread. The default maximum number of nested active parallel regions is 4.

Note that setting this environment variable is not sufficient to enable nested parallelism. This environment variable simply controls the the maximum number of nested active parallel regions; it does not enable nested parallelism. To enable nested parallelism, OMP_NESTED must be set to TRUE, or omp_set_nested() must be called with an argument that evaluates to true.

The following code will create 4 levels of nested parallel regions. If OMP_MAX_ACTIVE_LEVELS is set to 2, then nested parallel regions at nested depth of 3 and 4 are executed single-threaded.

#include <omp.h>
#include <stdio.h>
#define DEPTH 5
void report_num_threads(int level)
{
    #pragma omp single
    {
        printf("Level %d: number of threads in the team - %d\n",
               level, omp_get_num_threads());
    }
}
void nested(int depth)
{
    if (depth == DEPTH)
        return;

    #pragma omp parallel num_threads(2)
    {
        report_num_threads(depth);
        nested(depth+1);
    }
}
int main()
{
    omp_set_dynamic(0);
    omp_set_nested(1);
    nested(1);
    return(0);
}

The following example shows the possible results from compiling and running this program with a maximum nesting level of 4. (Actual results would depend on how the OS schedules threads.)

% setenv OMP_MAX_ACTIVE_LEVELS 4
% a.out |sort 
Level 1: number of threads in the team - 2
Level 2: number of threads in the team - 2
Level 2: number of threads in the team - 2
Level 3: number of threads in the team - 2
Level 3: number of threads in the team - 2
Level 3: number of threads in the team - 2
Level 3: number of threads in the team - 2
Level 4: number of threads in the team - 2
Level 4: number of threads in the team - 2
Level 4: number of threads in the team - 2
Level 4: number of threads in the team - 2
Level 4: number of threads in the team - 2
Level 4: number of threads in the team - 2
Level 4: number of threads in the team - 2
Level 4: number of threads in the team - 2

The following example shows a possible result running with the nesting level set at 2:

% setenv OMP_MAX_ACTIVE_LEVELS 2
% a.out |sort 
Level 1: number of threads in the team - 2
Level 2: number of threads in the team - 2
Level 2: number of threads in the team - 2
Level 3: number of threads in the team - 1
Level 3: number of threads in the team - 1
Level 3: number of threads in the team - 1
Level 3: number of threads in the team - 1
Level 4: number of threads in the team - 1
Level 4: number of threads in the team - 1
Level 4: number of threads in the team - 1
Level 4: number of threads in the team - 1

Again, these examples only show some possible results. Actual results would depend on how the OS schedules threads.