Go to main content
Oracle® Developer Studio 12.5: OpenMP API User's Guide

Exit Print View

Updated: July 2016
 
 

3.2 Control of Nested Parallelism

Nested parallelism can be controlled by setting various environment variables prior to the execution of the program, or by calling the omp_set_nested() runtime routine. This section discusses various environment variables that can be used to control nested parallelism.

3.2.1 OMP_NESTED

Nested parallelism can be enabled or disabled by setting the OMP_NESTED environment variable. By default, nested parallelism is disabled.

The following example has three levels of nested parallel constructs.

Example 1  Nested Parallelism Example
#include <omp.h>
#include <stdio.h>
void report_num_threads(int level)
{
    #pragma omp single
    {
        printf("Level %d: number of threads in the team = %d\n",
                  level, omp_get_num_threads());
    }
 }
int main()
{
    omp_set_dynamic(0);
    #pragma omp parallel num_threads(2)
    {
        report_num_threads(1);
        #pragma omp parallel num_threads(2)
        {
            report_num_threads(2);
            #pragma omp parallel num_threads(2)
            {
                report_num_threads(3);
            }
        }
    }
    return(0);
}

Compiling and running this program with nested parallelism enabled produces the following (sorted) output:

% setenv OMP_NESTED TRUE
% a.out | sort
Level 1: number of threads in the team = 2
Level 2: number of threads in the team = 2
Level 2: number of threads in the team = 2
Level 3: number of threads in the team = 2
Level 3: number of threads in the team = 2
Level 3: number of threads in the team = 2
Level 3: number of threads in the team = 2

Running the program with nested parallelism disabled produces the following output:

% setenv OMP_NESTED FALSE
% a.out | sort
Level 1: number of threads in the team = 2
Level 2: number of threads in the team = 1
Level 2: number of threads in the team = 1
Level 3: number of threads in the team = 1
Level 3: number of threads in the team = 1

3.2.2 OMP_THREAD_LIMIT

The setting of the OMP_THREAD_LIMIT environment variable controls the maximum number of OpenMP threads to use for the whole program. This number includes the initial (or main) thread, as well as the OpenMP helper threads that the OpenMP runtime library creates. By default, the maximum number of OpenMP threads to use for the whole program is 1024 (one initial or main thread and 1023 OpenMP helper threads).

Note that the thread pool consists of only OpenMP helper threads that the OpenMP runtime library creates. The pool does not include the initial (or main) thread or any thread created explicitly by the user's program.

If OMP_THREAD_LIMIT is set to 1, then the helper thread pool will be empty and all parallel regions will be executed by one thread (the initial or main thread).

The following example output shows that a parallel region might get fewer helper threads if the pool does not contain a sufficient number of helper threads. The code is the same as that in Example 1, except that the environment variable OMP_THREAD_LIMIT is set to 6. The number of threads needed for all the parallel regions to be active at the same time is 8. Therefore, the pool needs to contain at least 7 helper threads. If OMP_THREAD_LIMIT is set to 6, then the pool contains at most 5 helper threads. Therefore, two of the four innermost parallel regions might not be able to get all the helper threads requested. The following example shows one possible result.

% setenv OMP_NESTED TRUE
% OMP_THREAD_LIMIT 6
% a.out | sort
Level 1: number of threads in the team = 2
Level 2: number of threads in the team = 2
Level 2: number of threads in the team = 2
Level 3: number of threads in the team = 2
Level 3: number of threads in the team = 2
Level 3: number of threads in the team = 1
Level 3: number of threads in the team = 1

3.2.3 OMP_MAX_ACTIVE_LEVELS

The environment variable OMP_MAX_ACTIVE_LEVELS controls the maximum number of nested active parallel regions. A parallel region is active if it is executed by a team consisting of more than one thread. If not set, the default is 4.

Note that setting this environment variable simply controls the maximum number of nested active parallel regions; it does not enable nested parallelism. To enable nested parallelism, OMP_NESTED must be set to TRUE, or omp_set_nested() must be called with an argument that evaluates to true.

The following sample code creates 4 levels of nested parallel regions.

#include <omp.h>
#include <stdio.h>
#define DEPTH 4
void report_num_threads(int level)
{
    #pragma omp single
    {
        printf("Level %d: number of threads in the team = %d\n",
               level, omp_get_num_threads());
    }
}
void nested(int depth)
{
    if (depth > DEPTH)
        return;

    #pragma omp parallel num_threads(2)
    {
        report_num_threads(depth);
        nested(depth+1);
    }
}
int main()
{
    omp_set_dynamic(0);
    omp_set_nested(1);
    nested(1);
    return(0);
}

The following output shows a possible result from compiling and running the sample code when DEPTH is set to 4. Actual results would depend on how the operating system schedules the threads.

% setenv OMP_NESTED TRUE
% setenv OMP_MAX_ACTIVE_LEVELS 4
% a.out | sort
Level 1: number of threads in the team = 2
Level 2: number of threads in the team = 2
Level 2: number of threads in the team = 2
Level 3: number of threads in the team = 2
Level 3: number of threads in the team = 2
Level 3: number of threads in the team = 2
Level 3: number of threads in the team = 2
Level 4: number of threads in the team = 2
Level 4: number of threads in the team = 2
Level 4: number of threads in the team = 2
Level 4: number of threads in the team = 2
Level 4: number of threads in the team = 2
Level 4: number of threads in the team = 2
Level 4: number of threads in the team = 2
Level 4: number of threads in the team = 2

If OMP_MAX_ACTIVE_LEVELS is set to 2, then nested parallel regions at nesting depths of 3 and 4 are executed single-threaded. The following example shows a possible result.

% setenv OMP_NESTED TRUE
% setenv OMP_MAX_ACTIVE_LEVELS 2
% a.out |sort
Level 1: number of threads in the team = 2
Level 2: number of threads in the team = 2
Level 2: number of threads in the team = 2
Level 3: number of threads in the team = 1
Level 3: number of threads in the team = 1
Level 3: number of threads in the team = 1
Level 3: number of threads in the team = 1
Level 4: number of threads in the team = 1
Level 4: number of threads in the team = 1
Level 4: number of threads in the team = 1
Level 4: number of threads in the team = 1