环境变量 OMP_MAX_ACTIVE_LEVELS 可控制嵌套活动并行区域的最大数量。如果由包含多个线程的组执行并行区域,则该并行区域处于活动状态。如果未设置,则使用缺省值 4。
请注意,设置该环境变量仅控制嵌套活动并行区域的最大数量,并不启用嵌套并行操作。要启用嵌套并行操作,必须将 OMP_NESTED 设置为 TRUE,或者必须使用求值结果为 true 的参数调用 omp_set_nested()。
以下样例代码将创建 4 级嵌套并行区域。
#include <omp.h> #include <stdio.h> #define DEPTH 4 void report_num_threads(int level) { #pragma omp single { printf("Level %d: number of threads in the team = %d\n", level, omp_get_num_threads()); } } void nested(int depth) { if (depth > DEPTH) return; #pragma omp parallel num_threads(2) { report_num_threads(depth); nested(depth+1); } } int main() { omp_set_dynamic(0); omp_set_nested(1); nested(1); return(0); }
以下输出显示将 DEPTH 设置为 4 时编译和运行样例代码可能产生的结果。实际结果取决于操作系统调度线程的方式。
% setenv OMP_NESTED TRUE % setenv OMP_MAX_ACTIVE_LEVELS 4 % a.out | sort Level 1: number of threads in the team = 2 Level 2: number of threads in the team = 2 Level 2: number of threads in the team = 2 Level 3: number of threads in the team = 2 Level 3: number of threads in the team = 2 Level 3: number of threads in the team = 2 Level 3: number of threads in the team = 2 Level 4: number of threads in the team = 2 Level 4: number of threads in the team = 2 Level 4: number of threads in the team = 2 Level 4: number of threads in the team = 2 Level 4: number of threads in the team = 2 Level 4: number of threads in the team = 2 Level 4: number of threads in the team = 2 Level 4: number of threads in the team = 2
如果将 OMP_MAX_ACTIVE_LEVELS 设置为 2,嵌套深度为 3 和 4 的嵌套并行区域将由单个线程来执行。以下示例显示一个可能的结果。
% setenv OMP_NESTED TRUE % setenv OMP_MAX_ACTIVE_LEVELS 2 % a.out |sort Level 1: number of threads in the team = 2 Level 2: number of threads in the team = 2 Level 2: number of threads in the team = 2 Level 3: number of threads in the team = 1 Level 3: number of threads in the team = 1 Level 3: number of threads in the team = 1 Level 3: number of threads in the team = 1 Level 4: number of threads in the team = 1 Level 4: number of threads in the team = 1 Level 4: number of threads in the team = 1 Level 4: number of threads in the team = 1