Skip Navigation Links | |
Exit Print View | |
Oracle Solaris Studio 12.3: OpenMP API User's Guide Oracle Solaris Studio 12.3 Information Library |
2. Compiling and Running OpenMP Programs
3. Implementation-Defined Behaviors
4.2 Control of Nested Parallelism
4.3 Using OpenMP Library Routines Within Nested Parallel Regions
6. Automatic Scoping of Variables
Nesting parallel regions provides an immediate way to allow more threads to participate in the computation.
For example, suppose you have a program that contains two levels of parallelism and the degree of parallelism at each level is 2. Also, suppose your system has four CPUs and you want use all four CPUs to speed up the execution of this program. Just parallelizing any one level will use only two CPUs. You want to parallelize both levels.
Nesting parallel regions can easily create too many threads and oversubscribe the system. Set OMP_THREAD_LIMIT and OMP_MAX_ACTIVE_LEVELS appropriately to limit the number of threads in use and prevent runaway oversubscription.
Creating nested parallel regions adds overhead. If the outer level has enough paralellism and the load is balanced, using all the threads at the outer level of the computation will be more efficient than creating nested parallel regions at the inner levels.
For example, suppose you have a program that contains two levels of parallelism. The degree of parallelism at the outer level is four and the load is balanced. You have a system with four CPUs and want to use all four CPUs to speed up the execution of this program. In general, using all four threads for the outer level could yield better performance than using two threads for the outer parallel region and using the other two threads as slave threads for the inner parallel regions.