Skip Navigation Links | |
Exit Print View | |
Oracle Solaris Studio 12.3: OpenMP API User's Guide Oracle Solaris Studio 12.3 Information Library |
2. Compiling and Running OpenMP Programs
3. Implementation-Defined Behaviors
6. Automatic Scoping of Variables
Tasking introduces a layer of complexity to an OpenMP program. The programmer needs to pay special attention to how a program with tasks works. This section discusses some programming issues to consider.
When a thread encounters a task scheduling point, the implementation may choose to suspend the current task and schedule the thread to work on another task. This behavior implies that the value of a threadprivate variable, or other thread-specific information such as the thread number, may change across a task scheduling point.
If the suspended task is tied, then the thread that resumes executing the task will be the same thread that suspended it. Therefore, the thread number will remain the same after the task is resumed. However, the value of a threadprivate variable may change because the thread might have been scheduled to work on another task that modified the threadprivate variable before resuming the suspended task.
If the suspended task is untied, then the thread that resumes executing the task may be different from the thread that suspended it. Therefore, both the thread number and the value of a threadprivate variable before and after the task scheduling point might be different.
OpenMP specifies that locks are no longer owned by threads, but by tasks. Once a lock is acquired, the current task owns it, and the same task must release it before task completion.
The critical construct, on the other hand, remains as a thread-based mutual exclusion mechanism.
The change in lock ownership requires extra care when using locks. The following example (which appears in Appendix A of the OpenMP 3.1 Specification) is conforming in OpenMP 2.5 because the thread that releases the lock lck in the parallel region is the same thread that acquired the lock in the sequential part of the program. The master thread of a parallel region and the initial thread are the same. However, it is not conforming in OpenMP 3.1 because the task region that releases the lock lck is different from the task region that acquires the lock.
Example 5-2 Example Using Locks: Non-Conforming in OpenMP 3.0
#include <stdlib.h> #include <stdio.h> #include <omp.h> int main() { int x; omp_lock_t lck; omp_init_lock (&lck); omp_set_lock (&lck); x = 0; #pragma omp parallel shared (x) { #pragma omp master { x = x + 1; omp_unset_lock (&lck); } } omp_destroy_lock (&lck); }
A task is likely to have references to data on the stack of the routine where the task construct appears. Because the execution of a task may be deferred until the next implicit or explicit barrier, a given task could execute after the stack of the routine where it appears has already been popped and the stack data overwritten, thereby destroying the stack data listed as shared by the task.
The programmer is responsible for inserting the needed synchronizations to ensure that variables are still on the stack when the task references them, as illustrated in the following two examples.
In the first example, i is specified to be shared in the task construct, and the task accesses the copy of i that is allocated on the stack of work().
Task execution may be deferred, so tasks are executed at the implicit barrier at the end of the parallel region in main() after the work() routine has already returned. So when a task references i, it accesses some undetermined value that happens to be on the stack at that time.
For correct results, the programmer needs to make sure that work() does not exit before the tasks have completed. This is done by inserting a taskwait directive after the task construct. Alternatively, i can be specified to be firstprivate in the task construct, instead of shared.
Example 5-3 Stack Data: First Example– Incorrect Version
#include <stdio.h> #include <omp.h> void work() { int i; i = 10; #pragma omp task shared(i) { #pragma omp critical printf("In Task, i = %d\n",i); } } int main(int argc, char** argv) { omp_set_num_threads(8); omp_set_dynamic(0); #pragma omp parallel { work(); } }
Example 5-4 Stack Data: First Example — Corrected Version
#include <stdio.h> #include <omp.h> void work() { int i; i = 10; #pragma omp task shared(i) { #pragma omp critical printf("In Task, i = %d\n",i); } /* Use TASKWAIT for synchronization. */ #pragma omp taskwait } int main(int argc, char** argv) { omp_set_num_threads(8); omp_set_dynamic(0); #pragma omp parallel { work(); } }
In the second example, j in the task construct references the j in the sections construct. Therefore, the task accesses the firstprivate copy of j in the sections construct, which (in some implementations, including the Oracle Solaris Studio compilers) is a local variable on the stack of the outlined routine for the sections construct.
Task execution may deferred so the task is executed at the implicit barrier at the end of the sections region, after the outlined routine for the sections construct has exited. So when the task references j, it accesses some undetermined value on the stack.
For correct results, the programmer needs to make sure that the task is executed before the sections region reaches its implicit barrier by inserting a taskwait directive after the task construct. Alternatively, j can be specified to be firstprivate in the task construct, instead of shared.
Example 5-5 Second Example — Incorrect Version
#include <stdio.h> #include <omp.h> int main(int argc, char** argv) { omp_set_num_threads(2); omp_set_dynamic(0); int j=100; #pragma omp parallel shared(j) { #pragma omp sections firstprivate(j) { #pragma omp section { #pragma omp task shared(j) { #pragma omp critical printf("In Task, j = %d\n",j); } } } } printf("After parallel, j = %d\n",j); }
Example 5-6 Second Example — Corrected Version
#include <stdio.h> #include <omp.h> int main(int argc, char** argv) { omp_set_num_threads(2); omp_set_dynamic(0); int j=100; #pragma omp parallel shared(j) { #pragma omp sections firstprivate(j) { #pragma omp section { #pragma omp task shared(j) { #pragma omp critical printf("In Task, j = %d\n",j); } /* Use TASKWAIT for synchronization. */ #pragma omp taskwait } } } printf("After parallel, j = %d\n",j); }