C H A P T E R  4

Implementation-Defined Behaviors

This chapter notes specific issues in the OpenMP 2.0 Fortran and C/C++ specifications that are implementation dependent. For last-minute information regarding the latest compiler releases, see the C, C++, and Fortran readme files.

 

The default, in the absence of an explicit OMP_SCHEDULE environment variable, or an explicit SCHEDULE clause, is static scheduling.

Without an explicit num_threads() clause, call to the omp_set_num_threads() function, or an explicit definition of the OMP_NUM_THREADS environment variable, the default number of threads in a team is 1.

If dynamic adjustment is enabled, the number of threads in the team is adjusted to be the minimum of:

the number of threads the user requested

1 + the number of available threads in the pool

the number of available processors

If dynamic adjustment is disabled, then the number of threads in the team will be the minimum of:

the number of threads the user requested

1 + the number of available threads in the pool

If the number of threads supplied is less than the number the user requested and SUNW_MP_WARN is set to TRUE or a callback function is registered through a call to sunw_mp_register_warn(), a warning message will be issued.

In exceptional situations, such as when there is lack of system resources, the number of threads supplied will be less than described above. In these situations, if dynamic adjustment is disabled and SUNW_MP_WARN is set to TRUE or a callback function is registered via a call to sunw_mp_register_warn(), a warning message will be issued.

Refer to Chapter 2 for more information about the pool of threads and the nested parallelism execution model.

Nested parallelism is supported. Nested parallel regions can be executed by multiple threads. Nested parallelism is disabled by default. Set the OMP_NESTED environment variable, or call the omp_set_nested() function to enable it. See Chapter 2.

This implementation replaces all ATOMIC directives and pragmas by enclosing the target statement in a critical region.

The default chunk size for SCHEDULE(GUIDED) when chunksize is not specified is 1. The OpenMP runtime library uses the following formula for computing the chunk sizes for a loop with GUIDED scheduling:

chunksize = unassigned_iterations / (weight * num_threads)

where:

unassigned_iterations is the number of iterations in the loop that have not yet been assigned to any thread;

weight is a floating-point constant that can be set by the user at runtime with the SUNW_MP_GUIDED_WEIGHT environment variable (Section 5.3, OpenMP Environment Variables). The current default, if not specified, assumes weight is 2.0;

num_threads is the number of threads used to execute the loop.

Choice of the weighting value affects the size of the initial and subsequent chunks of iterations assigned to threads in loops, and has a direct affect on load balancing. Experimental results show that the default weight of 2.0 works well generally. However some applications could benefit from a different weight value.

Programs using POSIX or Solaris threads can contain OpenMP directives or call routines that contain OpenMP directives.

For example, the following code will fall into an endless loop as threads wait at different barriers, and must be terminated with a control-C from the terminal:


% cat bad1.c
 
#include <omp.h>
#include <stdio.h>
 
int
main(void)
{
   omp_set_dynamic(0);
   omp_set_num_threads(4);
 
   #pragma omp parallel
   {
       int i = omp_get_thread_num();
 
       if (i % 2) {
           printf("At barrier 1.\n");
           #pragma omp barrier
       }
   }
   return 0;
}
% cc -xopenmp -xO3 bad1.c
% ./a.out                    run the program
At barrier 1.
At barrier 1.
                  program hung in endless loop
Control-C   to terminate execution
 

But if we set SUNW_MP_WARN before execution, the runtime library will detect the problem:


% setenv SUNW_MP_WARN TRUE
% ./a.out
At barrier 1.
At barrier 1.
WARNING (libmtsk): Threads at barrier from different directives.
    Thread at barrier from bad1.c:11.
    Thread at barrier from bad1.c:17.
    Possible Reasons:
    Worksharing constructs not encountered by all threads in the team in the       same order.
    Incorrect placement of barrier directives.

int sunw_mp_register_warn(void (*func) (void *) )

Access to the prototype for this function requires adding
#include <sunw_mp_misc.h>

For example:


% cat bad2.c
#include <omp.h>
#include <sunw_mp_misc.h>
#include <stdio.h>
 
void handle_warn(void *msg)
{
    printf("handle_warn: %s\n", (char *)msg);
}
 
void set(int i)
{
    static int k;
#pragma omp critical
    {
        k++;
    }
#pragma omp barrier
}
 
int main(void)
{
  int i, rc;
  omp_set_dynamic(0);
  omp_set_num_threads(4);
  if (sunw_mp_register_warn(handle_warn) != 0) {
      printf ("Installing callback failed\n");
  }
#pragma omp parallel for
  for (i = 0; i < 20; i++) {
      set(i);
  }
  return 0;
}
 
% cc -xopenmp -xO3 bad2.c
% a.out
handle_warn: WARNING (libmtsk): at bad2.c:21 Barrier is not permitted in dynamic extent of for / DO.

handle_warn() is installed as the callback handler function when an error is detected by the OpenMP runtime library. The handler in this example merely prints the error message passed to it from the library, but could be used to trap certain errors.