Go to main content
Oracle® Developer Studio 12.5: OpenMP API User's Guide

Exit Print View

Updated: July 2016
 
 

9.6 Processor Binding (Thread Affinity)

The OpenMP 4.0 specification defines the term processor as an implementation defined hardware unit on which one or more OpenMP threads can execute.

In this implementation, the term processor is defined as the smallest hardware execution unit on which one or more OpenMP threads can be scheduled, bound, and executed, as documented in the processor_bind(2) Oracle Solaris man page. Synonyms for processor include CPU, virtual processor, and hardware thread. For clarity, the term hardware thread is used consistently in this manual.

In this implementation, the precise definitions of the abstract names threads, cores, and sockets used with the OMP_PLACES environment variable are follows:

  • threads refers to the hardware threads on the machine.

  • cores refers to the physical cores on the machine.

  • sockets refers to the physical sockets (processor chips) on the machine.

For more information, see OMP_PLACES and OMP_PROC_BIND. The implementation-defined behaviors of Oracle Developer Studio that are related to OpenMP 4.0 thread affinity are as follows:

  • With the close thread binding policy, when T > P and P does not divide T evenly, the assignment of threads to places is as follows: First, each of the P places is assigned S = floor(T/P) threads; the IDs of the threads assigned to a place are a contiguous subset of the thread IDs in the team. Second, each of the first T - (P*S) places (starting with the place of the parent thread, and with wrap around) is assigned one additional thread.

  • With the spread thread binding policy, when T > P and P does not divide T evenly, the assignment of threads to subpartitions is as follows: First, each of the P subpartitions is assigned S = floor(T/P) threads; the IDs of the threads assigned to a subpartition are a contiguous subset of the thread IDs in the team. Second, each of the first T - (P*S) subpartitions (starting with the subpartition containing the place of the parent thread, and with wrap around) is assigned one additional thread.

  • If an affinity request cannot be fulfilled, the process is exited with a nonzero status.

  • The numbers specified in the OMP_PLACES environment variable refer to hardware thread IDs.

  • When creating a place list of n elements by appending the number n to an abstract name, the place list will consist of N consecutive resources beginning at the resource containing the hardware thread on which the main thread is executing at the time the place list is constructed, with wrap around occurring after the last available named resource is reached.

  • If more resources are requested than are available on the machine, an error message is issued and the process is exited with a nonzero status. A resource is available if it contains at least one online hardware thread.

  • When the execution environment cannot map a numeric value (either explicitly defined or implicitly derived from an interval) within the OMP_PLACES list to a hardware thread on the target platform, or if it maps to an unavailable hardware thread, an error message is issued and the process is exited with a nonzero status.

  • When the OMP_PLACES environment variable is defined using an abstract name, each unit of the resource represented by the abstract name is allocated as a single place. The number of allocated units can be specified by a count n whose value is no greater than the total number of available units on the machine. On Oracle Solaris platforms, hardware threads pre-emptively reserved by an administrator using psrset(1M) are not considered available. If no available hardware threads remain in the set defined by OMP_PLACES, an error message is issued and the process is exited with a nonzero status.

  • If the affinity request for a parallel construct cannot be fulfilled (because, for example, the system call to bind an OpenMP thread to a hardware thread fails), the resulting behavior is undefined.

  • When using OMP_PLACES, intervals may be used to specify places. This implementation assumes that when an interval specifies a sequence of places, length is the number of places in the sequence, and stride is the number of hardware thread IDs separating successive places in the sequence. If no stride value is specified, then unit stride is assumed.