Oracle® Solaris Studio 12.4: C User's Guide

Exit Print View

Updated: March 2015
 
 

3.2.2 Private Scalars and Private Arrays

For some data dependences, the compiler might still be able to parallelize a loop. Consider the following example.

Example 3-4  Parallelizable Loop With Dependence
for (i=1; i < 1000; i++) {
    t = 2 * a[i];           /* S1 */
    b[i] = t;               /* S2 */
}

In this example, assuming that arrays a and b are non-overlapping arrays, there appears to be a data dependence in any two iterations due to the variable t. The following statements execute during iterations one and two.

Example 3-5  Iterations One and Two
t = 2*a[1];  /* 1 */
b[1] = t;    /* 2 */
t = 2*a[2];  /* 3 */
b[2] = t;    /* 4 */

Because statements one and three modify the variable t, the compiler cannot execute them in parallel. However, because the value of t is always computed and used in the same iteration, the compiler can use a separate copy of t for each iteration. This method eliminates the interference between different iterations due to such variables. In effect, variable t is used as a private variable for each thread executing that iteration, as shown in the following example.

Example 3-6  Variable t as a Private Variable for Each Thread
for (i=1; i < 1000; i++) {
    pt[i] = 2 * a[i];       /* S1 */
    b[i] = pt[i];           /* S2 */
}

In this example, each scalar variable reference t is replaced by an array reference pt. Each iteration now uses a different element of pt, eliminating any data dependencies between any two iterations. One problem with this is that it may lead to an extra large array. In practice, the compiler only allocates one copy of the variable for each thread that participates in the execution of the loop. Each such variable is, in effect, private to the thread.

The compiler can also privatize array variables to create opportunities for parallel execution of loops. Consider the following example:

Example 3-7  Parallelizable Loop With an Array Variable
for (i=1; i < 1000; i++) {
    for (j=1; j < 1000; j++) {
            x[j] = 2 * a[i];        /* S1 */
            b[i][j] = x[j];         /* S2 */
    }
}

In this example, different iterations of the outer loop modify the same elements of array x, and thus the outer loop cannot be parallelized. However, if each thread executing the outer loop iterations has a private copy of the entire array x, then no interference between any two iterations of the outer loop would occur. The following example illustrates this point.

Example 3-8  Parallelizable Loop Using a Privatized Array
for (i=1; i < 1000; i++) {
    for (j=1; j < 1000; j++) {
            px[i][j] = 2 * a[i];    /* S1 */
            b[i][j] = px[i][j];     /* S2 */
    }
}

As in the case of private scalars, you need to expand the array only up to the number of threads executing in the system. This expansion is done automatically by the compiler by allocating one copy of the original array in the private space of each thread.