Sun Studio 12: Fortran Programming Guide

10.2.3 Automatic Parallelization Criteria

DO loops that have no cross-iteration data dependencies are automatically parallelized by -autopar. The general criteria for automatic parallelization are:

10.2.3.1 Apparent Dependencies

The compilers may automatically eliminate a reference that appears to create a data dependence in the loop. One of the many such transformations makes use of private versions of some of the arrays. Typically, the compiler does this if it can determine that such arrays are used in the original loops only as temporary storage.

Example: Using -autopar, with dependencies eliminated by private arrays:


      parameter (n=1000)
      real a(n), b(n), c(n,n)
      do i = 1, 1000             <--Parallelized
        do k = 1, n
          a(k) = b(k) + 2.0
        end do
        do j = 1, n-1
          c(i,j) = a(j+1) + 2.3
        end do
      end do
      end

In the example, the outer loop is parallelized and run on independent processors. Although the inner loop references to array a appear to result in a data dependence, the compiler generates temporary private copies of the array to make the outer loop iterations independent.

10.2.3.2 Inhibitors to Automatic Parallelization

Under automatic parallelization, the compilers do not parallelize a loop if:

10.2.3.3 Nested Loops

In a multithreaded, multiprocessor environment, it is most effective to parallelize the outermost loop in a loop nest, rather than the innermost. Because parallel processing typically involves relatively large loop overhead, parallelizing the outermost loop minimizes the overhead and maximizes the work done for each thread. Under automatic parallelization, the compilers start their loop analysis from the outermost loop in a nest and work inward until a parallelizable loop is found. Once a loop within the nest is parallelized, loops contained within the parallel loop are passed over.