It is generally more profitable to parallelize the outermost loop in a nest of loops, since the overheads incurred are small. However, it is not always safe to parallelize the outermost loops due to dependences that might be carried by such loops. This is illustrated in the following:
for (i=0; i <n; i++) { for (j=0; j <n; j++) { a[j][i+1] = 2.0*a[j][i-1]; } } |
In this example, the loop with the index variable i cannot be parallelized, because of a dependency between two successive iterations of the loop. The two loops can be interchanged and the parallel loop (the j-loop) becomes the outer loop:
for (j=0; j<n; j++) { for (i=0; i<n; i++) { a[j][i+1] = 2.0*a[j][i-1]; } } |
The resulting loop incurs an overhead of parallel work distribution only once, while previously, the overhead was incurred n times. The compiler performs safety and profitability analysis to determine whether to perform loop interchange.