Oracle® Solaris Studio 12.4: C User's Guide

Exit Print View

Updated: March 2015
 
 

3.5.1 Amdahl’s Law

Fixed problem-size speedup is generally governed by Amdahl's law, which simply says that the amount of parallel speedup of a given problem is limited by the sequential portion of the problem. The following equation describes the speedup, S, of a problem where F is the fraction of time spent in sequential code and the remaining fraction of the time (1 - F) is divided up uniformly among P processors. If the second term of the equation ((1 - F) / P) drops to zero, the total speedup is limited by the first term, F, which remains fixed.

image:Equation showing Amdahl’s law, the fraction one over S equals F                         plus the fraction one minus F quantity over P.

The following figure illustrates this concept diagrammatically. The darkly shaded portion represents the sequential part of the program, and remains constant for one, two, four, and eight processors. The lightly shaded portion represents the parallel portion of the program that can be divided uniformly among an arbitrary number of processors.

Figure 3-1  Fixed Problem Speedups

image:As the number of processors increases, the amount of time required                             for the parallel portion of each program decreases.

As the number of processors increases, the amount of time required for the parallel portion of each program decreases whereas the serial portion of each program stays the same.

In reality, however, you might incur overheads due to communication and distribution of work to multiple processors. These overheads might not be fixed for arbitrary numbers of processors used.

The following figure illustrates the ideal speedups for a program containing 0%, 2%, 5%, and 10% sequential portions. No overhead is assumed.

Figure 3-2  Amdahl's Law Speedup Curve

image:The graph shows that the most speedup occurs with the program that                             has no sequential portion.

3.5.1.1 Overheads

Once the overheads are incorporated in the model, the speedup curves change dramatically. For the purposes of illustration, assume that overheads consist of two parts: a fixed part that is independent of the number of processors, and a non-fixed part that grows quadratically with the number of the processors used:

image:1 over S equals 1 divided by the quantity F plus the quantity 1                             minus the fraction F over P end quantity plus K sub 1 plus K sub 2 times                             P squared.

In this equation, K1 and K2 are some fixed factors. Under these assumptions, the speedup curve is shown in the following figure. Note that in this case, the speedups peak out. After a certain point, adding more processors is detrimental to performance.

Figure 3-3  Speedup Curve With Overheads

image:The graph shows that all programs reach the greatest speedup at                                 five processors and then loose this benefit as up to eight                                 processors are added.

The graph shows that all programs reach the greatest speedup at five processors and then lose this benefit as up to eight processors are added. The x-axis measures the number of processors and the y-axis measures the speedup.

3.5.1.2 Gustafson’s Law

Amdahl's law can be misleading for predicting parallel speedups in real problems. The fraction of time spent in sequential sections of the program sometimes depends on the problem size. That is, by scaling the problem size, you might improve the chances of speedup, as shown in the following example.

Example 3-21  Scaling the Problem Size Might Improve Chances of Speedup
/*
* initialize the arrays
*/
for (i=0; i < n; i++) {
    for (j=0; j < n; j++) {
            a[i][j] = 0.0;
            b[i][j] = ...
            c[i][j] = ...
    }
}
/*
* matrix multiply
*/
for (i=0; i < n; i++) {
    for(j=0; j < n; j++) {
            for (k=0; k < n; k++) {
                a[i][j] = b[i][k]*c[k][j];
            }
    }
}

Assume an ideal overhead of zero and that only the second loop nest is executed in parallel. For small problem sizes (that is, small values of n), the sequential and parallel parts of the program are not so far from each other. However, as n grows larger, the time spent in the parallel part of the program grows faster than the time spent in the sequential part. For this problem, increasing the number of processors as the problem size increases is beneficial.