Sun Studio 12: Fortran Programming Guide

10.1.2 Steps to Parallelizing a Program

Here is a very general outline of the steps needed to parallelize an application:

  1. Optimize. Use the appropriate set of compiler options to get the best serial performance on a single processor.

  2. Profile. Using typical test data, determine the performance profile of the program. Identify the most significant loops.

  3. Benchmark. Determine that the serial test results are accurate. Use these results and the performance profile as the benchmark.

  4. Parallelize. Use a combination of options and directives to compile and build a parallelized executable.

  5. Verify. Run the parallelized program on a single processor and single thread and check results to find instabilities and programming errors that might have crept in. (Set $PARALLEL or $OMP_NUM_THREADS to 1; see 10.1.5 Number of Threads).

  6. Test. Make various runs on several processors to check results.

  7. Benchmark. Make performance measurements with various numbers of processors on a dedicated system. Measure performance changes with changes in problem size (scalability).

  8. Repeat steps 4 to 7. Make improvements to your parallelization scheme based on performance.