To create a loop timing file, you compile your program with compiler options that automatically parallelize and optimize your code (-xparallel and -xO4). You also add the -Zlp option to compile for LoopTool or LoopReport. When you run the program compiled with these options, Sun WorkShop creates a timing file for LoopTool or LoopReport to process.
The three compiler options are illustrated in this example:
% f77 -xO4 -xparallel -Zlp source_file
All examples apply to FORTRAN 77, Fortran 90, and C programs.
There are a number of other useful options for looking at and parallelizing loops:
Option | Effect |
---|---|
-o program | Renames the executable to program |
-xexplicitpar | Parallelizes loops marked with DOALL pragma |
-xloopinfo | Prints hints to stderr for redirection to files |
Many combinations of compiler options work for LoopTool and LoopReport.
To compile for automatic parallelization, typical compilation switches are -xparallel and -x04. To compile for LoopTool and LoopReport, add -Zlp.
% f77 -x04 -xparallel -Zlp source_file
You can use either -xO3 or -xO4 with -xparallel. If you don't specify -xO3 or -xO4 but you do use -xparallel, then the compiler uses -xO3. Table 3-1 summarizes how optimization level options are added for specific options.
Table 3-1 Optimization Level Options and What They Imply
You type: |
Bumped Up To: |
---|---|
-xparallel |
-xparallel -xO3 |
-xparallel -Zlp |
-xparallel -xO3 -Zlp |
-xexplicitpar |
-xexplicitpar -xO3 |
-xexplicitpar -Zlp |
-xexplicitpar -xO3 -Zlp |
-Zlp |
-xdepend -xO3 -Zlp |
Other compilation options include -xexplicitpar and -xloopinfo.
The Fortran compiler option -xexplicitpar is used with the pragma DOALL. If you insert DOALL before a loop in your source code, you are explicitly marking that loop for parallelization. The compiler parallelizes the loop when you compile with -xexplicitpar.
The following code fragment shows how to mark a loop explicitly for parallelization.
subroutine adj(a,b,c,x,n) real*8 a(n), b(n), c(-n:0), x integer n c$par DOALL do 19 i = 1, n*n do 29 k = i, n*n a(i) = a(i) + x*b(k)*c(i-k) 29 continue 19 continue return end
When you use -Zlp by itself, -xdepend and -xO3 are added. The -xdepend option instructs the compiler to perform the data dependency analysis that it needs to do to identify loops. The option -xparallel includes -xdepend, but -xdepend does not imply (or trigger) -xparallel.
The -xloopinfo option prints hints about loops to stderr (the UNIX standard error file, on file descriptor 2) when you compile your program. The hints include the routine names, the line number for the start of the loop, whether the loop was parallelized, and the reason it was not parallelized, if applicable.
The following example redirects hints about loops in the source file gamteb.F to the file gamtab.loopinfo:
% f77 -xO3 -parallel -xloopinfo -Zlp gamteb.F 2> gamteb.loopinfo
The main difference between -Zlp and -xloopinfo is that in addition to providing compiler hints about loops, -Zlp also instruments your program so that timing statistics are recorded at runtime. For this reason, also, LoopTool and LoopReport analyze only programs that have been compiled with -Zlp.
After compiling with -Zlp, run the executable. This creates the loop timing file, program.looptimes. Both LoopTool and LoopReport process two files: the instrumented executable and the loop timing file.