Sun Studio 12: Fortran User's Guide

3.4.137 –xipo[={0|1|2}]

Perform interprocedural optimizations.

Performs whole-program optimizations by invoking an interprocedural analysis pass. Unlike -xcrossfile, -xipo will perform optimizations across all object files in the link step, and is not limited to just the source files on the compile command.

-xipo is particularly useful when compiling and linking large multi-file applications. Object files compiled with this flag have analysis information compiled within them that enables interprocedural analysis across source and pre-compiled program files. However, analysis and optimization is limited to the object files compiled with -xipo, and does not extend to object files on libraries.

-xipo=0 disables, and -xipo=1 enables, interprocedural analysis. -xipo=2 adds interprocedural aliasing analysis and memory allocation and layout optimizations to improve cache performance. The default is -xipo=0, and if -xipo is specified without a value, -xipo=1 is used.

When compiling with -xipo=2, there should be no calls from functions or subroutines compiled without -xipo=2 (for example, from libraries) to functions or subroutines compiled with -xipo=2.

As an example, if you interpose on the function malloc() and compile your own version of malloc() with -xipo=2, all the functions that reference malloc() in any library linked with your code would also have to be compiled with -xipo=2. Since this might not be possible for system libraries, your version of malloc should not be compiled with -xipo=2.

When compiling and linking are performed in separate steps, -xipo must be specified in both steps to be effective.

Example using -xipo in a single compile/link step:


demo% f95 -xipo -xO4 -o prog  part1.f part2.f part3.f

The optimizer performs crossfile inlining across all three source files. This is done in the final link step, so the compilation of the source files need not all take place in a single compilation and could be over a number of separate compilations, each specifying -xipo.

Example using -xipo in separate compile/link steps:


demo% f95 -xipo -xO4 -c part1.f part2.f
demo% f95 -xipo -xO4 -c part3.f
demo% f95 -xipo -xO4 -o prog  part1.o part2.o part3.o

The object files created in the compile steps have additional analysis information compiled within them to permit crossfile optimizations to take place at the link step.

A restriction is that libraries, even if compiled with -xipo do not participate in crossfile interprocedural analysis, as shown in this example:


demo% f95 -xipo -xO4 one.f two.f three.f
demo% ar -r mylib.a one.o two.o three.o
...
demo% f95 -xipo -xO4 -o myprog main.f four.f mylib.a

Here interprocedural optimizations will be performed between one.f, two.f and three.f, and between main.f and four.f, but not between main.f or four.f and the routines on mylib.a. (The first compilation may generate warnings about undefined symbols, but the interprocedural optimizations will be performed because it is a compile and link step.)

Other important information about -xipo:

When Not To Compile With -xipo:

Working with the set of object files in the link step, the compiler tries to perform whole-program analysis and optimizations. For any function or subroutine foo() defined in this set of object files, the compiler makes the following two assumptions:

(1) At runtime, foo() will not be called explicitly by another routine defined outside this set of object files, and

(2) calls to foo() from any routine in the set of object files will be not be interposed upon by a different version of foo() defined outside this set of object files.

If assumption (1) is not true for the given application, do not compile with -xipo=2.If assumption (2) is not true, do not compile with either -xipo=1 or -xipo=2.

As an example, consider interposing on the function malloc() with your own source version and compiling with -xipo=2. Then all the functions in any library that reference malloc() that are linked with your code would have to also be compiled with -xipo=2 and their object files would need to participate in the link step. Since this might not be possible for system libraries, your version of malloc should not be compiled with -xipo=2.

As another example, suppose that you build a shared library with two external calls, foo() and bar() inside two different source files, and bar() calls foo() inside its body. If there is a possibility that the function call foo() could be interposed at runtime, then compile neither source file for foo() or bar() with -xipo=1 or -xipo=2. Otherwise, foo() could be inlined into bar(), which could cause incorrect results when compiled with -xipo.