Specifies optimization level; note the uppercase letter O followed by the digit 1, 2, 3, 4, or 5. In general, program execution speed depends on the level of optimization. The higher the level of optimization, the better the runtime performance. However, higher optimization levels can result in increased compilation time and larger executable files.
In a few cases, –xO2 might perform better than the others, and –xO3 might outperform –xO4. Try compiling with each level to see if you have one of these rare cases.
If the optimizer runs out of memory, it tries to recover by retrying the current procedure at a lower level of optimization. The optimizer resumes subsequent procedures at the original level specified in the -xOlevel option.
There are five levels that you can use with –xO. The following sections describe how they operate on the SPARC platform and the x86 platform.
On the SPARC Platform:
–xO1 does only the minimum amount of optimization (peephole), which is post-pass, assembly-level optimization. Do not use -xO1 unless using -xO2 or -xO3 results in excessive compilation time, or you are running out of swap space.
–xO2 does basic local and global optimization, which includes:
Induction-variable elimination
Local and global common-subexpression elimination
Algebraic simplification
Copy propagation
Constant propagation
Loop-invariant optimization
Register allocation
Basic block merging
Tail recursion elimination
Dead-code elimination
Tail-call elimination
Complicated expression expansion
This level does not optimize references or definitions for external or indirect variables.
–xO3, in addition to optimizations performed at the –xO2 level, also optimizes references and definitions for external variables. This level does not trace the effects of pointer assignments. When compiling either device drivers that are not properly protected by volatile or programs that modify external variables from within signal handlers, use– xO2. In general, this level results in increased code size unless combined with the -xspace option.
–xO4 does automatic inlining of functions contained in the same file in addition to performing– xO3 optimizations. This automatic inlining usually improves execution speed but sometimes makes it worse. In general, this level results in increased code size unless combined with the -xspace option.
–xO5 generates the highest level of optimization. It is suitable only for the small fraction of a program that uses the largest fraction of computer time. This level uses optimization algorithms that take more compilation time or that do not have as high a certainty of improving execution time. Optimization at this level is more likely to improve performance if it is done with profile feedback. See A.2.165 -xprofile=p.
On the x86 Platform:
–xO1 does basic optimization. This includes algebraic simplification, register allocation, basic block merging, dead code and store elimination, and peephole optimization.
–xO2 performs local common subexpression elimination, local copy and constant propagation, and tail recursion elimination, as well as the optimization done by level 1.
–xO3 performs global common subexpression elimination, global copy and constant propagation, loop strength reduction, induction variable elimination, and loop-variant optimization, as well as the optimization done by level 2.
–xO4 does automatic inlining of functions contained in the same file as well as the optimization done by level 3. This automatic inlining usually improves execution speed, but sometimes makes it worse. This level also frees the frame pointer registration (ebp) for general purpose use. In general this level results in increased code size.
–xO5 generates the highest level of optimization. It uses optimization algorithms that take more compilation time or that do not have as high a certainty of improving execution time.
If you use -g or -g0 and the optimization level is -xO3 or lower, the compiler provides best-effort symbolic information with almost full optimization. Tail-call optimization and back-end inlining are disabled.
If you use -g or -g0 and the optimization level is -xO4 or higher, the compiler provides best-effort symbolic information with full optimization.
Debugging with -g does not suppress –xOlevel, but –xOlevel limits –g in certain ways. For example, the –xOlevel options reduce the utility of debugging so that you cannot display variables from dbx, but you can still use the dbx where command to get a symbolic traceback. For more information, see Debugging a Program With dbx.
The -xcrossfile option is effective only if it is used with -xO4 or -xO5.
The -xinline option has no effect for optimization levels below -xO3. At -xO4, the optimizer decides which functions should be inlined, and does so regardless of whether you specify the -xinline option. At -xO4, the compiler also attempts to determine which functions will improve performance if they are inlined. If you force the inlining of a function with -xinline, you might actually diminish performance.
The default is no optimization. However, this is only possible if you do not specify an optimization level. If you specify an optimization level, there is no option for turning optimization off.
If you are trying to avoid setting an optimization level, be sure not to specify any option that implies an optimization level. For example, -fast is a macro option that sets optimization at -xO5. All other options that imply an optimization level give a warning message that optimization has been set. The only way to compile without any optimization is to delete all options from the command line or make file that specify an optimization level.
If you optimize at –xO3 or –xO4 with very large procedures (thousands of lines of code in a single procedure), the optimizer might require an unreasonable amount of memory. In such cases, machine performance can be degraded.
To prevent this degradation from taking place, use the limit command to limit the amount of virtual memory available to a single process (see the csh(1) man page). For example, to limit virtual memory to 16 megabytes:
example% limit datasize 16M |
This command causes the optimizer to try to recover if it reaches 16 megabytes of data space.
The limit cannot be greater than the total available swap space of the machine, and should be small enough to permit normal use of the machine while a large compilation is in progress.
The best setting for data size depends on the degree of optimization requested, the amount of real memory, and virtual memory available.
To find the actual swap space, type: swap– l
To find the actual real memory, type: dmesg | grep mem
-xldscope –fast, -xcrossfile=n, –xprofile=p, csh(1) man page