4.3 Optimizing gcc Compilation

The -O level option to gcc turns on compiler optimization, when the specified value of level has the following effects:


The default reduces compilation time and has the effect that debugging always yields the expected result. This level is equivalent to not specifying the -O option at all. However, a number of optimization options are still enabled, for example: -falign-loops, -finline-functions-called-once, and -fmove-loop-invariants.


The compiler attempts to reduce both the size of the output binary code and the execution speed, but it does not perform optimizations that might substantially increase the compilation time.


The compiler performs optimizations that do not require a tradeoff of space for speed. Compared to level 1, level 2 optimization improves the performance of the output binary but it also increases the compilation time.


The compiler turns on the -fgcse-after-reload, -finline-functions, -fipa-cp-clone, -fpredictive-commoning, -ftree-vectorize, and -funswitch-loops options, which require a tradeoff of space for speed, in addition to the level 2 optimizations.


The compiler optimizes to reduce the size of the binary instead of execution speed.

If you do not specify an optimization option, gcc attempts to reduce the compilation time and to make debugging always yield the result expected from reading the source code. If you enable optimization, the compiler tries to improve performance, to reduce the size of the output binary, or both, but compilation takes longer and you can lose the ability to debug the program effectively. If you compile several source files together to a single output binary file, the compiler uses information that it gathers from all of the source files when compiling each individual source file.

You can use the following command to find out which optimization options are enabled at a specified optimization level:

$ gcc -c -Q -Olevel --help=optimizers

To improve the speed of compilation, you can specify the -pipe option, which instructs gcc to use pipes rather than temporary files for communication between the various stages of compilation.

Taking advantage of hardware properties specific to target platforms can result in a significant performance improvement. By default, GCC compiles code that is optimized for the most processors. However, you can use the -mtune and -march options with gcc to optimize instruction scheduling and instruction set selection respectively. Specifying an architecture with -march implicitly selects the value of -mtune unless you specify a value explicitly. Your program can still run, albeit probably not optimally, if you specify an incorrect value for -mtune, but it is likely to fail if you specify an incorrect value for -march. For more information, see the gcc(1) manual page and the GCC 4.4.4 Manual: Hardware Models and Configurations.