SPARC Assembly Language Reference Manual

Exit Print View

Updated: July 2014
 
 

7.1.4 Late and Early Inlining

The code generator of the compiler processes template inlining. There are two opportunities for inlining: before and after optimization. If the inline template is complicated, the compiler may choose to do the inlining after optimization (late inlining), which means that the code will more or less appear exactly as it appears in the template. Otherwise, the code is inlined before optimization (early inlining) and will be merged and optimized with the rest of the code around the call site.

Early inlining leads to better performance. Things that will cause late inlining are:

  • Use of instructions that the compiler cannot generate

  • Instructions in the delay slots of branches

  • Call instructions

View the compiler commentary generated with -g to see if a routine is late inlined. The following example shows a template that fails early inlining because it uses the frame pointer (%fp) rather than the stack pointer (%sp).

.inline sum_val,16
  st   %o0,[%fp+0x48]
  st   %o1,[%fp+0x4c]
  ldd  [%fp+0x48],%f0
  st   %o2,[%fp+0x48]
  st   %o3,[%fp+0x4c]
  ldd  [%fp+0x48],%f2
  faddd %f0,%f2,%f0
.end

The compiler will still inline the code, but it is unable to early inline the code and the code will not participate in the compiler's optimization.

The following example compiles a 32-bit executable with compiler commentary information and displays it using the Oracle Solaris Studio er_src command. The debug information is stored in the .o files by default, so it is necessary to keep these files available.

% cc -g -O inline32.il driver32.c
% er_src a.out main
Source file: /home/AUser/code/inline/driver32.c
Object file: /home/AUser/code/inline/driver32.o
Load Object: a.out

     1. #include <stdio.h>
     2.
     3. void do_nothing();
     4. int add_up(int v1,int v2, int v3, int v4, int v5, int v6, int v7);
     5. double sum_val(double a, double b);
     6. double sum_ref(double *a, double *b);
     7. int is_true(int i);
     8.
     9.
     10. void main()
     11. {
     12.   double a=3.11,b=7.22;
     13.   do_nothing();
     14.   printf("add_up  %i\n",add_up(1,2,3,4,5,6,7));

   Template could not be early inlined because it references the register %fp
   Template could not be early inlined because it references the register %fp
   Template could not be early inlined because it references the register %fp
   Template could not be early inlined because it references the register %fp
   Template could not be early inlined because it references the register %fp
   Template could not be early inlined because it references the register %fp
     15.   printf("sum_val %f\n",sum_val(a,b));
     16.   printf("sum_ref %f\n",sum_ref(&a,&b));
     17.   printf("is_true 0=%i,1=%i\n", is_true(0),is_true(1));
     18. }

Use the Solaris Studio er_src command to examine the compiler commentary for a particular file. It takes two parameters: the name of the executable and the name of the function to examine. In this case, the template that cannot be early inlined is sum_val. Each time the compiler comes across the %fp register, it inserts a debug message, so you can tell that there are six instances of references to %fp in the template.