Man Page inline.1




NAME

     inline - in-line procedure	call expander


DESCRIPTION

     Assembly language call instructions are replaced by  a  copy
     of	 their	corresponding  function	 body  obtained	 from the
     inline template (*.il) file.

     Inline files have a suffix	of .il,

	  for example: % CC foo.il hello.c

     Inlining is done by the compiler's	code generator.


USAGE

     Each  inlinefile  contains	 one  or  more	labeled	 assembly
     language templates	of the form:
	  inline-directive
	  instructions
	  ...
	  .end

     where the instructions constitute an  in-line  expansion  of
     the  named	routine.  An inline-directive is a command of the
     form:

	  .inline   identifier,	argsize

     This declares a block of code for the routine named by iden-
     tifier,  with  argsize  as	 the  total size of the	routine's
     arguments,	 in  bytes.   Calls  to	 the  named  routine  are
     replaced by the code in the in-line template.

     NOTE:
     The value of argsize is ignored but the argument  should  be
     included  for compatibility with compiler versions	predating
     the Sun WorkShop[tm] 5.0 compilers.

     Multiple templates	are permitted; matching	 templates  after
     the first are ignored.


  Coding Conventions for all Solaris Studio
     Inline  templates	should	be  coded  as  expansions  of  C-
     compatible	 procedure  calls,  with  the difference that the
     return address cannot be depended upon to be in the expected
     place, since no call instruction will have	been executed.

     Inline templates must conform  to	standard  Solaris  Studio
     parameter	 passing   and	register  usage	 conventions,  as
     detailed below.  They must	not call  routines  that  violate
     these  conventions;  for example, assembly	language routines
     such as setjmp(3c)	may cause problems.

     Registers other than the ones mentioned below  must  not  be
     used or set.

     Branch instructions in an in-line template	may only transfer
     to	 numeric  labels  (1f,	2b, and	so on) defined within the
     in-line template.	No other control transfers are allowed.

     Templates do not need ret or retl instructions,  and  should
     not include them.

     Only opcodes and addressing modes generated by Solaris  Stu-
     dio  compilers  are guaranteed to work.  Binary encodings of
     instructions are not supported.

  Coding Conventions for SPARC Systems
     The first six arguments are  passed  in  registers	 %o0-%o5.
     Arguments	beyond the sixth are passed using stack	locations
     in	accordance with	the target ABI.	 %sp is	guaranteed to  be
     64-bit aligned.  The contents of %o7 are undefined, since no
     call instruction will have	been executed.

     Results are returned in %o0 or %f0/%f1.

     Registers %o0-%o5 and %f0-%f31 may	be used	as temporaries.

     Integral and single-precision floating-point  arguments  are
     32-bit aligned.

     Double-precision floating-point arguments are guaranteed  to
     be	64-bit aligned if their	offsets	are multiples of 8.

     Each control-transfer instruction (branches and calls)  must
     be	immediately followed by	a nop.

     Call instructions must include  an	 extra	(final)	 argument
     which indicates the number	of registers used to pass parame-
     ters to the called	routine.

     Note that for SPARC systems, the  instruction  following  an
     expanded 'call' is	deleted.

  Coding Conventions for 32-bit	x86 Systems
     Arguments are passed on the stack.	Since no call instruction
     was  issued,  the	first  argument	 is at (%esp), the second
     argument is at 4 (%esp), etc. Integer results of 32 bits  or
     less  are	returned  in  %eax,  64-bit  integer results  are
     returned in %edx:%eax. Floating point results  are	 returned
     in	%st(0).

     The code may use registers	%eax, %ecx and %edx.  The  values
     in	any other registers must be preserved. The floating point
     stack will	be empty at the	start  of  the	inline	expansion
     template,	and must be empty (except for a	returned floating
     point value) at the end.


SPECIAL x86 NOTE

     Programs compiled with -xarch={sse|sse2} to run  on  Solaris
     x86 SSE/SSE2 Pentium 4-compatible platforms must be run only
     on	platforms that are SSE/SSE2 enabled.  Running  such  pro-
     grams  on	platforms  that	 are  not  SSE/SSE2-enabled could
     result in segmentation faults or incorrect	results	 occuring
     without  any  explicit  warning  messages.	Starting with the
     Solaris 10	release, the OS	and compilers will prevent execu-
     tion   of	 SSE/SSE2-compiled   binaries  on  platforms  not
     SSE/SSE2-enabled.

     OS	releases starting with Solaris 9 update	6  are	SSE/SSE2-
     enabled  on Pentium 4-compatible platforms. Earlier versions
     of	Solaris	OS are not SSE/SSE2-enabled.

     This warning extends also to programs that	employ .il inline
     assembly  language	 functions or __asm() assembler	code that
     utililize SSE/SSE2	instructions.

     If	you compile and	link in	separate steps,	always link using
     the  compiler  and	with -xarch={sse|sse2} to ensure that the
     correct startup routine is	linked.

  Coding Conventions for x64 Platforms
     Arguments are passed according to their classification.  The
     classification includes integer-, sse- and	memory-arguments.

     Arguments of types	(signed	and unsigned) _Bool, char, short,
     int,  long,  long	long  and pointers are integer arguments.
     Arguments of aggregate types  (struct,union,array)	 of  size
     less  than	 or  equal  to	16 bytes and that contain aligned
     members of	types _Bool, char, short, int,	long,  long  long
     and pointers are also integer.

     Arguments of types	 float	and  double  are  sse  arguments.
     Arguments	of  aggregate types of size less than or equal to
     16	bytes and that contain aligned members of types	float and
     double are	also sse.

     Arguments of types	long double and	 of  aggregate	types  of
     size  greater  than  16 bytes, or with unaligned members are
     memory arguments.

     Integer arguments are passed in  integer  registers  by  the
     next  sequence:  %rdi,  %rsi,  %rdx,  %rcx, %r8 and %r9. One
     integer argument of aggregate type	can hold up to 2  integer
     registers.	 If  the  number  of integer arguments is greater
     than 6, the 7th and next integer arguments	are considered as
     memory arguments.

     Sse arguments are passed in sse registers in the order  from
     %xmm0  to %xmm7. One sse argument of aggregate type can hold
     up	to 2 sse registers, each sse register holds up to 8 bytes
     of	 argument.   For example, argument of type double complex
     is	passed in 2 consequent see registers,  argument	 of  type
     float  complex is passed in 1 see register. If the	number of
     sse arguments is greater than 8, the 9th and next sse  argu-
     ments are considered as memory arguments.

     Integer and sse arguments are numbered independently.

     Memory arguments are passed on the	stack in order from right
     to	 left  how  they  appear in function arguments list. Each
     argument on stack is aligned according to its size, on 8  if
     size is less or equal to 8, on 16 otherwise. at the start of
     the inline	expansion template stack is aligned on 16.

     Since no call instruction was issued, the first memory argu-
     ment  is  at (%rsp), the second argument is at 8(%rsp) or at
     16(%rsp) depending	on the first memory argument size and the
     second memory argument alignment, etc.

     Returning values are classified in	the  same  way	as  argu-
     ments.

     Integer results of	8 bytes	or less	 are  returned	in  %rax,
     integer results of	9 to 16	bytes are returned in %rdx:%rax.

     Sse results are returned depending	on  their  size	 too,  in
     %xmm0 or in %xmm1:%xmm0.

     Results of	type long double are returned in %st(0).

     If	returning value	is of type long	double complex,	the  real
     part of the value is returned in %st0 and the imaginary part
     in	%st1.

     For memory	results	the caller provides space for the  return
     value  and	 passes	the address of this storage in %rdi as if
     it	were the first argument	to the function.  In effect, this
     address  becomes  a  hidden  first	argument.  On return %rax
     will contain the address that has	been  passed  in  by  the
     caller in %rdi.

     The code may not change register %rbp.  The  floating  point
     stack  will  be  empty  at	the start of the inline	expansion
     template, and must	be empty (except for a returned	 floating
     point value) at the end.

     In	addition to %rbp, the values in	registers %rbx and  %r12-
     %r15 must be preserved across the inlined code.


EXAMPLES

     Please review libm.il or vis.il for examples. You can find	a
     version  of  these	 libraries  that is specific to	each sup-
     ported architecture under the compiler's lib/ directory.


WARNING

     inline does not check for violations of the  coding  conven-
     tions described above.


SEE ALSO:

     "Techniques for Optimizing	 Applications:	High  Performance
     Computing"	 by  Rajat P. Garg and Ilya Sharapov uses Fortran
     to	provide	a useful explanation  of  inline  templates.  See
     Chapter 8.

     "The SPARC	Architecture Manual Version 9" provided	by  SPARC
     International Inc.	at http://www.sparc.com/resource.htm. See
     appendix G.