C H A P T E R  3

Optimization

This chapter describes how to use compiler and linking options to optimize applications for:

TABLE 3-1 shows a comparison of the 32-bit and 64-bit operating environments. These items are described in greater detail in the following sections.


TABLE 3-1 Comparison of 32-bit and 64-bit Operating Environments

32-bit (ILP 32)

64-bit (LP64)

-xarch on SPARC platforms

v8, sparcvis, sparcvis2, sparcfmaf

sparcvis, sparcvis2, sparcfmaf

-xarch on x86 platforms

generic, sse2

sse2

addressing

-m32

-m64

Fortran Integers

INTEGER, INTEGER*4

INTEGER*8

C Integers

int

long

Floating-point

S/D/C/Z

S/D/C/Z

API

Names of routines

Names of routines with _64 suffix



3.1 Using The Sun Performance Library

The Sun Performance Library was compiled using the f95 compiler provided with this release. The Sun Performance Library routines were compiled using -dalign, -xparallel.

 

3.1.1 Fortran and C

When linking the program, use -dalign -xlic_lib=sunperf and the same command line options that were used when compiling.

Sun Performance Library is linked into an application with the -xlic_lib switch rather than the -l switch that is used to link in other libraries, as shown here.


 my_system% f95 -dalign my_file.f -xlic_lib=sunperf

3.1.2 C++

When linking your program, use -dalign -library=sunperf, and the same command line options that were used when compiling. Sun Performance Library is linked into an application with the -library switch, as shown here, rather than the -l switch.


 my_system% CC -dalign my_file.cpp -library=sunperf

If -dalign cannot be used in the program, supply a trap 6 handler as described in Getting Started With Sun Performance Library


3.2 Compiling

Compile with the most appropriate -xarch= option for best performance. At link time, use the same -xarch= option that was used at compile time to select the version of the Sun Performance Library optimized for a specific instruction-set architecture.



Note - Using instruction-set specific optimization options improves application performance on the selected instruction set architecture, but limits code portability.


For a detailed description of the different -xarch options, refer to the Fortran User’s Guide or the C User’s Guide.

The following lists the -xarch values for SPARC instruction-set architectures:

The following are the -xarch values for x86 instruction-set architectures:

3.2.1 Compiling Code for a 64-Bit Enabled Operating Environments

To compile code for a 64-bit enabled operating environment, use -m64 and convert all integer arguments to 64-bit arguments. 64-bit routines require the use of 64-bit integers.

Sun Performance Library provides 32-bit and 64-bit interfaces. To use the 64-bit interfaces:

3.2.2 64-Bit Integer Arguments

These additional 64-bit-integer interfaces are available only when linking with -m64. Codes compiled for 32-bit operating environments (-m32) cannot call the 64-bit-integer interfaces.

To call the 64-bit-integer interfaces directly, append the suffix _64 to the standard library name. For example, use daxpy_64() in place of daxpy().

However, if calling the 64-bit integer interfaces indirectly, do not append _64 to the name of the Sun Performance Library routine. Calls to the Sun Performance Library routine will access a 32-bit wrapper that promotes the 32-bit integers to 64-bit integers, calls the 64-bit routine, and then demotes the 64-bit integers to 32-bit integers.

For best performance, call the routine directly by appending _64 to the routine name.

For C programs, use long instead of int arguments. The following code example shows calling the 64-bit integer interfaces directly.


#include <sunperf.h>
long n, incx, incy;
double alpha, *x, *y;
daxpy_64(n, alpha, x, incx, y, incy);

The following code example shows calling the 64-bit integer interfaces indirectly.


#include <sunperf.h>
int  n, incx, incy;
double alpha, *x, *y;
daxpy   (n, alpha, x, incx, y, incy);

For Fortran programs, use 64-bit integers for all integer arguments. The following methods can be used to convert integer arguments to 64-bits:

Consider the following code example.


INTEGER*8 N
REAL*8 ALPHA, X(N), Y(N)
 
! _64 SUFFIX: N AND 1_8 ARE 64-BIT INTEGERS
CALL DAXPY_64(N,ALPHA,X,1_8,Y,1_8)

INTEGER*8 arguments cannot be used in a 32-bit environment. Routines in the 32-bit libraries, v8, v8plusa, v8plusb, cannot be called with 64-bit arguments. However, the 64-bit routines can be called with 32-bit arguments.

When passing constants in Fortran 95 code that have not been compiled with -xtypemap, append _8 to literal constants to effect the promotion. For example, when using Fortran 95, change CALL DSCAL(20,5.26D0,X,1) to CALL DSCAL(20_8,5.26D0,X,1_8). This example assumes USE SUNPERF is included in the code, because the _64 has not been appended to the routine name.

The following code example shows calling CAXPY from Fortran 95 using 32-bit arguments.


       PROGRAM TEST
       COMPLEX ALPHA
       INTEGER,PARAMETER :: INCX=1, INCY=1, N=10
       COMPLEX X(N), Y(N)
 
       CALL CAXPY(N, ALPHA, X, INCX, Y, INCY) 

The following code example shows calling CAXPY from Fortran 95 (without the USE SUNPERF statement) using 64-bit arguments.


       PROGRAM TEST
       COMPLEX   ALPHA
       INTEGER*8, PARAMETER :: INCX=1, INCY=1, N=10
       COMPLEX   X(N), Y(N)
 
       CALL CAXPY_64(N, ALPHA, X, INCX, Y, INCY)

When using 64-bit arguments, the _64 must be appended to the routine name if the USE SUNPERF statement is not used.

The following Fortran 95 code example shows calling CAXPY using 64-bit arguments.


       PROGRAM TEST
       USE SUNPERF
       .
       .
       .
       COMPLEX   ALPHA
       INTEGER*8, PARAMETER :: INCX=1, INCY=1, N=10
       COMPLEX   X(N), Y(N)
 
       CALL CAXPY(N, ALPHA, X, INCX, Y, INCY)

In C routines, the size of long is 32 bits when compiling for V8 or V8plus and 64 bits when compiling for V9. The following code example shows calling the dgbcon routine using 32-bit arguments.


void dgbcon(char norm, int n, int nsub, int nsuper, double *da,
            int lda, int *ipivot, double danorm, double drcond, 
            int *info)

The following code example shows calling the dgbcon routine using 64-bit arguments.


void dgbcon_64 (char norm, long n, long nsub, long nsuper,
                  double *da, long lda, long *ipivot, double danorm,
                double *drcond, long *info)