Specify instruction set architecture (ISA).
Architectures that are accepted by -xarch keyword isa are shown in Table 3–11:
Table 3–11 –xarch ISA Keywords
Platform |
Valid -xarch Keywords |
SPARC |
generic, generic64, native, native64, sparc, sparcvis, sparcvis2, sparcfmaf, v9, v9a, v9b |
x86 |
generic, native, 386, pentium_pro, sse, sse2, amd64, pentium_proa, ssea, sse2a, amd64a, sse3,sse3a, ssse3, sse4_1, sse4_2, amdsse4a |
Note that although -xarch can be used alone, it is part of the expansion of the –xtarget option and may be used to override the -xarch value that is set by a specific -xtarget option. For example:
% f95 -xtarget=ultra2 -xarch=sparcfmaf ...
overrides the -xarch set by -xtarget=ultra2
This option limits the code generated by the compiler to the instructions of the specified instruction set architecture by allowing only the specified set of instructions. This option does not guarantee use of any target–specific instructions.
If this option is used with optimization, the appropriate choice can provide good performance of the executable on the specified architecture. An inappropriate choice results in a binary program that is not executable on the intended target platform.
Note the following:
Legacy 32–bit SPARC instruction set architectures V7 and V8 imply —m32 and cannot be combined with —m64.
Object binary files (.o) compiled with sparc and sparcvis can be linked and can execute together, but will run only on a sparcvis—compatible platform.
Object binary files (.o) compiled with sparc, sparcvis, and sparcvis2 can be linked and can execute together, but will run only on a sparcvis2–compatible platform.
For any particular choice, the generated executable may run much more slowly on earlier architectures. Also, although quad-precision (REAL*16 and long double) floating-point instructions are available in many of these instruction set architectures, the compiler does not use these instructions in the code it generates.
The default when -xarch is not specified is generic.
Table 3–12 gives details for each of the -xarch keywords on SPARC platforms.
Table 3–12 -xarch Values for SPARC Platforms
-xarch= |
Meaning (SPARC) |
---|---|
generic |
Compile using the instruction set common to most processors. This is v8plus when compiling with —m32, and sparc with —m64. |
generic64 |
Compile for most 64–bit platforms. (Solaris only) This option is equivalent to -m64 -xarch=generic and is provided for compatibility with earlier releases. Use -m64 to specify 64-bit compilation instead of -xarch=generic64 |
native |
Compile for good performance on this system. The compiler chooses the appropriate setting for the current system processor it is running on. This is the default for the -fast option. |
native64 |
Compile for good performance in 64-bit mode on this system. (Solaris only) This option is equivalent to -m64 -xarch=native and is provided for compatibility with earlier releases. |
sparc |
Compile for the SPARC–V9 ISA. Compile for the V9 ISA, but without the Visual Instruction Set (VIS), and without other implementation-specific ISA extensions. This option enables the compiler to generate code for good performance on the V9 ISA. |
sparcvis |
Compile for the SPARC–V9 ISA with UltraSPARC extensions. Compile for SPARC-V9 plus the Visual Instruction Set (VIS) version 1.0, and with UltraSPARC extensions. This option enables the compiler to generate code for good performance on the UltraSPARC architecture. |
sparcvis2 |
Compile for the SPARC-V9 ISA with UltraSPARC-III extensions. Enables the compiler to generate object code for the UltraSPARC architecture, plus the Visual Instruction Set (VIS) version 2.0, and with UltraSPARC III extensions. |
sparcfmaf |
Compile for the sparcfmaf version of the SPARC-V9 ISA. Enables the compiler to use instructions from the SPARC-V9 instruction set, plus the UltraSPARC extensions, including the Visual Instruction Set (VIS) version 1.0, the UltraSPARC-III extensions, including the Visual Instruction Set (VIS) version 2.0, and the SPARC64 VI extensions for floating-point multiply-add. Note that you must use -xarch=sparcfmafin conjunction with -fma=fused and some optimization level to get the compiler to attempt to find opportunities to use the multiply-add instructions automatically. |
v9 |
Equivalent to -m64 -xarch=sparc Legacy makefiles and scripts that use -xarch=v9 to obtain the 64-bit memory model need only use -m64. |
v9a |
Equivalent to -m64 -xarch=sparcvis and is provided for compatibility with earlier releases. |
v9b |
Equivalent to -m64 -xarch=sparcvis2 and is provided for compatibility with earlier releases. |
Table 3–13 details each of the -xarch keywords on x86 platforms. The default on x86 is generic (or generic64 if —m64 is specified) if -xarch is not specified.
Table 3–13 -xarch Values for x86 Platforms
-xarch= |
Meaning (x86) |
---|---|
generic |
Compile for good performance on most 32-bit x86 platforms. This is the default, and is equivalent to -xarch=pentium_pro. |
generic64 |
Compile for good performance on most 64-bit x86 platforms. It is equivalent to sse2. |
native |
Compile for good performance on this x86 architecture. Use the best instruction set for good performance on most x86 processors. With each new release, the definition of “best” instruction set may be adjusted, if appropriate. |
native64 |
Compile for good performance on this 64-bit x86 architecture. |
386 |
Limits instruction set to the Intel 386/486 architecture. |
pentium_pro |
Limits instruction set to the Pentium Pro architecture. |
pentium_proa |
Adds the AMD extensions (3DNow!, 3DNow! extensions, and MMX extensions) to the 32-bit Pentium Pro architecture. |
sse |
Adds the SSE instruction set to pentium_pro. (See Note below.) |
ssea |
Adds the AMD extensions (3DNow!, 3DNow! extensions, and MMX extensions) to the 32-bit SSE architecture. |
sse2 |
Adds the SSE2 instruction set to the pentium_pro. (See Note below.) |
sse2a |
Adds the AMD extensions (3DNow!, 3DNow! extensions, and MMX extensions) to the 32-bit SSE2 architecture. |
sse3 |
Adds the SSE3 instruction set to the SSE2 instruction set. |
amd64 |
On Solaris platforms, this is equivalent to -m64 -xarch=sse2 Legacy makefiles and scripts that use -xarch=amd64 to obtain the 64-bit memory model should use -m64. |
amd64a |
On Solaris platforms, this is equivalent to —m64 —xarch=sse2a . |
sse3a |
Adds AMD extended instructions, including 3DNow! to the SSE3 instruction set. |
ssse3 |
Adds the SSSE3 instructions to the SSE3 instruction set. |
sse4_1 |
Adds the SSE4.1 instructions to the SSSE3 instruction set. |
sse4_2 |
Adds the SSE4.2 instructions to the SSE4.1 instruction set. |
amdsse4a |
Adds the SSE4a instructions to the AMD instruction set. |
There are some important considerations when compiling for x86 Solaris platforms.
Programs compiled with -xarch set to sse, sse2, sse2a, or sse3 and beyond must be run on platforms supporting these features and extensions..
OS releases starting with Solaris 9 4/04 are SSE/SSE2-enabled on Pentium 4-compatible platforms. Earlier versions of Solaris OS are not SSE/SSE2- enabled.
If you compile and link in separate steps, always link using the compiler and with same -xarch setting to ensure that the correct startup routine is linked.
Arithmetic results on x86 may differ from results on SPARC due to the x86 80-byte floating-point registers. To minimize these differences, use the -fstore option or compile with -xarch=sse2 if the hardware supports SSE2.
Starting with Sun Studio 11 and the Solaris 10 OS, program binaries compiled and built using these specialized -xarch hardware flags are verified that they are being run on the appropriate platform.
On systems prior to Solaris 10, no verification is done and it is the user’s responsibility to ensure objects built using these flags are deployed on suitable hardware.
Running programs compiled with these -xarch options on platforms that are not enabled with the appropriate features or instruction set extensions could result in segmentation faults or incorrect results occurring without any explicit warning messages.
This warning extends also to programs that employ .il inline assembly language functions or __asm() assembler code that utilize SSE, SSE2, SSE2a, and SSE3 instructions and extensions.