Oracle® Solaris Studio 12.4: Numerical Computation Guide

Exit Print View

Updated: January 2015
 
 

x86 Behavior and Implementation

This appendix discusses x86/x64 and SPARC compatibility issues related to the floating-point units used in x86/x64 based systems.

C.1 Code Generation for Supported Systems

Oracle Solaris supports many systems from Oracle, Sun, and other system vendors, that contain x86 processors from Intel, AMD, and other chip vendors. A particular Oracle Solaris release supports a number of specific systems containing such chips. For a particular Oracle Solaris release, see its corresponding Hardware Compatibility List.

Oracle Solaris 11 supports x86 processors that support 64-bit addressing. Oracle Solaris 10 Update 10 supports those 64-bit processors and many 32-bit-only x86 processors with hardware floating-point and 120 MHz or faster clock rates.

Compile with the –m32 –xarch=generic –xchip=generic flags to generate code that is satisfactory for the largest number of systems. The following table lists some specific code generation options for a few typical Oracle and Sun x86 systems:

System
Code Generation Options
Ultra 20
–xarch=sse2a –xchip=opteron
X2200
–xarch=amdsse4a –xchip=amdfam10
X6250
–xarch=sse3 –xchip=core2
X4170
–xarch=aes –xchip=westmere
X2-4
–xarch=sse4_2 –xchip=nehalem
X3-2
–xarch=avx –xchip=sandybridge
X4-2 X4-4
–xarch=avx_i –xchip=ivybridge
?
–xarch=avx2 –xchip=haswell

There are hundreds of distinct x86 chips, each with complicated nomenclature.

Using cc –dryrun –native is the best way to find out what the compiler would do to optimize a particular system. When generating code intended for a few varied x86 systems, using options for the oldest system is often satisfactory for all.

C.2 Differences from SPARC

Oracle Solaris Studio compilers generate code that usually performs similarly on SPARC and x86. However, be aware of the following important differences on x86-based systems:

  • The x87 floating-point registers are 80 bits wide. Because intermediate results of arithmetic computations can be in double extended (80-bit) precision when the x87 floating-point register stack is in use, computation results can differ. The –fstore flag minimizes these discrepancies. However, using the –fstore flag introduces a penalty in performance. While Oracle Solaris Studio 12.4 does not use x87 registers by default for single and double precision expression evaluation, they are used if –xarch=386 is specified, or if the x87 hardware transcendental instructions are used, or if double-extended variables are used.

  • Each time a single or double precision floating-point number is loaded onto the x87 floating-point register stack or stored into memory, a conversion to or from double extended (80-bit) precision occurs. Thus loads and stores of floating-point numbers can cause exceptions. With –m32, floating-point subroutine operands and results are passed in x87 registers.

  • When the x87 floating-point register stack is in use, gradual underflow is implemented in hardware with microcode assist; there is no nonstandard mode.

  • There is no fpversion utility.

  • The double extended (80-bit) format admits certain bit patterns that do not represent any floating-point values (see Table 2–8). The hardware generally treats these "unsupported formats" as signaling NaNs, but the math libraries are not consistent in their handling of such representations. Since these bit patterns are never generated by the hardware, they can only be created by invalid memory references. such as reading beyond the end of an array, or from explicit coercions of data in memory from one type to another, via C's union construct, for example. Therefore, in most numerical programs, these bit patterns do not arise.