x86 Assembly Language Reference Manual

SIMD Single-Precision Floating-Point Instructions (SSE)

The SSE SIMD instructions operate on packed and scalar single-precision floating-point values located in the XMM registers or memory.

Data Transfer Instructions (SSE)

The SSE data transfer instructions move packed and scalar single-precision floating-point operands between XMM registers and between XMM registers and memory.

Table 3–27 Data Transfer Instructions (SSE)


Solaris Mnemonic	Intel/AMD Mnemonic	Description
`movaps`	`MOVAPS`	move four aligned packed single-precision floating-point values between XMM registers or memory
`movhlps`	`MOVHLPS`	move two packed single-precision floating-point values from the high quadword of an XMM register to the low quadword of another XMM register
`movhps`	`MOVHPS`	move two packed single-precision floating-point values to or from the high quadword of an XMM register or memory
`movlhps`	`MOVLHPS`	move two packed single-precision floating-point values from the low quadword of an XMM register to the high quadword of another XMM register
`movlps`	`MOVLPS`	move two packed single-precision floating-point values to or from the low quadword of an XMM register or memory
`movmskps`	`MOVMSKPS`	extract sign mask from four packed single-precision floating-point values
`movss`	`MOVSS`	move scalar single-precision floating-point value between XMM registers or memory
`movups`	`MOVUPS`	move four unaligned packed single-precision floating-point values between XMM registers or memory

Packed Arithmetic Instructions (SSE)

SSE packed arithmetic instructions perform packed and scalar arithmetic operations on packed and scalar single-precision floating-point operands.

Table 3–28 Packed Arithmetic Instructions (SSE)


Solaris Mnemonic	Intel/AMD Mnemonic	Description
`addps`	`ADDPS`	add packed single-precision floating-point values
`addss`	`ADDSS`	add scalar single-precision floating-point values
`divps`	`DIVPS`	divide packed single-precision floating-point values
`divss`	`DIVSS`	divide scalar single-precision floating-point values
`maxps`	`MAXPS`	return maximum packed single-precision floating-point values
`maxss`	`MAXSS`	return maximum scalar single-precision floating-point values
`minps`	`MINPS`	return minimum packed single-precision floating-point values
`minss`	`MINSS`	return minimum scalar single-precision floating-point values.
`mulps`	`MULPS`	multiply packed single-precision floating-point values
`mulss`	`MULSS`	multiply scalar single-precision floating-point values
`rcpps`	`RCPPS`	compute reciprocals of packed single-precision floating-point values
`rcpss`	`RCPSS`	compute reciprocal of scalar single-precision floating-point values
`rsqrtps`	`RSQRTPS`	compute reciprocals of square roots of packed single-precision floating-point values
`rsqrtss`	`RSQRTSS`	compute reciprocal of square root of scalar single-precision floating-point values
`sqrtps`	`SQRTPS`	compute square roots of packed single-precision floating-point values
`sqrtss`	`SQRTSS`	compute square root of scalar single-precision floating-point values
`subps`	`SUBPS`	subtract packed single-precision floating-point values
`subss`	`SUBSS`	subtract scalar single-precision floating-point values

Comparison Instructions (SSE)

The SEE compare instructions compare packed and scalar single-precision floating-point operands.

Table 3–29 Comparison Instructions (SSE)


Solaris Mnemonic	Intel/AMD Mnemonic	Description
`cmpps`	`CMPPS`	compare packed single-precision floating-point values
`cmpss`	`CMPSS`	compare scalar single-precision floating-point values
`comiss`	`COMISS`	perform ordered comparison of scalar single-precision floating-point values and set flags in EFLAGS register
`ucomiss`	`UCOMISS`	perform unordered comparison of scalar single-precision floating-point values and set flags in EFLAGS register

Logical Instructions (SSE)

The SSE logical instructions perform bitwise AND, AND NOT, OR, and XOR operations on packed single-precision floating-point operands.

Table 3–30 Logical Instructions (SSE)


Solaris Mnemonic	Intel/AMD Mnemonic	Description
`andnps`	`ANDNPS`	perform bitwise logical AND NOT of packed single-precision floating-point values
`andps`	`ANDPS`	perform bitwise logical AND of packed single-precision floating-point values
`orps`	`ORPS`	perform bitwise logical OR of packed single-precision floating-point values
`xorps`	`XORPS`	perform bitwise logical XOR of packed single-precision floating-point values

Shuffle and Unpack Instructions (SSE)

The SSE shuffle and unpack instructions shuffle or interleave single-precision floating-point values in packed single-precision floating-point operands.

Table 3–31 Shuffle and Unpack Instructions (SSE)


Solaris Mnemonic	Intel/AMD Mnemonic	Description
`shufps`	`SHUFPS`	shuffles values in packed single-precision floating-point operands
`unpckhps`	`UNPCKHPS`	unpacks and interleaves the two high-order values from two single-precision floating-point operands
`unpcklps`	`UNPCKLPS`	unpacks and interleaves the two low-order values from two single-precision floating-point operands

Conversion Instructions (SSE)

The SSE conversion instructions convert packed and individual doubleword integers into packed and scalar single-precision floating-point values.

Table 3–32 Conversion Instructions (SSE)


Solaris Mnemonic	Intel/AMD Mnemonic	Description
`cvtpi2ps`	`CVTPI2PS`	convert packed doubleword integers to packed single-precision floating-point values
`cvtps2pi`	`CVTPS2PI`	convert packed single-precision floating-point values to packed doubleword integers
`cvtsi2ss`	`CVTSI2SS`	convert doubleword integer to scalar single-precision floating-point value
`cvtss2si`	`CVTSS2SI`	convert scalar single-precision floating-point value to a doubleword integer
`cvttps2pi`	`CVTTPS2PI`	convert with truncation packed single-precision floating-point values to packed doubleword integers
`cvttss2si`	`CVTTSS2SI`	convert with truncation scalar single-precision floating-point value to scalar doubleword integer