x86 Assembly Language Reference Manual

SIMD Single-Precision Floating-Point Instructions (SSE)

The SSE SIMD instructions operate on packed and scalar single-precision floating-point values located in the XMM registers or memory.

Data Transfer Instructions (SSE)

The SSE data transfer instructions move packed and scalar single-precision floating-point operands between XMM registers and between XMM registers and memory.

Table 3–27 Data Transfer Instructions (SSE)

Solaris Mnemonic 

Intel/AMD Mnemonic 

Description 

Notes 

movaps

MOVAPS

move four aligned packed single-precision floating-point values between XMM registers or memory 

 

movhlps

MOVHLPS

move two packed single-precision floating-point values from the high quadword of an XMM register to the low quadword of another XMM register 

 

movhps

MOVHPS

move two packed single-precision floating-point values to or from the high quadword of an XMM register or memory 

 

movlhps

MOVLHPS

move two packed single-precision floating-point values from the low quadword of an XMM register to the high quadword of another XMM register 

 

movlps

MOVLPS

move two packed single-precision floating-point values to or from the low quadword of an XMM register or memory 

 

movmskps

MOVMSKPS

extract sign mask from four packed single-precision floating-point values 

 

movss

MOVSS

move scalar single-precision floating-point value between XMM registers or memory 

 

movups

MOVUPS

move four unaligned packed single-precision floating-point values between XMM registers or memory 

 

Packed Arithmetic Instructions (SSE)

SSE packed arithmetic instructions perform packed and scalar arithmetic operations on packed and scalar single-precision floating-point operands.

Table 3–28 Packed Arithmetic Instructions (SSE)

Solaris Mnemonic 

Intel/AMD Mnemonic 

Description 

Notes 

addps

ADDPS

add packed single-precision floating-point values 

 

addss

ADDSS

add scalar single-precision floating-point values 

 

divps

DIVPS

divide packed single-precision floating-point values 

 

divss

DIVSS

divide scalar single-precision floating-point values 

 

maxps

MAXPS

return maximum packed single-precision floating-point values 

 

maxss

MAXSS

return maximum scalar single-precision floating-point values 

 

minps

MINPS

return minimum packed single-precision floating-point values 

 

minss

MINSS

return minimum scalar single-precision floating-point values. 

 

mulps

MULPS

multiply packed single-precision floating-point values 

 

mulss

MULSS

multiply scalar single-precision floating-point values 

 

rcpps

RCPPS

compute reciprocals of packed single-precision floating-point values 

 

rcpss

RCPSS

compute reciprocal of scalar single-precision floating-point values 

 

rsqrtps

RSQRTPS

compute reciprocals of square roots of packed single-precision floating-point values 

 

rsqrtss

RSQRTSS

compute reciprocal of square root of scalar single-precision floating-point values 

 

sqrtps

SQRTPS

compute square roots of packed single-precision floating-point values 

 

sqrtss

SQRTSS

compute square root of scalar single-precision floating-point values 

 

subps

SUBPS

subtract packed single-precision floating-point values 

 

subss

SUBSS

subtract scalar single-precision floating-point values 

 

Comparison Instructions (SSE)

The SEE compare instructions compare packed and scalar single-precision floating-point operands.

Table 3–29 Comparison Instructions (SSE)

Solaris Mnemonic 

Intel/AMD Mnemonic 

Description 

Notes 

cmpps

CMPPS

compare packed single-precision floating-point values 

 

cmpss

CMPSS

compare scalar single-precision floating-point values 

 

comiss

COMISS

perform ordered comparison of scalar single-precision floating-point values and set flags in EFLAGS register 

 

ucomiss

UCOMISS

perform unordered comparison of scalar single-precision floating-point values and set flags in EFLAGS register 

 

Logical Instructions (SSE)

The SSE logical instructions perform bitwise AND, AND NOT, OR, and XOR operations on packed single-precision floating-point operands.

Table 3–30 Logical Instructions (SSE)

Solaris Mnemonic 

Intel/AMD Mnemonic 

Description 

Notes 

andnps

ANDNPS

perform bitwise logical AND NOT of packed single-precision floating-point values 

 

andps

ANDPS

perform bitwise logical AND of packed single-precision floating-point values 

 

orps

ORPS

perform bitwise logical OR of packed single-precision floating-point values 

 

xorps

XORPS

perform bitwise logical XOR of packed single-precision floating-point values 

 

Shuffle and Unpack Instructions (SSE)

The SSE shuffle and unpack instructions shuffle or interleave single-precision floating-point values in packed single-precision floating-point operands.

Table 3–31 Shuffle and Unpack Instructions (SSE)

Solaris Mnemonic 

Intel/AMD Mnemonic 

Description 

Notes 

shufps

SHUFPS

shuffles values in packed single-precision floating-point operands 

 

unpckhps

UNPCKHPS

unpacks and interleaves the two high-order values from two single-precision floating-point operands 

 

unpcklps

UNPCKLPS

unpacks and interleaves the two low-order values from two single-precision floating-point operands 

 

Conversion Instructions (SSE)

The SSE conversion instructions convert packed and individual doubleword integers into packed and scalar single-precision floating-point values.

Table 3–32 Conversion Instructions (SSE)

Solaris Mnemonic 

Intel/AMD Mnemonic 

Description 

Notes 

cvtpi2ps

CVTPI2PS

convert packed doubleword integers to packed single-precision floating-point values 

 

cvtps2pi

CVTPS2PI

convert packed single-precision floating-point values to packed doubleword integers 

 

cvtsi2ss

CVTSI2SS

convert doubleword integer to scalar single-precision floating-point value 

 

cvtss2si

CVTSS2SI

convert scalar single-precision floating-point value to a doubleword integer 

 

cvttps2pi

CVTTPS2PI

convert with truncation packed single-precision floating-point values to packed doubleword integers 

 

cvttss2si

CVTTSS2SI

convert with truncation scalar single-precision floating-point value to scalar doubleword integer