JavaScript is required to for searching.
Skip Navigation Links
Exit Print View
x86 Assembly Language Reference Manual     Oracle Solaris 11 Express 11/10
search filter icon
search icon

Document Information

Preface

1.  Overview of the Solaris x86 Assembler

2.  Solaris x86 Assembly Language Syntax

3.  Instruction Set Mapping

Instruction Overview

General-Purpose Instructions

Data Transfer Instructions

Binary Arithmetic Instructions

Decimal Arithmetic Instructions

Logical Instructions

Shift and Rotate Instructions

Bit and Byte Instructions

Control Transfer Instructions

String Instructions

I/O Instructions

Flag Control (EFLAG) Instructions

Segment Register Instructions

Miscellaneous Instructions

Floating-Point Instructions

Data Transfer Instructions (Floating Point)

Basic Arithmetic Instructions (Floating-Point)

Comparison Instructions (Floating-Point)

Transcendental Instructions (Floating-Point)

Load Constants (Floating-Point) Instructions

Control Instructions (Floating-Point)

SIMD State Management Instructions

MMX Instructions

Data Transfer Instructions (MMX)

Conversion Instructions (MMX)

Packed Arithmetic Instructions (MMX)

Comparison Instructions (MMX)

Logical Instructions (MMX)

Shift and Rotate Instructions (MMX)

State Management Instructions (MMX)

SSE Instructions

SIMD Single-Precision Floating-Point Instructions (SSE)

Data Transfer Instructions (SSE)

Packed Arithmetic Instructions (SSE)

Comparison Instructions (SSE)

Logical Instructions (SSE)

Shuffle and Unpack Instructions (SSE)

Conversion Instructions (SSE)

MXCSR State Management Instructions (SSE)

64-Bit SIMD Integer Instructions (SSE)

Miscellaneous Instructions (SSE)

SSE2 Instructions

SSE2 Packed and Scalar Double-Precision Floating-Point Instructions

SSE2 Data Movement Instructions

SSE2 Packed Arithmetic Instructions

SSE2 Logical Instructions

SSE2 Compare Instructions

SSE2 Shuffle and Unpack Instructions

SSE2 Conversion Instructions

SSE2 Packed Single-Precision Floating-Point Instructions

SSE2 128-Bit SIMD Integer Instructions

SSE2 Miscellaneous Instructions

Operating System Support Instructions

64-Bit AMD Opteron Considerations

Index

SSE2 Instructions

SSE2 instructions are an extension of the SIMD execution model introduced with the MMX technology and the SSE extensions. SSE2 instructions are divided into four subgroups:

SSE2 Packed and Scalar Double-Precision Floating-Point Instructions

The SSE2 packed and scalar double-precision floating-point instructions operate on double-precision floating-point operands.

SSE2 Data Movement Instructions

The SSE2 data movement instructions move double-precision floating-point data between XMM registers and memory.

Table 3-36 SSE2 Data Movement Instructions

Solaris Mnemonic
Intel/AMD Mnemonic
Description
Notes
movapd
MOVAPD
move two aligned packed double-precision floating-point values between XMM registers and memory
movhpd
MOVHPD
move high packed double-precision floating-point value to or from the high quadword of an XMM register and memory
movlpd
MOVLPD
move low packed single-precision floating-point value to or from the low quadword of an XMM register and memory
movmskpd
MOVMSKPD
extract sign mask from two packed double-precision floating-point values
movsd
MOVSD
move scalar double-precision floating-point value between XMM registers and memory.
movupd
MOVUPD
move two unaligned packed double-precision floating-point values between XMM registers and memory
SSE2 Packed Arithmetic Instructions

The SSE2 arithmetic instructions operate on packed and scalar double-precision floating-point operands.

Table 3-37 SSE2 Packed Arithmetic Instructions

Solaris Mnemonic
Intel/AMD Mnemonic
Description
Notes
addpd
ADDPD
add packed double-precision floating-point values
addsd
ADDSD
add scalar double-precision floating-point values
divpd
DIVPD
divide packed double-precision floating-point values
divsd
DIVSD
divide scalar double-precision floating-point values
maxpd
MAXPD
return maximum packed double-precision floating-point values
maxsd
MAXSD
return maximum scalar double-precision floating-point value
minpd
MINPD
return minimum packed double-precision floating-point values
minsd
MINSD
return minimum scalar double-precision floating-point value
mulpd
MULPD
multiply packed double-precision floating-point values
mulsd
MULSD
multiply scalar double-precision floating-point values
sqrtpd
SQRTPD
compute packed square roots of packed double-precision floating-point values
sqrtsd
SQRTSD
compute scalar square root of scalar double-precision floating-point value
subpd
SUBPD
subtract packed double-precision floating-point values
subsd
SUBSD
subtract scalar double-precision floating-point values
SSE2 Logical Instructions

The SSE2 logical instructions operate on packed double-precision floating-point values.

Table 3-38 SSE2 Logical Instructions

Solaris Mnemonic
Intel/AMD Mnemonic
Description
Notes
andnpd
ANDNPD
perform bitwise logical AND NOT of packed double-precision floating-point values
andpd
ANDPD
perform bitwise logical AND of packed double-precision floating-point values
orpd
ORPD
perform bitwise logical OR of packed double-precision floating-point values
xorpd
XORPD
perform bitwise logical XOR of packed double-precision floating-point values
SSE2 Compare Instructions

The SSE2 compare instructions compare packed and scalar double-precision floating-point values and return the results of the comparison to either the destination operand or to the EFLAGS register.

Table 3-39 SSE2 Compare Instructions

Solaris Mnemonic
Intel/AMD Mnemonic
Description
Notes
cmppd
CMPPD
compare packed double-precision floating-point values
cmpsd
CMPSD
compare scalar double-precision floating-point values
comisd
COMISD
perform ordered comparison of scalar double-precision floating-point values and set flags in EFLAGS register
ucomisd
UCOMISD
perform unordered comparison of scalar double-precision floating-point values and set flags in EFLAGS register
SSE2 Shuffle and Unpack Instructions

The SSE2 shuffle and unpack instructions operate on packed double-precision floating-point operands.

Table 3-40 SSE2 Shuffle and Unpack Instructions

Solaris Mnemonic
Intel/AMD Mnemonic
Description
Notes
shufpd
SHUFPD
shuffle values in packed double-precision floating-point operands
unpckhpd
UNPCKHPD
unpack and interleave the high values from two packed double-precision floating-point operands
unpcklpd
UNPCKLPD
unpack and interleave the low values from two packed double-precision floating-point operands
SSE2 Conversion Instructions

The SSE2 conversion instructions convert packed and individual doubleword integers into packed and scalar double-precision floating-point values (and vice versa). These instructions also convert between packed and scalar single-precision and double-precision floating-point values.

Table 3-41 SSE2 Conversion Instructions

Solaris Mnemonic
Intel/AMD Mnemonic
Description
Notes
cvtdq2pd
CVTDQ2PD
convert packed doubleword integers to packed double-precision floating-point values
cvtpd2dq
CVTPD2DQ
convert packed double-precision floating-point values to packed doubleword integers
cvtpd2pi
CVTPD2PI
convert packed double-precision floating-point values to packed doubleword integers
cvtpd2ps
CVTPD2PS
convert packed double-precision floating-point values to packed single-precision floating-point values
cvtpi2pd
CVTPI2PD
convert packed doubleword integers to packed double-precision floating-point values
cvtps2pd
CVTPS2PD
convert packed single-precision floating-point values to packed double-precision floating-point values
cvtsd2si
CVTSD2SI
convert scalar double-precision floating-point values to a doubleword integer
cvtsd2ss
CVTSD2SS
convert scalar double-precision floating-point values to scalar single-precision floating-point values
cvtsi2sd
CVTSI2SD
convert doubleword integer to scalar double-precision floating-point value
cvtss2sd
CVTSS2SD
convert scalar single-precision floating-point values to scalar double-precision floating-point values
cvttpd2dq
CVTTPD2DQ
convert with truncation packed double-precision floating-point values to packed doubleword integers
cvttpd2pi
CVTTPD2PI
convert with truncation packed double-precision floating-point values to packed doubleword integers
cvttsd2si
CVTTSD2SI
convert with truncation scalar double-precision floating-point values to scalar doubleword integers

SSE2 Packed Single-Precision Floating-Point Instructions

The SSE2 packed single-precision floating-point instructions operate on single-precision floating-point and integer operands.

Table 3-42 SSE2 Packed Single-Precision Floating-Point Instructions

Solaris Mnemonic
Intel/AMD Mnemonic
Description
Notes
cvtdq2ps
CVTDQ2PS
convert packed doubleword integers to packed single-precision floating-point values
cvtps2dq
CVTPS2DQ
convert packed single-precision floating-point values to packed doubleword integers
cvttps2dq
CVTTPS2DQ
convert with truncation packed single-precision floating-point values to packed doubleword integers

SSE2 128–Bit SIMD Integer Instructions

The SSE2 SIMD integer instructions operate on packed words, doublewords, and quadwords contained in XMM and MMX registers.

Table 3-43 SSE2 128–Bit SIMD Integer Instructions

Solaris Mnemonic
Intel/AMD Mnemonic
Description
Notes
movdq2q
MOVDQ2Q
move quadword integer from XMM to MMX registers
movdqa
MOVDQA
move aligned double quadword
movdqu
MOVDQU
move unaligned double quadword
movq2dq
MOVQ2DQ
move quadword integer from MMX to XMM registers
paddq
PADDQ
add packed quadword integers
pmuludq
PMULUDQ
multiply packed unsigned doubleword integers
pshufd
PSHUFD
shuffle packed doublewords
pshufhw
PSHUFHW
shuffle packed high words
pshuflw
PSHUFLW
shuffle packed low words
pslldq
PSLLDQ
shift double quadword left logical
psrldq
PSRLDQ
shift double quadword right logical
psubq
PSUBQ
subtract packed quadword integers
punpckhqdq
PUNPCKHQDQ
unpack high quadwords
punpcklqdq
PUNPCKLQDQ
unpack low quadwords

SSE2 Miscellaneous Instructions

The SSE2 instructions described below provide additional functionality for caching non-temporal data when storing data from XMM registers to memory, and provide additional control of instruction ordering on store operations.

Table 3-44 SSE2 Miscellaneous Instructions

Solaris Mnemonic
Intel/AMD Mnemonic
Description
Notes
clflush
CLFLUSH
flushes and invalidates a memory operand and its associated cache line from all levels of the processor's cache hierarchy
lfence
LFENCE
serializes load operations
maskmovdqu
MASKMOVDQU
non-temporal store of selected bytes from an XMM register into memory
mfence
MFENCE
serializes load and store operations
movntdq
MOVNTDQ
non-temporal store of double quadword from an XMM register into memory
movnti
MOVNTI
non-temporal store of a doubleword from a general-purpose register into memory
movntiq valid only under -xarch=amd64
movntpd
MOVNTPD
non-temporal store of two packed double-precision floating-point values from an XMM register into memory
pause
PAUSE
improves the performance of spin-wait loops