JavaScript is required to for searching.
Skip Navigation Links
Exit Print View
x86 Assembly Language Reference Manual
search filter icon
search icon

Document Information

Preface

1.  Overview of the Solaris x86 Assembler

2.  Solaris x86 Assembly Language Syntax

3.  Instruction Set Mapping

Instruction Overview

General-Purpose Instructions

Data Transfer Instructions

Binary Arithmetic Instructions

Decimal Arithmetic Instructions

Logical Instructions

Shift and Rotate Instructions

Bit and Byte Instructions

Control Transfer Instructions

String Instructions

I/O Instructions

Flag Control (EFLAG) Instructions

Segment Register Instructions

Miscellaneous Instructions

Floating-Point Instructions

Data Transfer Instructions (Floating Point)

Basic Arithmetic Instructions (Floating-Point)

Comparison Instructions (Floating-Point)

Transcendental Instructions (Floating-Point)

Load Constants (Floating-Point) Instructions

Control Instructions (Floating-Point)

SIMD State Management Instructions

MMX Instructions

Data Transfer Instructions (MMX)

Conversion Instructions (MMX)

Packed Arithmetic Instructions (MMX)

Comparison Instructions (MMX)

Logical Instructions (MMX)

Shift and Rotate Instructions (MMX)

State Management Instructions (MMX)

SSE Instructions

SIMD Single-Precision Floating-Point Instructions (SSE)

Data Transfer Instructions (SSE)

Packed Arithmetic Instructions (SSE)

Comparison Instructions (SSE)

Logical Instructions (SSE)

Shuffle and Unpack Instructions (SSE)

Conversion Instructions (SSE)

MXCSR State Management Instructions (SSE)

64-Bit SIMD Integer Instructions (SSE)

Miscellaneous Instructions (SSE)

SSE2 Instructions

SSE2 Packed and Scalar Double-Precision Floating-Point Instructions

SSE2 Data Movement Instructions

SSE2 Packed Arithmetic Instructions

SSE2 Logical Instructions

SSE2 Compare Instructions

SSE2 Shuffle and Unpack Instructions

SSE2 Conversion Instructions

SSE2 Packed Single-Precision Floating-Point Instructions

SSE2 128-Bit SIMD Integer Instructions

SSE2 Miscellaneous Instructions

Operating System Support Instructions

64-Bit AMD Opteron Considerations

Index

SSE Instructions

SSE instructions are an extension of the SIMD execution model introduced with the MMX technology. SSE instructions are divided into four subgroups:

SIMD Single-Precision Floating-Point Instructions (SSE)

The SSE SIMD instructions operate on packed and scalar single-precision floating-point values located in the XMM registers or memory.

Data Transfer Instructions (SSE)

The SSE data transfer instructions move packed and scalar single-precision floating-point operands between XMM registers and between XMM registers and memory.

Table 3-27 Data Transfer Instructions (SSE)

Solaris Mnemonic
Intel/AMD Mnemonic
Description
Notes
movaps
MOVAPS
move four aligned packed single-precision floating-point values between XMM registers or memory
movhlps
MOVHLPS
move two packed single-precision floating-point values from the high quadword of an XMM register to the low quadword of another XMM register
movhps
MOVHPS
move two packed single-precision floating-point values to or from the high quadword of an XMM register or memory
movlhps
MOVLHPS
move two packed single-precision floating-point values from the low quadword of an XMM register to the high quadword of another XMM register
movlps
MOVLPS
move two packed single-precision floating-point values to or from the low quadword of an XMM register or memory
movmskps
MOVMSKPS
extract sign mask from four packed single-precision floating-point values
movss
MOVSS
move scalar single-precision floating-point value between XMM registers or memory
movups
MOVUPS
move four unaligned packed single-precision floating-point values between XMM registers or memory
Packed Arithmetic Instructions (SSE)

SSE packed arithmetic instructions perform packed and scalar arithmetic operations on packed and scalar single-precision floating-point operands.

Table 3-28 Packed Arithmetic Instructions (SSE)

Solaris Mnemonic
Intel/AMD Mnemonic
Description
Notes
addps
ADDPS
add packed single-precision floating-point values
addss
ADDSS
add scalar single-precision floating-point values
divps
DIVPS
divide packed single-precision floating-point values
divss
DIVSS
divide scalar single-precision floating-point values
maxps
MAXPS
return maximum packed single-precision floating-point values
maxss
MAXSS
return maximum scalar single-precision floating-point values
minps
MINPS
return minimum packed single-precision floating-point values
minss
MINSS
return minimum scalar single-precision floating-point values.
mulps
MULPS
multiply packed single-precision floating-point values
mulss
MULSS
multiply scalar single-precision floating-point values
rcpps
RCPPS
compute reciprocals of packed single-precision floating-point values
rcpss
RCPSS
compute reciprocal of scalar single-precision floating-point values
rsqrtps
RSQRTPS
compute reciprocals of square roots of packed single-precision floating-point values
rsqrtss
RSQRTSS
compute reciprocal of square root of scalar single-precision floating-point values
sqrtps
SQRTPS
compute square roots of packed single-precision floating-point values
sqrtss
SQRTSS
compute square root of scalar single-precision floating-point values
subps
SUBPS
subtract packed single-precision floating-point values
subss
SUBSS
subtract scalar single-precision floating-point values
Comparison Instructions (SSE)

The SEE compare instructions compare packed and scalar single-precision floating-point operands.

Table 3-29 Comparison Instructions (SSE)

Solaris Mnemonic
Intel/AMD Mnemonic
Description
Notes
cmpps
CMPPS
compare packed single-precision floating-point values
cmpss
CMPSS
compare scalar single-precision floating-point values
comiss
COMISS
perform ordered comparison of scalar single-precision floating-point values and set flags in EFLAGS register
ucomiss
UCOMISS
perform unordered comparison of scalar single-precision floating-point values and set flags in EFLAGS register
Logical Instructions (SSE)

The SSE logical instructions perform bitwise AND, AND NOT, OR, and XOR operations on packed single-precision floating-point operands.

Table 3-30 Logical Instructions (SSE)

Solaris Mnemonic
Intel/AMD Mnemonic
Description
Notes
andnps
ANDNPS
perform bitwise logical AND NOT of packed single-precision floating-point values
andps
ANDPS
perform bitwise logical AND of packed single-precision floating-point values
orps
ORPS
perform bitwise logical OR of packed single-precision floating-point values
xorps
XORPS
perform bitwise logical XOR of packed single-precision floating-point values
Shuffle and Unpack Instructions (SSE)

The SSE shuffle and unpack instructions shuffle or interleave single-precision floating-point values in packed single-precision floating-point operands.

Table 3-31 Shuffle and Unpack Instructions (SSE)

Solaris Mnemonic
Intel/AMD Mnemonic
Description
Notes
shufps
SHUFPS
shuffles values in packed single-precision floating-point operands
unpckhps
UNPCKHPS
unpacks and interleaves the two high-order values from two single-precision floating-point operands
unpcklps
UNPCKLPS
unpacks and interleaves the two low-order values from two single-precision floating-point operands
Conversion Instructions (SSE)

The SSE conversion instructions convert packed and individual doubleword integers into packed and scalar single-precision floating-point values.

Table 3-32 Conversion Instructions (SSE)

Solaris Mnemonic
Intel/AMD Mnemonic
Description
Notes
cvtpi2ps
CVTPI2PS
convert packed doubleword integers to packed single-precision floating-point values
cvtps2pi
CVTPS2PI
convert packed single-precision floating-point values to packed doubleword integers
cvtsi2ss
CVTSI2SS
convert doubleword integer to scalar single-precision floating-point value
cvtss2si
CVTSS2SI
convert scalar single-precision floating-point value to a doubleword integer
cvttps2pi
CVTTPS2PI
convert with truncation packed single-precision floating-point values to packed doubleword integers
cvttss2si
CVTTSS2SI
convert with truncation scalar single-precision floating-point value to scalar doubleword integer

MXCSR State Management Instructions (SSE)

The MXCSR state management instructions save and restore the state of the MXCSR control and status register.

Table 3-33 MXCSR State Management Instructions (SSE)

Solaris Mnemonic
Intel/AMD Mnemonic
Description
Notes
ldmxcsr
LDMXCSR
load %mxcsr register
stmxcsr
STMXCSR
save %mxcsr register state

64–Bit SIMD Integer Instructions (SSE)

The SSE 64–bit SIMD integer instructions perform operations on packed bytes, words, or doublewords in MMX registers.

Table 3-34 64–Bit SIMD Integer Instructions (SSE)

Solaris Mnemonic
Intel/AMD Mnemonic
Description
Notes
pavgb
PAVGB
compute average of packed unsigned byte integers
pavgw
PAVGW
compute average of packed unsigned byte integers
pextrw
PEXTRW
extract word
pinsrw
PINSRW
insert word
pmaxsw
PMAXSW
maximum of packed signed word integers
pmaxub
PMAXUB
maximum of packed unsigned byte integers
pminsw
PMINSW
minimum of packed signed word integers
pminub
PMINUB
minimum of packed unsigned byte integers
pmovmskb
PMOVMSKB
move byte mask
pmulhuw
PMULHUW
multiply packed unsigned integers and store high result
psadbw
PSADBW
compute sum of absolute differences
pshufw
PSHUFW
shuffle packed integer word in MMX register

Miscellaneous Instructions (SSE)

The following instructions control caching, prefetching, and instruction ordering.

Table 3-35 Miscellaneous Instructions (SSE)

Solaris Mnemonic
Intel/AMD Mnemonic
Description
Notes
maskmovq
MASKMOVQ
non-temporal store of selected bytes from an MMX register into memory
movntps
MOVNTPS
non-temporal store of four packed single-precision floating-point values from an XMM register into memory
movntq
MOVNTQ
non-temporal store of quadword from an MMX register into memory
prefetchnta
PREFETCHNTA
prefetch data into non-temporal cache structure and into a location close to the processor
prefetcht0
PREFETCHT0
prefetch data into all levels of the cache hierarchy
prefetcht1
PREFETCHT1
prefetch data into level 2 cache and higher
prefetcht2
PREFETCHT2
prefetch data into level 2 cache and higher
sfence
SFENCE
serialize store operations