SPARC Assembly Language Reference Manual

Appendix E SPARC-V9 Instruction Set

This appendix describes changes made to the SPARC instruction set due to the SPARC-V9 architecture. Application software for the 32-bit SPARC-V8 (Version8) architecture can execute, unchanged, on SPARC-V9 systems.

This appendix is organized into the following sections:

E.1 SPARC-V9 Changes

The SPARC-V9 architecture differs from SPARC-V8 architecture in the following areas, expanded below: registers, alternate space access, byte order, and instruction set.

E.1.1 Registers

These registers have been deleted:

Table E–1

PSR 

Processor State Register 

TBR 

Trap Base Register 

WIM 

Window Invalid Mask 

These registers have been widened from 32 to 64 bits:

Table E–2

Integer registers 

 

All state registers 

FSR, PC, nPC, and Y 


Note –

FSR Floating-Point State Register: fcc1, fcc2, and fcc3 (added floating-point condition code) bits are added and the register widened to 64-bits.


These SPARC-V9 registers are within a SPARC-V8 register field:

Table E–3

CCR 

Condition Codes Register 

CWP 

Current Window Pointer 

PIL 

Processor Interrupt Level 

TBA 

Trap Base Address  

TT[MAXTL] 

Trap Type 

VER 

Version  

These are registers that have been added.

Table E–4

ASI 

Address Space Identifier 

CANRESTORE 

Restorable Windows 

CANSAVE 

Savable windows 

CLEANWIN 

Clean Windows  

FPRS 

Floating-point Register State 

OTHERWIN  

Other Windows  

PSTATE 

Processor State 

TICK 

Hardware clock tick-counter 

TL 

Trap Level 

TNPC[MAXTL] 

Trap Next Program Counter 

TPC[MAXTL] 

Trap Program Counter  

TSTATE[MAXTL] 

Trap State  

WSTATE 

Windows State  

Also, there are sixteen additional double-precision floating-point registers, f[32] .. f[62]. These registers overlap (and are aliased with) eight additional quad-precision floating-point registers, f[32] .. f[60]

The SPARC-V9, CWP register is decremented during a RESTORE instruction, and incremented during a SAVE instruction. This is the opposite of PSR.CWP's behavior in SPARC-V8. This change has no effect on nonprivileged instructions.

E.1.2 Alternate Space Access

Load- and store-alternate instructions to one-half of the alternate spaces can now be included in user code. In SPARC-V9, loads and stores to ASIs 0016 .. 7f16 are privileged; those to ASIs 8016 .. FF16 are nonprivileged. In SPARC-V8, access to alternate address spaces is privileged.

E.1.3 Byte Order

SPARC-V9 supports both little- and big-endian byte orders for data accesses only; instruction accesses are always performed using big-endian byte order. In SPARC-V8, all data and instruction accesses are performed in big-endian byte order.

E.2 SPARC-V9 Instruction Set Changes

Application software written for the SPARC-V8 processor runs unchanged on a SPARC-V9 processor.

E.2.1 Extended Instruction Definitions to Support the 64-bit Model

Table E–5

FCMP, FCMPE 

Floating-Point Compare—can set any of the four floating-point condition codes. 

LDFSR, STFSR 

Load/Store FSR- only affect low-order 32 bits of FSR 

LDUW, LDUWA 

Same as LD, LDA in SPARC-V8 

RDASR/WRASR 

Read/Write State Registers - access additional registers 

SAVE/RESTORE 

 

SETHI 

 

SRA, SRL, SLL, Shifts 

Split into 32-bit and 64-bit versions 

Tcc 

(was Ticc) Operates with either the 32-bit integer condition codes (icc), or the 64-bit integer condition codes (xcc) 

All other arithmetic operations operate on 64-bit operands and produce 64-bit results.

E.2.2 Added Instructions to Support 64 bits

Table E–6

F[sdq]TOx 

Convert floating point to 64-bit word 

FxTO[sdq] 

Convert 64-bit word to floating point 

FMOV[dq] 

Floating-Point Move, double and quad 

FNEG[dq] 

Floating-point Negate, double and quad 

FABS[dq] 

Floating-point Absolute Value, double and quad 

LDDFA, STDFA, LDFA, STFA 

Alternate address space forms of LDDF, STDF, LDF, and STF 

LDSW 

Load a signed word 

LDSWA 

Load a signed word from an alternate space 

LDX 

Load an extended word 

LDXA 

Load an extended word from an alternate space 

LDXFSR 

Load all 64 bits of the FSR register 

STX 

Store an extended word 

STXA 

Store an extended word into an alternate space 

STXFSR 

Store all 64 bits if the FSR register 

E.2.3 Added Instructions to Support High-Performance System Implementation

Table E–7

BPcc 

Branch on integer condition code with prediction 

BPr 

Branch on integer register contents with prediction 

CASA, CASXA 

Compare and Swap from an alternate space 

FBPfcc 

Branch on floating-point condition code with prediction 

FLUSHW

Flush windows 

FMOVcc 

Move floating-point register if condition code is satisfied 

FMOVr 

Move floating-point register if integer register satisfies condition 

LDQF(A), STQF(A) 

Load/Store Quad Floating-point (in an alternate space) 

MOVcc 

Move integer register if condition code is satisfied 

MOVr 

Move integer register if register contents satisfy condition 

MULX 

Generic 64-bit multiply 

POPC 

Population count 

PREFETCH, PREFETCHA 

Prefetch Data 

SDIVX, UDIVX 

Signed and Unsigned 64-bit divide 

E.2.4 Deleted Instructions

Table E–8

Coprocessor loads and stores 

 

RDTBR and WRTBR 

TBR no longer exists. It is replaced by TBA, which can be read/written with RDPR/WRPR instructions 

RDWIM and WRWIM 

WIM no longer exists. WIM has been replaced by several register-window registers 

REPSR and WRPSR 

PSR no longer exists. It has been replaced by several separate registers that are read/written with other instructions 

RETT 

Return from trap (replace by DONE/RETRY) 

STDFQ 

Store Double from Floating-point Queue (replaced by the RDPR FQ instruction 

E.2.5 Miscellaneous Instruction Changes

Table E–9

IMPDEPn 

(Changed) Implementation-dependent instructions (replace SPARC-V8 CPop instructions) 

MEMBAR 

(Added) Memory barrier (memory synchronization support) 

E.3 SPARC-V9 Instruction Set Mapping

Table E–10

Opcode 

Mnemonic 

Argument List 

Operation 

Comments 

 

BPA

 

ba{,a}

{,pt|,pn}

 

%icc or %xcc, label 

(Branch on cc with prediction)  

Branch always  

 

1  

BPN

bn{,a}

{,pt|,pn}

%icc or %xcc, label  

Branch never  

0  

BPNE

bne{,a}

{,pt|,pn}

%icc or %xcc, label  

Branch on not equal  

not Z 

BPE

be{,a}

{,pt|,pn}

%icc or %xcc, label  

Branch on equal  

BPG

bg{,a}

{,pt|,pn}

%icc or %xcc, label  

Branch on greater  

not (Z or (N xor V))  

BPLE

ble{,a}

{,pt|,pn}

%icc or %xcc, label  

Branch on less or equal  

Z or (N xor V)  

BPGE

bge{,a}

{,pt|,pn}

%icc or %xcc, label  

Branch on greater or equal  

not (N xor V)  

BPL

bl{,a}

{,pt|,pn}

%icc or %xcc, label  

Branch on less  

N xor V  

BPGU

bgu{,a}

{,pt|,pn}

%icc or %xcc, label  

Branch on greater unsigned  

not (C or Z)  

BPLEU

bleu{,a}

{,pt|,pn}

%icc or %xcc, label  

Branch on less or equal unsigned  

C or Z  

BPCC

bcc{,a}

{,pt|,pn}

%icc or %xcc, label  

Branch on carry clear (greater than or equal, unsigned)  

not C  

BPCS

bcs{,a}

{,pt|,pn}

%icc or %xcc, label  

Branch on carry set (less than, unsigned)  

C  

BPPOS

bpos{,a}

{,pt|,pn}

%icc or %xcc, label  

Branch on positive  

not N  

BPNEG

bneg{,a}

{,pt|,pn}

%icc or %xcc, label  

Branch on negative  

N  

BPVC

bvc{,a}

{,pt|,pn}

%icc or %xcc, label  

Branch on overflow clear  

not V 

BPVS

bvs{,a}

{,pt|,pn}

%icc or %xcc, label 

Branch on overflow set 

BRZ

brz{,a}

{,pt|,pn}

regrs1, label

Branch on register zero  

Z  

BRLEZ

brlez{,a}

{,pt|,pn}

regrs1, label

Branch on register less than or equal to zero  

N or Z  

BRLZ

brlz{,a}

{,pt|,pn}

regrs1, label

Branch on register less than zero  

N  

BRNZ

brnz{,a}

{,pt|,pn}

regrs1, label

Branch on register not zero  

not Z  

BRGZ

brgz{,a}

{,pt|,pn}

regrs1, label

Branch on register greater than zero  

not (N or Z)  

BRGEZ

brgez{,a}

{,pt|,pn}

regrs1, label

Branch on register greater than or equal to zero 

not N  

CASA

casa  

casa  

[regrs1]imm_asi,regrs2,regrd

[regrs1]%asi,regrs2,regrd

Compare and swap word from alternate space  

 

CASXA

casxa  

casxa 

[regrs1]imm_asi,regrs2,regrd

[regrs1]%asi,regrs2,regrd

Compare and swap extended from alternate space 

 

 

FBPA

 

fba{,a}

{,pt|,pn}

 

%fccn, label

(Branch on cc with prediction)  

Branch never  

 

1  

FBPN

fbn{,a}

{,pt|,pn}

%fccn, label

Branch always  

0  

FBPU

fbu{,a}

{,pt|,pn}

%fccn, label

Branch on unordered  

U  

FBPG

fbg{,a}

{,pt|,pn}

%fccn, label

Branch on greater  

G  

FBPUG

fbug{,a}

{,pt|,pn}

%fccn, label

Branch on unordered or greater  

G or U  

FBPL

fbl{,a}

{,pt|,pn}

%fccn, label

Branch on less  

L  

FBPUL

fbul{,a}

{,pt|,pn}

%fccn, label

Branch on unordered or less  

L or U  

FBPLG

fblg{,a}

{,pt|,pn}

%fccn, label

Branch on less or greater  

L or G  

FBPNE

fbne{,a}

{,pt|,pn}

%fccn, label

Branch on not equal  

L or G or U  

FBPE

fbe{,a}

{,pt|,pn}

%fccn, label

Branch on equal  

E  

FBPUE

fbue{,a}

{,pt|,pn}

%fccn, label

Branch on unordered or equal  

E or U  

FBPGE

fbge{,a}

{,pt|,pn}

%fccn, label

Branch on greater or equal  

E or G  

FBPUGE

fbuge{,a}

{,pt|,pn}

%fccn, label

Branch on unordered or greater or equal  

E or G or U  

FBPLE

fble{,a}

{,pt|,pn}

%fccn, label

Branch on less or equal  

E or L  

FBPULE

fbule{,a}

{,pt|,pn}

%fccn, label

Branch on unordered or less or equal  

E or L or u  

FBPO

fbo{,a}

{,pt|,pn}

%fccn, label

 

Branch on ordered 

E or L or G 

FLUSHW

flushw

 

Flush register windows 

 

 

FMOVA

 

fmov

{s,d,q}a

 

%icc or %xcc, fregrs2, fregrd

(Move on integer cc)  

Move always  

 

1  

FMOVN

fmov

{s,d,q}n

%icc or %xcc, fregrs2, fregrd

Move never  

0  

FMOVNE

fmov

{s,d,q}ne

%icc or %xcc, fregrs2, fregrd

Move if not equal  

not Z  

FMOVE

fmov

{s,d,q}e

%icc or %xcc, fregrs2, fregrd

Move if equal  

Z  

FMOVG

fmov

{s,d,q}g

%icc or %xcc, fregrs2, fregrd

Move if greater  

not (Z or (N xor V))  

FMOVLE

fmov

{s,d,q}le

%icc or %xcc, fregrs2, fregrd

Move if less or equal  

Z or (N xor V)  

FMOVGE

fmov

{s,d,q}ge

%icc or %xcc, fregrs2, fregrd

Move if greater or equal  

not (N xor V)  

FMOVL

fmov

{s,d,q}l

%icc or %xcc, fregrs2, fregrd

Move if less  

N xor V  

FMOVGU

fmov

{s,d,q}gu

%icc or %xcc, fregrs2, fregrd

Move if greater unsigned  

not (C or Z)  

FMOVLEU

fmov

{s,d,q}leu

%icc or %xcc, fregrs2, fregrd

Move if less or equal unsigned  

C or Z  

FMOVCC

fmov

{s,d,q}cc

%icc or %xcc, fregrs2, fregrd

Move if carry clear (greater or equal, unsigned)  

not C  

FMOVCS

fmov

{s,d,q}cs

%icc or %xcc, fregrs2, fregrd

Move if carry set (less than, unsigned)  

C  

FMOVPOS

fmov

{s,d,q}pos

%icc or %xcc, fregrs2, fregrd

Move if positive  

not N  

FMOVNEG

fmov

{s,d,q}neg

%icc or %xcc, fregrs2, fregrd

Move if negative  

N  

FMOVVC

fmov

{s,d,q}vc

%icc or %xcc, fregrs2, fregrd

Move if overflow clear  

not V 

FMOVVS

fmov

{s,d,q}vs

%icc or %xcc, fregrs2, fregrd

Move if overflow set 

V  

 

FMOVRZ

 

fmovr

{s,d,q}e

 

regrs1, fregrs2, fregrd

(Move f-p register on cc)  

Move if register zero  

 

FMOVRLEZ

fmovr

{s,d,q}lz

regrs1, fregrs2, fregrd

Move if register less than or equal zero  

 

FMOVRLZ

fmovr

{s,d,q}lz

regrs1, fregrs2, fregrd

Move if register less than zero  

 

FMOVRNZ

FMOVRGZ

FMOVRGEZ

fmovr

{s,d,q}ne

fmovr

{s,d,q}gz

fmovr

{s,d,q}gez

regrs1, fregrs2, fregrd

regrs1, fregrs2, fregrd

regrs1, fregrs2, fregrd

Move if register not zero  

Move if register greater than zero  

Move if register greater than or equal to zero 

 

FMOVFA

FMOVFN

FMOVFU

FMOVFG

FMOVFUG

FMOVFL

FMOVFUL

FMOVFLG

FMOVFNE

FMOVFE

FMOVFUE

FMOVFGE

FMOVFUGE

FMOVFLE

FMOVFULE

FMOVFO

 

fmov{s,d,q}a

fmov{s,d,q}n

fmov{s,d,q}u

fmov{s,d,q}g

fmov{s,d,q}ug

fmov{s,d,q}l

fmov{s,d,q}ul

fmov{s,d,q}lg

fmov{s,d,q}ne

fmov{s,d,q}e

fmov{s,d,q}ue

fmov{s,d,q}ge

fmov{s,d,q}uge

fmov{s,d,q}le

fmov{s,d,q}ule

fmov{s,d,q}o

%fccn,fregrs2,fregrd

%fccn,fregrs2,fregrd

%fccn,fregrs2,fregrd

%fccn,fregrs2,fregrd

%fccn,fregrs2,fregrd

%fccn,fregrs2,fregrd

%fccn,fregrs2,fregrd

%fccn,fregrs2,fregrd

%fccn,fregrs2,fregrd

%fccn,fregrs2,fregrd

%fccn,fregrs2,fregrd

%fccn,fregrs2,fregrd

%fccn,fregrs2,fregrd

%fccn,fregrs2,fregrd

%fccn,fregrs2,fregrd

%fccn,fregrs2,fregrd

 

(Move on floating-point cc)  

Move always  

Move never  

Move if unordered  

Move if greater  

Move if unordered or greater  

Move if less  

Move if unordered or less  

Move if less or greater  

Move if not equal  

Move if equal  

Move if unordered or equal  

Move if greater or equal  

Move if unordered or greater or equal  

Move if less or equal  

Move if unordered or less or equal  

Move if ordered 

1  

0  

U  

G  

G or U  

L  

L or U  

L or G  

L or G or U  

E  

E or U  

E or G  

E or G or U  

E or L  

E or L or u  

E or L or G 

LDSW

LDSWA

ldsw

ldsw

[address], regrd

[regaddr] imm_asi, regrd

Load a signed word  

Load signed word from alternate space 

 

LDX

LDXA

 

LDXFSR

ldx

ldxa

ldxa

ldx

[address], regrd

[regaddr] imm_asi, regrd

[reg_plus_imm] %asi, regrd

[address], %fsr

Load extended word  

Load extended word from alternate space  

Load floating-point state register 

 

MEMBAR

membar

membar_mask

Memory barrier 

 

MOVA

MOVN

MOVNE

MOVE

MOVG

MOVLE

MOVGE

MOVL

MOVGU

MOVLEU

MOVCC

MOVCS

MOVPOS

MOVNEG

MOVVC

MOVVS

mova

movn

movne

move

movg

movle

movge

movl

movgu

movleu

movcc

movcs

movpos

movneg

movvc

movvs

%icc or %xcc, reg_or_imm11, regrd

%icc or %xcc, reg_or_imm11, regrd

%icc or %xcc, reg_or_imm11, regrd

%icc or %xcc, reg_or_imm11, regrd

%icc or %xcc, reg_or_imm11, regrd

%icc or %xcc, reg_or_imm11, regrd

%icc or %xcc, reg_or_imm11, regrd

%icc or %xcc, reg_or_imm11, regrd

%icc or %xcc, reg_or_imm11, regrd

%icc or %xcc, reg_or_imm11, regrd

%icc or %xcc, reg_or_imm11, regrd

%icc or %xcc, reg_or_imm11, regrd

%icc or %xcc, reg_or_imm11, regrd

%icc or %xcc, reg_or_imm11, regrd

%icc or %xcc, reg_or_imm11, regrd

%icc or %xcc, reg_or_imm11, regrd

(Move integer register on cc)  

Move always  

Move never  

Move if not equal  

Move if equal  

Move if greater  

Move if less or equal  

Move if greater or equal  

Move if less  

Move if greater unsigned  

Move if less or equal unsigned  

Move if carry clear (greater or equal, unsigned)  

Move if carry set (less than, unsigned)  

Move if positive  

Move if negative  

Move if overflow clear  

Move if overflow set 

1  

0  

not Z  

Z  

not (Z or (N xor V))  

Z or (N xor V)  

not (N xor V)  

N xor V  

not (C or Z)  

C or Z  

not C  

C  

not N  

N  

not V  

MOVFA

MOVFN

MOVFU

MOVFG

MOVFUG

MOVFL

MOVFUL

MOVFLG

MOVFNE

MOVFE

MOVFUE

MOVFGE

MOVFUGE

MOVFLE

MOVFULE

MOVFO

mova

movn

movu

movg

movug

movl

movul

movlg

movne

move

movue

movge

movuge

movle

movule

movo

%fccn,reg_or_imm11,regrd

%fccn,reg_or_imm11,regrd

%fccn,reg_or_imm11,regrd

%fccn,reg_or_imm11,regrd

%fccn,reg_or_imm11,regrd

%fccn,reg_or_imm11,regrd

%fccn,reg_or_imm11,regrd

%fccn,reg_or_imm11,regrd

%fccn,reg_or_imm11,regrd

%fccn,reg_or_imm11,regrd

%fccn,reg_or_imm11,regrd

%fccn,reg_or_imm11,regrd

%fccn,reg_or_imm11,regrd

%fccn,reg_or_imm11,regrd

%fccn,reg_or_imm11,regrd

%fccn,reg_or_imm11,regrd

(Move on floating-point cc)  

Move always  

Move never  

Move if unordered  

Move if greater  

Move if unordered or greater  

Move if less  

Move if unordered or less  

Move if less or greater  

Move if not equal  

Move if equal  

Move if unordered or equal  

Move if greater or equal  

Move if unordered or greater or equal  

Move if less or equal  

Move if unordered or less or equal  

Move if ordered 

1  

0  

U  

G  

G or U  

L  

L or U  

L or G  

L or G or U  

E E or U  

E or G  

E or G or U  

E or L  

E or L or u  

E or L or G 

MOVRZ

MOVRLEZ

MOVRLZ

MOVRNZ

MOVRGZ

MOVRGEZ

movre

movrlez

movrlz

movrnz

movrgz

movrgez

regrs1, reg_or_imm10,regrd

regrs1, reg_or_imm10,regrd

regrs1, reg_or_imm10,regrd

regrs1, reg_or_imm10,regrd

regrs1, reg_or_imm10,regrd

regrs1, reg_or_imm10,regrd

(Move register on register cc)  

Move if register zero  

Move if register less than or equal to zero  

Move if register less than zero  

Move if register not zero  

Move if register greater than zero  

Move if register greater than or equal to zero 

Z  

N or Z  

N  

not Z  

N nor Z  

not N  

MULX

mulx

regrs1, reg_or_imm,regrd

(Generic 64-bit Multiply) Multiply (signed or unsigned) 

See SDIVX and UDIVX 

POPC

popc

reg_or_imm, regrd

Population count 

 

PREFETCH

PREFETCHA

prefetch

prefetcha

prefetcha

[address], prefetch_dcn [regaddr] imm_asi, prefetch_fcn [reg_plus_imm] %asi, prefetch_fcn

Prefetch data  

Prefetch data from alternate space 

See The SPARC architecture manual, version 9

 

SDIVX

sdivx

regrs1, reg_or_imm,regrd

(64-bit signed divide) Signed Divide 

See MULX and UDIVX 

STX

STXA

 

STXFSR

stx

stxa

stxa

stx

regrd, [address]

regrd, [address] imm_asi

regrd, [reg_plus_imm] %asi %fsr, [address]

Store extended word  

Store extended word into alternate space  

 

Store floating-point register (all 64-bits) 

 

UDIVX

udivx

regrs1, reg_or_imm, regrd

(64-bit unsigned divide) Unsigned divide 

See MULX and SDIVX 

E.4 SPARC-V9 Floating-Point Instruction Set Mapping

SPARC-V9 floating-point instructions are shown in the following table.

Table E–11

SPARC 

Mnemonic [Types of Operands are denoted by the following lower-case letters:i 32-bit integerx 64-bit integers singled doubleq quad]

Argument List 

Description 

F[sdq]TOx

fstox

fdtox

fqtox

fregrs2, fregrd

fregrs2, fregrd

fregrs2, fregrd

Convert floating point to 64-bit integer  

 

fstoi

fdtoi

fqtoi

fregrs2, fregrd

fregrs2, fregrd

fregrs2, fregrd

Convert floating-point to 32-bit integer 

FxTO[sdq]

fxtos

fxtod

fxtoq

fregrs2, fregrd

fregrs2, fregrd

fregrs2, fregrd

Convert 64-bit integer to floating point  

 

fitos

fitod

fitoq

fregrs2, fregrd

fregrs2, fregrd

fregrs2, fregrd

Convert 32-bit integer to floating point 

FMOV[dq]

fmovd

fmovq

fregrs2, fregrd

fregrs2, fregrd

Move double  

Move quad 

FNEG[dq]

fnegd

fnegq

fregrs2, fregrd

fregrs2, fregrd

Negate double  

Negate quad 

FABS[dq]

fabsd

fabsq

fregrs2, fregrd

fregrs2, fregrd

Absolute value double  

Absolute value quad 

LDFA

 

LDDFA

 

LDQFA

lda

lda

ldda

ldda

ldqa

ldqa

[regaddr] imm_asi, fregrd

[reg_plus_imm] %asi, fregrd

[regaddr] imm_asi, fregrd

[reg_plus_imm] %asi, fregrd

[regaddr] imm_asi, fregrd

[reg_plus_imm] %asi, fregrd

Load floating-point register from alternate space  

Load double floating-point register from alternate space.  

Load quad floating-point register from alternate space 

STFA

 

STDFA

 

STQFA

sta

sta

stda

stda

stqa

stqa

fregrd, [regaddr] imm_asi

fregrd, [reg_plus_imm] %asi

fregrd, [regaddr] imm_asi

fregrd, [reg_plus_imm] %asi

fregrd, [regaddr] imm_asi

fregrd, [reg_plus_imm] %asi

Store floating-point register to alternate space  

Store double floating-point register to alternate space 

Store quad floating-point register to alternate space 

E.5 SPARC-V9 Synthetic Instruction-Set Mapping

Here is a mapping of synthetic instructions to hardware equivalent instructions.

Table E–12

Synthetic Instruction 

Hardware Equivalent(s) 

Comment 

cas

casl

casx

casxl

[regrsl], regrs2, regrd

[regrsl], regrs2, regrd

[regrsl], regrs2, regrd

[regrsl], regrs2, regrd

casa

casa

casxa

casxa

[regrsl]ASI_P, regrs2, regrd

[regrsl]ASI_P_L, regrs2, regrd

[regrsl]ASI_P, regrs2, regrd

[regrsl]ASI_P_L, regrs2, regrd

Compare & swap (cas)  

cas little-endian  

cas extended  

cas little-endian, extended 

clrx

[address]

stx

%g0, [address]

Clear extended word 

clruw

clruw

regrs1, regrd

regrd

srl

srl

regrs1, %g0, regrd

regrd, %g0, regrd

Copy and clear upper word  

Clear upper word 

iprefetch

label

bn, pt

%xcc, label

Instruction prefetch, 

mov

mov

mov

%y, regrd

%asrn, regrd

reg_or_imm, %asrn

rd

rd

wr

%y, regrd

%asrn, regrd

%g0, reg_or_imm, %asrn

 

ret

retl

 

jmpl

jmpl

%i7+8, %g0  

%o7+8, %g0 

Return from subroutine  

Return from leaf subroutine 

setn

value, r1, r2

for -xarch=v9 same as setx value r1, r2

for -xarch=v8 same as set value r2

 

setnhi

value, r1, r2

for -xarch=v9 same as setxhi value r1, r2

for -xarch=v8 same as sethi value r2

 

setuw

value,regrd

sethi

or

sethi

or

%hi(value), regrd

%g0, value, regrd

%hi(value), regrd;

regrd, %lo(value), regrd

(value & 3FF16)==0

when 0 ≤ value 4095

(otherwise) 

Do not use setuw in a DCTI delay slot. 

setsw

value,regrd

sethi

or

sethi

sra

sethi

or

sethi

or

sra

%hi(value), regrd

%g0, value, regrd

%hi(value), regrd

regrd, %g0, regrd

%hi(value), regrd;

regrd, %lo(value), regrd

%hi(value), regrd;

regrd, %lo(value), regrd

regrd, %g0, regrd

value>=0 and (value & 3FF16)==0

-4096 ≤ value ≤ 4095  

if (value<0) and ((value & 3FF)==0)  

(otherwise, if value>=0)  

(otherwise, if value<0)  

Do not use setsw in a CTI delay slot. 

setx

value, r1, r2

sethi

or

sethi

or

sllx

or

%hh(value), r1

r1, %hm(value), r1

%lm(value), r2

r2, %lo(value), r2

r1, 32, r1  

r1, r2, r2 

 

setxhi

value r1, r2

sethi

or

sethi

sllx

or

%hh(value), r1  

r1, %hm(value), r1  

%lm(value), r2 

r1, 32, r1  

r1, r2, r2 

 

signx

signx

regrsl, regrd

regrd

sra

sra

regrsl, %g0, regrd

regrd, %g0, regrd

Sign-extend 32-bit value to 64 bits 

E.6 UltraSPARC and VIS Instruction Set Extensions

This section describes extensions that require SPARC-V9. The extensions support enhanced graphics functionality and improved memory access efficiency.


Note –

SPARC-V9 instruction set extensions used in executables may not be portable to other SPARC-V9 systems.


E.6.1 Graphics Data Formats

The overhead of converting to and from floating-point arithmetic is high, so the graphics instructions are optimized for short-integer arithmetic. Image components are 8 or 16 bits. Intermediate results are 16 or 32 bits.

E.6.2 Eight-bit Format

A 32-bit word contains pixels of four unsigned 8-bit integers. The integers represent image intensity values (Graphic, G, B, R). Support is provided for band interleaved images (store color components of a point), and band sequential images (store all values of one color component).

E.6.3 Fixed Data Formats

A 64-bit word contains four 16-bit signed fixed-point values. This is the fixed 16-bit data format.

A 64-bit word contains two 8-bit signed fixed-point values. This is the fixed 32-bit data format.

Enough precision and dynamic range (for filtering and simple image computations on pixel values) can be provided by an intermediate format of fixed data values. Pixel multiplication is used to convert from pixel data to fixed data. Pack instructions are used to convert from fixed data to pixel data (clip and truncate to an 8-bit unsigned value). The FPACKFIX instruction supports conversion from 32-bit fixed to 16-bit fixed. Rounding is done by adding one to the rounding bit position. You should use floating-point data to perform complex calculations needing more precision or dynamic range.

E.6.4 SHUTDOWN Instruction

All outstanding transactions are completed before the SHUTDOWN instruction completes.

Table E–13

SPARC 

Mnemonic 

Argument List 

Description 

SHUTDOWN

shutdown

 

shutdown to enter power down mode 

E.6.5 Graphics Status Register (GSR)

You use ASR 0x13 instructions RDASR and WRASR to access the Graphics Status Register.

Table E–14

SPARC 

Mnemonic 

Argument List 

Description 

RDASR

WRASR

rdasr

wrasr

%gsr, regrd

regrs1, reg_or_imm, %gsr

read GSR 

write GSR 

E.6.6 Graphics Instructions

Unless otherwise specified, floating-point registers contain all instruction operands. There are 32 double-precision registers. Single-precision floating-point registers contain the pixel values, and double-precision floating-point registers contain the fixed values.

The opcode space reserved for the Implementation-Dependent Instruction1 (IMPDEP1) instructions is where the graphics instruction set is mapped.

Partitioned add/subtract instructions perform two 32-bit or four 16-bit partitioned adds or subtracts between the source operands corresponding fixed point values.

Table E–15

SPARC 

Mnemonic 

Argument List 

Description 

FPADD16

FPADD16S

FPADD32

FPADD32S

FPSUB16

FPSUB16S

FPSUB32

FPSUB32S

fpadd16

fpadd16s

fpadd32

fpadd32s

fpsub16

fpsub16s

fpsub32

fpsub32s

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

four 16-bit add 

two 16-bit add 

two 32-bit add 

one 32-bit add 

four 16-bit subtract 

two 16-bit subtract 

two 32-bit subtract 

one 32-bit subtract 

Pack instructions convert to a lower pixel or precision fixed format.

Table E–16

SPARC 

Mnemonic 

Argument List 

Description 

FPACK16

FPACK32

FPACKFIX

FEXPAND

FPMERGE

fpack16

fpack32

fpackfix

fexpand

fpmerge

fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs2, fregrd

fregrs2, fregrd

fregrs1, fregrs2, fregrd

four 16-bit packs 

two 32-bit packs 

four 16-bit packs 

four 16-bit expands 

two 32-bit merges 

Partitioned multiply instructions have the following variations.

Table E–17

SPARC 

Mnemonic 

Argument List 

Description 

FMUL8x16

FMUL8x16AU

FMUL8x16AL

FMUL8SUx16

FMUL8ULx16

FMULD8SUx16

FMULD8ULx16

fmul8x16

fmul8x16au

fmul8x16al

fmul8sux16

fmul8ulx16

fmuld8sux16

fmuld8ulx16

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

8x16-bit partition 

8x16-bit upper Graphic partition

8x16-bit lower Graphic partition

upper 8x16-bit partition 

lower unsigned 8x16-bit partition 

upper 8x16-bit partition 

lower unsigned 8x16-bit partition 

Alignment instructions have the following variations.

Table E–18

SPARC 

Mnemonic 

Argument List 

Description 

ALIGNADDRESS

ALIGNADDRESS_LITTLE

FALIGNDATA

alignaddr

alignaddrl

 

faligndata

regrs1, regrs2, regrd

regrs1, regrs2, regrd

 

fregrs1, fregrs2, fregrd

find misaligned data access address 

same as above, but little-endian 

 

do misaligned data, data alignment 

Logical operate instructions perform one of sixteen 64-bit logical operations between rs1 and rs2 (in the standard 64-bit version).

Table E–19

SPARC 

Mnemonic 

Argument List 

Description 

FZERO

FZEROS

FONE

FONES

FSRC1

fzero

fzeros

fone

fones

fsrc1

fregrd

fregrd

fregrd

fregrd

fregrs1, fregrd

zero fill 

zero fill, single precision 

one fill 

one fill, single precision 

copy src1 

FSRC1S

FSRC2

FSRC2S

FNOT1

FNOT1S

fsrc1s

fsrc2

fsrc2s

fnot1

fnot1s

fregrs1, fregrd

fregrs2, fregrd

fregrs2, fregrd

fregrs1, fregrd

fregrs1, fregrd

copy src1, single precision 

copy src2 

copy src2, single precision 

negate src1, 1's complement 

same as above, single precision 

FNOT2

FNOT2S

FOR

FORS

FNOR

fnot2

fnot2s

for

fors

fnor

fregrs2, fregrd

fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

negate src2, 1's complement 

same as above, single precision 

logical OR 

logical OR, single precision 

logical NOR 

FNORS

FAND

FANDS

FNAND

FNANDS

fnors

fand

fands

fnand

fnands

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

logical NOR, single precision 

logical AND 

logical AND, single precision 

logical NAND 

logical NAND, single precision 

FXOR

FXORS

FXNOR

FXNORS

FORNOT1

fxor

fxors

fxnor

fxnors

fornot1

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

logical XOR 

logical XOR, single precision 

logical XNOR 

logical XNOR, single precision 

negated src1 OR src2 

FORNOT1S

FORNOT2

FORNOT2S

FANDNOT1

fornot1s

fornot2

fornot2s

fandnot1

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

same as above, single precision 

src1 OR negated src2 

same as above, single precision 

negated src1 AND src2 

FANDNOT1S

FANDNOT2

FANDNOT2S

fandnot1s

fandnot2

fandnot2s

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

fregrs1, fregrs2, fregrd

 

same as above, single precision 

src1 AND negated src2 

same as above, single precision 

Pixel compare instructions compare fixed-point values in rs1 and rs2 (two 32 bit or four 16 bit)

Table E–20

SPARC 

Mnemonic 

Argument List 

Description 

FCMPGT16

FCMPGT32

FCMPLE16

FCMPLE32

fcmpgt16

fcmpgt32

fcmple16

fcmple32

fregrs1, fregrs2, regrd

fregrs1, fregrs2, regrd

fregrs1, fregrs2, regrd

fregrs1, fregrs2, regrd

4 16-bit compare, set rd if src1>src2 

2 32-bit compare, set rd if src1>src2 

4 16-bit compare, set rd if src1≤src2 

2 32-bit compare, set rd if src1≤src2 

FCMPNE16

FCMPNE32

FCMPEQ16

FCMPEQ32

fcmpne16

fcmpne32

fcmpeq16

fcmpeq32

fregrs1, fregrs2, regrd

fregrs1, fregrs2, regrd

fregrs1, fregrs2, regrd

fregrs1, fregrs2, regrd

4 16-bit compare, set rd if src1≠src2 

2 32-bit compare, set rd if src1≠src2 

4 16-bit compare, set rd if src1=src2 

2 32-bit compare, set rd if src1=src2 

Edge handling instructions handle the boundary conditions for parallel pixel scan line loops.

Table E–21

SPARC 

Mnemonic 

Argument List 

Description 

EDGE8

EDGE8L

EDGE16

edge8

edge8l

edge16

regrs1, regrs2, regrd

regrs1, regrs2, regrd

regrs1, regrs2, regrd

8 8-bit edge boundary processing 

same as above, little-endian 

4 16-bit edge boundary processing 

EDGE16L

EDGE32

EDGE32L

edge16l

edge32

edge32l

regrs1, regrs2, regrd

regrs1, regrs2, regrd

regrs1, regrs2, regrd

same as above, little-endian 

2 32-bit edge boundary processing 

same as above, little-endian 

Pixel component distance instructions are used for motion estimation in video compression algorithms.

Table E–22

SPARC 

Mnemonic 

Argument List 

Description 

PDIST

pdist

fregrs1, fregrs2, fregrd

8 8-bit components, distance between 

The three-dimensional array addressing instructions convert three- dimensional fixed-point addresses (in rs1) to a blocked-byte address. The result is stored in rd.

Table E–23

SPARC 

Mnemonic 

Argument List 

Description 

ARRAY8

 

ARRAY16

ARRAY32

array8

 

array16

array32

regrs1, regrs2, regrd

 

regrs1, regrs2, regrd

regrs1, regrs2, regrd

convert 8-bit 3-D address to blocked byte address 

same as above, but 16-bit 

same as above, but 32-bit 

E.6.7 Memory Access Instructions

These memory access instructions are part of the SPARC-V9 instruction set extensions.

Table E–24

SPARC 

imm_asi 

Argument List 

Description 

 

STDFA

STDFA

STDFA

STDFA

 

ASI_PST8_P

ASI_PST8_S

ASI_PST8_PL

ASI_PST8_SL

 

stda fregrd, [fregrs1] regmask, imm_asi

eight 8-bit conditional stores to: 

primary address space 

secondary address space 

primary address space, little endian 

secondary address space, little endian 

 

STDFA

STDFA

STDFA

STDFA

 

ASI_PST16_P

ASI_PST16_S

ASI_PST16_PL

ASI_PST16_SL

 

four 16-bit conditional stores to: 

primary address space 

secondary address space 

primary address space, little endian 

secondary address space, little endian 

 

STDFA

STDFA

STDFA

STDFA

 

ASI_PST32_P

ASI_PST32_S

ASI_PST32_PL

ASI_PST32_SL

 

two 32-bit conditional stores to: 

primary address space 

secondary address space 

primary address space, little endian 

secondary address space, little endian 


Note –

To select a partial store instruction, use one of the partial store ASIs with the STDA instruction.


Table E–25

SPARC 

imm_asi 

Argument List 

Description 

 

LDDFA

STDFA

 

ASI_FL8_P

 

ldda [reg_addr] imm_asi, freqrd

stda freqrd, [reg_addr] imm_asi

8-bit load/store from/to: 

primary address space  

LDDFA

STDFA

ASI_FL8_S

ldda [reg_plus_imm] %asi, freqrd

stda [reg_plus_imm] %asi

secondary address space 

LDDFA

STDFA

ASI_FL8_PL

 

primary address space, little endian 

LDDFA

STDFA

ASI_FL8_SL

 

secondary address space, little endian 

 

LDDFA

STDFA

 

ASI_FL16_P

 

16-bit load/store from/to: 

primary address space 

LDDFA

STDFA

ASI_FL16_S

 

secondary address space 

LDDFA

STDFA

ASI_FL16_PL

 

primary address space, little endian 

LDDFA

STDFA

ASI_FL16_SL

 

secondary address space, little endian 


Note –

To select a short floating-point load and store instruction, use one of the short ASIs with the LDDA and STDA instructions.


Table E–26

SPARC 

imm_asi 

Argument List 

Description 

LDDA

LDDA

ASI_NUCLEUS_QUAD_LDD

ASI_NUCLEUS_QUAD_LDD_L

[reg_addr] imm_asi, regrd

[reg_plus_imm] %asi, regrd

128-bit atomic load 

128-bit atomic load, little endian 

 

LDDFA

STDFA

 

ASI_BLK_AIUP

 

ldda [reg_addr] imm_asi, freqrd

stda freqrd, [reg_addr] imm_asi

64-byte block load/store from/to: 

primary address space, user privilege  

LDDFA

STDFA

ASI_BLK_AIUS

ldda [reg_plus_imm] %asi, freqrd

stda fregrd, [reg_plus_imm] %asi

secondary address space, user privilege. 

LDDFA

STDFA

ASI_BLK_AIUPL

 

primary address space, user privilege, little endian 

LDDFA

STDFA

ASI_BLK_AIUSL

 

secondary address space, user privilege little endian 

LDDFA

STDFA

ASI_BLK_P

 

primary address space 

LDDFA

STDFA

ASI_BLK_S

 

secondary address space 

LDDFA

STDFA

ASI_BLK_PL

 

primary address space, little endian 

LDDFA

STDFA

ASI_BLK_SL

 

secondary address space, little endian 

LDDFA

STDFA

ASI_BLK_COMMIT_P

 

64-byte block commit store to primary address space 

LDDFA

STDFA

ASI_BLK_COMMIT_S

 

64-byte block commit store to secondary address space 


Note –

To select a block load and store instruction, use one of the block transfer ASIs with the LDDA and STDA instructions.