SPARC Assembly Language Reference Manual

Previous: Appendix D An Example Language Program

Appendix E SPARC-V9 Instruction Set

This appendix describes changes made to the SPARC instruction set due to the SPARC-V9 architecture. Application software for the 32-bit SPARC-V8 (Version8) architecture can execute, unchanged, on SPARC-V9 systems.

This appendix is organized into the following sections:

E.1 SPARC-V9 Changes

The SPARC-V9 architecture differs from SPARC-V8 architecture in the following areas, expanded below: registers, alternate space access, byte order, and instruction set.

E.1.1 Registers

These registers have been deleted:

Table E–1


PSR	Processor State Register
TBR	Trap Base Register
WIM	Window Invalid Mask

These registers have been widened from 32 to 64 bits:

Table E–2


Integer registers
All state registers	FSR, PC, nPC, and Y

Note –

FSR Floating-Point State Register: fcc1, fcc2, and fcc3 (added floating-point condition code) bits are added and the register widened to 64-bits.

These SPARC-V9 registers are within a SPARC-V8 register field:

Table E–3


CCR	Condition Codes Register
CWP	Current Window Pointer
PIL	Processor Interrupt Level
TBA	Trap Base Address
TT[MAXTL]	Trap Type
VER	Version

These are registers that have been added.

Table E–4


ASI	Address Space Identifier
CANRESTORE	Restorable Windows
CANSAVE	Savable windows
CLEANWIN	Clean Windows
FPRS	Floating-point Register State
OTHERWIN	Other Windows
PSTATE	Processor State
TICK	Hardware clock tick-counter
TL	Trap Level
TNPC[MAXTL]	Trap Next Program Counter
TPC[MAXTL]	Trap Program Counter
TSTATE[MAXTL]	Trap State
WSTATE	Windows State

Also, there are sixteen additional double-precision floating-point registers, f[32] .. f[62]. These registers overlap (and are aliased with) eight additional quad-precision floating-point registers, f[32] .. f[60]

The SPARC-V9, CWP register is decremented during a RESTORE instruction, and incremented during a SAVE instruction. This is the opposite of PSR.CWP's behavior in SPARC-V8. This change has no effect on nonprivileged instructions.

E.1.2 Alternate Space Access

Load- and store-alternate instructions to one-half of the alternate spaces can now be included in user code. In SPARC-V9, loads and stores to ASIs 00₁₆.. 7f₁₆ are privileged; those to ASIs 80₁₆ .. FF₁₆ are nonprivileged. In SPARC-V8, access to alternate address spaces is privileged.

E.1.3 Byte Order

SPARC-V9 supports both little- and big-endian byte orders for data accesses only; instruction accesses are always performed using big-endian byte order. In SPARC-V8, all data and instruction accesses are performed in big-endian byte order.

E.2 SPARC-V9 Instruction Set Changes

Application software written for the SPARC-V8 processor runs unchanged on a SPARC-V9 processor.

E.2.1 Extended Instruction Definitions to Support the 64-bit Model

Table E–5


FCMP, FCMPE	Floating-Point Compare—can set any of the four floating-point condition codes.
LDFSR, STFSR	Load/Store FSR- only affect low-order 32 bits of FSR
LDUW, LDUWA	Same as LD, LDA in SPARC-V8
RDASR/WRASR	Read/Write State Registers - access additional registers
SAVE/RESTORE
SETHI
SRA, SRL, SLL, Shifts	Split into 32-bit and 64-bit versions
Tcc	(was Ticc) Operates with either the 32-bit integer condition codes (icc), or the 64-bit integer condition codes (xcc)

All other arithmetic operations operate on 64-bit operands and produce 64-bit results.

E.2.2 Added Instructions to Support 64 bits

Table E–6


F[sdq]TOx	Convert floating point to 64-bit word
FxTO[sdq]	Convert 64-bit word to floating point
FMOV[dq]	Floating-Point Move, double and quad
FNEG[dq]	Floating-point Negate, double and quad
FABS[dq]	Floating-point Absolute Value, double and quad
LDDFA, STDFA, LDFA, STFA	Alternate address space forms of LDDF, STDF, LDF, and STF
LDSW	Load a signed word
LDSWA	Load a signed word from an alternate space
LDX	Load an extended word
LDXA	Load an extended word from an alternate space
LDXFSR	Load all 64 bits of the FSR register
STX	Store an extended word
STXA	Store an extended word into an alternate space
STXFSR	Store all 64 bits if the FSR register

E.2.3 Added Instructions to Support High-Performance System Implementation

Table E–7


BPcc	Branch on integer condition code with prediction
BPr	Branch on integer register contents with prediction
CASA, CASXA	Compare and Swap from an alternate space
FBPfcc	Branch on floating-point condition code with prediction
`FLUSHW`	Flush windows
FMOVcc	Move floating-point register if condition code is satisfied
FMOVr	Move floating-point register if integer register satisfies condition
LDQF(A), STQF(A)	Load/Store Quad Floating-point (in an alternate space)
MOVcc	Move integer register if condition code is satisfied
MOVr	Move integer register if register contents satisfy condition
MULX	Generic 64-bit multiply
POPC	Population count
PREFETCH, PREFETCHA	Prefetch Data
SDIVX, UDIVX	Signed and Unsigned 64-bit divide

E.2.4 Deleted Instructions

Table E–8


Coprocessor loads and stores
RDTBR and WRTBR	TBR no longer exists. It is replaced by TBA, which can be read/written with RDPR/WRPR instructions
RDWIM and WRWIM	WIM no longer exists. WIM has been replaced by several register-window registers
REPSR and WRPSR	PSR no longer exists. It has been replaced by several separate registers that are read/written with other instructions
RETT	Return from trap (replace by DONE/RETRY)
STDFQ	Store Double from Floating-point Queue (replaced by the RDPR FQ instruction

E.2.5 Miscellaneous Instruction Changes

Table E–9


IMPDEPn	(Changed) Implementation-dependent instructions (replace SPARC-V8 CPop instructions)
MEMBAR	(Added) Memory barrier (memory synchronization support)

E.3 SPARC-V9 Instruction Set Mapping

Table E–10


Opcode	Mnemonic	Argument List	Operation	Comments
`BPA`	`ba{,a}` `{,pt\|,pn}`	%icc or %xcc, label	(Branch on cc with prediction) Branch always	1
`BPN`	`bn{,a}` `{,pt\|,pn}`	%icc or %xcc, label	Branch never	0
`BPNE`	`bne{,a}` `{,pt\|,pn}`	%icc or %xcc, label	Branch on not equal	not Z
`BPE`	`be{,a}` `{,pt\|,pn}`	%icc or %xcc, label	Branch on equal	Z
`BPG`	`bg{,a}` `{,pt\|,pn}`	%icc or %xcc, label	Branch on greater	not (Z or (N xor V))
`BPLE`	`ble{,a}` `{,pt\|,pn}`	%icc or %xcc, label	Branch on less or equal	Z or (N xor V)
`BPGE`	`bge{,a}` `{,pt\|,pn}`	%icc or %xcc, label	Branch on greater or equal	not (N xor V)
`BPL`	`bl{,a}` `{,pt\|,pn}`	%icc or %xcc, label	Branch on less	N xor V
`BPGU`	`bgu{,a}` `{,pt\|,pn}`	%icc or %xcc, label	Branch on greater unsigned	not (C or Z)
`BPLEU`	`bleu{,a}` `{,pt\|,pn}`	%icc or %xcc, label	Branch on less or equal unsigned	C or Z
`BPCC`	`bcc{,a}` `{,pt\|,pn}`	%icc or %xcc, label	Branch on carry clear (greater than or equal, unsigned)	not C
`BPCS`	`bcs{,a}` `{,pt\|,pn}`	%icc or %xcc, label	Branch on carry set (less than, unsigned)	C
`BPPOS`	`bpos{,a}` `{,pt\|,pn}`	%icc or %xcc, label	Branch on positive	not N
`BPNEG`	`bneg{,a}` `{,pt\|,pn}`	%icc or %xcc, label	Branch on negative	N
`BPVC`	`bvc{,a}` `{,pt\|,pn}`	%icc or %xcc, label	Branch on overflow clear	not V
`BPVS`	`bvs{,a}` `{,pt\|,pn}`	%icc or %xcc, label	Branch on overflow set	V
`BRZ`	`brz{,a}` `{,pt\|,pn}`	reg_rs1, label	Branch on register zero	Z
`BRLEZ`	`brlez{,a}` `{,pt\|,pn}`	reg_rs1, label	Branch on register less than or equal to zero	N or Z
`BRLZ`	`brlz{,a}` `{,pt\|,pn}`	reg_rs1, label	Branch on register less than zero	N
`BRNZ`	`brnz{,a}` `{,pt\|,pn}`	reg_rs1, label	Branch on register not zero	not Z
`BRGZ`	`brgz{,a}` `{,pt\|,pn}`	reg_rs1, label	Branch on register greater than zero	not (N or Z)
`BRGEZ`	`brgez{,a}` `{,pt\|,pn}`	reg_rs1, label	Branch on register greater than or equal to zero	not N
`CASA`	casa casa	[reg_rs1]imm_asi,reg_rs2,reg_rd [reg_rs1]%asi,reg_rs2,reg_rd	Compare and swap word from alternate space
`CASXA`	casxa casxa	[reg_rs1]imm_asi,reg_rs2,reg_rd [reg_rs1]%asi,reg_rs2,reg_rd	Compare and swap extended from alternate space
`FBPA`	`fba{,a}` `{,pt\|,pn}`	`%fccn`, `label`	(Branch on cc with prediction) Branch never	1
`FBPN`	`fbn{,a}` `{,pt\|,pn}`	`%fccn`, `label`	Branch always	0
`FBPU`	`fbu{,a}` `{,pt\|,pn}`	`%fccn`, `label`	Branch on unordered	U
`FBPG`	`fbg{,a}` `{,pt\|,pn}`	`%fccn`, `label`	Branch on greater	G
`FBPUG`	`fbug{,a}` `{,pt\|,pn}`	`%fccn`, `label`	Branch on unordered or greater	G or U
`FBPL`	`fbl{,a}` `{,pt\|,pn}`	`%fccn`, `label`	Branch on less	L
`FBPUL`	`fbul{,a}` `{,pt\|,pn}`	`%fccn`, `label`	Branch on unordered or less	L or U
`FBPLG`	`fblg{,a}` `{,pt\|,pn}`	`%fccn`, `label`	Branch on less or greater	L or G
`FBPNE`	`fbne{,a}` `{,pt\|,pn}`	`%fccn`, `label`	Branch on not equal	L or G or U
`FBPE`	`fbe{,a}` `{,pt\|,pn}`	`%fccn`, `label`	Branch on equal	E
`FBPUE`	`fbue{,a}` `{,pt\|,pn}`	`%fccn`, `label`	Branch on unordered or equal	E or U
`FBPGE`	`fbge{,a}` `{,pt\|,pn}`	`%fccn`, `label`	Branch on greater or equal	E or G
`FBPUGE`	`fbuge{,a}` `{,pt\|,pn}`	`%fccn`, `label`	Branch on unordered or greater or equal	E or G or U
`FBPLE`	`fble{,a}` `{,pt\|,pn}`	`%fccn`, `label`	Branch on less or equal	E or L
`FBPULE`	`fbule{,a}` `{,pt\|,pn}`	`%fccn`, `label`	Branch on unordered or less or equal	E or L or u
`FBPO`	`fbo{,a}` `{,pt\|,pn}`	`%fccn`, `label`	Branch on ordered	E or L or G
`FLUSHW`	`flushw`		Flush register windows
`FMOVA`	`fmov` `{s,d,q}a`	%icc or %xcc, `freg_rs2`, `freg_rd`	(Move on integer cc) Move always	1
`FMOVN`	`fmov` `{s,d,q}n`	%icc or %xcc, `freg_rs2`, `freg_rd`	Move never	0
`FMOVNE`	`fmov` `{s,d,q}ne`	%icc or %xcc, `freg_rs2`, `freg_rd`	Move if not equal	not Z
`FMOVE`	`fmov` `{s,d,q}e`	%icc or %xcc, `freg_rs2`, `freg_rd`	Move if equal	Z
`FMOVG`	`fmov` `{s,d,q}g`	%icc or %xcc, `freg_rs2`, `freg_rd`	Move if greater	not (Z or (N xor V))
`FMOVLE`	`fmov` `{s,d,q}le`	%icc or %xcc, `freg_rs2`, `freg_rd`	Move if less or equal	Z or (N xor V)
`FMOVGE`	`fmov` `{s,d,q}ge`	%icc or %xcc, `freg_rs2`, `freg_rd`	Move if greater or equal	not (N xor V)
`FMOVL`	`fmov` `{s,d,q}l`	%icc or %xcc, `freg_rs2`, `freg_rd`	Move if less	N xor V
`FMOVGU`	`fmov` `{s,d,q}gu`	%icc or %xcc, `freg_rs2`, `freg_rd`	Move if greater unsigned	not (C or Z)
`FMOVLEU`	`fmov` `{s,d,q}leu`	%icc or %xcc, `freg_rs2`, `freg_rd`	Move if less or equal unsigned	C or Z
`FMOVCC`	`fmov` `{s,d,q}cc`	%icc or %xcc, `freg_rs2`, `freg_rd`	Move if carry clear (greater or equal, unsigned)	not C
`FMOVCS`	`fmov` `{s,d,q}cs`	%icc or %xcc, `freg_rs2`, `freg_rd`	Move if carry set (less than, unsigned)	C
`FMOVPOS`	`fmov` `{s,d,q}pos`	%icc or %xcc, `freg_rs2`, `freg_rd`	Move if positive	not N
`FMOVNEG`	`fmov` `{s,d,q}neg`	%icc or %xcc, `freg_rs2`, `freg_rd`	Move if negative	N
`FMOVVC`	`fmov` `{s,d,q}vc`	%icc or %xcc, `freg_rs2`, `freg_rd`	Move if overflow clear	not V
`FMOVVS`	`fmov` `{s,d,q}vs`	%icc or %xcc, `freg_rs2`, `freg_rd`	Move if overflow set	V
`FMOVRZ`	`fmovr` `{s,d,q}e`	`reg_rs1`, `freg_rs2`, `freg_rd`	(Move f-p register on cc) Move if register zero
`FMOVRLEZ`	`fmovr` `{s,d,q}lz`	`reg_rs1`, `freg_rs2`, `freg_rd`	Move if register less than or equal zero
`FMOVRLZ`	`fmovr` `{s,d,q}lz`	`reg_rs1`, `freg_rs2`, `freg_rd`	Move if register less than zero
`FMOVRNZ` `FMOVRGZ` `FMOVRGEZ`	`fmovr` `{s,d,q}ne` `fmovr` `{s,d,q}gz` `fmovr` `{s,d,q}gez`	`reg_rs1`, `freg_rs2`, `freg_rd` `reg_rs1`, `freg_rs2`, `freg_rd` `reg_rs1`, `freg_rs2`, `freg_rd`	Move if register not zero Move if register greater than zero Move if register greater than or equal to zero
`FMOVFA` `FMOVFN` `FMOVFU` `FMOVFG` `FMOVFUG` `FMOVFL` `FMOVFUL` `FMOVFLG` `FMOVFNE` `FMOVFE` `FMOVFUE` `FMOVFGE` `FMOVFUGE` `FMOVFLE` `FMOVFULE` `FMOVFO`	`fmov{s,d,q}a` `fmov{s,d,q}n` `fmov{s,d,q}u` `fmov{s,d,q}g` `fmov{s,d,q}ug` `fmov{s,d,q}l` `fmov{s,d,q}ul` `fmov{s,d,q}lg` `fmov{s,d,q}ne` `fmov{s,d,q}e` `fmov{s,d,q}ue` `fmov{s,d,q}ge` `fmov{s,d,q}uge` `fmov{s,d,q}le` `fmov{s,d,q}ule` `fmov{s,d,q}o`	`%fcc`n`,`freg_rs2,freg_rd `%fcc`n,freg_rs2,freg_rd `%fcc`n`,`freg_rs2,freg_rd `%fcc`n`,`freg_rs2,freg_rd `%fcc`n`,`freg_rs2,freg_rd `%fcc`n`,`freg_rs2,freg_rd `%fcc`n`,`freg_rs2,freg_rd `%fcc`n`,`freg_rs2,freg_rd `%fcc`n`,`freg_rs2,freg_rd `%fcc`n`,`freg_rs2,freg_rd `%fcc`n`,`freg_rs2,freg_rd `%fcc`n`,`freg_rs2,freg_rd `%fcc`n`,`freg_rs2,freg_rd `%fcc`n`,`freg_rs2,freg_rd `%fcc`n`,`freg_rs2,freg_rd `%fcc`n`,`freg_rs2,freg_rd	(Move on floating-point cc) Move always Move never Move if unordered Move if greater Move if unordered or greater Move if less Move if unordered or less Move if less or greater Move if not equal Move if equal Move if unordered or equal Move if greater or equal Move if unordered or greater or equal Move if less or equal Move if unordered or less or equal Move if ordered	1 0 U G G or U L L or U L or G L or G or U E E or U E or G E or G or U E or L E or L or u E or L or G
`LDSW` `LDSWA`	`ldsw` `ldsw`	[address], reg_rd [regaddr] imm_asi, reg_rd	Load a signed word Load signed word from alternate space
`LDX` `LDXA` `LDXFSR`	`ldx` `ldxa` `ldxa` `ldx`	[address], reg_rd [regaddr] imm_asi, reg_rd [reg_plus_imm] %asi, reg_rd [address], %fsr	Load extended word Load extended word from alternate space Load floating-point state register
`MEMBAR`	`membar`	membar_mask	Memory barrier
`MOVA` `MOVN` `MOVNE` `MOVE` `MOVG` `MOVLE` `MOVGE` `MOVL` `MOVGU` `MOVLEU` `MOVCC` `MOVCS` `MOVPOS` `MOVNEG` `MOVVC` `MOVVS`	`mova` `movn` `movne` `move` `movg` `movle` `movge` `movl` `movgu` `movleu` `movcc` `movcs` `movpos` `movneg` `movvc` `movvs`	%icc or %xcc,`reg_or_imm11, reg_rd` %icc or %xcc, `reg_or_imm11, reg_rd` %icc or %xcc, `reg_or_imm11, reg_rd` %icc or %xcc, `reg_or_imm11, reg_rd` %icc or %xcc, `reg_or_imm11, reg_rd` %icc or %xcc, `reg_or_imm11, reg_rd` %icc or %xcc, `reg_or_imm11, reg_rd` %icc or %xcc, `reg_or_imm11, reg_rd` %icc or %xcc, `reg_or_imm11, reg_rd` %icc or %xcc, `reg_or_imm11, reg_rd` %icc or %xcc, `reg_or_imm11, reg_rd` %icc or %xcc, `reg_or_imm11, reg_rd` %icc or %xcc, `reg_or_imm11, reg_rd` %icc or %xcc, `reg_or_imm11, reg_rd` %icc or %xcc, `reg_or_imm11, reg_rd` %icc or %xcc, `reg_or_imm11, reg_rd`	(Move integer register on cc) Move always Move never Move if not equal Move if equal Move if greater Move if less or equal Move if greater or equal Move if less Move if greater unsigned Move if less or equal unsigned Move if carry clear (greater or equal, unsigned) Move if carry set (less than, unsigned) Move if positive Move if negative Move if overflow clear Move if overflow set	1 0 not Z Z not (Z or (N xor V)) Z or (N xor V) not (N xor V) N xor V not (C or Z) C or Z not C C not N N not V V
`MOVFA` `MOVFN` `MOVFU` `MOVFG` `MOVFUG` `MOVFL` `MOVFUL` `MOVFLG` `MOVFNE` `MOVFE` `MOVFUE` `MOVFGE` `MOVFUGE` `MOVFLE` `MOVFULE` `MOVFO`	`mova` `movn` `movu` `movg` `movug` `movl` `movul` `movlg` `movne` `move` `movue` `movge` `movuge` `movle` `movule` `movo`	`%fcc`n`,`reg_or_imm11`,`reg_rd `%fcc`n,reg_or_imm11`,`reg_rd `%fcc`n`,`reg_or_imm11`,`reg_rd `%fcc`n`,`reg_or_imm11`,`reg_rd `%fcc`n`,`reg_or_imm11`,`reg_rd `%fcc`n`,`reg_or_imm11`,`reg_rd `%fcc`n`,`reg_or_imm11`,`reg_rd `%fcc`n`,`reg_or_imm11`,`reg_rd `%fcc`n`,`reg_or_imm11`,`reg_rd `%fcc`n`,`reg_or_imm11`,`reg_rd `%fcc`n`,`reg_or_imm11`,`reg_rd `%fcc`n`,`reg_or_imm11`,`reg_rd `%fcc`n,reg_or_imm11,reg`_rd` `%fcc`n`,`reg_or_imm11`,`reg_rd `%fcc`n`,`reg_or_imm11`,`reg_rd `%fcc`n`,`reg_or_imm11`,`reg_rd	(Move on floating-point cc) Move always Move never Move if unordered Move if greater Move if unordered or greater Move if less Move if unordered or less Move if less or greater Move if not equal Move if equal Move if unordered or equal Move if greater or equal Move if unordered or greater or equal Move if less or equal Move if unordered or less or equal Move if ordered	1 0 U G G or U L L or U L or G L or G or U E E or U E or G E or G or U E or L E or L or u E or L or G
`MOVRZ` `MOVRLEZ` `MOVRLZ` `MOVRNZ` `MOVRGZ` `MOVRGEZ`	`movre` `movrlez` `movrlz` `movrnz` `movrgz` `movrgez`	reg_rs1, reg_or_imm10,reg_rd reg_rs1, reg_or_imm10,reg_rd reg_rs1, reg_or_imm10,reg_rd reg_rs1, reg_or_imm10,reg_rd reg_rs1, reg_or_imm10,reg_rd reg_rs1, reg_or_imm10,reg_rd	(Move register on register cc) Move if register zero Move if register less than or equal to zero Move if register less than zero Move if register not zero Move if register greater than zero Move if register greater than or equal to zero	Z N or Z N not Z N nor Z not N
`MULX`	`mulx`	reg_rs1, reg_or_imm,reg_rd	(Generic 64-bit Multiply) Multiply (signed or unsigned)	See SDIVX and UDIVX
`POPC`	`popc`	reg_or_imm, reg_rd	Population count
`PREFETCH` `PREFETCHA`	`prefetch` `prefetcha` `prefetcha`	[address], prefetch_dcn [regaddr] imm_asi, prefetch_fcn [reg_plus_imm] `%asi`, prefetch_fcn	Prefetch data Prefetch data from alternate space	See The SPARC architecture manual, version 9
`SDIVX`	`sdivx`	reg_rs1, reg_or_imm,reg_rd	(64-bit signed divide) Signed Divide	See MULX and UDIVX
`STX` `STXA` `STXFSR`	`stx` `stxa` `stxa` `stx`	reg_rd, [address] reg_rd, [address] imm_asi reg_rd, [reg_plus_imm] %asi %fsr, [address]	Store extended word Store extended word into alternate space Store floating-point register (all 64-bits)
`UDIVX`	`udivx`	reg_rs1, reg_or_imm, reg_rd	(64-bit unsigned divide) Unsigned divide	See MULX and SDIVX

E.4 SPARC-V9 Floating-Point Instruction Set Mapping

SPARC-V9 floating-point instructions are shown in the following table.

Table E–11


SPARC	Mnemonic [Types of Operands are denoted by the following lower-case letters:`i` 32-bit integer`x` 64-bit integer`s` single`d` double`q` quad]	Argument List	Description
`F[sdq]TOx`	`fstox` `fdtox` `fqtox`	`freg_rs2`, `freg_rd` `freg_rs2`, `freg_rd` `freg_rs2`, `freg_rd`	Convert floating point to 64-bit integer
	`fstoi` `fdtoi` `fqtoi`	`freg_rs2`, `freg_rd` `freg_rs2`, `freg_rd` `freg_rs2`, `freg_rd`	Convert floating-point to 32-bit integer
`FxTO[sdq]`	`fxtos` `fxtod` `fxtoq`	freg_rs2, freg_rd freg_rs2, freg_rd freg_rs2, freg_rd	Convert 64-bit integer to floating point
	`fitos` `fitod` `fitoq`	freg_rs2, freg_rd freg_rs2, freg_rd freg_rs2, freg_rd	Convert 32-bit integer to floating point
`FMOV[dq]`	`fmovd` `fmovq`	freg_rs2, freg_rd freg_rs2, freg_rd	Move double Move quad
`FNEG[dq]`	`fnegd` `fnegq`	freg_rs2, freg_rd freg_rs2, freg_rd	Negate double Negate quad
`FABS[dq]`	`fabsd` `fabsq`	freg_rs2, freg_rd freg_rs2, freg_rd	Absolute value double Absolute value quad
`LDFA` `LDDFA` `LDQFA`	`lda` `lda` `ldda` `ldda` `ldqa` `ldqa`	[regaddr] imm_asi, freg_rd [reg_plus_imm] %asi, freg_rd [regaddr] imm_asi, freg_rd [reg_plus_imm] %asi, freg_rd [regaddr] imm_asi, freg_rd [reg_plus_imm] %asi, freg_rd	Load floating-point register from alternate space Load double floating-point register from alternate space. Load quad floating-point register from alternate space
`STFA` `STDFA` `STQFA`	`sta` `sta` `stda` `stda` `stqa` `stqa`	freg_rd, [regaddr] imm_asi freg_rd, [reg_plus_imm] %asi freg_rd, [regaddr] imm_asi freg_rd, [reg_plus_imm] %asi freg_rd, [regaddr] imm_asi freg_rd, [reg_plus_imm] %asi	Store floating-point register to alternate space Store double floating-point register to alternate space Store quad floating-point register to alternate space

E.5 SPARC-V9 Synthetic Instruction-Set Mapping

Here is a mapping of synthetic instructions to hardware equivalent instructions.

Table E–12


Synthetic Instruction		Hardware Equivalent(s)		Comment
`cas` `casl` `casx` `casxl`	[reg_rsl], reg_rs2, reg_rd [reg_rsl], reg_rs2, reg_rd [reg_rsl], reg_rs2, reg_rd [reg_rsl], reg_rs2, reg_rd	`casa` `casa` `casxa` `casxa`	[reg_rsl]ASI_P, reg_rs2, reg_rd [reg_rsl]ASI_P_L, reg_rs2, reg_rd [reg_rsl]ASI_P, reg_rs2, reg_rd [reg_rsl]ASI_P_L, reg_rs2, reg_rd	Compare & swap (cas) cas little-endian cas extended cas little-endian, extended
`clrx`	[address]	`stx`	`%g0`, [address]	Clear extended word
`clruw` `clruw`	reg_rs1, reg_rd reg_rd	`srl` `srl`	reg_rs1, %g0, reg_rd reg_rd, %g0, reg_rd	Copy and clear upper word Clear upper word
`iprefetch`	label	`bn`, `pt`	%xcc, label	Instruction prefetch,
`mov` `mov` `mov`	%y, reg_rd %asrn, reg_rd reg_or_imm, %asrn	`rd` `rd` `wr`	%y, reg_rd %asrn, reg_rd %g0, reg_or_imm, %asrn
`ret` `retl`		`jmpl` `jmpl`	%i7+8, %g0 %o7+8, %g0	Return from subroutine Return from leaf subroutine
`setn`	value, r1, r2	for `-xarch=v9` same as `setx` value r1, r2 for `-xarch=v8` same as `set` value r2
`setnhi`	value, r1, r2	for `-xarch=v9` same as `setxhi` value r1, r2 for `-xarch=v8` same as `sethi` value r2
`setuw`	value,reg_rd	`sethi` `or` `sethi` `or`	`%hi`(value), reg_rd `%g0`, value, reg_rd `%hi`(value), reg_rd; reg_rd, `%lo`(value), reg_rd	(value & 3FF₁₆)==0 when 0 ≤ value `≤` 4095 (otherwise) Do not use setuw in a DCTI delay slot.
`setsw`	value,reg_rd	`sethi` `or` `sethi` `sra` `sethi` `or` `sethi` `or` `sra`	`%hi`(value), reg_rd `%g0`, value, reg_rd `%hi`(value), reg_rd reg_rd, `%g0`, reg_rd `%hi`(value), reg_rd; reg_rd, `%lo`(value), reg_rd `%hi`(value), reg_rd; reg_rd, `%lo`(value), reg_rd reg_rd, `%g0`, reg_rd	value>=0 and (value & 3FF₁₆)==0 -4096 ≤ value ≤ 4095 if (value<0) and ((value & 3FF)==0) (otherwise, if value>=0) (otherwise, if value<0) Do not use setsw in a CTI delay slot.
`setx`	value, r1, r2	`sethi` `or` `sethi` `or` `sllx` `or`	%hh(value), r1 r1, %hm(value), r1 %lm(value), r2 r2, %lo(value), r2 r1, 32, r1 r1, r2, r2
`setxhi`	value r1, r2	`sethi` `or` `sethi` `sllx` `or`	%hh(value), r1 r1, %hm(value), r1 %lm(value), r2 r1, 32, r1 r1, r2, r2
`signx` `signx`	reg_rsl, reg_rd reg_rd	`sra` `sra`	reg_rsl, %g0, reg_rd reg_rd,%g0, reg_rd	Sign-extend 32-bit value to 64 bits

E.6 UltraSPARC and VIS Instruction Set Extensions

This section describes extensions that require SPARC-V9. The extensions support enhanced graphics functionality and improved memory access efficiency.

Note –

SPARC-V9 instruction set extensions used in executables may not be portable to other SPARC-V9 systems.

E.6.1 Graphics Data Formats

The overhead of converting to and from floating-point arithmetic is high, so the graphics instructions are optimized for short-integer arithmetic. Image components are 8 or 16 bits. Intermediate results are 16 or 32 bits.

E.6.2 Eight-bit Format

A 32-bit word contains pixels of four unsigned 8-bit integers. The integers represent image intensity values (, G, B, R). Support is provided for band interleaved images (store color components of a point), and band sequential images (store all values of one color component).

E.6.3 Fixed Data Formats

A 64-bit word contains four 16-bit signed fixed-point values. This is the fixed 16-bit data format.

A 64-bit word contains two 8-bit signed fixed-point values. This is the fixed 32-bit data format.

Enough precision and dynamic range (for filtering and simple image computations on pixel values) can be provided by an intermediate format of fixed data values. Pixel multiplication is used to convert from pixel data to fixed data. Pack instructions are used to convert from fixed data to pixel data (clip and truncate to an 8-bit unsigned value). The FPACKFIX instruction supports conversion from 32-bit fixed to 16-bit fixed. Rounding is done by adding one to the rounding bit position. You should use floating-point data to perform complex calculations needing more precision or dynamic range.

E.6.4 SHUTDOWN Instruction

All outstanding transactions are completed before the SHUTDOWN instruction completes.

Table E–13


SPARC	Mnemonic	Argument List	Description
`SHUTDOWN`	`shutdown`		shutdown to enter power down mode

E.6.5 Graphics Status Register (GSR)

You use ASR 0x13 instructions RDASR and WRASR to access the Graphics Status Register.

Table E–14


SPARC	Mnemonic	Argument List	Description
`RDASR` `WRASR`	`rdasr` `wrasr`	%gsr, reg_rd reg_rs1, reg_or_imm, %gsr	read GSR write GSR

E.6.6 Graphics Instructions

Unless otherwise specified, floating-point registers contain all instruction operands. There are 32 double-precision registers. Single-precision floating-point registers contain the pixel values, and double-precision floating-point registers contain the fixed values.

The opcode space reserved for the Implementation-Dependent Instruction1 (IMPDEP1) instructions is where the graphics instruction set is mapped.

Partitioned add/subtract instructions perform two 32-bit or four 16-bit partitioned adds or subtracts between the source operands corresponding fixed point values.

Table E–15


SPARC	Mnemonic	Argument List	Description
`FPADD16` `FPADD16S` `FPADD32` `FPADD32S` `FPSUB16` `FPSUB16S` `FPSUB32` `FPSUB32S`	`fpadd16` `fpadd16s` `fpadd32` `fpadd32s` `fpsub16` `fpsub16s` `fpsub32` `fpsub32s`	freg_rs1, freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd	four 16-bit add two 16-bit add two 32-bit add one 32-bit add four 16-bit subtract two 16-bit subtract two 32-bit subtract one 32-bit subtract

Pack instructions convert to a lower pixel or precision fixed format.

Table E–16


SPARC	Mnemonic	Argument List	Description
`FPACK16` `FPACK32` `FPACKFIX` `FEXPAND` `FPMERGE`	`fpack16` `fpack32` `fpackfix` `fexpand` `fpmerge`	freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd freg_rs2, freg_rd freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd	four 16-bit packs two 32-bit packs four 16-bit packs four 16-bit expands two 32-bit merges

Partitioned multiply instructions have the following variations.

Table E–17


SPARC	Mnemonic	Argument List	Description
`FMUL8x16` `FMUL8x16AU` `FMUL8x16AL` `FMUL8SUx16` `FMUL8ULx16` `FMULD8SUx16` `FMULD8ULx16`	`fmul8x16` `fmul8x16au` `fmul8x16al` `fmul8sux16` `fmul8ulx16` `fmuld8sux16` `fmuld8ulx16`	freg_rs1, freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd	8x16-bit partition 8x16-bit upper partition 8x16-bit lower partition upper 8x16-bit partition lower unsigned 8x16-bit partition upper 8x16-bit partition lower unsigned 8x16-bit partition

Alignment instructions have the following variations.

Table E–18


SPARC	Mnemonic	Argument List	Description
`ALIGNADDRESS` `ALIGNADDRESS_LITTLE` `FALIGNDATA`	`alignaddr` `alignaddrl` `faligndata`	reg_rs1, reg_rs2, reg_rd reg_rs1, reg_rs2, reg_rd freg_rs1, freg_rs2, freg_rd	find misaligned data access address same as above, but little-endian do misaligned data, data alignment

Logical operate instructions perform one of sixteen 64-bit logical operations between rs1 and rs2 (in the standard 64-bit version).

Table E–19


SPARC	Mnemonic	Argument List	Description
`FZERO` `FZEROS` `FONE` `FONES` `FSRC1`	`fzero` `fzeros` `fone` `fones` `fsrc1`	`freg_rd` `freg_rd` `freg_rd` `freg_rd` `freg_rs1`, `freg_rd`	zero fill zero fill, single precision one fill one fill, single precision copy src1
`FSRC1S` `FSRC2` `FSRC2S` `FNOT1` `FNOT1S`	`fsrc1s` `fsrc2` `fsrc2s` `fnot1` `fnot1s`	freg_rs1, freg_rd freg_rs2, freg_rd freg_rs2, freg_rd freg_rs1, freg_rd freg_rs1, freg_rd	copy src1, single precision copy src2 copy src2, single precision negate src1, 1's complement same as above, single precision
`FNOT2` `FNOT2S` `FOR` `FORS` `FNOR`	`fnot2` `fnot2s` `for` `fors` `fnor`	freg_rs2, freg_rd freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd	negate src2, 1's complement same as above, single precision logical OR logical OR, single precision logical NOR
`FNORS` `FAND` `FANDS` `FNAND` `FNANDS`	`fnors` `fand` `fands` `fnand` `fnands`	freg_rs1, freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd	logical NOR, single precision logical AND logical AND, single precision logical NAND logical NAND, single precision
`FXOR` `FXORS` `FXNOR` `FXNORS` `FORNOT1`	`fxor` `fxors` `fxnor` `fxnors` `fornot1`	freg_rs1, freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd	logical XOR logical XOR, single precision logical XNOR logical XNOR, single precision negated src1 OR src2
`FORNOT1S` `FORNOT2` `FORNOT2S` `FANDNOT1`	`fornot1s` `fornot2` `fornot2s` `fandnot1`	freg_rs1, freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd	same as above, single precision src1 OR negated src2 same as above, single precision negated src1 AND src2
`FANDNOT1S` `FANDNOT2` `FANDNOT2S`	`fandnot1s` `fandnot2` `fandnot2s`	freg_rs1, freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd	same as above, single precision src1 AND negated src2 same as above, single precision

Pixel compare instructions compare fixed-point values in rs1 and rs2 (two 32 bit or four 16 bit)

Table E–20


SPARC	Mnemonic	Argument List	Description
`FCMPGT16` `FCMPGT32` `FCMPLE16` `FCMPLE32`	`fcmpgt16` `fcmpgt32` `fcmple16` `fcmple32`	freg_rs1, freg_rs2, reg_rd freg_rs1, freg_rs2, reg_rd freg_rs1, freg_rs2, reg_rd freg_rs1, freg_rs2, reg_rd	4 16-bit compare, set rd if src1>src2 2 32-bit compare, set rd if src1>src2 4 16-bit compare, set rd if src1≤src2 2 32-bit compare, set rd if src1≤src2
`FCMPNE16` `FCMPNE32` `FCMPEQ16` `FCMPEQ32`	`fcmpne16` `fcmpne32` `fcmpeq16` `fcmpeq32`	freg_rs1, freg_rs2, reg_rd freg_rs1, freg_rs2, reg_rd freg_rs1, freg_rs2, reg_rd freg_rs1, freg_rs2, reg_rd	4 16-bit compare, set rd if src1≠src2 2 32-bit compare, set rd if src1≠src2 4 16-bit compare, set rd if src1=src2 2 32-bit compare, set rd if src1=src2

Edge handling instructions handle the boundary conditions for parallel pixel scan line loops.

Table E–21


SPARC	Mnemonic	Argument List	Description
`EDGE8` `EDGE8L` `EDGE16`	`edge8` `edge8l` `edge16`	reg_rs1, reg_rs2, reg_rd reg_rs1, reg_rs2, reg_rd reg_rs1, reg_rs2, reg_rd	8 8-bit edge boundary processing same as above, little-endian 4 16-bit edge boundary processing
`EDGE16L` `EDGE32` `EDGE32L`	`edge16l` `edge32` `edge32l`	reg_rs1, reg_rs2, reg_rd reg_rs1, reg_rs2, reg_rd reg_rs1, reg_rs2, reg_rd	same as above, little-endian 2 32-bit edge boundary processing same as above, little-endian

Pixel component distance instructions are used for motion estimation in video compression algorithms.

Table E–22


SPARC	Mnemonic	Argument List	Description
`PDIST`	`pdist`	freg_rs1, freg_rs2, freg_rd	8 8-bit components, distance between

The three-dimensional array addressing instructions convert three- dimensional fixed-point addresses (in rs1) to a blocked-byte address. The result is stored in rd.

Table E–23


SPARC	Mnemonic	Argument List	Description
`ARRAY8` `ARRAY16` `ARRAY32`	`array8` `array16` `array32`	reg_rs1, reg_rs2, reg_rd reg_rs1, reg_rs2, reg_rd reg_rs1, reg_rs2, reg_rd	convert 8-bit 3-D address to blocked byte address same as above, but 16-bit same as above, but 32-bit

E.6.7 Memory Access Instructions

These memory access instructions are part of the SPARC-V9 instruction set extensions.

Table E–24


SPARC	imm_asi	Argument List	Description
`STDFA` `STDFA` `STDFA` `STDFA`	`ASI_PST8_P` `ASI_PST8_S` `ASI_PST8_PL` `ASI_PST8_SL`	stda freg_rd, [freg_rs1] reg_mask, imm_asi	eight 8-bit conditional stores to: primary address space secondary address space primary address space, little endian secondary address space, little endian
`STDFA` `STDFA` `STDFA` `STDFA`	`ASI_PST16_P` `ASI_PST16_S` `ASI_PST16_PL` `ASI_PST16_SL`		four 16-bit conditional stores to: primary address space secondary address space primary address space, little endian secondary address space, little endian
`STDFA` `STDFA` `STDFA` `STDFA`	`ASI_PST32_P` `ASI_PST32_S` `ASI_PST32_PL` `ASI_PST32_SL`		two 32-bit conditional stores to: primary address space secondary address space primary address space, little endian secondary address space, little endian

Note –

To select a partial store instruction, use one of the partial store ASIs with the STDA instruction.

Table E–25


SPARC	imm_asi	Argument List	Description
`LDDFA` `STDFA`	`ASI_FL8_P`	ldda [reg_addr] imm_asi, freq_rd stda freq_rd, [reg_addr] imm_asi	8-bit load/store from/to: primary address space
`LDDFA` `STDFA`	`ASI_FL8_S`	ldda [reg_plus_imm] %asi, freq_rd stda [reg_plus_imm] %asi	secondary address space
`LDDFA` `STDFA`	`ASI_FL8_PL`		primary address space, little endian
`LDDFA` `STDFA`	`ASI_FL8_SL`		secondary address space, little endian
`LDDFA` `STDFA`	`ASI_FL16_P`		16-bit load/store from/to: primary address space
`LDDFA` `STDFA`	`ASI_FL16_S`		secondary address space
`LDDFA` `STDFA`	`ASI_FL16_PL`		primary address space, little endian
`LDDFA` `STDFA`	`ASI_FL16_SL`		secondary address space, little endian

Note –

To select a short floating-point load and store instruction, use one of the short ASIs with the LDDA and STDA instructions.

Table E–26


SPARC	imm_asi	Argument List	Description
`LDDA` `LDDA`	`ASI_NUCLEUS_QUAD_LDD` `ASI_NUCLEUS_QUAD_LDD_L`	[reg_addr] imm_asi, reg_rd [reg_plus_imm] %asi, reg_rd	128-bit atomic load 128-bit atomic load, little endian
`LDDFA` `STDFA`	`ASI_BLK_AIUP`	ldda [reg_addr] imm_asi, freq_rd stda freq_rd, [reg_addr] imm_asi	64-byte block load/store from/to: primary address space, user privilege
`LDDFA` `STDFA`	`ASI_BLK_AIUS`	ldda [reg_plus_imm] %asi, freq_rd stda freg_rd, [reg_plus_imm] %asi	secondary address space, user privilege.
`LDDFA` `STDFA`	`ASI_BLK_AIUPL`		primary address space, user privilege, little endian
`LDDFA` `STDFA`	`ASI_BLK_AIUSL`		secondary address space, user privilege little endian
`LDDFA` `STDFA`	`ASI_BLK_P`		primary address space
`LDDFA` `STDFA`	`ASI_BLK_S`		secondary address space
`LDDFA` `STDFA`	`ASI_BLK_PL`		primary address space, little endian
`LDDFA` `STDFA`	`ASI_BLK_SL`		secondary address space, little endian
`LDDFA` `STDFA`	`ASI_BLK_COMMIT_P`		64-byte block commit store to primary address space
`LDDFA` `STDFA`	`ASI_BLK_COMMIT_S`		64-byte block commit store to secondary address space

Note –

To select a block load and store instruction, use one of the block transfer ASIs with the LDDA and STDA instructions.

Previous: Appendix D An Example Language Program