This appendix describes changes made to the SPARC instruction set due to the SPARC-V9 architecture. Application software for the 32-bit SPARC-V8 (Version8) architecture can execute, unchanged, on SPARC-V9 systems.
This appendix is organized into the following sections:
The SPARC-V9 architecture differs from SPARC-V8 architecture in the following areas, expanded below: registers, alternate space access, byte order, and instruction set.
These registers have been deleted:
Table E–1
PSR |
Processor State Register |
TBR |
Trap Base Register |
WIM |
Window Invalid Mask |
These registers have been widened from 32 to 64 bits:
Table E–2
Integer registers |
|
All state registers |
FSR, PC, nPC, and Y |
FSR Floating-Point State Register: fcc1, fcc2, and fcc3 (added floating-point condition code) bits are added and the register widened to 64-bits.
These SPARC-V9 registers are within a SPARC-V8 register field:
Table E–3
CCR |
Condition Codes Register |
CWP |
Current Window Pointer |
PIL |
Processor Interrupt Level |
TBA |
Trap Base Address |
TT[MAXTL] |
Trap Type |
VER |
Version |
These are registers that have been added.
Table E–4
ASI |
Address Space Identifier |
CANRESTORE |
Restorable Windows |
CANSAVE |
Savable windows |
CLEANWIN |
Clean Windows |
FPRS |
Floating-point Register State |
OTHERWIN |
Other Windows |
PSTATE |
Processor State |
TICK |
Hardware clock tick-counter |
TL |
Trap Level |
TNPC[MAXTL] |
Trap Next Program Counter |
TPC[MAXTL] |
Trap Program Counter |
TSTATE[MAXTL] |
Trap State |
WSTATE |
Windows State |
Also, there are sixteen additional double-precision floating-point registers, f[32] .. f[62]. These registers overlap (and are aliased with) eight additional quad-precision floating-point registers, f[32] .. f[60]
The SPARC-V9, CWP register is decremented during a RESTORE instruction, and incremented during a SAVE instruction. This is the opposite of PSR.CWP's behavior in SPARC-V8. This change has no effect on nonprivileged instructions.
Load- and store-alternate instructions to one-half of the alternate spaces can now be included in user code. In SPARC-V9, loads and stores to ASIs 0016 .. 7f16 are privileged; those to ASIs 8016 .. FF16 are nonprivileged. In SPARC-V8, access to alternate address spaces is privileged.
SPARC-V9 supports both little- and big-endian byte orders for data accesses only; instruction accesses are always performed using big-endian byte order. In SPARC-V8, all data and instruction accesses are performed in big-endian byte order.
Application software written for the SPARC-V8 processor runs unchanged on a SPARC-V9 processor.
FCMP, FCMPE |
Floating-Point Compare—can set any of the four floating-point condition codes. |
LDFSR, STFSR |
Load/Store FSR- only affect low-order 32 bits of FSR |
LDUW, LDUWA |
Same as LD, LDA in SPARC-V8 |
RDASR/WRASR |
Read/Write State Registers - access additional registers |
SAVE/RESTORE |
|
SETHI |
|
SRA, SRL, SLL, Shifts |
Split into 32-bit and 64-bit versions |
Tcc |
(was Ticc) Operates with either the 32-bit integer condition codes (icc), or the 64-bit integer condition codes (xcc) |
All other arithmetic operations operate on 64-bit operands and produce 64-bit results.
F[sdq]TOx |
Convert floating point to 64-bit word |
FxTO[sdq] |
Convert 64-bit word to floating point |
FMOV[dq] |
Floating-Point Move, double and quad |
FNEG[dq] |
Floating-point Negate, double and quad |
FABS[dq] |
Floating-point Absolute Value, double and quad |
LDDFA, STDFA, LDFA, STFA |
Alternate address space forms of LDDF, STDF, LDF, and STF |
LDSW |
Load a signed word |
LDSWA |
Load a signed word from an alternate space |
LDX |
Load an extended word |
LDXA |
Load an extended word from an alternate space |
LDXFSR |
Load all 64 bits of the FSR register |
STX |
Store an extended word |
STXA |
Store an extended word into an alternate space |
STXFSR |
Store all 64 bits if the FSR register |
BPcc |
Branch on integer condition code with prediction |
BPr |
Branch on integer register contents with prediction |
CASA, CASXA |
Compare and Swap from an alternate space |
FBPfcc |
Branch on floating-point condition code with prediction |
FLUSHW |
Flush windows |
FMOVcc |
Move floating-point register if condition code is satisfied |
FMOVr |
Move floating-point register if integer register satisfies condition |
LDQF(A), STQF(A) |
Load/Store Quad Floating-point (in an alternate space) |
MOVcc |
Move integer register if condition code is satisfied |
MOVr |
Move integer register if register contents satisfy condition |
MULX |
Generic 64-bit multiply |
POPC |
Population count |
PREFETCH, PREFETCHA |
Prefetch Data |
SDIVX, UDIVX |
Signed and Unsigned 64-bit divide |
Coprocessor loads and stores |
|
RDTBR and WRTBR |
TBR no longer exists. It is replaced by TBA, which can be read/written with RDPR/WRPR instructions |
RDWIM and WRWIM |
WIM no longer exists. WIM has been replaced by several register-window registers |
REPSR and WRPSR |
PSR no longer exists. It has been replaced by several separate registers that are read/written with other instructions |
RETT |
Return from trap (replace by DONE/RETRY) |
STDFQ |
Store Double from Floating-point Queue (replaced by the RDPR FQ instruction |
IMPDEPn |
(Changed) Implementation-dependent instructions (replace SPARC-V8 CPop instructions) |
MEMBAR |
(Added) Memory barrier (memory synchronization support) |
Opcode |
Mnemonic |
Argument List |
Operation |
Comments |
---|---|---|---|---|
BPA |
ba{,a} {,pt|,pn} |
%icc or %xcc, label |
(Branch on cc with prediction) Branch always |
1 |
BPN |
bn{,a} {,pt|,pn} |
%icc or %xcc, label |
Branch never |
0 |
BPNE |
bne{,a} {,pt|,pn} |
%icc or %xcc, label |
Branch on not equal |
not Z |
BPE |
be{,a} {,pt|,pn} |
%icc or %xcc, label |
Branch on equal |
Z |
BPG |
bg{,a} {,pt|,pn} |
%icc or %xcc, label |
Branch on greater |
not (Z or (N xor V)) |
BPLE |
ble{,a} {,pt|,pn} |
%icc or %xcc, label |
Branch on less or equal |
Z or (N xor V) |
BPGE |
bge{,a} {,pt|,pn} |
%icc or %xcc, label |
Branch on greater or equal |
not (N xor V) |
BPL |
bl{,a} {,pt|,pn} |
%icc or %xcc, label |
Branch on less |
N xor V |
BPGU |
bgu{,a} {,pt|,pn} |
%icc or %xcc, label |
Branch on greater unsigned |
not (C or Z) |
BPLEU |
bleu{,a} {,pt|,pn} |
%icc or %xcc, label |
Branch on less or equal unsigned |
C or Z |
BPCC |
bcc{,a} {,pt|,pn} |
%icc or %xcc, label |
Branch on carry clear (greater than or equal, unsigned) |
not C |
BPCS |
bcs{,a} {,pt|,pn} |
%icc or %xcc, label |
Branch on carry set (less than, unsigned) |
C |
BPPOS |
bpos{,a} {,pt|,pn} |
%icc or %xcc, label |
Branch on positive |
not N |
BPNEG |
bneg{,a} {,pt|,pn} |
%icc or %xcc, label |
Branch on negative |
N |
BPVC |
bvc{,a} {,pt|,pn} |
%icc or %xcc, label |
Branch on overflow clear |
not V |
BPVS |
bvs{,a} {,pt|,pn} |
%icc or %xcc, label |
Branch on overflow set |
V |
BRZ |
brz{,a} {,pt|,pn} |
regrs1, label |
Branch on register zero |
Z |
BRLEZ |
brlez{,a} {,pt|,pn} |
regrs1, label |
Branch on register less than or equal to zero |
N or Z |
BRLZ |
brlz{,a} {,pt|,pn} |
regrs1, label |
Branch on register less than zero |
N |
BRNZ |
brnz{,a} {,pt|,pn} |
regrs1, label |
Branch on register not zero |
not Z |
BRGZ |
brgz{,a} {,pt|,pn} |
regrs1, label |
Branch on register greater than zero |
not (N or Z) |
BRGEZ |
brgez{,a} {,pt|,pn} |
regrs1, label |
Branch on register greater than or equal to zero |
not N |
CASA |
casa casa |
[regrs1]imm_asi,regrs2,regrd [regrs1]%asi,regrs2,regrd |
Compare and swap word from alternate space |
|
CASXA |
casxa casxa |
[regrs1]imm_asi,regrs2,regrd [regrs1]%asi,regrs2,regrd |
Compare and swap extended from alternate space |
|
FBPA |
fba{,a} {,pt|,pn} |
%fccn, label |
(Branch on cc with prediction) Branch never |
1 |
FBPN |
fbn{,a} {,pt|,pn} |
%fccn, label |
Branch always |
0 |
FBPU |
fbu{,a} {,pt|,pn} |
%fccn, label |
Branch on unordered |
U |
FBPG |
fbg{,a} {,pt|,pn} |
%fccn, label |
Branch on greater |
G |
FBPUG |
fbug{,a} {,pt|,pn} |
%fccn, label |
Branch on unordered or greater |
G or U |
FBPL |
fbl{,a} {,pt|,pn} |
%fccn, label |
Branch on less |
L |
FBPUL |
fbul{,a} {,pt|,pn} |
%fccn, label |
Branch on unordered or less |
L or U |
FBPLG |
fblg{,a} {,pt|,pn} |
%fccn, label |
Branch on less or greater |
L or G |
FBPNE |
fbne{,a} {,pt|,pn} |
%fccn, label |
Branch on not equal |
L or G or U |
FBPE |
fbe{,a} {,pt|,pn} |
%fccn, label |
Branch on equal |
E |
FBPUE |
fbue{,a} {,pt|,pn} |
%fccn, label |
Branch on unordered or equal |
E or U |
FBPGE |
fbge{,a} {,pt|,pn} |
%fccn, label |
Branch on greater or equal |
E or G |
FBPUGE |
fbuge{,a} {,pt|,pn} |
%fccn, label |
Branch on unordered or greater or equal |
E or G or U |
FBPLE |
fble{,a} {,pt|,pn} |
%fccn, label |
Branch on less or equal |
E or L |
FBPULE |
fbule{,a} {,pt|,pn} |
%fccn, label |
Branch on unordered or less or equal |
E or L or u |
FBPO |
fbo{,a} {,pt|,pn} |
%fccn, label
|
Branch on ordered |
E or L or G |
FLUSHW |
flushw |
|
Flush register windows |
|
FMOVA |
fmov {s,d,q}a |
%icc or %xcc, fregrs2, fregrd |
(Move on integer cc) Move always |
1 |
FMOVN |
fmov {s,d,q}n |
%icc or %xcc, fregrs2, fregrd |
Move never |
0 |
FMOVNE |
fmov {s,d,q}ne |
%icc or %xcc, fregrs2, fregrd |
Move if not equal |
not Z |
FMOVE |
fmov {s,d,q}e |
%icc or %xcc, fregrs2, fregrd |
Move if equal |
Z |
FMOVG |
fmov {s,d,q}g |
%icc or %xcc, fregrs2, fregrd |
Move if greater |
not (Z or (N xor V)) |
FMOVLE |
fmov {s,d,q}le |
%icc or %xcc, fregrs2, fregrd |
Move if less or equal |
Z or (N xor V) |
FMOVGE |
fmov {s,d,q}ge |
%icc or %xcc, fregrs2, fregrd |
Move if greater or equal |
not (N xor V) |
FMOVL |
fmov {s,d,q}l |
%icc or %xcc, fregrs2, fregrd |
Move if less |
N xor V |
FMOVGU |
fmov {s,d,q}gu |
%icc or %xcc, fregrs2, fregrd |
Move if greater unsigned |
not (C or Z) |
FMOVLEU |
fmov {s,d,q}leu |
%icc or %xcc, fregrs2, fregrd |
Move if less or equal unsigned |
C or Z |
FMOVCC |
fmov {s,d,q}cc |
%icc or %xcc, fregrs2, fregrd |
Move if carry clear (greater or equal, unsigned) |
not C |
FMOVCS |
fmov {s,d,q}cs |
%icc or %xcc, fregrs2, fregrd |
Move if carry set (less than, unsigned) |
C |
FMOVPOS |
fmov {s,d,q}pos |
%icc or %xcc, fregrs2, fregrd |
Move if positive |
not N |
FMOVNEG |
fmov {s,d,q}neg |
%icc or %xcc, fregrs2, fregrd |
Move if negative |
N |
FMOVVC |
fmov {s,d,q}vc |
%icc or %xcc, fregrs2, fregrd |
Move if overflow clear |
not V |
FMOVVS |
fmov {s,d,q}vs |
%icc or %xcc, fregrs2, fregrd |
Move if overflow set |
V |
FMOVRZ |
fmovr {s,d,q}e |
regrs1, fregrs2, fregrd |
(Move f-p register on cc) Move if register zero |
|
FMOVRLEZ |
fmovr {s,d,q}lz |
regrs1, fregrs2, fregrd |
Move if register less than or equal zero | |
FMOVRLZ |
fmovr {s,d,q}lz |
regrs1, fregrs2, fregrd |
Move if register less than zero | |
FMOVRNZ FMOVRGZ FMOVRGEZ |
fmovr {s,d,q}ne fmovr {s,d,q}gz fmovr {s,d,q}gez |
regrs1, fregrs2, fregrd regrs1, fregrs2, fregrd regrs1, fregrs2, fregrd |
Move if register not zero Move if register greater than zero Move if register greater than or equal to zero |
|
FMOVFA FMOVFN FMOVFU FMOVFG FMOVFUG FMOVFL FMOVFUL FMOVFLG FMOVFNE FMOVFE FMOVFUE FMOVFGE FMOVFUGE FMOVFLE FMOVFULE FMOVFO
|
fmov{s,d,q}a fmov{s,d,q}n fmov{s,d,q}u fmov{s,d,q}g fmov{s,d,q}ug fmov{s,d,q}l fmov{s,d,q}ul fmov{s,d,q}lg fmov{s,d,q}ne fmov{s,d,q}e fmov{s,d,q}ue fmov{s,d,q}ge fmov{s,d,q}uge fmov{s,d,q}le fmov{s,d,q}ule fmov{s,d,q}o |
%fccn,fregrs2,fregrd %fccn,fregrs2,fregrd %fccn,fregrs2,fregrd %fccn,fregrs2,fregrd %fccn,fregrs2,fregrd %fccn,fregrs2,fregrd %fccn,fregrs2,fregrd %fccn,fregrs2,fregrd %fccn,fregrs2,fregrd %fccn,fregrs2,fregrd %fccn,fregrs2,fregrd %fccn,fregrs2,fregrd %fccn,fregrs2,fregrd %fccn,fregrs2,fregrd %fccn,fregrs2,fregrd %fccn,fregrs2,fregrd
|
(Move on floating-point cc) Move always Move never Move if unordered Move if greater Move if unordered or greater Move if less Move if unordered or less Move if less or greater Move if not equal Move if equal Move if unordered or equal Move if greater or equal Move if unordered or greater or equal Move if less or equal Move if unordered or less or equal Move if ordered |
1 0 U G G or U L L or U L or G L or G or U E E or U E or G E or G or U E or L E or L or u E or L or G |
LDSW LDSWA |
ldsw ldsw |
[address], regrd [regaddr] imm_asi, regrd |
Load a signed word Load signed word from alternate space |
|
LDX LDXA
LDXFSR |
ldx ldxa ldxa ldx |
[address], regrd [regaddr] imm_asi, regrd [reg_plus_imm] %asi, regrd [address], %fsr |
Load extended word Load extended word from alternate space Load floating-point state register |
|
MEMBAR |
membar |
membar_mask |
Memory barrier |
|
MOVA MOVN MOVNE MOVE MOVG MOVLE MOVGE MOVL MOVGU MOVLEU MOVCC MOVCS MOVPOS MOVNEG MOVVC MOVVS |
mova movn movne move movg movle movge movl movgu movleu movcc movcs movpos movneg movvc movvs |
%icc or %xcc, reg_or_imm11, regrd %icc or %xcc, reg_or_imm11, regrd %icc or %xcc, reg_or_imm11, regrd %icc or %xcc, reg_or_imm11, regrd %icc or %xcc, reg_or_imm11, regrd %icc or %xcc, reg_or_imm11, regrd %icc or %xcc, reg_or_imm11, regrd %icc or %xcc, reg_or_imm11, regrd %icc or %xcc, reg_or_imm11, regrd %icc or %xcc, reg_or_imm11, regrd %icc or %xcc, reg_or_imm11, regrd %icc or %xcc, reg_or_imm11, regrd %icc or %xcc, reg_or_imm11, regrd %icc or %xcc, reg_or_imm11, regrd %icc or %xcc, reg_or_imm11, regrd %icc or %xcc, reg_or_imm11, regrd |
(Move integer register on cc) Move always Move never Move if not equal Move if equal Move if greater Move if less or equal Move if greater or equal Move if less Move if greater unsigned Move if less or equal unsigned Move if carry clear (greater or equal, unsigned) Move if carry set (less than, unsigned) Move if positive Move if negative Move if overflow clear Move if overflow set |
1 0 not Z Z not (Z or (N xor V)) Z or (N xor V) not (N xor V) N xor V not (C or Z) C or Z not C C not N N not V V |
MOVFA MOVFN MOVFU MOVFG MOVFUG MOVFL MOVFUL MOVFLG MOVFNE MOVFE MOVFUE MOVFGE MOVFUGE MOVFLE MOVFULE MOVFO |
mova movn movu movg movug movl movul movlg movne move movue movge movuge movle movule movo |
%fccn,reg_or_imm11,regrd %fccn,reg_or_imm11,regrd %fccn,reg_or_imm11,regrd %fccn,reg_or_imm11,regrd %fccn,reg_or_imm11,regrd %fccn,reg_or_imm11,regrd %fccn,reg_or_imm11,regrd %fccn,reg_or_imm11,regrd %fccn,reg_or_imm11,regrd %fccn,reg_or_imm11,regrd %fccn,reg_or_imm11,regrd %fccn,reg_or_imm11,regrd %fccn,reg_or_imm11,regrd %fccn,reg_or_imm11,regrd %fccn,reg_or_imm11,regrd %fccn,reg_or_imm11,regrd |
(Move on floating-point cc) Move always Move never Move if unordered Move if greater Move if unordered or greater Move if less Move if unordered or less Move if less or greater Move if not equal Move if equal Move if unordered or equal Move if greater or equal Move if unordered or greater or equal Move if less or equal Move if unordered or less or equal Move if ordered |
1 0 U G G or U L L or U L or G L or G or U E E or U E or G E or G or U E or L E or L or u E or L or G |
MOVRZ MOVRLEZ MOVRLZ MOVRNZ MOVRGZ MOVRGEZ |
movre movrlez movrlz movrnz movrgz movrgez |
regrs1, reg_or_imm10,regrd regrs1, reg_or_imm10,regrd regrs1, reg_or_imm10,regrd regrs1, reg_or_imm10,regrd regrs1, reg_or_imm10,regrd regrs1, reg_or_imm10,regrd |
(Move register on register cc) Move if register zero Move if register less than or equal to zero Move if register less than zero Move if register not zero Move if register greater than zero Move if register greater than or equal to zero |
Z N or Z N not Z N nor Z not N |
MULX |
mulx |
regrs1, reg_or_imm,regrd |
(Generic 64-bit Multiply) Multiply (signed or unsigned) |
See SDIVX and UDIVX |
POPC |
popc |
reg_or_imm, regrd |
Population count |
|
PREFETCH PREFETCHA |
prefetch prefetcha prefetcha |
[address], prefetch_dcn [regaddr] imm_asi, prefetch_fcn [reg_plus_imm] %asi, prefetch_fcn |
Prefetch data Prefetch data from alternate space |
See The SPARC architecture manual, version 9 |
SDIVX |
sdivx |
regrs1, reg_or_imm,regrd |
(64-bit signed divide) Signed Divide |
See MULX and UDIVX |
STX STXA
STXFSR |
stx stxa stxa stx |
regrd, [address] regrd, [address] imm_asi regrd, [reg_plus_imm] %asi %fsr, [address] |
Store extended word Store extended word into alternate space
Store floating-point register (all 64-bits) |
|
UDIVX |
udivx |
regrs1, reg_or_imm, regrd |
(64-bit unsigned divide) Unsigned divide |
See MULX and SDIVX |
SPARC-V9 floating-point instructions are shown in the following table.
Table E–11
Here is a mapping of synthetic instructions to hardware equivalent instructions.
Table E–12
Synthetic Instruction |
Hardware Equivalent(s) |
Comment |
||
---|---|---|---|---|
cas casl casx casxl |
[regrsl], regrs2, regrd [regrsl], regrs2, regrd [regrsl], regrs2, regrd [regrsl], regrs2, regrd |
casa casa casxa casxa |
[regrsl]ASI_P, regrs2, regrd [regrsl]ASI_P_L, regrs2, regrd [regrsl]ASI_P, regrs2, regrd [regrsl]ASI_P_L, regrs2, regrd |
Compare & swap (cas) cas little-endian cas extended cas little-endian, extended |
clrx |
[address] |
stx |
%g0, [address] |
Clear extended word |
clruw clruw |
regrs1, regrd regrd |
srl srl |
regrs1, %g0, regrd regrd, %g0, regrd |
Copy and clear upper word Clear upper word |
iprefetch |
label |
bn, pt |
%xcc, label |
Instruction prefetch, |
mov mov mov |
%y, regrd %asrn, regrd reg_or_imm, %asrn |
rd rd wr |
%y, regrd %asrn, regrd %g0, reg_or_imm, %asrn |
|
ret retl |
|
jmpl jmpl |
%i7+8, %g0 %o7+8, %g0 |
Return from subroutine Return from leaf subroutine |
setn |
value, r1, r2 |
for -xarch=v9 same as setx value r1, r2 for -xarch=v8 same as set value r2 |
|
|
setnhi |
value, r1, r2 |
for -xarch=v9 same as setxhi value r1, r2 for -xarch=v8 same as sethi value r2 |
|
|
setuw |
value,regrd |
sethi or sethi or |
%hi(value), regrd %g0, value, regrd %hi(value), regrd; regrd, %lo(value), regrd |
(value & 3FF16)==0 when 0 ≤ value ≤ 4095 (otherwise) Do not use setuw in a DCTI delay slot. |
setsw |
value,regrd |
sethi or sethi sra sethi or sethi or sra |
%hi(value), regrd %g0, value, regrd %hi(value), regrd regrd, %g0, regrd %hi(value), regrd; regrd, %lo(value), regrd %hi(value), regrd; regrd, %lo(value), regrd regrd, %g0, regrd |
value>=0 and (value & 3FF16)==0 -4096 ≤ value ≤ 4095 if (value<0) and ((value & 3FF)==0) (otherwise, if value>=0) (otherwise, if value<0) Do not use setsw in a CTI delay slot. |
setx |
value, r1, r2 |
sethi or sethi or sllx or |
%hh(value), r1 r1, %hm(value), r1 %lm(value), r2 r2, %lo(value), r2 r1, 32, r1 r1, r2, r2 |
|
setxhi |
value r1, r2 |
sethi or sethi sllx or |
%hh(value), r1 r1, %hm(value), r1 %lm(value), r2 r1, 32, r1 r1, r2, r2 |
|
signx signx |
regrsl, regrd regrd |
sra sra |
regrsl, %g0, regrd regrd, %g0, regrd |
Sign-extend 32-bit value to 64 bits |
This section describes extensions that require SPARC-V9. The extensions support enhanced graphics functionality and improved memory access efficiency.
SPARC-V9 instruction set extensions used in executables may not be portable to other SPARC-V9 systems.
The overhead of converting to and from floating-point arithmetic is high, so the graphics instructions are optimized for short-integer arithmetic. Image components are 8 or 16 bits. Intermediate results are 16 or 32 bits.
A 32-bit word contains pixels of four unsigned 8-bit integers. The integers represent image intensity values (, G, B, R). Support is provided for band interleaved images (store color components of a point), and band sequential images (store all values of one color component).
A 64-bit word contains four 16-bit signed fixed-point values. This is the fixed 16-bit data format.
A 64-bit word contains two 8-bit signed fixed-point values. This is the fixed 32-bit data format.
Enough precision and dynamic range (for filtering and simple image computations on pixel values) can be provided by an intermediate format of fixed data values. Pixel multiplication is used to convert from pixel data to fixed data. Pack instructions are used to convert from fixed data to pixel data (clip and truncate to an 8-bit unsigned value). The FPACKFIX instruction supports conversion from 32-bit fixed to 16-bit fixed. Rounding is done by adding one to the rounding bit position. You should use floating-point data to perform complex calculations needing more precision or dynamic range.
All outstanding transactions are completed before the SHUTDOWN instruction completes.
Table E–13
SPARC |
Mnemonic |
Argument List |
Description |
---|---|---|---|
SHUTDOWN |
shutdown |
|
shutdown to enter power down mode |
You use ASR 0x13 instructions RDASR and WRASR to access the Graphics Status Register.
Table E–14
SPARC |
Mnemonic |
Argument List |
Description |
---|---|---|---|
RDASR WRASR |
rdasr wrasr |
%gsr, regrd regrs1, reg_or_imm, %gsr |
read GSR write GSR |
Unless otherwise specified, floating-point registers contain all instruction operands. There are 32 double-precision registers. Single-precision floating-point registers contain the pixel values, and double-precision floating-point registers contain the fixed values.
The opcode space reserved for the Implementation-Dependent Instruction1 (IMPDEP1) instructions is where the graphics instruction set is mapped.
Partitioned add/subtract instructions perform two 32-bit or four 16-bit partitioned adds or subtracts between the source operands corresponding fixed point values.
Table E–15
SPARC |
Mnemonic |
Argument List |
Description |
---|---|---|---|
FPADD16 FPADD16S FPADD32 FPADD32S FPSUB16 FPSUB16S FPSUB32 FPSUB32S |
fpadd16 fpadd16s fpadd32 fpadd32s fpsub16 fpsub16s fpsub32 fpsub32s |
fregrs1, fregrs2, fregrd fregrs1, fregrs2, fregrd fregrs1, fregrs2, fregrd fregrs1, fregrs2, fregrd fregrs1, fregrs2, fregrd fregrs1, fregrs2, fregrd fregrs1, fregrs2, fregrd fregrs1, fregrs2, fregrd |
four 16-bit add two 16-bit add two 32-bit add one 32-bit add four 16-bit subtract two 16-bit subtract two 32-bit subtract one 32-bit subtract |
Pack instructions convert to a lower pixel or precision fixed format.
Table E–16
SPARC |
Mnemonic |
Argument List |
Description |
---|---|---|---|
FPACK16 FPACK32 FPACKFIX FEXPAND FPMERGE |
fpack16 fpack32 fpackfix fexpand fpmerge |
fregrs2, fregrd fregrs1, fregrs2, fregrd fregrs2, fregrd fregrs2, fregrd fregrs1, fregrs2, fregrd |
four 16-bit packs two 32-bit packs four 16-bit packs four 16-bit expands two 32-bit merges |
Partitioned multiply instructions have the following variations.
Table E–17
SPARC |
Mnemonic |
Argument List |
Description |
---|---|---|---|
FMUL8x16 FMUL8x16AU FMUL8x16AL FMUL8SUx16 FMUL8ULx16 FMULD8SUx16 FMULD8ULx16 |
fmul8x16 fmul8x16au fmul8x16al fmul8sux16 fmul8ulx16 fmuld8sux16 fmuld8ulx16 |
fregrs1, fregrs2, fregrd fregrs1, fregrs2, fregrd fregrs1, fregrs2, fregrd fregrs1, fregrs2, fregrd fregrs1, fregrs2, fregrd fregrs1, fregrs2, fregrd fregrs1, fregrs2, fregrd |
8x16-bit partition 8x16-bit upper partition 8x16-bit lower partition upper 8x16-bit partition lower unsigned 8x16-bit partition upper 8x16-bit partition lower unsigned 8x16-bit partition |
Alignment instructions have the following variations.
Table E–18
SPARC |
Mnemonic |
Argument List |
Description |
---|---|---|---|
ALIGNADDRESS ALIGNADDRESS_LITTLE FALIGNDATA |
alignaddr alignaddrl
faligndata |
regrs1, regrs2, regrd regrs1, regrs2, regrd
fregrs1, fregrs2, fregrd |
find misaligned data access address same as above, but little-endian
do misaligned data, data alignment |
Logical operate instructions perform one of sixteen 64-bit logical operations between rs1 and rs2 (in the standard 64-bit version).
Table E–19
SPARC |
Mnemonic |
Argument List |
Description |
---|---|---|---|
FZERO FZEROS FONE FONES FSRC1 |
fzero fzeros fone fones fsrc1 |
fregrd fregrd fregrd fregrd fregrs1, fregrd |
zero fill zero fill, single precision one fill one fill, single precision copy src1 |
FSRC1S FSRC2 FSRC2S FNOT1 FNOT1S |
fsrc1s fsrc2 fsrc2s fnot1 fnot1s |
fregrs1, fregrd fregrs2, fregrd fregrs2, fregrd fregrs1, fregrd fregrs1, fregrd |
copy src1, single precision copy src2 copy src2, single precision negate src1, 1's complement same as above, single precision |
FNOT2 FNOT2S FOR FORS FNOR |
fnot2 fnot2s for fors fnor |
fregrs2, fregrd fregrs2, fregrd fregrs1, fregrs2, fregrd fregrs1, fregrs2, fregrd fregrs1, fregrs2, fregrd |
negate src2, 1's complement same as above, single precision logical OR logical OR, single precision logical NOR |
FNORS FAND FANDS FNAND FNANDS |
fnors fand fands fnand fnands |
fregrs1, fregrs2, fregrd fregrs1, fregrs2, fregrd fregrs1, fregrs2, fregrd fregrs1, fregrs2, fregrd fregrs1, fregrs2, fregrd |
logical NOR, single precision logical AND logical AND, single precision logical NAND logical NAND, single precision |
FXOR FXORS FXNOR FXNORS FORNOT1 |
fxor fxors fxnor fxnors fornot1 |
fregrs1, fregrs2, fregrd fregrs1, fregrs2, fregrd fregrs1, fregrs2, fregrd fregrs1, fregrs2, fregrd fregrs1, fregrs2, fregrd |
logical XOR logical XOR, single precision logical XNOR logical XNOR, single precision negated src1 OR src2 |
FORNOT1S FORNOT2 FORNOT2S FANDNOT1 |
fornot1s fornot2 fornot2s fandnot1 |
fregrs1, fregrs2, fregrd fregrs1, fregrs2, fregrd fregrs1, fregrs2, fregrd fregrs1, fregrs2, fregrd |
same as above, single precision src1 OR negated src2 same as above, single precision negated src1 AND src2 |
FANDNOT1S FANDNOT2 FANDNOT2S |
fandnot1s fandnot2 fandnot2s |
fregrs1, fregrs2, fregrd fregrs1, fregrs2, fregrd fregrs1, fregrs2, fregrd
|
same as above, single precision src1 AND negated src2 same as above, single precision |
Pixel compare instructions compare fixed-point values in rs1 and rs2 (two 32 bit or four 16 bit)
Table E–20
SPARC |
Mnemonic |
Argument List |
Description |
---|---|---|---|
FCMPGT16 FCMPGT32 FCMPLE16 FCMPLE32 |
fcmpgt16 fcmpgt32 fcmple16 fcmple32 |
fregrs1, fregrs2, regrd fregrs1, fregrs2, regrd fregrs1, fregrs2, regrd fregrs1, fregrs2, regrd |
4 16-bit compare, set rd if src1>src2 2 32-bit compare, set rd if src1>src2 4 16-bit compare, set rd if src1≤src2 2 32-bit compare, set rd if src1≤src2 |
FCMPNE16 FCMPNE32 FCMPEQ16 FCMPEQ32 |
fcmpne16 fcmpne32 fcmpeq16 fcmpeq32 |
fregrs1, fregrs2, regrd fregrs1, fregrs2, regrd fregrs1, fregrs2, regrd fregrs1, fregrs2, regrd |
4 16-bit compare, set rd if src1≠src2 2 32-bit compare, set rd if src1≠src2 4 16-bit compare, set rd if src1=src2 2 32-bit compare, set rd if src1=src2 |
Edge handling instructions handle the boundary conditions for parallel pixel scan line loops.
Table E–21
SPARC |
Mnemonic |
Argument List |
Description |
---|---|---|---|
EDGE8 EDGE8L EDGE16 |
edge8 edge8l edge16 |
regrs1, regrs2, regrd regrs1, regrs2, regrd regrs1, regrs2, regrd |
8 8-bit edge boundary processing same as above, little-endian 4 16-bit edge boundary processing |
EDGE16L EDGE32 EDGE32L |
edge16l edge32 edge32l |
regrs1, regrs2, regrd regrs1, regrs2, regrd regrs1, regrs2, regrd |
same as above, little-endian 2 32-bit edge boundary processing same as above, little-endian |
Pixel component distance instructions are used for motion estimation in video compression algorithms.
Table E–22
SPARC |
Mnemonic |
Argument List |
Description |
---|---|---|---|
PDIST |
pdist |
fregrs1, fregrs2, fregrd |
8 8-bit components, distance between |
The three-dimensional array addressing instructions convert three- dimensional fixed-point addresses (in rs1) to a blocked-byte address. The result is stored in rd.
Table E–23
SPARC |
Mnemonic |
Argument List |
Description |
---|---|---|---|
ARRAY8
ARRAY16 ARRAY32 |
array8
array16 array32 |
regrs1, regrs2, regrd
regrs1, regrs2, regrd regrs1, regrs2, regrd |
convert 8-bit 3-D address to blocked byte address same as above, but 16-bit same as above, but 32-bit |
These memory access instructions are part of the SPARC-V9 instruction set extensions.
Table E–24
SPARC |
imm_asi |
Argument List |
Description |
---|---|---|---|
STDFA STDFA STDFA STDFA |
ASI_PST8_P ASI_PST8_S ASI_PST8_PL ASI_PST8_SL |
stda fregrd, [fregrs1] regmask, imm_asi |
eight 8-bit conditional stores to: primary address space secondary address space primary address space, little endian secondary address space, little endian |
STDFA STDFA STDFA STDFA |
ASI_PST16_P ASI_PST16_S ASI_PST16_PL ASI_PST16_SL |
four 16-bit conditional stores to: primary address space secondary address space primary address space, little endian secondary address space, little endian |
|
STDFA STDFA STDFA STDFA |
ASI_PST32_P ASI_PST32_S ASI_PST32_PL ASI_PST32_SL |
|
two 32-bit conditional stores to: primary address space secondary address space primary address space, little endian secondary address space, little endian |
To select a partial store instruction, use one of the partial store ASIs with the STDA instruction.
SPARC |
imm_asi |
Argument List |
Description |
---|---|---|---|
LDDFA STDFA |
ASI_FL8_P |
ldda [reg_addr] imm_asi, freqrd stda freqrd, [reg_addr] imm_asi |
8-bit load/store from/to: primary address space |
LDDFA STDFA |
ASI_FL8_S |
ldda [reg_plus_imm] %asi, freqrd stda [reg_plus_imm] %asi |
secondary address space |
LDDFA STDFA |
ASI_FL8_PL |
|
primary address space, little endian |
LDDFA STDFA |
ASI_FL8_SL |
|
secondary address space, little endian |
LDDFA STDFA |
ASI_FL16_P |
|
16-bit load/store from/to: primary address space |
LDDFA STDFA |
ASI_FL16_S |
|
secondary address space |
LDDFA STDFA |
ASI_FL16_PL |
primary address space, little endian |
|
LDDFA STDFA |
ASI_FL16_SL |
|
secondary address space, little endian |
To select a short floating-point load and store instruction, use one of the short ASIs with the LDDA and STDA instructions.
SPARC |
imm_asi |
Argument List |
Description |
---|---|---|---|
LDDA LDDA |
ASI_NUCLEUS_QUAD_LDD ASI_NUCLEUS_QUAD_LDD_L |
[reg_addr] imm_asi, regrd [reg_plus_imm] %asi, regrd |
128-bit atomic load 128-bit atomic load, little endian |
LDDFA STDFA |
ASI_BLK_AIUP |
ldda [reg_addr] imm_asi, freqrd stda freqrd, [reg_addr] imm_asi |
64-byte block load/store from/to: primary address space, user privilege |
LDDFA STDFA |
ASI_BLK_AIUS |
ldda [reg_plus_imm] %asi, freqrd stda fregrd, [reg_plus_imm] %asi |
secondary address space, user privilege. |
LDDFA STDFA |
ASI_BLK_AIUPL |
|
primary address space, user privilege, little endian |
LDDFA STDFA |
ASI_BLK_AIUSL |
|
secondary address space, user privilege little endian |
LDDFA STDFA |
ASI_BLK_P |
|
primary address space |
LDDFA STDFA |
ASI_BLK_S |
|
secondary address space |
LDDFA STDFA |
ASI_BLK_PL |
|
primary address space, little endian |
LDDFA STDFA |
ASI_BLK_SL |
|
secondary address space, little endian |
LDDFA STDFA |
ASI_BLK_COMMIT_P |
64-byte block commit store to primary address space |
|
LDDFA STDFA |
ASI_BLK_COMMIT_S |
|
64-byte block commit store to secondary address space |
To select a block load and store instruction, use one of the block transfer ASIs with the LDDA and STDA instructions.