E.6 UltraSPARC and VIS Instruction Set Extensions

This section describes extensions that require SPARC-V9. The extensions support enhanced graphics functionality and improved memory access efficiency.

Note - SPARC-V9 instruction set extensions used in executables may not be portable to other SPARC-V9 systems.

E.6.1 Graphics Data Formats

The overhead of converting to and from floating-point arithmetic is high, so the graphics instructions are optimized for short-integer arithmetic. Image components are 8 or 16 bits. Intermediate results are 16 or 32 bits.

E.6.2 Eight-bit Format

A 32-bit word contains pixels of four unsigned 8-bit integers. The integers represent image intensity values (, G, B, R). Support is provided for band interleaved images (store color components of a point), and band sequential images (store all values of one color component).

E.6.3 Fixed Data Formats

A 64-bit word contains four 16-bit signed fixed-point values. This is the fixed 16-bit data format.

A 64-bit word contains two 8-bit signed fixed-point values. This is the fixed 32-bit data format.

Enough precision and dynamic range (for filtering and simple image computations on pixel values) can be provided by an intermediate format of fixed data values. Pixel multiplication is used to convert from pixel data to fixed data. Pack instructions are used to convert from fixed data to pixel data (clip and truncate to an 8-bit unsigned value). The FPACKFIX instruction supports conversion from 32-bit fixed to 16-bit fixed. Rounding is done by adding one to the rounding bit position. You should use floating-point data to perform complex calculations needing more precision or dynamic range.

E.6.4 SHUTDOWN Instruction

All outstanding transactions are completed before the SHUTDOWN instruction completes.

Table E-13

SPARC	Mnemonic	Argument List	Description
`SHUTDOWN`	`shutdown`		shutdown to enter power down mode

E.6.5 Graphics Status Register (GSR)

You use ASR 0x13 instructions RDASR and WRASR to access the Graphics Status Register.

Table E-14

SPARC	Mnemonic	Argument List	Description
`RDASR` `WRASR`	`rdasr` `wrasr`	%gsr, reg_rd reg_rs1, reg_or_imm, %gsr	read GSR write GSR

E.6.6 Graphics Instructions

Unless otherwise specified, floating-point registers contain all instruction operands. There are 32 double-precision registers. Single-precision floating-point registers contain the pixel values, and double-precision floating-point registers contain the fixed values.

The opcode space reserved for the Implementation-Dependent Instruction1 (IMPDEP1) instructions is where the graphics instruction set is mapped.

Partitioned add/subtract instructions perform two 32-bit or four 16-bit partitioned adds or subtracts between the source operands corresponding fixed point values.

Table E-15

SPARC	Mnemonic	Argument List	Description
`FPADD16` `FPADD16S` `FPADD32` `FPADD32S` `FPSUB16` `FPSUB16S` `FPSUB32` `FPSUB32S`	`fpadd16` `fpadd16s` `fpadd32` `fpadd32s` `fpsub16` `fpsub16s` `fpsub32` `fpsub32s`	freg_rs1, freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd	four 16-bit add two 16-bit add two 32-bit add one 32-bit add four 16-bit subtract two 16-bit subtract two 32-bit subtract one 32-bit subtract

Pack instructions convert to a lower pixel or precision fixed format.

Table E-16

SPARC	Mnemonic	Argument List	Description
`FPACK16` `FPACK32` `FPACKFIX` `FEXPAND` `FPMERGE`	`fpack16` `fpack32` `fpackfix` `fexpand` `fpmerge`	freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd freg_rs2, freg_rd freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd	four 16-bit packs two 32-bit packs four 16-bit packs four 16-bit expands two 32-bit merges

Partitioned multiply instructions have the following variations.

Table E-17

SPARC	Mnemonic	Argument List	Description
`FMUL8x16` `FMUL8x16AU` `FMUL8x16AL` `FMUL8SUx16` `FMUL8ULx16` `FMULD8SUx16` `FMULD8ULx16`	`fmul8x16` `fmul8x16au` `fmul8x16al` `fmul8sux16` `fmul8ulx16` `fmuld8sux16` `fmuld8ulx16`	freg_rs1, freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd	8x16-bit partition 8x16-bit upper partition 8x16-bit lower partition upper 8x16-bit partition lower unsigned 8x16-bit partition upper 8x16-bit partition lower unsigned 8x16-bit partition

Alignment instructions have the following variations.

Table E-18

SPARC	Mnemonic	Argument List	Description
`ALIGNADDRESS` `ALIGNADDRESS_LITTLE` `FALIGNDATA`	`alignaddr` `alignaddrl` `faligndata`	reg_rs1, reg_rs2, reg_rd reg_rs1, reg_rs2, reg_rd freg_rs1, freg_rs2, freg_rd	find misaligned data access address same as above, but little-endian do misaligned data, data alignment

Logical operate instructions perform one of sixteen 64-bit logical operations between rs1 and rs2 (in the standard 64-bit version).

Table E-19

SPARC	Mnemonic	Argument List	Description
`FZERO` `FZEROS` `FONE` `FONES` `FSRC1`	`fzero` `fzeros` `fone` `fones` `fsrc1`	`freg`_rd `freg`_rd `freg`_rd `freg`_rd `freg`_rs1, `freg`_rd	zero fill zero fill, single precision one fill one fill, single precision copy src1
`FSRC1S` `FSRC2` `FSRC2S` `FNOT1` `FNOT1S`	`fsrc1s` `fsrc2` `fsrc2s` `fnot1` `fnot1s`	freg_rs1, freg_rd freg_rs2, freg_rd freg_rs2, freg_rd freg_rs1, freg_rd freg_rs1, freg_rd	copy src1, single precision copy src2 copy src2, single precision negate src1, 1's complement same as above, single precision
`FNOT2` `FNOT2S` `FOR` `FORS` `FNOR`	`fnot2` `fnot2s` `for` `fors` `fnor`	freg_rs2, freg_rd freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd	negate src2, 1's complement same as above, single precision logical OR logical OR, single precision logical NOR
`FNORS` `FAND` `FANDS` `FNAND` `FNANDS`	`fnors` `fand` `fands` `fnand` `fnands`	freg_rs1, freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd	logical NOR, single precision logical AND logical AND, single precision logical NAND logical NAND, single precision
`FXOR` `FXORS` `FXNOR` `FXNORS` `FORNOT1`	`fxor` `fxors` `fxnor` `fxnors` `fornot1`	freg_rs1, freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd	logical XOR logical XOR, single precision logical XNOR logical XNOR, single precision negated src1 OR src2
`FORNOT1S` `FORNOT2` `FORNOT2S` `FANDNOT1`	`fornot1s` `fornot2` `fornot2s` `fandnot1`	freg_rs1, freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd	same as above, single precision src1 OR negated src2 same as above, single precision negated src1 AND src2
`FANDNOT1S` `FANDNOT2` `FANDNOT2S`	`fandnot1s` `fandnot2` `fandnot2s`	freg_rs1, freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd freg_rs1, freg_rs2, freg_rd	same as above, single precision src1 AND negated src2 same as above, single precision

Pixel compare instructions compare fixed-point values in rs1 and rs2 (two 32 bit or four 16 bit)

Table E-20

SPARC	Mnemonic	Argument List	Description
`FCMPGT16` `FCMPGT32` `FCMPLE16` `FCMPLE32`	`fcmpgt16` `fcmpgt32` `fcmple16` `fcmple32`	freg_rs1, freg_rs2, reg_rd freg_rs1, freg_rs2, reg_rd freg_rs1, freg_rs2, reg_rd freg_rs1, freg_rs2, reg_rd	4 16-bit compare, set rd if src1>src2 2 32-bit compare, set rd if src1>src2 4 16-bit compare, set rd if src1≤src2 2 32-bit compare, set rd if src1≤src2
`FCMPNE16` `FCMPNE32` `FCMPEQ16` `FCMPEQ32`	`fcmpne16` `fcmpne32` `fcmpeq16` `fcmpeq32`	freg_rs1, freg_rs2, reg_rd freg_rs1, freg_rs2, reg_rd freg_rs1, freg_rs2, reg_rd freg_rs1, freg_rs2, reg_rd	4 16-bit compare, set rd if src1≠src2 2 32-bit compare, set rd if src1≠src2 4 16-bit compare, set rd if src1=src2 2 32-bit compare, set rd if src1=src2

Edge handling instructions handle the boundary conditions for parallel pixel scan line loops.

Table E-21

SPARC	Mnemonic	Argument List	Description
`EDGE8` `EDGE8L` `EDGE16`	`edge8` `edge8l` `edge16`	reg_rs1, reg_rs2, reg_rd reg_rs1, reg_rs2, reg_rd reg_rs1, reg_rs2, reg_rd	8 8-bit edge boundary processing same as above, little-endian 4 16-bit edge boundary processing
`EDGE16L` `EDGE32` `EDGE32L`	`edge16l` `edge32` `edge32l`	reg_rs1, reg_rs2, reg_rd reg_rs1, reg_rs2, reg_rd reg_rs1, reg_rs2, reg_rd	same as above, little-endian 2 32-bit edge boundary processing same as above, little-endian

Pixel component distance instructions are used for motion estimation in video compression algorithms.

Table E-22

SPARC	Mnemonic	Argument List	Description
`PDIST`	`pdist`	freg_rs1, freg_rs2, freg_rd	8 8-bit components, distance between

The three-dimensional array addressing instructions convert three- dimensional fixed-point addresses (in rs1) to a blocked-byte address. The result is stored in rd.

Table E-23

SPARC	Mnemonic	Argument List	Description
`ARRAY8` `ARRAY16` `ARRAY32`	`array8` `array16` `array32`	reg_rs1, reg_rs2, reg_rd reg_rs1, reg_rs2, reg_rd reg_rs1, reg_rs2, reg_rd	convert 8-bit 3-D address to blocked byte address same as above, but 16-bit same as above, but 32-bit

E.6.7 Memory Access Instructions

These memory access instructions are part of the SPARC-V9 instruction set extensions.

Table E-24

SPARC	imm_asi	Argument List	Description
`STDFA` `STDFA` `STDFA` `STDFA`	`ASI_PST8_P` `ASI_PST8_S` `ASI_PST8_PL` `ASI_PST8_SL`	stda freg_rd, [reg_addr] reg_mask, imm_asi	eight 8-bit conditional stores to: primary address space secondary address space primary address space, little endian secondary address space, little endian
`STDFA` `STDFA` `STDFA` `STDFA`	`ASI_PST16_P` `ASI_PST16_S` `ASI_PST16_PL` `ASI_PST16_SL`		four 16-bit conditional stores to: primary address space secondary address space primary address space, little endian secondary address space, little endian
`STDFA` `STDFA` `STDFA` `STDFA`	`ASI_PST32_P` `ASI_PST32_S` `ASI_PST32_PL` `ASI_PST32_SL`		two 32-bit conditional stores to: primary address space secondary address space primary address space, little endian secondary address space, little endian

Note - To select a partial store instruction, use one of the partial store ASIs with the STDA instruction.

Table E-25

SPARC	imm_asi	Argument List	Description
`LDDFA` `STDFA`	`ASI_FL8_P`	ldda [reg_addr] imm_asi, freg_rd stda freg_rd, [reg_addr] imm_asi	8-bit load/store from/to: primary address space
`LDDFA` `STDFA`	`ASI_FL8_S`	ldda [reg_plus_imm] %asi, freg_rd stda [reg_plus_imm] %asi	secondary address space
`LDDFA` `STDFA`	`ASI_FL8_PL`		primary address space, little endian
`LDDFA` `STDFA`	`ASI_FL8_SL`		secondary address space, little endian
`LDDFA` `STDFA`	`ASI_FL16_P`		16-bit load/store from/to: primary address space
`LDDFA` `STDFA`	`ASI_FL16_S`		secondary address space
`LDDFA` `STDFA`	`ASI_FL16_PL`		primary address space, little endian
`LDDFA` `STDFA`	`ASI_FL16_SL`		secondary address space, little endian

Note - To select a short floating-point load and store instruction, use one of the short ASIs with the LDDA and STDA instructions.

Table E-26

SPARC	imm_asi	Argument List	Description
`LDDA` `LDDA`	`ASI_NUCLEUS_QUAD_LDD` `ASI_NUCLEUS_QUAD_LDD_L`	[reg_addr] imm_asi, reg_rd [reg_plus_imm] %asi, reg_rd	128-bit atomic load 128-bit atomic load, little endian
`LDDFA` `STDFA`	`ASI_BLK_AIUP`	ldda [reg_addr] imm_asi, freg_rd stda freg_rd, [reg_addr] imm_asi	64-byte block load/store from/to: primary address space, user privilege
`LDDFA` `STDFA`	`ASI_BLK_AIUS`	ldda [reg_plus_imm] %asi, freg_rd stda freg_rd, [reg_plus_imm] %asi	secondary address space, user privilege.
`LDDFA` `STDFA`	`ASI_BLK_AIUPL`		primary address space, user privilege, little endian
`LDDFA` `STDFA`	`ASI_BLK_AIUSL`		secondary address space, user privilege little endian
`LDDFA` `STDFA`	`ASI_BLK_P`		primary address space
`LDDFA` `STDFA`	`ASI_BLK_S`		secondary address space
`LDDFA` `STDFA`	`ASI_BLK_PL`		primary address space, little endian
`LDDFA` `STDFA`	`ASI_BLK_SL`		secondary address space, little endian
`LDDFA` `STDFA`	`ASI_BLK_COMMIT_P`		64-byte block commit store to primary address space
`LDDFA` `STDFA`	`ASI_BLK_COMMIT_S`		64-byte block commit store to secondary address space

Note - To select a block load and store instruction, use one of the block transfer ASIs with the LDDA and STDA instructions.

Skip Navigation Links
Exit Print View
	SPARC Assembly Language Reference Manual