Instructions, Operands, and Addressing - x86 Assembly Language Reference Manual

Language:

2.2 Instructions, Operands, and Addressing

Instructions are operations performed by the CPU. Operands are entities operated upon by the instruction. Addresses are the locations in memory of specified data.

2.2.1 Instructions in Assembly Language

An instruction is a statement that is executed at runtime. An x86 instruction statement can consist of four parts:

Label (optional)
Instruction (required)
Operands (instruction specific)
Comment (optional)

See Statements in Assembly Language for the description of labels and comments.

The terms instruction and mnemonic are used interchangeably in this document to refer to the names of x86 instructions. Although the term opcode is sometimes used as a synonym for instruction, this document reserves the term opcode for the hexadecimal representation of the instruction value.

For most instructions, the Oracle Solaris x86 assembler mnemonics are the same as the Intel or AMD mnemonics. However, the Oracle Solaris x86 mnemonics might appear to be different because the Oracle Solaris mnemonics are suffixed with a one-character modifier that specifies the size of the instruction operands. That is, the Oracle Solaris assembler derives its operand type information from the instruction name and the suffix. If a mnemonic is specified with no type suffix, the operand type defaults to long. Possible operand types and their instruction suffixes are:

b: Byte (8-bit)
w: Word (16-bit)
l: Long (32-bit) (default)
q: Quadword (64-bit)

The assembler recognizes the following suffixes for x87 floating-point instructions:

[no suffix]: Instruction operands are registers only
l ("long"): Instruction operands are 64-bit
s ("short"): Instruction operands are 32-bit

See Instruction Set Mapping for a mapping between Oracle Solaris x86 assembly language mnemonics and the equivalent Intel or AMD mnemonics.

2.2.2 Operands in Assembly Language

An x86 instruction can have zero to three operands. Operands are separated by commas (,) (ASCII 0x2C). For instructions with two operands, the first (lefthand) operand is the source operand, and the second (righthand) operand is the destination operand (that is, source→destination).

Note - The Intel assembler uses the opposite order (destination←source) for operands.

Operands can be immediate (that is, constant expressions that evaluate to an inline value), register (a value in the processor number registers), or memory (a value stored in memory). An indirect operand contains the address of the actual operand value. Indirect operands are specified by prefixing the operand with an asterisk (*) (ASCII 0x2A). Only jump and call instructions can use indirect operands.

Immediate operands are prefixed with a dollar sign ($) (ASCII 0x24)
Register names are prefixed with a percent sign (%) (ASCII 0x25)
Memory operands are specified either by the name of a variable or by a register that contains the address of a variable. A variable name implies the address of a variable and instructs the computer to reference the contents of memory at that address. Memory references have the following syntax:segment:offset(base, index, scale).
- Segment is any of the x86 architecture segment registers. Segment is optional: if specified, it must be separated from offset by a colon (:). If segment is omitted, the value of %ds (the default segment register) is assumed.
- Offset is the displacement from segment of the desired memory value. Offset is optional.
- Base and index can be any of the general 32-bit number registers.
- Scale is a factor by which index is to be multipled before being added to base to specify the address of the operand. Scale can have the value of 1, 2, 4, or 8. If scale is not specified, the default value is 1.
Some examples of memory addresses are:

movl var, %eax
Move the contents of memory location var into number register %eax.

movl %cs:var, %eax
Move the contents of memory location var in the code segment (register %cs) into number register %eax.

movl $var, %eax
Move the address of var into number register %eax.

movl array_base(%esi), %eax
Add the address of memory location array_base to the contents of number register %esi to determine an address in memory. Move the contents of this address into number register %eax.

movl (%ebx, %esi, 4), %eax
Multiply the contents of number register %esi by 4 and add the result to the contents of number register %ebx to produce a memory reference. Move the contents of this memory location into number register %eax.

movl struct_base(%ebx, %esi, 4), %eax
Multiply the contents of number register %esi by 4, add the result to the contents of number register %ebx, and add the result to the address of struct_base to produce an address. Move the contents of this address into number register %eax.