Linker and Libraries Guide

x64: Thread-Local Variable Access

On x64, the following code sequence models are available for accessing TLS

x64: General Dynamic (GD)

This code sequence implements the GD model described in Thread-Local Storage Access Models.

Table 8–18 x64: General Dynamic Thread-Local Variable Access Codes

Code Sequence

Initial Relocations

Symbol

0x00 .byte 0x66
0x01 leaq  x@tlsgd(%rip), %rdi
0x08 .word 0x666
0x0a rex64
0x0b call  __tls_get_addr@plt

# %iax - contains address of TLS variable
<none>
R_AMD64_TLSGD
<none>
<none>
R_AMD64_PLT32
 
x


__tls_get_addr
 

Outstanding Relocations

Symbol

GOT[n]
GOT[n + 1]
R_AMD64_DTPMOD64
R_AMD64_DTPOFF64
x
x

The __tls_get_addr() function takes a single parameter, the address of the tls_index structure. The R_AMD64_TLSGD relocation that is associated with the x@tlsgd(%rip) expression, instructs the link-editor to allocate a tls_index structure within the GOT. The two elements required for the tls_index structure are maintained in consecutive GOT entries, GOT[n] and GOT[n+1]. These GOT entries are associated to the R_AMD64_DTPMOD64 and R_AMD64_DTPOFF64 relocations.

The instruction at address 0x00 computes the address of the first GOT entry. This computation adds the PC relative address of the beginning of the GOT, which is known at link-edit time, to the current instruction pointer. The result is passed using the %rdi register to the __tls_get_addr() function.


Note –

The leaq instruction computes the address of the first GOT entry. This computation is carried out by adding the PC-relative address of the GOT, which was determined at link-edit time, to the current instruction pointer. The .byte, .word, and .rex64 prefixes insure that the whole instruction sequence occupies 16 bytes. Prefixes are employed, as prefixes have no negative inpact on the code.


x64: Local Dynamic (LD)

This code sequence implements the LD model described in Thread-Local Storage Access Models.

Table 8–19 x64: Local Dynamic Thread-Local Variable Access Codes

Code Sequence

Initial Relocations

Symbol

0x00 leaq  x1@tlsld(%rip), %rdi
0x07 call  __tls_get_addr@plt

# %rax - contains address of TLS block

0x10 leaq  x1@dtpoff(%rax), %rcx

# %rcx - contains address of TLS variable x1

0x20 leaq  x2@dtpoff(%rax), %r9

# %rcx - contains address of TLS variable x2
R_AMD64_TLSLD
R_AMD64_PLT32
 


R_AMD64_DTOFF32
 


R_AMD64_DTOFF32
x1
__tls_get_addr



 
x1


 
x2
 

Outstanding Relocations

Symbol

GOT[n]
R_AMD64_DTMOD64
x1

The first two instructions are equivalent to the code sequence used for the general dynamic model, although without any padding. The two instructions must be consecutive. The x1@tlsld(%rip) sequence generates a the tls_index entry for symbol x1. This index refers to the current module that contains x1 with an offset of zero. The link-editor creates one relocation for the object, R_AMD64_DTMOD64.

The R_AMD64_DTOFF32 relocation is unnecessary, because offsets are loaded separately. The x1@dtpoff expression is used to access the offset of the symbol x1. Using the instruction as address 0x10, the complete offset is loaded and added to the result of the __tls_get_addr() call in %rax to produce the result in %rcx. The x1@dtpoff expression creates the R_AMD64_DTPOFF32 relocation.

Instead of computing the address of the variable, the value of the variable can be loaded using the following instruction. This instruction creates the same relocation as the original leaq instruction.


movq  x1@dtpoff(%rax), %r11

Provided the base address of a TLS block is maintained within a register, loading, storing or computing the address of a protected thread-local variable requires one instruction.

Benefits exist in using the local dynamic model over the general dynamic model. Every additional thread-local variable access only requires three new instructions. In addition, no additional GOT entries, or runtime relocations are required.

x64: Initial Executable (IE)

This code sequence implements the IE model described in Thread-Local Storage Access Models.

Table 8–20 x64: Initial Executable, Thread-Local Variable Access Codes

Code Sequence

Initial Relocations

Symbol

0x00 movq  %fs:0, %rax
0x09 addq  x@gottpoff(%rip), %rax

# %rax - contains address of TLS variable
<none>
R_AMD64_GOTTPOFF
 
x
 

Outstanding Relocations

Symbol

GOT[n]
R_AMD64_TPOFF64
x

The R_AMD64_GOTTPOFF relocation for the symbol x requests the link-editor to generate a GOT entry and an associated R_AMD64_TPOFF64 relocation. The offset of the GOT entry relative to the end of the x@gottpoff(%rip) instruction, is then used by the instruction. The R_AMD64_TPOFF64 relocation uses the value of the symbol x that is determined from the presently loaded modules. The offset is written in the GOT entry and later loaded by the addq instruction.

To load the contents of x, rather than the address of x, the following sequence is available.

Table 8–21 x64: Initial Executable, Thread-Local Variable Access Codes II

Code Sequence

Initial Relocations

Symbol

0x00 movq  x@gottpoff(%rip), %rax
0x06 movq  %fs:(%rax), %rax

# %rax - contains contents of TLS variable
R_AMD64_GOTTPOFF
<none>
x
 

Outstanding Relocations

Symbol

GOT[n]
R_AMD64_TPOFF64
x

x64: Local Executable (LE)

This code sequence implements the LE model described in Thread-Local Storage Access Models.

Table 8–22 x64: Local Executable Thread-Local Variable Access Codes

Code Sequence

Initial Relocations

Symbol

0x00 movq %fs:0, %rax
0x06 leaq  x@tpoff(%rax), %rax

# %rax - contains address of TLS variable
<none>
R_AMD64_TPOFF32
x

To load the contents of a TLS variable instead of the address of a TLS variable, the following sequence can be used.

Table 8–23 x64: Local Executable Thread-Local Variable Access Codes II

Code Sequence

Initial Relocations

Symbol

0x00 movq %fs:0, %rax
0x06 movq  x@tpoff(%rax), %rax

# %rax - contains contents of TLS variable
<none>
R_AMD64_TPOFF32
x

The following sequence is even shorter.

Table 8–24 x64: Local Executable Thread-Local Variable Access Codes III

Code Sequence

Initial Relocations

Symbol

0x00 movq  %fs:x@tpoff, %rax

# %rax - contains contents of TLS variable
R_AMD64_TPOFF32
x