This code sequence implements the GD model described in Thread-Local Storage Access Models.
Table 8–18 x64: General Dynamic Thread-Local Variable Access Codes
The __tls_get_addr() function takes a single parameter, the address of the tls_index structure. The R_AMD64_TLSGD relocation that is associated with the x@tlsgd(%rip) expression, instructs the link-editor to allocate a tls_index structure within the GOT. The two elements required for the tls_index structure are maintained in consecutive GOT entries, GOT[n] and GOT[n+1]. These GOT entries are associated to the R_AMD64_DTPMOD64 and R_AMD64_DTPOFF64 relocations.
The instruction at address 0x00 computes the address of the first GOT entry. This computation adds the PC relative address of the beginning of the GOT, which is known at link-edit time, to the current instruction pointer. The result is passed using the %rdi register to the __tls_get_addr() function.
The leaq instruction computes the address of the first GOT entry. This computation is carried out by adding the PC-relative address of the GOT, which was determined at link-edit time, to the current instruction pointer. The .byte, .word, and .rex64 prefixes insure that the whole instruction sequence occupies 16 bytes. Prefixes are employed, as prefixes have no negative inpact on the code.