Linker and Libraries Guide

Relocation Processing

After the runtime linker has located and mapped all the dependencies required by an application, it processes each object and performs all necessary relocations.

During the link-editing of an object, any relocation information supplied with the input relocatable objects is applied to the output file. However, when building a dynamic executable or shared object, many of the relocations cannot be completed at link-edit time because they require logical addresses that are known only when the objects are mapped into memory. In these cases the link-editor generates new relocation records as part of the output file image, and it is this information that the runtime linker must now process.

For a more detailed description of the many relocation types, see "Relocation Types (Processor-Specific)". However, for the purposes of this discussion it is convenient to categorize relocations into one of two types:

Non-symbolic relocations.
Symbolic relocations.

The relocation records for an object can be displayed by using dump(1). For example:

$ dump -rvp libbar.so.1

libbar.so.1:

.rela.got:
Offset      Symndx                Type              Addend

0x10438     0                     R_SPARC_RELATIVE  0
0x1043c     foo                   R_SPARC_GLOB_DAT  0

Here, the file libbar.so.1 contains two relocation records that indicate that the global offset table (the .got section) must be updated.

The first relocation is a simple relative relocation that can be seen from its relocation type and the symbol index (Symndx) field being zero. This relocation needs to use the base address at which the object was mapped into memory to update the associated .got offset.

The second relocation requires the address of the symbol foo. To complete this relocation, the runtime linker must locate this symbol from the dynamic executable and its dependencies that have been mapped so far.

Symbol Lookup

When an object requires a symbol, the runtime linker searches for that symbol based upon the requesting objects' symbol search scope, and the symbol visibility offered by each object within the process. These attributes are applied as defaults to an object at the time it is loaded, as specific modes to dlopen(3DL), and in some cases can be recorded within the object at the time it is built.

Typically, an average user becomes familiar with the default symbol search models that are applied to a dynamic executable and its dependencies, and to objects obtained through dlopen(3DL). The former is outlined in this section and the latter, which is also able to exploit the various symbol lookup attributes, is discussed in "Symbol Lookup".

A dynamic executable and all the dependencies loaded with it are assigned world search scope, and global symbol visibility (see "Symbol Lookup"). Thus, when the runtime linker looks up a symbol for a dynamic executable or for any of the dependencies loaded with the executable, it does so by searching each object, starting with the dynamic executable, and progressing through each dependency in the same order in which the objects were mapped.

As discussed in previous sections, ldd(1) lists the dependencies of a dynamic executable in the order in which they are mapped. Therefore, if the shared object libbar.so.1 requires the address of symbol foo to complete its relocation, and this shared object is a dependency of the dynamic executable prog:

$ ldd prog
        libfoo.so.1 =>   /home/me/lib/libfoo.so.1
        libbar.so.1 =>   /home/me/lib/libbar.so.1

Then, the runtime linker first looks for foo in the dynamic executable prog, then in the shared object /home/me/lib/libfoo.so.1, and finally in the shared object /home/me/lib/libbar.so.1.

Note -

Symbol lookup can be an expensive operation, especially when the size of symbol names increases and the number of dependencies increases. This aspect of performance is discussed in more detail in "Performance Considerations".

Interposition

The runtime linker's mechanism of searching for a symbol first in the dynamic executable and then in each of the dependencies means that the first occurrence of the required symbol will satisfy the search. Therefore, if more than one instance of the same symbol exists, the first instance will interpose on all others (see also "Shared Object Processing").

When Relocations Are Performed

Having briefly described the relocation process, together with the simplification of relocations into the two types, non-symbolic and symbolic, it is also useful to distinguish relocations by when they are performed. This distinction arises due to the type of reference being made to the relocated offset, and can be either a:

A data reference.
A function reference.

A data reference refers to an address that is used as a data item by the application code. The runtime linker has no knowledge of the application code, so does not know when this data item will be referenced. Therefore, all data relocations must be carried out during process initialization, before the application gains control.

A function reference refers to the address of a function that will be called by the application code. During the compilation and link-editing of any dynamic module, calls to global functions are relocated to become calls to a procedure linkage table entry (these entries make up the .plt section).

Procedure linkage table entries are constructed so that, when first called, control is passed to the runtime linker (see "Procedure Linkage Table (Processor-Specific)"). The runtime linker looks up the required symbol and rewrites information in the application so that any future calls to this .plt entry go directly to the function. This mechanism allows relocations of this type to be deferred until the first instance of a function being called, a process that is sometimes referred to as lazy binding.

The runtime linker's default mode of performing lazy binding can be overridden by setting the environment variable LD_BIND_NOW to any non-null value. This environment variable setting causes the runtime linker to perform both data reference and function reference relocations during process initialization, before transferring control to the application. For example:

$ LD_BIND_NOW=yes prog

Here, all relocations within the file prog and within its dependencies will be processed before control is transferred to the application.

Individual objects can also be built using the link-editors' -z now option to indicate that they require complete relocation processing at the time they are loaded. This relocation requirement is also propagated to any dependencies of the marked object at runtime.

Relocation Errors

The most common relocation error occurs when a symbol cannot be found. This condition results in an appropriate runtime linker error message and the termination of the application. For example:

$ ldd prog
        libfoo.so.1 =>   ./libfoo.so.1
        libc.so.1 =>     /usr/lib/libc.so.1
        libbar.so.1 =>   ./libbar.so.1
        libdl.so.1 =>    /usr/lib/libdl.so.1
$ prog
ld.so.1: prog: fatal: relocation error: file ./libfoo.so.1: \
symbol bar: referenced symbol not found

Here the symbol bar, which is referenced in the file libfoo.so.1, cannot be located.

Note -

During the link-edit of a dynamic executable, any potential relocation errors of this sort are flagged as fatal undefined symbols (see "Generating an Executable" for examples). This runtime relocation error can occur if the link-edit of main used a different version of the shared object libbar.so.1 that contained a symbol definition for bar, or if the -z nodefs option was used as part of the link-edit.

If a relocation error of this type occurs because a symbol used as a data reference cannot be located, the error condition will occur immediately during process initialization. However, because of the default mode of lazy binding, if a symbol used as a function reference cannot be found, the error condition will occur after the application has gained control.

This latter case can take minutes or months, or might never occur, depending on the execution paths exercised throughout the code. To guard against errors of this kind, the relocation requirements of any dynamic executable or shared object can be validated using ldd(1).

When the -d option is specified with ldd(1), all dependencies will be printed and all data reference relocations will be processed. If a data reference cannot be resolved, a diagnostic message is produced. From the previous example this reveals:

$ ldd -d prog
        libfoo.so.1 =>   ./libfoo.so.1
        libc.so.1 =>     /usr/lib/libc.so.1
        libbar.so.1 =>   ./libbar.so.1
        libdl.so.1 =>    /usr/lib/libdl.so.1
        symbol not found: bar           (./libfoo.so.1)

When the -r option is specified with ldd(1), all data and function reference relocations will be processed, and if either cannot be resolved, a diagnostic message is produced.