After the runtime linker has located and loaded all the dependencies required by an application, it processes each object and performs all necessary relocations.
During the link-editing of an object, any relocation information supplied with the input relocatable objects is applied to the output file. However, when creating a dynamic executable or shared object, many of the relocations cannot be completed at link-edit time because they require logical addresses that are known only when the objects are loaded into memory. In these cases the link-editor generates new relocation records as part of the output file image, and it is this information that the runtime linker must now process.
For a more detailed description of the many relocation types, see "Relocation Types (Processor-Specific)". However, for the purposes of this discussion it is convenient to categorize relocations into one of two types:
The relocation records for an object can be displayed by using dump(1). For example:
$ dump -rvp libbar.so.1 libbar.so.1: .rela.got: Offset Symndx Type Addend 0x10438 0 R_SPARC_RELATIVE 0 0x1043c foo R_SPARC_GLOB_DAT 0
The first relocation is a simple relative relocation that can be seen from its relocation type and the symbol index (Symndx) field being zero. This relocation needs to use the base address at which the object was loaded into memory to update the associated .got offset.
The second relocation requires the address of the symbol foo. To complete this relocation, the runtime linker must locate this symbol from the dynamic executable and its dependencies that have been loaded so far.
When an object requires a symbol, the runtime linker searches for that symbol based upon the requesting objects' symbol search scope, and the symbol visibility offered by each object within the process. These attributes are applied as defaults to an object at the time it is loaded, as specific modes to dlopen(3DL), and in some cases can be recorded within the object at the time it is built.
Typically, an average user becomes familiar with the default symbol search models that are applied to a dynamic executable and its dependencies, and to objects obtained through dlopen(3DL). The former is outlined in the next section "Default Lookup", and the latter, which is also able to exploit the various symbol lookup attributes, is discussed in "Symbol Lookup".
An alternative model for symbol lookup is provided when a dynamic object is created with the link-editors -Bdirect option (see "External Bindings"). This model directs the runtime linker to search for a symbol directly in the object that provided the symbol at link-edit time. This model is discussed in more detail in "Direct Binding".
A dynamic executable and all the dependencies loaded with it are assigned world search scope, and global symbol visibility (see "Symbol Lookup"). Thus, when the runtime linker looks up a symbol for a dynamic executable or for any of the dependencies loaded with the executable, it does so by searching each object, starting with the dynamic executable, and progressing through each dependency in the same order in which the objects were loaded.
As discussed in previous sections, ldd(1) lists the dependencies of a dynamic executable in the order in which they are loaded. Therefore, if the shared object libbar.so.1 requires the address of symbol foo to complete its relocation, and this shared object is a dependency of the dynamic executable prog:
$ ldd prog libfoo.so.1 => /home/me/lib/libfoo.so.1 libbar.so.1 => /home/me/lib/libbar.so.1
Then, the runtime linker first looks for foo in the dynamic executable prog, then in the shared object /home/me/lib/libfoo.so.1, and finally in the shared object /home/me/lib/libbar.so.1.
Symbol lookup can be an expensive operation, especially when the size of symbol names increases and the number of dependencies increases. This aspect of performance is discussed in more detail in "Performance Considerations". Also see "Direct Binding" for an alternative lookup model.
The runtime linker's default mechanism of searching for a symbol first in the dynamic executable and then in each of the dependencies means that the first occurrence of the required symbol will satisfy the search. Therefore, if more than one instance of the same symbol exists, the first instance will interpose on all others (see also "Shared Object Processing").
When creating an object using the link-editor's -Bdirect option, the relationship between the referenced symbol and the dependency that provided the definition is recorded in the object. The runtime linker uses this information to search directly for the symbol in the associated object, rather than carry out the default symbol search model.
The use of -Bdirect also enables lazy loading, which is equivalent to adding the option -zlazyload to the front of the link-edit command line (see "Lazy Loading of Dynamic Dependencies").
This direct binding model can significantly reduce the symbol lookup overhead within a dynamic process that has many symbolic relocations and many dependencies. This model also allows multiple symbols of the same name to be located from different objects that have been directly bound to.
However, direct bindings can circumvent the traditional use of interposition symbols because they bypasses the default search model. The default model ensures that all references to a symbol bind to one definition.
Interposition can still be achieved in a direct binding environment, on a per-object basis, if an object is identified as an interposer. Any object loaded using the environment variable
LD_PRELOAD (see "Loading Additional Objects"), or created with the link-editor's -zinterpose option, is identified as an interposer. When the runtime linker searches for a directly bound symbol, it first looks
in any object identified as an interposer before it looks in the object that supplies the symbol definition.
Having briefly described the relocation process, together with the simplification of relocations into the two types, non-symbolic and symbolic, it is also useful to distinguish relocations by when they are performed. This distinction arises due to the type of reference being made to the relocated offset, and can be either a:
A data reference refers to an address that is used as a data item by the application code. The runtime linker has no knowledge of the application code, so does not know when this data item will be referenced. Therefore, all data relocations must be carried out during process initialization, before the application gains control.
A function reference refers to the address of a function that will be called by the application code. During the compilation and link-editing of any dynamic module, calls to global functions are relocated to become calls to a procedure linkage table entry (these entries make up the .plt section).
Procedure linkage table entries are constructed so that, when first called, control is passed to the runtime linker (see "Procedure Linkage Table (Processor-Specific)"). The runtime linker looks up the required symbol and rewrites information in the application so that any future calls to this .plt entry go directly to the function. This mechanism allows relocations of this type to be deferred until the first instance of a function being called, a process that is sometimes referred to as lazy binding.
The runtime linker's default mode of performing lazy binding can be overridden by setting the environment variable
LD_BIND_NOW to any non-null value. This
environment variable setting causes the runtime linker to perform both data reference and function reference relocations during process initialization, before transferring control to the application. For
$ LD_BIND_NOW=yes prog
Here, all relocations within the file prog and within its dependencies will be processed before control is transferred to the application.
Individual objects can also be built using the link-editors' -znow option to indicate that they require complete relocation processing at the time they are loaded. This relocation requirement is also propagated to any dependencies of the marked object at runtime.
$ ldd prog libfoo.so.1 => ./libfoo.so.1 libc.so.1 => /usr/lib/libc.so.1 libbar.so.1 => ./libbar.so.1 libdl.so.1 => /usr/lib/libdl.so.1 $ prog ld.so.1: prog: fatal: relocation error: file ./libfoo.so.1: \ symbol bar: referenced symbol not found
Here the symbol bar, which is referenced in the file libfoo.so.1, cannot be located.
During the link-edit of a dynamic executable, any potential relocation errors of this sort are flagged as fatal undefined symbols (see "Generating an Executable" for examples). This runtime relocation error can occur if the link-edit of main used a different version of the shared object libbar.so.1 that contained a symbol definition for bar, or if the -znodefs option was used as part of the link-edit.
If a relocation error of this type occurs because a symbol used as a data reference cannot be located, the error condition will occur immediately during process initialization. However, because of the default mode of lazy binding, if a symbol used as a function reference cannot be found, the error condition will occur after the application has gained control.
This latter case can take minutes or months, or might never occur, depending on the execution paths exercised throughout the code. To guard against errors of this kind, the relocation requirements of any dynamic executable or shared object can be validated using ldd(1).
When the -d option is specified with ldd(1), all dependencies will be printed and all data reference relocations will be processed. If a data reference cannot be resolved, a diagnostic message is produced. From the previous example this reveals:
$ ldd -d prog libfoo.so.1 => ./libfoo.so.1 libc.so.1 => /usr/lib/libc.so.1 libbar.so.1 => ./libbar.so.1 libdl.so.1 => /usr/lib/libdl.so.1 symbol not found: bar (./libfoo.so.1)
When the -r option is specified with ldd(1), all data and function reference relocations will be processed, and if either cannot be resolved, a diagnostic message is produced.