Linker and Libraries Guide

Relocation Processing

After the runtime linker has loaded all the dependencies required by an application, the linker processes each object and performs all necessary relocations.

During the link-editing of an object, any relocation information supplied with the input relocatable objects is applied to the output file. However, when creating a dynamic executable or shared object, many of the relocations cannot be completed at link-edit time because they require logical addresses that are known only when the objects are loaded into memory. In these cases the link-editor generates new relocation records as part of the output file image. The runtime linker must then process these new relocation records.

For a more detailed description of the many relocation types, see Relocation Types (Processor-Specific). There are two basic types of relocations:

The relocation records for an object can be displayed by using dump(1). In the following example, the file libbar.so.1 contains two relocation records that indicate that the global offset table (the .got section) must be updated.


$ dump -rvp libbar.so.1

libbar.so.1:

.rela.got:
Offset      Symndx                Type              Addend

0x10438     0                     R_SPARC_RELATIVE  0
0x1043c     foo                   R_SPARC_GLOB_DAT  0

The first relocation is a simple relative relocation that can be seen from its relocation type and the symbol index (Symndx) field being zero. This relocation needs to use the base address at which the object was loaded into memory to update the associated .got offset.

The second relocation requires the address of the symbol foo. To complete this relocation, the runtime linker must locate this symbol from either the dynamic executable or one of its dependencies.

Symbol Lookup

When an object requires a symbol, the runtime linker searches for that symbol based upon the requesting object's symbol search scope, and the symbol visibility offered by each object within the process. These attributes are applied as defaults to an object at the time the object is loaded, as specific modes to dlopen(3DL), and in some cases can be recorded within the object at the time it is built.

Typically, an average user becomes familiar with the default symbol search models that are applied to a dynamic executable and its dependencies, and to objects obtained through dlopen(3DL). The former is outlined in the next section Default Lookup, and the latter, which is also able to exploit the various symbol lookup attributes, is discussed in Symbol Lookup.

An alternative model for symbol lookup is provided when a dynamic object employes direct bindings. This model directs the runtime linker to search for a symbol directly in the object that provided the symbol at link-edit time. See Direct Binding.

Default Lookup

A dynamic executable and all the dependencies loaded with it are assigned world search scope, and global symbol visibility. See Symbol Lookup. When the runtime linker looks up a symbol for a dynamic executable or for any of the dependencies loaded with the executable, it does so by searching each object. The runtime linker starts with the dynamic executable, and progresses through each dependency in the same order in which the objects were loaded.

As discussed in previous sections, ldd(1) lists the dependencies of a dynamic executable in the order in which they are loaded. Therefore, if the shared object libbar.so.1 requires the address of symbol foo to complete its relocation, and this shared object is a dependency of the dynamic executable prog:


$ ldd prog
        libfoo.so.1 =>   /home/me/lib/libfoo.so.1
        libbar.so.1 =>   /home/me/lib/libbar.so.1

The runtime linker first looks for foo in the dynamic executable prog, then in the shared object /home/me/lib/libfoo.so.1, and finally in the shared object /home/me/lib/libbar.so.1.


Note –

Symbol lookup can be an expensive operation, especially when the size of symbol names increases and the number of dependencies increases. This aspect of performance is discussed in more detail in Performance Considerations. See Direct Binding for an alternative lookup model.


Interposition

The runtime linker's default mechanism of searching for a symbol first in the dynamic executable and then in each of the dependencies means that the first occurrence of the required symbol will satisfy the search. Therefore, if more than one instance of the same symbol exists, the first instance interposes on all others. See also Shared Object Processing.

Direct Binding

When creating an object to use direct bindings, the relationship between the referenced symbol and the dependency that provided the definition is recorded in the object. The runtime linker uses this information to search directly for the symbol in the associated object, rather than carry out the default symbol search model. Direct binding information can only be established to dependencies specified with the link-edit. Therefore, use of the -z defs option is recommended.

Direct bindings can be established with one of the following mechanisms.

The direct binding model can significantly reduce the symbol lookup overhead within a dynamic process that has many symbolic relocations and many dependencies. This model also enables multiple symbols of the same name to be located from different objects that have been bound to directly.

Direct binding can circumvent the traditional use of interposition symbols because it bypasses the default search model. The default model ensures that all references to a symbol bind to one definition.

Interposition can still be achieved in a direct binding environment, on a per-object basis, if an object is identified as an interposer. Any object loaded using the environment variable LD_PRELOAD or created with the link-editor's -z interpose option, is identified as an interposer. When the runtime linker searches for a directly bound symbol, it first looks in any object identified as an interposer before it looks in the object that supplies the symbol definition.


Note –

Direct bindings can be disabled at runtime by setting the environment variable LD_NODIRECT to a non-null value.


Some interfaces exist that offer alternative implementations of a default technology. These implementations also assume they are the only instance of that technology within a process. An example of this is the malloc(3C) family. Directly binding to interfaces within such a family should be avoided, as it is possible for more than one instance of the technology to be referenced by the same process. For example, one dependency within a process may directly bind against libc.so.1, while another dependency directly binds against libmapmalloc.so.1.

Objects that provide a single implementation for a process, should define the interfaces to that implementation using the mapfile directive NODIRECT. This directive insures no users directly bind to an implementation, but use the default symbol search model.


Note –

NODIRECT mapfile directives can be combined with the command line options -B direct or -z direct. Symbols that are not explicitly defined NODIRECT will follow the command line directive.


When Relocations Are Performed

Relocations can be distinguish by when they are performed. This distinction arises due to the type of reference being made to the relocated offset, and is either:

An immediate reference refers to a relocation that must be determined immediately when an object is loaded. These references are typically to data items used by the object code, pointers to functions, and even calls to functions made from position-dependent shared objects. These relocations cannot provide the runtime linker with knowledge of when the relocated item is referenced. Therefore, all immediate relocations must be carried out when an object is loaded, and before the application gains, or regains, control.

A lazy reference refers to a relocation that can be determined as an object executes. These references are typically calls to global functions made from position-independent shared objects, or calls to external functions made from a dynamic executable. During the compilation and link-editing of any dynamic module that provide these references, the associated function calls become calls to a procedure linkage table entry. These entries make up the .plt section. Each procedure linkage table entry becomes a lazy reference with a relocation associated with it.

Procedure linkage table entries are constructed so that when they are first called, control is passed to the runtime linker. The runtime linker looks up the required symbol and rewrites information in the associated object so that any future calls to this procedure linkage table entry go directly to the function. This mechanism enables relocations of this type to be deferred until the first instance of a function is called. This process is sometimes referred to as lazy binding.

The runtime linker's default mode is to perform lazy binding whenever procedure linkage table relocations are provided. This default can be overridden by setting the environment variable LD_BIND_NOW to any non-null value. This environment variable setting causes the runtime linker to perform both immediate and lazy reference relocations when an object is loaded, and before the application gains, or regains, control. For example, setting the environment variable as follows means that all relocations within the file prog and within its dependencies, will be processed before control is transferred to the application.


$ LD_BIND_NOW=1 prog

Objects can also be accessed with dlopen(3DL) with the mode defined as RTLD_NOW. Objects can also be built using the link-editor's -z now option to indicate that they require complete relocation processing at the time they are loaded. This relocation requirement is also propagated to any dependencies of the marked object at runtime.


Note –

Although the preceding examples of immediate and lazy references are typical, the creation of procedure linkage table entries is ultimately controlled by the relocation information provided by the relocatable object files used as input to a link-edit. Relocation records such as R_SPARC_WPLT30 and R_386_PLT32 instruct the link-editor to create a procedure linkage table entry are common for position-independent code. However, as a dynamic executable has a fixed location, external function references that can be determined at link-edit time can be converted to procedure linkage table entries regardless of the original relocation records.


Relocation Errors

The most common relocation error occurs when a symbol cannot be found. This condition results in an appropriate runtime linker error message and the termination of the application. For example:


$ ldd prog
        libfoo.so.1 =>   ./libfoo.so.1
        libc.so.1 =>     /usr/lib/libc.so.1
        libbar.so.1 =>   ./libbar.so.1
        libdl.so.1 =>    /usr/lib/libdl.so.1
$ prog
ld.so.1: prog: fatal: relocation error: file ./libfoo.so.1: \
symbol bar: referenced symbol not found

The symbol bar, which is referenced in the file libfoo.so.1, cannot be located.

During the link-edit of a dynamic executable, any potential relocation errors of this sort are flagged as fatal undefined symbols. See Generating an Executable Output File for examples. This runtime relocation error can occur if the link-edit of main used a different version of the shared object libbar.so.1 that contained a symbol definition for bar, or if the -z nodefs option was used as part of the link-edit.

If a relocation error of this type occurs because a symbol used as an immediate reference cannot be located, the error condition will occur immediately during process initialization. Because of the default mode of lazy binding, if a symbol used as a lazy reference cannot be found, the error condition will occur after the application has gained control. This latter case can take minutes or months, or might never occur, depending on the execution paths exercised throughout the code.

To guard against errors of this kind, the relocation requirements of any dynamic executable or shared object can be validated using ldd(1).

When the -d option is specified with ldd(1), all dependencies will be printed and all immediate reference relocations will be processed. If a reference cannot be resolved, a diagnostic message is produced. From the previous example this option would result in:


$ ldd -d prog
        libfoo.so.1 =>   ./libfoo.so.1
        libc.so.1 =>     /usr/lib/libc.so.1
        libbar.so.1 =>   ./libbar.so.1
        libdl.so.1 =>    /usr/lib/libdl.so.1
        symbol not found: bar           (./libfoo.so.1)

When the -r option is specified with ldd(1), all immediate and lazy reference relocations are processed. If either type of relocation cannot be resolved, a diagnostic message is produced.