In Relocation Processing, the mechanisms by which the runtime linker relocates dynamic executables and shared objects to create a runable process was covered. Symbol Lookup and When Relocations Are Performed categorized this relocation processing into two areas to simplify and help illustrate the mechanisms involved. These same two categorizations are also ideally suited for considering the performance impact of relocations.
When the runtime linker needs to look up a symbol, by default it does so by searching in each object. The runtime linker starts with the dynamic executable, and progresses through each shared object in the same order that the objects are loaded. In many instances, the shared object that requires a symbolic relocation turns out to be the provider of the symbol definition.
In this situation, if the symbol used for this relocation is not required as part of the shared object's interface, then this symbol is a strong candidate for conversion to a static or automatic variable. A symbol reduction can also be applied to removed symbols from a shared objects interface. See Reducing Symbol Scope for more details. By making these conversions, the link-editor incurs the expense of processing any symbolic relocation against these symbols during the shared object's creation.
The only global data items that should be visible from a shared object are those that contribute to its user interface. Historically this has been a hard goal to accomplish, because global data are often defined to allow reference from two or more functions located in different source files. By applying symbol reduction, unnecessary global symbols can be removed. See Reducing Symbol Scope. Any reduction in the number of global symbols exported from a shared object results in lower relocation costs and an overall performance improvement.
The use of direct bindings can also significantly reduce the symbol lookup overhead within a dynamic process that has many symbolic relocations any many dependencies. See Direct Binding.
All immediate reference relocations must be carried out during process initialization before the application gains control. However, any lazy reference relocations can be deferred until the first instance of a function being called. Immediate relocations typically result from data references. Therefore, reducing the number of data references also reduces the runtime initialization of a process.
Initialization relocation costs can also be deferred by converting data references into function references. For example, you can return data items by a functional interface. This conversion usually results in a perceived performance improvement because the initialization relocation costs are effectively spread throughout the process's execution. Some of the functional interfaces might never be called by a particular invocation of a process, thus removing their relocation overhead altogether.
The advantage of using a functional interface can be seen in the section, Copy Relocations. This section examines a special, and somewhat expensive, relocation mechanism employed between dynamic executables and shared objects. It also provides an example of how this relocation overhead can be avoided.
Relocations by default are grouped by the sections against which they are to be applied. However, when an object is built with the -z combreloc option, all but the procedure linkage table relocations are placed into a single common section named .SUNW_reloc. See Procedure Linkage Table (Processor-Specific).
Combining relocation records in this manner enables all RELATIVE relocations to be grouped together. All symbolic relocations are sorted by symbol name. The grouping of RELATIVE relocations permits optimized runtime processing using the DT_RELACOUNT/DT_RELCOUNT .dynamic entries. Sorted symbolic entries help reduce runtime symbol lookup.
Shared objects are usually built with position-independent code. References to external data items from code of this type employs indirect addressing through a set of tables. See Position-Independent Code for more details. These tables are updated at runtime with the real address of the data items. These updated tables enable access to the data without the code itself being modified.
Dynamic executables, however, are generally not created from position-independent code. Any references to external data they make can seemingly only be achieved at runtime by modifying the code that makes the reference. Modifying a read-only text segment is to be avoided. The copy relocation technique can solve this reference.
Suppose the link-editor is used to create a dynamic executable, and a reference to a data item is found to reside in one of the dependent shared objects. Space is allocated in the dynamic executable's .bss, equivalent in size to the data item found in the shared object. This space is also assigned the same symbolic name as defined in the shared object. Along with this data allocation, the link-editor generates a special copy relocation record that will instruct the runtime linker to copy the data from the shared object to this allocated space within the dynamic executable.
Because the symbol assigned to this space is global, it is used to satisfy any references from any shared objects. The dynamic executable inherits the data item. Any other objects within the process that make reference to this item are bound to this copy. The original data from which the copy is made effectively becomes unused.
The following example of this mechanism uses an array of system error messages that is maintained within the standard C library. In previous SunOS operating system releases, the interface to this information was provided by two global variables, sys_errlist[], and sys_nerr. The first variable provided the array of error message strings, while the second conveyed the size of the array itself. These variables were commonly used within an application in the following manner:
| $ cat foo.c
extern int      sys_nerr;
extern char *   sys_errlist[];
char *
error(int errnumb)
{
        if ((errnumb < 0) || (errnumb >= sys_nerr))
                return (0);
        return (sys_errlist[errnumb]);
} | 
The application uses the function error to provide a focal point to obtain the system error message associated with the number errnumb.
Examining a dynamic executable built using this code shows the implementation of the copy relocation in more detail:
| $ cc -o prog main.c foo.c
$ nm -x prog | grep sys_
[36]  |0x00020910|0x00000260|OBJT |WEAK |0x0  |16 |sys_errlist
[37]  |0x0002090c|0x00000004|OBJT |WEAK |0x0  |16 |sys_nerr
$ dump -hv prog | grep bss
[16]    NOBI    WA-    0x20908   0x908    0x268   .bss
$ dump -rv prog
    **** RELOCATION INFORMATION ****
.rela.bss:
Offset      Symndx                Type              Addend
0x2090c     sys_nerr              R_SPARC_COPY      0
0x20910     sys_errlist           R_SPARC_COPY      0
.......... | 
The link-editor has allocated space in the dynamic executable's .bss to receive the data represented by sys_errlist and sys_nerr. These data are copied from the C library by the runtime linker at process initialization. Thus, each application that uses these data gets a private copy of the data in its own data segment.
There are two drawbacks to this technique. First, each application pays a performance penalty for the overhead of copying the data at runtime. Second, the size of the data array sys_errlist has now become part of the C library's interface. Suppose the size of this array were to change, prehaps as new error messages are added. Any dynamic executables that reference this array have to undergo a new link-edit to be able to access any of the new error messages. Without this new link-edit, the allocated space within the dynamic executable is insufficient to hold the new data.
These drawbacks can be eliminated if the data required by a dynamic executable are provided by a functional interface. The ANSI C function strerror(3C) returns a pointer to the appropriate error string, based on the error number supplied to it. One implementation of this function might be:
| $ cat strerror.c
static const char * sys_errlist[] = {
        "Error 0",
        "Not owner",
        "No such file or directory",
        ......
};
static const int sys_nerr =
        sizeof (sys_errlist) / sizeof (char *);
char *
strerror(int errnum)
{
        if ((errnum < 0) || (errnum >= sys_nerr))
                return (0);
        return ((char *)sys_errlist[errnum]);
} | 
The error routine in foo.c can now be simplified to use this functional interface. This simplification in turn removes any need to perform the original copy relocations at process initialization.
Additionally, because the data are now local to the shared object, the data are no longer part of its interface. The shared object therefore has the flexibility of changing the data without adversely effecting any dynamic executables that use it. Eliminating data items from a shared object's interface generally improves performance while making the shared object's interface and code easier to maintain.
ldd(1), when used with either the -d or -r options, can verify any copy relocations that exist within a dynamic executable.
For example, suppose the dynamic executable prog had originally been built against the shared object libfoo.so.1 and the following two copy relocations had been recorded:
| $ nm -x prog | grep _size_ [36] |0x000207d8|0x40|OBJT |GLOB |15 |_size_gets_smaller [39] |0x00020818|0x40|OBJT |GLOB |15 |_size_gets_larger $ dump -rv size | grep _size_ 0x207d8 _size_gets_smaller R_SPARC_COPY 0 0x20818 _size_gets_larger R_SPARC_COPY 0 | 
A new version of this shared object is supplied that contains different data sizes for these symbols:
| $ nm -x libfoo.so.1 | grep _size_ [26] |0x00010378|0x10|OBJT |GLOB |8 |_size_gets_smaller [28] |0x00010388|0x80|OBJT |GLOB |8 |_size_gets_larger | 
Running ldd(1) against the dynamic executable reveals:
| $ ldd -d prog
    libfoo.so.1 =>   ./libfoo.so.1
    ...........
    copy relocation sizes differ: _size_gets_smaller
       (file prog size=40; file ./libfoo.so.1 size=10);
       ./libfoo.so.1 size used; possible insufficient data copied
    copy relocation sizes differ: _size_gets_larger
       (file prog size=40; file ./libfoo.so.1 size=80);
       ./prog size used; possible data truncation | 
ldd(1) shows that the dynamic executable will copy as much data as the shared object has to offer, but only accepts as much as its allocated space allows.
Copy relocations can be eliminated by building the application from position-independent code. See Position-Independent Code.