Removing Multiply Defined Symbols of the Same Name (Linker and Libraries Guide)

Linker and Libraries Guide

Removing Multiply Defined Symbols of the Same Name

Multiply defined symbols of the same name can be problematic within a directly bound environment, if the implementation associated with the symbol maintains state. Data symbols are the typical offenders in this regard, however functions that maintain state can also be problematic.

In a directly bound environment, multiple instances of the same symbol can be bound to. Therefore, different binding instances can manipulate different state variables that were originally intended to be a single instance within a process.

For example, suppose that two shared objects contain the same data item errval. Suppose also, that two functions action() and inspect(), exist in different shared objects. These functions expect to write and read the value errval respectively.

With the default search model, one definition of errval would interpose on the other definition. Both functions action() and inspect() would be bound to the same instance of errval. Therefore, if an error code was written to errval by action(), then inspect() could read, and act upon this error condition.

However, suppose the objects containing action() and inspect() were bound to different dependencies that each defined errval. Within a directly bound environment, these functions are bound to different definitions of errval. An error code can be written to one instance of errval by action() while inspect() reads the other, uninitialized definition of errval. The outcome is that inspect() detects no error condition to act upon.

Multiple instances of data symbols typically occur when the symbols are declared in headers.

int bar;

This data declaration results in a data item being produced by each compilation unit that includes the header. The resulting tentative data item can result in multiple instances of the symbol being defined in different dynamic objects.

However, by explicitly defining the data item as external, references to the data item are produced for each compilation unit that includes the header.

extern int bar;

These references can then be resolved to one data instance at runtime.

Occasionally, the interface for a symbol implementation that you want to remove, should be preserved. Multiple instances of the same interface can be vectored to one implementation, while preserving any existing interface. This model can be achieved by creating individual symbol filters by using a FILTER mapfile keyword. This keyword is described in SYMBOL_SCOPE / SYMBOL_VERSION Directives.

Creating individual symbol filters is useful when dependencies expect to find a symbol in an object where the implementation for that symbol has been removed.

For example, suppose the function error() exists in two shared objects, A.so.1 and B.so.1. To remove the symbol duplication, you want to remove the implementation from A.so.1. However, other dependencies are relying on error() being provided from A.so.1. The following example shows the definition of error() in A.so.1. A mapfile is then used to allow the removal of the error() implementation, while leaving a filter for this symbol that is directed to B.so.1.

$ cc -o A.so.1 -G -Kpic error.c a.c b.c ...
$ elfdump -sN.dynsym A.so.1 | fgrep error
    [3]  0x00000300 0x00000014  FUNC GLOB  D    0 .text      error
$ cat mapfile
$mapfile_version 2
SYMBOL_SCOPE {
        global:
                error { TYPE=FUNCTION; FILTER=B.so.1 };
};
$ cc -o A.so.2 -G -Kpic -M mapfile a.c b.c ...
$ elfdump -sN.dynsym A.so.2 | fgrep error
    [3]  0x00000000 0x00000000  FUNC GLOB  D    0 ABS        error
$ elfdump -y A.so.2 | fgrep error
    [3]  F       [0] B.so.1         error

The function error() is global, and remains an exported interface of A.so.2. However, any runtime binding to this symbol is vectored to the filtee B.so.1. The letter “F” indicates the filter nature of this symbol.

This model of preserving existing interfaces, while vectoring to one implementation has been used in several Oracle Solaris libraries. For example, a number of math interfaces that were once defined in libc.so.1 are now vectored to the preferred implementation of the functions in libm.so.2.