Linker and Libraries Guide

Appendix D Direct Bindings

As part of constructing a process from a dynamic executable and a number of dependencies, the runtime linker must bind symbol references to symbol definitions. By default, symbol definitions are discovered using a simple search model. Typically, each object is searched, starting with the dynamic executable, and progressing through each dependency in the same order in which the objects are loaded. This model has been in effect since dynamic linking was first introduced. This simple model typically results in all symbol references being bound to one definition. The bound definition is the first definition that is found in the series of dependencies that have been loaded.

Dynamic executables have evolved into far more complex processes than the executables that were developed when dynamic linking was in its infancy. The number of dependencies has grown from tens to hundreds. The number of symbolic interfaces that are referenced between dynamic objects has also grown substantially. The size of symbol names has increased considerably with techniques such as the name mangling used to support languages such as C++. These factors have contributed to an increase in startup time for many applications, as symbol references are bound to symbol definitions.

The increase in the number of symbols within a process has also led to an increase in namespace pollution. Multiple instances of symbols of the same name are becoming more common. Unanticipated, and erroneous bindings that result from multiple instances of the same symbol frequently result in hard to diagnose process failures.

In addition, processes now exist where individual objects of the process need to bind to different instances of multiply defined symbols of the same name.

To address the overhead of the default search model while providing greater symbol binding flexibility, an alternative symbol search model has been created. This model is referred to as direct binding.

Direct binding allows for precise binding relationships to be established between the objects of a process. Direct binding relationships can help avoid any accidental namespace clashes, by isolating the associated objects from unintentional bindings. This protection adds to the robustness of the objects within a process, which can help avoid unexpected, hard to diagnose, binding situations.

Direct bindings can affect interposition. Unintentional interposition can be avoided by employing direct bindings. However, intentional interposition can be circumvented by direct bindings.

This chapter describes the direct binding model together with discussing interposition issues that should be considered when converting objects to use this model.

Observing Symbol Bindings

To understand the default symbol search model and compare this model with direct bindings, the following components are used to build a process.

$ cat main.c
extern int W(), X();

int main() { return (W() + X()); }

$ cat W.c
extern int b();

int a() { return (1); }
int W() { return (a() - b()); }

$ cat w.c
int b() { return (2); }

$ cat X.c
extern int b();

int a() { return (3); }
int X() { return (a() - b()); }

$ cat x.c
int b() { return (4); }

$ cc -o w.so.1 -G -Kpic w.c
$ cc -o W.so.1 -G -Kpic W.c -R. w.so.1
$ cc -o x.so.1 -G -Kpic x.c
$ cc -o X.so.1 -G -Kpic X.c -R. x.so.1
$ cc -o prog1 -R. main.c W.so.1 X.so.1

The components of the application are loaded in the following order.

$ ldd prog1
        W.so.1 =>        ./W.so.1
        X.so.1 =>        ./X.so.1
        w.so.1 =>        ./w.so.1
        x.so.1 =>        ./x.so.1

Both files W.so.1 and X.so.1 define a function that is named a(). Both files w.so.1 and x.so.1 define a function that is named b(). In addition, both files W.so.1 and X.so.1 reference the functions a() and b().

The runtime symbol search, using the default search model, together with the final binding, can be observed by setting the LD_DEBUG environment variable. From the runtime linkers diagnostics, the bindings to the functions a() and b() can be revealed.

$ LD_DEBUG=symbols,bindings prog1
.....
17375: symbol=a;  lookup in file=prog1  [ ELF ]
17375: symbol=a;  lookup in file=./W.so.1  [ ELF ]
17375: binding file=./W.so.1 to file=./W.so.1: symbol `a'
.....
17375: symbol=b;  lookup in file=prog1  [ ELF ]
17375: symbol=b;  lookup in file=./W.so.1  [ ELF ]
17375: symbol=b;  lookup in file=./X.so.1  [ ELF ]
17375: symbol=b;  lookup in file=./w.so.1  [ ELF ]
17375: binding file=./W.so.1 to file=./w.so.1: symbol `b'
.....
17375: symbol=a;  lookup in file=prog1  [ ELF ]
17375: symbol=a;  lookup in file=./W.so.1  [ ELF ]
17375: binding file=./X.so.1 to file=./W.so.1: symbol `a'
.....
17375: symbol=b;  lookup in file=prog1  [ ELF ]
17375: symbol=b;  lookup in file=./W.so.1  [ ELF ]
17375: symbol=b;  lookup in file=./X.so.1  [ ELF ]
17375: symbol=b;  lookup in file=./w.so.1  [ ELF ]
17375: binding file=./X.so.1 to file=./w.so.1: symbol `b'

Each reference to one of the functions a() or b(), results in a search for the associated symbol starting with the application prog1. Each reference to a() binds to the first instance of the symbol which is discovered in W.so.1. Each reference to b() binds to the first instance of the symbol which is discovered in w.so.1. This example reveals how the function definitions in W.so.1 and w.so.1 interpose on the function definitions in X.so.1 and x.so.1. The existence of interposition is an important factor when considering the use of direct bindings. Interposition is covered in detail in the sections that follow.

This example is concise, and the associated diagnostics are easy to follow. However, most applications are far more complex, being constructed from many dynamic components. These components are frequently delivered asynchronously, having been built from separate source bases.

The analysis of the diagnostics from a complex process can be challenging. Another technique for analyzing the interface requirements of dynamic objects is to use the lari(1) utility. lari analyzes the binding information of a process together with the interface definitions provided by each object. This information allows lari to concisely convey interesting information about the symbol dependencies of a process. This information is very useful when analyzing interposition in conjunction with direct bindings.

By default, lari conveys information that is considered interesting. This information originates from multiple instances of a symbol definition. lari reveals the following information for prog1.

$ lari prog1
[2:2ES]: a(): ./W.so.1
[2:0]: a(): ./X.so.1
[2:2E]: b(): ./w.so.1
[2:0]: b(): ./x.so.1

In this example, the process established from prog1 contains two multiply defined symbols, a() and b(). The initial elements of the output diagnostics, those elements that are enclosed in the brackets, describe the associated symbols.

The first decimal value identifies the number of instances of the associated symbol. Two instances of a() and b() exist. The second decimal value identifies the number of bindings that have been resolved to this symbol. The symbol definition a() from W.so.1 reveals that two bindings have been established to this dependency. Similarly, the symbol definition b() from w.so.1 reveals that two bindings have been established to this dependency. The letters that follow the number of bindings, qualify the binding. The letter “E” indicates that a binding has been established from an external object. The letter “S” indicates that a binding has been established from the same object.

LD_DEBUG, lari, and the process examples built from these components, are used to further investigate direct binding scenarios in the sections that follow.

Enabling Direct Binding

An object that uses direct bindings maintains the relationship between a symbol reference and the dependency that provided the definition. The runtime linker uses this information to search directly for the symbol in the associated object, rather than carry out the default symbol search model.

Direct binding information for a dynamic object is recorded at link-edit time. This information can only be established for the dependencies that are specified with the link-edit of that object. Use the -z defs option to ensure that all of the necessary dependencies are provided as part of the link-edit.

Objects that use direct bindings can exist within a process with objects that do not use direct bindings. Those objects that do not use direct bindings use the default symbol search model.

The direct binding of a symbol reference to a symbol definition can be established with one of the following link-editing mechanisms.

With the -B direct option. This option establishes direct bindings between the object being built and all of the objects dependencies. This option also establishes direct bindings between any symbol reference and symbol definition within the object being built.

The use of the -B direct option also enables lazy loading. This enabling is equivalent to adding the -z lazyload option to the front of the link-edit command line. This attribute was introduced in Lazy Loading of Dynamic Dependencies.
With the -z direct option. This option establishes direct bindings from the object being built to any dependencies that follow the option on the command line. This option can be used together with the -z nodirect option to toggle the use of direct bindings between dependencies. This option does not establish direct bindings between any symbol reference and symbol definition within the object being built.
With the DIRECT mapfile keyword. This keyword provides for directly binding individual symbols. This keyword is described in SYMBOL_SCOPE / SYMBOL_VERSION Directives.

Note –

Direct bindings can be disabled at runtime by setting the environment variable LD_NODIRECT to a non-null value. By setting this environment variable, all symbol binding within a process is carried out using the default search model.

The following sections describe the use of each of the direct binding mechanisms.

Using the `-B direct` Option

The -B direct option provides the simplest mechanism of enabling direct binding for any dynamic object. This option establishes direct bindings to any dependencies, and within the object being built.

From the components used in the previous example, a directly bound object, W.so.2, can be produced.

$ cc -o W.so.2 -G -Kpic W.c -R. -Bdirect w.so.1
$ cc -o prog2 -R. main.c W.so.2 X.so.1

The direct binding information is maintained in a symbol information section, .SUNW_syminfo, within W.so.2. This section can be viewed with elfdump(1).

$ elfdump -y W.so.2
    [6]  DB          <self>         a
    [7]  DBL     [1] w.so.1         b

The letters “DB” indicates a direct binding has been recorded for the associated symbol. The function a() has been bound to the containing object W.so.2. The function b() has been bound directly to the dependency w.so.1. The letter “L” indicates that the dependency w.so.1 should also be lazily loaded.

The direct bindings that are established for W.so.2 can be observed using the LD_DEBUG environment variable. The detail token adds additional information to the binding diagnostics. For W.so.2, this token indicates the direct nature of the binding. The detail token also provides additional information about the binding addresses. For simplification, this address information has been omitted from the output generated from the following examples.

$ LD_DEBUG=symbols,bindings,detail prog2
.....
18452: symbol=a;  lookup in file=./W.so.2  [ ELF ]
18452: binding file=./W.so.2 to file=./W.so.2: symbol `a'  (direct)
18452: symbol=b;  lookup in file=./w.so.1  [ ELF ]
18452: binding file=./W.so.2 to file=./w.so.1: symbol `b'  (direct)

The lari(1) utility can also reveal the direct binding information.

$ lari prog2
[2:2ESD]: a(): ./W.so.2
[2:0]: a(): ./X.so.1
[2:2ED]: b(): ./w.so.1
[2:0]: b(): ./x.so.1

The letter “D” indicates that the function a() defined by W.so.2 has been bound to directly. Similarly, the function b() defined in w.so.1 has been bound to directly.

Note –

The direct binding of W.so.2 to W.so.2 for the function a() results in a similar effect as would be created had the -B symbolic option been used to build W.so.2. However, the -B symbolic option causes references such as a(), that can be resolved internally, to be finalized at link-edit time. This symbol resolution leaves no binding to resolve at runtime.

Unlike -B symbolic bindings, a -B direct binding is left for resolution at runtime. Therefore, this binding can be overridden by explicit interposition, or disabled by setting the environment variable LD_NODIRECT to a non-null value.

Symbolic bindings have often been employed to reduce the runtime relocation overhead incurred when loading complex objects. Direct bindings can be used to establish exactly the same symbol bindings. However, a runtime relocation is still required to create each direct binding. Direct bindings require more overhead than symbolic bindings, but provide for greater flexibility.

Using the `-z direct` Option

The -z direct option provides a mechanism of establishing direct bindings to any dependencies that follow the option on the link-edit command line. Unlike the -B direct option, no direct bindings are established within the object that is being built.

This option is well suited for building objects that are designed to be interposed upon. For example, shared objects are sometimes designed that contain a number of default, or fall back, interfaces. Applications are free to define their own definitions of these interfaces with the intent that the application definitions are bound to at runtime. To allow an application to interpose on the interfaces of a shared object, build the shared object using the -z direct option rather than the -B direct option.

The -z direct option is also useful if you want to be selective over directly binding to one or more dependencies. The -z nodirect option allows you to toggle the use of direct bindings between the dependencies supplied with a link-edit.

From the components used in the previous example, a directly bound object X.so.2 can be produced.

$ cc -o X.so.2 -G -Kpic X.c -R. -zdirect x.so.1
$ cc -o prog3 -R. main.c W.so.2 X.so.2

The direct binding information can be viewed with elfdump(1).

$ elfdump -y X.so.2
    [6]  D           <self>         a
    [7]  DB      [1] x.so.1         b

The function b() has been bound directly to the dependency x.so.1. The function a()is defined as having a potential direct binding, “D”, with the object X.so.2, but no direct binding is established.

The LD_DEBUG environment variable can be used to observe the runtime bindings.

$ LD_DEBUG=symbols,bindings,detail prog3
.....
06177: symbol=a;  lookup in file=prog3  [ ELF ]
06177: symbol=a;  lookup in file=./W.so.2  [ ELF ]
06177: binding file=./X.so.2 to file=./W.so.2: symbol `a'
06177: symbol=b;  lookup in file=./x.so.1  [ ELF ]
06177: binding file=./X.so.2 to file=./x.so.1: symbol `b'  (direct)

The lari(1) utility can also reveal the direct binding information.

$ lari prog3
[2:2ESD]: a(): ./W.so.2
[2:0]: a(): ./X.so.2
[2:1ED]: b(): ./w.so.1
[2:1ED]: b(): ./x.so.1

The function a() defined by W.so.2 continues to satisfy the default symbol reference made by X.so.2. However, the function b() defined in x.so.1 has now been bound to directly from the reference made by X.so.2.

Using the `DIRECT` `mapfile` Keyword

The DIRECT mapfile keyword provides a means of establishing a direct binding for individual symbols. This mechanism is intended for specialized link-editing scenarios.

From the components used in the previous example, the function main() references the external functions W() and X(). The binding of these functions follow the default search model.

$ LD_DEBUG=symbols,bindings prog3
.....
18754: symbol=W;  lookup in file=prog3  [ ELF ]
18754: symbol=W;  lookup in file=./W.so.2  [ ELF ]
18754: binding file=prog3 to file=./W.so.2: symbol `W'
.....
18754: symbol=X;  lookup in file=prog3  [ ELF ]
18754: symbol=X;  lookup in file=./W.so.2  [ ELF ]
18754: symbol=X;  lookup in file=./X.so.2  [ ELF ]
18754: binding file=prog3 to file=./X.so.2: symbol `X'

prog3 can be rebuilt with DIRECT mapfile keywords so that direct bindings are established to the functions W() and X().

$ cat mapfile
$mapfile_version 2
SYMBOL_SCOPE {
        global:
                W       { FLAGS = EXTERN DIRECT };
                X       { FLAGS = EXTERN DIRECT };
};
$ cc -o prog4 -R. main.c W.so.2 X.so.2 -Mmapfile

The LD_DEBUG environment variable can be used to observe the runtime bindings.

$ LD_DEBUG=symbols,bindings,detail prog4
.....
23432: symbol=W;  lookup in file=./W.so.2  [ ELF ]
23432: binding file=prog4 to file=./W.so.2: symbol `W'  (direct)
23432: symbol=X;  lookup in file=./X.so.2  [ ELF ]
23432: binding file=prog4 to file=./x.so.2: symbol `X'  (direct)

The lari(1) utility can also reveal the direct binding information. However in this case, the functions W() and X() are not multiply defined. Therefore, by default lari does not find these functions interesting. The -a option must be used to display all symbol information.

$ lari -a prog4
....
[1:1ED]: W(): ./W.so.2
.....
[2:1ED]: X(): ./X.so.2
.....

Note –

The same direct binding to W.so.2 and X.so.1, can be produced by building prog4 with the -B direct option or the -z direct option. The intent of this example is solely to convey how the mapfile keyword can be used.

Direct Bindings and Interposition

Interposition can occur when multiple instances of a symbol, having the same name, exist in different dynamic objects that have been loaded into a process. Under the default search model, symbol references are bound to the first definition that is found in the series of dependencies that have been loaded. This first symbol is said to interpose on the other symbols of the same name.

Direct bindings can circumvent any implicit interposition. As the directly bound reference is searched for in the dependency associated with the reference, the default symbol search model that enables interposition, is bypassed. In a directly bound environment, bindings can be established to different definitions of a symbol that have the same name.

The ability to bind to different definitions of a symbol that have the same name is a feature of direct binding that can be very useful. However, should an application depend upon an instance of interposition, the use of direct bindings can subvert the applications expected execution. Before deciding to use direct bindings with an existing application, the application should be analyzed to determine whether interposition exists.

To determine whether interposition is possible within an application, use lari(1). By default, lari conveys interesting information. This information originates from multiple instances of a symbol definition, which in turn can lead to interposition.

Interposition only occurs when one instance of the symbol is bound to. Multiple instances of a symbol that are called out by lari might not be involved in interposition. Other multiple instance symbols can exist, but might not be referenced. These unreferenced symbols are still candidates for interposition, as future code development might result in references to these symbols. All instances of multiply defined symbols should be analyzed when considering the use of direct bindings.

If multiple instances of a symbol of the same name exist, especially if interposition is observed, one of the following actions should be performed.

Localize symbol instances to remove namespace collision.
Remove the multiple instances to leave one symbol definition.
Define any interposition requirement explicitly.
Identify symbols that can be interposed upon to prevent the symbol from being directly bound to.

The following sections explore these actions in greater detail.

Localizing Symbol Instances

Multiply defined symbols of the same name that provide different implementations, should be isolated to avoid accidental interposition. The simplest way to remove a symbol from the interfaces that are exported by an object, is to reduce the symbol to local. Demoting a symbol to local can be achieved by defining the symbol “static”, or possibly through the use of symbol attributes provided by the compilers.

A symbol can also be reduced to local by using the link-editor and a mapfile. The following example shows a mapfile that reduces the global function error() to a local symbol by using the local scoping directive.

$ cc -o A.so.1 -G -Kpic error.c a.c b.c ...
$ elfdump -sN.symtab A.so.1 | fgrep error
    [36]  0x000002d0 0x00000014  FUNC GLOB  D    0 .text      error
$ cat mapfile
$mapfile_version 2
SYMBOL_SCOPE {
        local:
                error;
};
$ cc -o A.so.2 -G -Kpic -M mapfile error.c a.c b.c ...
$ elfdump -sN.symtab A.so.2 | fgrep error
    [24]  0x000002c8 0x00000014  FUNC LOCL  H    0 .text      error

Although individual symbols can be reduced to locals using explicit mapfile definitions, defining the entire interface family through symbol versioning is recommended. See Chapter 5, Application Binary Interfaces and Versioning.

Versioning is a useful technique typically employed to identify the interfaces that are exported from shared objects. Similarly, dynamic executables can be versioned to define their exported interfaces. A dynamic executable need only export the interfaces that must be made available for the dependencies of the object to bind to. Frequently, the code that you add to a dynamic executable need export no interfaces.

The removal of exported interfaces from a dynamic executable should take into account any symbol definitions that have been established by the compiler drivers. These definitions originate from auxiliary files that the compiler drivers add to the final link-edit. See Using a Compiler Driver.

The following example mapfile exports a common set of symbol definitions that a compiler driver might establish, while demoting all other global definitions to local.

$ cat mapfile
$mapfile_version 2
SYMBOL_SCOPE {
        global:
                __Argv;
                __environ_lock;
                _environ;
                _lib_version;
                environ;
        local:
                *;
};

You should determine the symbol definitions that your compiler driver establishes. Any of these definitions that are used within the dynamic executable should remain global.

By removing any exported interfaces from a dynamic executable, the executable is protected from future interposition issues than might occur as the objects dependencies evolve.

Removing Multiply Defined Symbols of the Same Name

Multiply defined symbols of the same name can be problematic within a directly bound environment, if the implementation associated with the symbol maintains state. Data symbols are the typical offenders in this regard, however functions that maintain state can also be problematic.

In a directly bound environment, multiple instances of the same symbol can be bound to. Therefore, different binding instances can manipulate different state variables that were originally intended to be a single instance within a process.

For example, suppose that two shared objects contain the same data item errval. Suppose also, that two functions action() and inspect(), exist in different shared objects. These functions expect to write and read the value errval respectively.

With the default search model, one definition of errval would interpose on the other definition. Both functions action() and inspect() would be bound to the same instance of errval. Therefore, if an error code was written to errval by action(), then inspect() could read, and act upon this error condition.

However, suppose the objects containing action() and inspect() were bound to different dependencies that each defined errval. Within a directly bound environment, these functions are bound to different definitions of errval. An error code can be written to one instance of errval by action() while inspect() reads the other, uninitialized definition of errval. The outcome is that inspect() detects no error condition to act upon.

Multiple instances of data symbols typically occur when the symbols are declared in headers.

int bar;

This data declaration results in a data item being produced by each compilation unit that includes the header. The resulting tentative data item can result in multiple instances of the symbol being defined in different dynamic objects.

However, by explicitly defining the data item as external, references to the data item are produced for each compilation unit that includes the header.

extern int bar;

These references can then be resolved to one data instance at runtime.

Occasionally, the interface for a symbol implementation that you want to remove, should be preserved. Multiple instances of the same interface can be vectored to one implementation, while preserving any existing interface. This model can be achieved by creating individual symbol filters by using a FILTER mapfile keyword. This keyword is described in SYMBOL_SCOPE / SYMBOL_VERSION Directives.

Creating individual symbol filters is useful when dependencies expect to find a symbol in an object where the implementation for that symbol has been removed.

For example, suppose the function error() exists in two shared objects, A.so.1 and B.so.1. To remove the symbol duplication, you want to remove the implementation from A.so.1. However, other dependencies are relying on error() being provided from A.so.1. The following example shows the definition of error() in A.so.1. A mapfile is then used to allow the removal of the error() implementation, while leaving a filter for this symbol that is directed to B.so.1.

$ cc -o A.so.1 -G -Kpic error.c a.c b.c ...
$ elfdump -sN.dynsym A.so.1 | fgrep error
    [3]  0x00000300 0x00000014  FUNC GLOB  D    0 .text      error
$ cat mapfile
$mapfile_version 2
SYMBOL_SCOPE {
        global:
                error { TYPE=FUNCTION; FILTER=B.so.1 };
};
$ cc -o A.so.2 -G -Kpic -M mapfile a.c b.c ...
$ elfdump -sN.dynsym A.so.2 | fgrep error
    [3]  0x00000000 0x00000000  FUNC GLOB  D    0 ABS        error
$ elfdump -y A.so.2 | fgrep error
    [3]  F       [0] B.so.1         error

The function error() is global, and remains an exported interface of A.so.2. However, any runtime binding to this symbol is vectored to the filtee B.so.1. The letter “F” indicates the filter nature of this symbol.

This model of preserving existing interfaces, while vectoring to one implementation has been used in several Oracle Solaris libraries. For example, a number of math interfaces that were once defined in libc.so.1 are now vectored to the preferred implementation of the functions in libm.so.2.

Defining Explicit Interposition

The default search model can result in instances of the same named symbol interposing on later instances of the same name. Even without any explicit labelling, interposition still occurs, so that one symbol definition is bound to from all references. This implicit interposition occurs as a consequence of the symbol search, not because of any explicit instruction the runtime linker has been given. This implicit interposition can be circumvented by direct bindings.

Although direct bindings work to resolve a symbol reference directly to an associated symbol definition, explicit interposition is processed prior to any direct binding search. Therefore, even within a direct binding environment, interposers can be designed, and be expected to interpose on any direct binding associations. Interposers can be explicitly defined using the following techniques.

With the LD_PRELOAD environment variable.
With the link-editors -z interpose option.
With the INTERPOSE mapfile keyword.
As a consequence of a singleton symbol definition.

The interposition facilities of the LD_PRELOAD environment variable, and the -z interpose option, have been available for some time. See Runtime Interposition. As these objects are explicitly defined to be interposers, the runtime linker inspects these objects before processing any direct binding.

Interposition that is established for a shared object applies to all the interfaces of that dynamic object. This object interposition is established when a object is loaded using the LD_PRELOAD environment variable. Object interposition is also established when an object that has been built with the -z interpose option, is loaded. This object model is important when techniques such as dlsym(3C) with the special handle RTLD_NEXT are used. An interposing object should always have a consistent view of the next object.

A dynamic executable has additional flexibility, in that the executable can define individual interposing symbols using the INTERPOSE mapfile keyword. Because a dynamic executable is the first object loaded in a process, the executables view of the next object is always consistent.

The following example shows an application that explicitly wants to interpose on the exit() function.

$ cat mapfile
$mapfile_version 2
SYMBOL_SCOPE {
        global:
                exit    { FLAGS = INTERPOSE };
};
$ cc -o prog -M mapfile exit.c a.c b.c ...
$ elfdump -y prog | fgrep exit
    [6]  DI          <self>         exit

The letter “I” indicates the interposing nature of this symbol. Presumably, the implementation of this exit() function directly references the system function _exit(), or calls through to the system function exit() using dlsym() with the RTLD_NEXT handle.

At first, you might consider identifying this object using the -z interpose option. However, this technique is rather heavy weight, because all of the interfaces exported by the application would act as interposers. A better alternative would be to localize all of the symbols provided by the application except for the interposer, together with using the -z interpose option.

However, use of the INTERPOSE mapfile keyword provides greater flexibility. The use of this keyword allows an application to export several interfaces while selecting those interfaces that should act as interposers.

Symbols that are assigned the STV_SINGLETON visibility effectively provide a form of interposition. See Table 7–20. These symbols can be assigned by the compilation system to an implementation that might become multiply instantiated in a number of objects within a process. All references to a singleton symbol are bound to the first occurrence of a singleton symbol within a process.

Preventing a Symbol from being Directly Bound to

Direct bindings can be overridden with explicit interposition. See Defining Explicit Interposition. However, cases can exist where you do not have control over establishing explicit interposition.

For example, you might deliver a family of shared objects that you would like to use direct bindings. Customers are known to be interposing on symbols that are provided by shared objects of this family. If these customers have not explicitly defined their interpositioning requirements, their interpositioning can be compromised by a re-delivery of shared objects that employ direct bindings.

Shared objects can also be designed that provide a number of default interfaces, with an expectation that users provide their own interposing routines.

To prevent disrupting existing applications, shared objects can be delivered that explicitly prevent directly binding to one or more of their interfaces.

Directly binding to a dynamic object can be prevented using one of the following options.

With the -B nodirect option. This option prevents directly binding to any interfaces that are offered by the object being built.
With the NODIRECT mapfile keyword. This keyword provides for preventing direct binding to individual symbols. This keyword is described in SYMBOL_SCOPE / SYMBOL_VERSION Directives.
As a consequence of a singleton symbol definition.

An interface that is labelled as nodirect, can not be directly bound to from an external object. In addition, an interface that is labelled as nodirect, can not be directly bound to from within the same object.

The following sections describe the use of each of the direct binding prevention mechanisms.

Using the `-B nodirect` Option

The -B nodirect option provides the simplest mechanism of preventing direct binding from any dynamic object. This option prevents direct binding from any other object, and from within the object being built.

The following components are used to build three shared objects, A.so.1, O.so.1 and X.so.1. The -B nodirect option is used to prevent A.so.1 from directly binding to O.so.1. However, O.so.1 can continue to establish direct bindings to X.so.1 using the -z direct option.

$ cat a.c
extern int o(), p(), x(), y();

int a() { return (o() + p() - x() - y()); }

$ cat o.c
extern int x(), y();

int o() { return (x()); }
int p() { return (y()); }

$ cat x.c
int x() { return (1); }
int y() { return (2); }

$ cc -o X.so.1 -G -Kpic x.c
$ cc -o O.so.1 -G -Kpic o.c -Bnodirect -zdirect -R. X.so.1
$ cc -o A.so.1 -G -Kpic a.c -Bdirect -R. O.so.1 X.so.1

The symbol information for A.so.1 and O.so.1 can be viewed with elfdump(1).

$ elfdump -y A.so.1
    [1]  DBL     [3] X.so.1            x
    [5]  DBL     [3] X.so.1            y
    [6]  DL      [1] O.so.1            o
    [9]  DL      [1] O.so.1            p
$ elfdump -y O.so.1
    [3]  DB      [0] X.so.1            x
    [4]  DB      [0] X.so.1            y
    [6]  N                             o
    [7]  N                             p

The letter “N” indicates that no direct bindings be allowed to the functions o() and p(). Even though A.so.1 has requested direct bindings by using the -B direct option, direct bindings have not be established to the functions o() and p(). O.so.1 can still request direct bindings to its dependency X.so.1 using the -z direct option.

The Oracle Solaris library libproc.so.1 is built with the -B nodirect option. Users of this library are expected to provide their own call back interfaces for many of the libproc functions. References to the libproc functions from any dependencies of libproc should bind to any user definitions when such definitions exist.

Using the `NODIRECT` `mapfile` Keyword

The NODIRECT mapfile keyword provides a means of preventing a direct binding to individual symbols. This keyword allows for more fine grained control over preventing direct binding than the -B nodirect option.

From the components used in the previous example, O.so.2 can be built to prevent direct binding to the function o().

$ cat mapfile
$mapfile_version 2
SYMBOL_SCOPE {
        global:
                o       { FLAGS = NODIRECT };
};
$ cc -o O.so.2 -G -Kpic o.c -Mmapfile -zdirect -R. X.so.1
$ cc -o A.so.2 -G -Kpic a.c -Bdirect -R. O.so.2 X.so.1

The symbol information for A.so.2 and O.so.2 can be viewed with elfdump(1).

$ elfdump -y A.so.2
    [1]  DBL     [3] X.so.1            x
    [5]  DBL     [3] X.so.1            y
    [6]  DL      [1] O.so.1            o
    [9]  DBL     [1] O.so.1            p
$ elfdump -y O.so.1
    [3]  DB      [0] X.so.1            x
    [4]  DB      [0] X.so.1            y
    [6]  N                             o
    [7]  D           <self>            p

O.so.1 only declares that the function o() can not be directly bound to. Therefore, A.so.2 is able to directly bind to the function p() in O.so.1.

Several individual interfaces within the Oracle Solaris libraries have been defined to not allow direct binding. One example is the data item errno. This data item is defined in libc.so.1. This data item can be referenced by including the header file stdio.h. However, many applications were commonly taught to defined their own errno. These applications would be compromised if a family of system libraries were delivered which directly bound to the errno that is defined in libc.so.1.

Another family of interfaces that have been defined to prevent direct binding to, are the malloc(3C) family. The malloc() family are another set of interfaces that are frequently implemented within user applications. These user implementations are intended to interpose upon any system definitions.

Note –

Various system interposing libraries are provided with the Oracle Solaris OS that provide alternative malloc() implementations. In addition, each implementation expects to be the only implementation used within a process. All of the malloc() interposing libraries have been built with the -z interpose option. This option is not really necessary as the malloc() family within libc.so.1 have been labelled to prevent any direct binding

However, the interposing libraries have been built with -z interpose to set a precedent for building interposers. This explicit interposition has no adverse interaction with the direct binding prevention definitions established within libc.so.1.

Symbols that are assigned the STV_SINGLETON visibility can not be directly bound to. See Table 7–20. These symbols can be assigned by the compilation system to an implementation that might become multiply instantiated in a number of objects within a process. All references to a singleton symbol are bound to the first occurrence of a singleton symbol within a process.

Appendix D Direct Bindings

Observing Symbol Bindings

Enabling Direct Binding

Using the -B direct Option

Using the -z direct Option

Using the DIRECT mapfile Keyword

Direct Bindings and Interposition

Localizing Symbol Instances

Removing Multiply Defined Symbols of the Same Name

Defining Explicit Interposition

Preventing a Symbol from being Directly Bound to

Using the -B nodirect Option

Using the NODIRECT mapfile Keyword

Using the `-B direct` Option

Using the `-z direct` Option

Using the `DIRECT` `mapfile` Keyword

Using the `-B nodirect` Option

Using the `NODIRECT` `mapfile` Keyword