JavaScript is required to for searching.
Skip Navigation Links
Exit Print View
Linker and Libraries Guide     Oracle Solaris 11 Express 11/10
search filter icon
search icon

Document Information

Preface

1.  Introduction to the Oracle Solaris Link Editors

2.  Link-Editor

Invoking the Link-Editor

Direct Invocation

Using a Compiler Driver

Cross Link-Editing

Specifying the Link-Editor Options

Input File Processing

Archive Processing

Shared Object Processing

Linking With Additional Libraries

Library Naming Conventions

Linking With a Mix of Shared Objects and Archives

Position of an Archive on the Command Line

Directories Searched by the Link-Editor

Using a Command-Line Option

Using an Environment Variable

Directories Searched by the Runtime Linker

Initialization and Termination Sections

Symbol Processing

Symbol Resolution

Simple Resolutions

Complex Resolutions

Fatal Resolutions

Undefined Symbols

Generating an Executable Output File

Generating a Shared Object Output File

Weak Symbols

Tentative Symbol Order Within the Output File

Defining Additional Symbols

Defining Additional Symbols with the -u option

Defining Symbol References

Defining Absolute Symbols

Defining Tentative Symbols

Augmenting a Symbol Definition

Reducing Symbol Scope

Symbol Elimination

External Bindings

String Table Compression

Generating the Output File

Identifying Capability Requirements

Identifying a Platform Capability

Identifying a Machine Capability

Identifying Hardware Capabilities

Identifying Software Capabilities

Software Capability Frame Pointer Processing

Software Capability Address Space Restriction Processing

Creating a Family of Symbol Capabilities Functions

Creating a Family of Symbol Capabilities Data Items

Converting Object Capabilities to Symbol Capabilities

Exercising a Capability Family

Relocation Processing

Displacement Relocations

Debugging Aids

3.  Runtime Linker

4.  Shared Objects

5.  Application Binary Interfaces and Versioning

6.  Support Interfaces

7.  Object File Format

8.  Thread-Local Storage

9.  Mapfiles

A.  Link-Editor Quick Reference

B.  Versioning Quick Reference

C.  Establishing Dependencies with Dynamic String Tokens

D.  Direct Bindings

E.  System V Release 4 (Version 1) Mapfiles

F.  Linker and Libraries Updates and New Features

Index

Symbol Processing

During input file processing, all local symbols from the input relocatable objects are passed through to the output file image. All global symbols from the input relocatable objects, together with globals symbols from shared object dependencies, are accumulated internally within the link-editor.

Each global symbol supplied by an input file is searched for within this internal symbol table. If a symbol with the same name has already been encountered from a previous input file, a symbol resolution process is called. This resolution process determines which of two entries from relocatable objects are kept. This resolution process also determines how external references to shared object dependencies are established.

On completion of input file processing, and providing no fatal symbol resolution errors have occurred, the link-editor determines if any unresolved symbol references remain. Unresolved symbol references can cause the link-edit to terminate.

Finally, the link-editor's internal symbol table is added to the symbol tables of the image being created.

The following sections expand upon symbol resolution and undefined symbol processing.

Symbol Resolution

Symbol resolution runs the entire spectrum, from simple and intuitive to complex and perplexing. Most resolutions are carried out silently by the link-editor. However, some relocations can be accompanied by warning diagnostics, while others can result in a fatal error condition.

The most common simple resolutions involve binding symbol references from one object to symbol definitions within another object. This binding can occur between two relocatable objects, or between a relocatable object and the first definition found in a shared object dependency. Complex resolutions typically occur between two or more relocatable objects.

The resolution of two symbols depends on their attributes, the type of file that provides the symbol, and the type of file being generated. For a complete description of symbol attributes, see Symbol Table Section. For the following discussions, however, three basic symbol types are identified.

In its simplest form, symbol resolution involves the use of a precedence relationship. This relationship has defined symbols dominate tentative symbols, which in turn dominate undefined symbols.

The following example of C code shows how these symbol types can be generated. Undefined symbols are prefixed with u_. Tentative symbols are prefixed with t_. Defined symbols are prefixed with d_.

$ cat main.c
extern int      u_bar;
extern int      u_foo();

int             t_bar;
int             d_bar = 1;

int d_foo()
{
        return (u_foo(u_bar, t_bar, d_bar));
}
$ cc -o main.o -c main.c
$ elfdump -s main.o

Symbol Table Section:  .symtab
     index    value      size      type bind oth ver shndx          name
     ....
       [7]  0x00000000 0x00000000  FUNC GLOB  D    0 UNDEF          u_foo
       [8]  0x00000010 0x00000040  FUNC GLOB  D    0 .text          d_foo
       [9]  0x00000004 0x00000004  OBJT GLOB  D    0 COMMON         t_bar
      [10]  0x00000000 0x00000004  NOTY GLOB  D    0 UNDEF          u_bar
      [11]  0x00000000 0x00000004  OBJT GLOB  D    0 .data          d_bar
Simple Resolutions

Simple symbol resolutions are by far the most common. In this case, two symbols with similar characteristics are detected, with one symbol taking precedence over the other. This symbol resolution is carried out silently by the link-editor. For example, with symbols of the same binding, a symbol reference from one file is bound to a defined, or tentative symbol definition, from another file. Or, a tentative symbol definition from one file is bound to a defined symbol definition from another file. This resolution can occur between two relocatable objects, or between a relocatable object and the first definition found in a shared object dependency.

Symbols that undergo resolution can have either a global or weak binding. Within relocatable objects, weak bindings have lower precedence than global binding. Relocatable object symbols with different bindings are resolved according to a slight alteration of the basic rules.

Weak symbols can usually be defined through the compiler, either individually or as aliases to global symbols. One mechanism uses a #pragma definition.

$ cat main.c
#pragma weak    bar
#pragma weak    foo = _foo

int             bar = 1;

int _foo()
{
        return (bar);
}
$ cc -o main.o -c main.c
$ elfdump -s main.o
Symbol Table Section:  .symtab
     index    value      size      type bind oth ver shndx          name
     ....
       [7]  0x00000010 0x00000018  FUNC GLOB  D    0 .text          _foo
       [8]  0x00000010 0x00000018  FUNC WEAK  D    0 .text          foo
       [9]  0x00000000 0x00000004  OBJT WEAK  D    0 .data          bar

Notice that the weak alias foo is assigned the same attributes as the global symbol _foo. This relationship is maintained by the link-editor and results in the symbols being assigned the same value in the output image. In symbol resolution, weak defined symbols are silently overridden by any global definition of the same name.

Another form of simple symbol resolution, interposition, occurs between relocatable objects and shared objects, or between multiple shared objects. In these cases, when a symbol is multiply-defined, the relocatable object, or the first definition between multiple shared objects, is silently taken by the link-editor. The relocatable object's definition, or the first shared object's definition, is said to interpose on all other definitions. This interposition can be used to override the functionality provided by another shared object. Multiply-defined symbols that occur between relocatable objects and shared objects, or between multiple shared objects, are treated identically. A symbols weak binding or global binding is irrelevant. By resolving to the first definition, regardless of the symbols binding, both the link-editor and runtime linker behave consistently.

The combination of weak symbols defined within a shared object together with symbol interposition over the same shared object, can provide a useful programming technique. For example, the standard C library provides several services that you are allowed to redefine. However, ANSI C defines a set of standard services that must be present on the system. These services cannot be replaced in a strictly conforming program.

The function fread(3C), for example, is an ANSI C library function. The system function read(2) is not an ANSI C library function. A conforming ANSI C program must be able to redefine read(2) and still use fread(3C) in a predictable way.

The problem here is that read(2) underlies the fread(3C) implementation in the standard C library. Therefore, a program that redefines read(2) might confuse the fread(3C) implementation. To guard against this occurrence, ANSI C states that an implementation cannot use a name that is not reserved for the implementation. Use the following #pragma directive to define just such a reserved name. Use this name to generate an alias for the function read(2).

#pragma weak read = _read

Thus, you can quite freely define your own read() function without compromising the fread(3C) implementation, which in turn is implemented to use the _read() function.

The link-editor has no difficulty with this redefinition of read(), either when linking against the shared object or archive version of the standard C library. In the former case, interposition takes its course. In the latter case, the fact that the C library's definition of read(2) is weak allows that definition to be quietly overridden.

Use the link-editor's -m option to write a list of all interposed symbol references, along with section load address information, to the standard output.

Complex Resolutions

Complex resolutions occur when two symbols of the same name are found with differing attributes. In these cases, the link-editor generates a warning message, while selecting the most appropriate symbol. This message indicates the symbol, the attributes that conflict, and the identity of the file from which the symbol definition is taken. In the following example, two files with a definition of the data item array have different size requirements.

$ cat foo.c
int array[1];

$ cat bar.c
int array[2] = { 1, 2 };

$ ld -r -o temp.o foo.c bar.c
ld: warning: symbol `array' has differing sizes:
        (file foo.o value=0x4; file bar.o value=0x8);
        bar.o definition taken

A similar diagnostic is produced if the symbol's alignment requirements differ. In both of these cases, the diagnostic can be suppressed by using the link-editor's -t option.

Another form of attribute difference is the symbol's type. In the following example, the symbol bar() has been defined as both a data item and a function.

$ cat foo.c
int bar()
{
        return (0);
}
$ cc -o libfoo.so -G -K pic foo.c
$ cat main.c
int     bar = 1;

int main()
{
        return (bar);
}
$ cc -o main main.c -L. -lfoo
ld: warning: symbol `bar' has differing types:
        (file main.o type=OBJT; file ./libfoo.so type=FUNC);
        main.o definition taken

Note - Symbol types in this context are classifications that can be expressed in ELF. These symbol types are not related to the data types as employed by the programming language, except in the crudest fashion.


In cases like the previous example, the relocatable object definition is taken when the resolution occurs between a relocatable object and a shared object. Or, the first definition is taken when the resolution occurs between two shared objects. When such resolutions occur between symbols of weak or global binding, a warning is also produced.

Inconsistencies between symbol types are not suppressed by the link-editor's -t option.

Fatal Resolutions

Symbol conflicts that cannot be resolved result in a fatal error condition and an appropriate error message. This message indicates the symbol name together with the names of the files that provided the symbols. No output file is generated. Although the fatal condition is sufficient to terminate the link-edit, all input file processing is first completed. In this manner, all fatal resolution errors can be identified.

The most common fatal error condition exists when two relocatable objects both define non-weak symbols of the same name.

$ cat foo.c
int bar = 1;

$ cat bar.c
int bar()
{ 
        return (0);
}

$ ld -r -o temp.o foo.c bar.c
ld: fatal: symbol `bar' is multiply-defined:
        (file foo.o and file bar.o);
ld: fatal: File processing errors. No output written to int.o

foo.c and bar.c have conflicting definitions for the symbol bar. Because the link-editor cannot determine which should dominate, the link-edit usually terminates with an error message. You can use the link-editor's -z muldefs option to suppress this error condition. This option allows the first symbol definition to be taken.

Undefined Symbols

After all of the input files have been read and all symbol resolution is complete, the link-editor searches the internal symbol table for any symbol references that have not been bound to symbol definitions. These symbol references are referred to as undefined symbols. Undefined symbols can affect the link-edit process according to the type of symbol, together with the type of output file being generated.

Generating an Executable Output File

When generating an executable output file, the link-editor's default behavior is to terminate with an appropriate error message should any symbols remain undefined. A symbol remains undefined when a symbol reference in a relocatable object is never matched to a symbol definition.

$ cat main.c
extern int foo();

int main()
{
        return (foo());
}
$ cc -o prog main.c
Undefined           first referenced
 symbol                 in file
foo                     main.o
ld: fatal: Symbol referencing errors. No output written to prog

Similarly, if a shared object is used to create a dynamic executable and leaves an unresolved symbol definition, an undefined symbol error results.

$ cat foo.c
extern int bar;
int foo()
{
        return (bar);
}

$ cc -o libfoo.so -G -K pic foo.c
$ cc -o prog main.c -L. -lfoo
Undefined           first referenced
 symbol                 in file
bar                     ./libfoo.so
ld: fatal: Symbol referencing errors. No output written to prog

To allow undefined symbols, as in the previous example, use the link-editor's -z nodefs option to suppress the default error condition.


Note - Take care when using the -z nodefs option. If an unavailable symbol reference is required during the execution of a process, a fatal runtime relocation error occurs. This error might be detected during the initial execution and testing of an application. However, more complex execution paths can result in this error condition taking much longer to detect, which can be time consuming and costly.


Symbols can also remain undefined when a symbol reference in a relocatable object is bound to a symbol definition in an implicitly defined shared object. For example, continuing with the files main.c and foo.c used in the previous example.

$ cat bar.c
int bar = 1;

$ cc -o libbar.so -R. -G -K pic bar.c -L. -lfoo
$ ldd libbar.so
        libfoo.so =>     ./libfoo.so

$ cc -o prog main.c -L. -lbar
Undefined           first referenced
 symbol                 in file
foo                     main.o  (symbol belongs to implicit \
                        dependency ./libfoo.so)
ld: fatal: Symbol referencing errors. No output written to prog

prog is built with an explicit reference to libbar.so. libbar.so has a dependency on libfoo.so. Therefore, an implicit reference to libfoo.so from prog is established.

Because main.c made a specific reference to the interface provided by libfoo.so, prog really has a dependency on libfoo.so. However, only explicit shared object dependencies are recorded in the output file being generated. Thus, prog fails to run if a new version of libbar.so is developed that no longer has a dependency on libfoo.so.

For this reason, bindings of this type are deemed fatal. The implicit reference must be made explicit by referencing the library directly during the link-edit of prog. The required reference is hinted at in the fatal error message that is shown in the preceding example.

Generating a Shared Object Output File

When the link-editor is generating a shared object output file, undefined symbols are allowed to remain at the end of the link-edit. This default behavior allows the shared object to import symbols from a dynamic executable that defines the shared object as a dependency.

The link-editor's -z defs option can be used to force a fatal error if any undefined symbols remain. This option is recommended when creating any shared objects. Shared objects that reference symbols from an application can use the -z defs option, together with defining the symbols by using an extern mapfile directive. See SYMBOL_SCOPE / SYMBOL_VERSION Directives.

A self-contained shared object, in which all references to external symbols are satisfied by named dependencies, provides maximum flexibility. The shared object can be employed by many users without those users having to determine and establish dependencies to satisfy the shared object's requirements.

Weak Symbols

Weak symbol references that remain unresolved, do not result in a fatal error condition, no matter what output file type is being generated.

If a static executable is being generated, the symbol is converted to an absolute symbol with an assigned value of zero.

If a dynamic executable or shared object is being produced, the symbol is left as an undefined weak reference with an assigned value of zero. During process execution, the runtime linker searches for this symbol. If the runtime linker does not find a match, the reference is bound to an address of zero instead of generating a fatal relocation error.

Historically, these undefined weak referenced symbols have been employed as a mechanism to test for the existence of functionality. For example, the following C code fragment might have been used in the shared object libfoo.so.1.

#pragma weak    foo

extern  void    foo(char *);

void bar(char *path)
{
        void (*fptr)(char *);

        if ((fptr = foo) != 0)
                (*fptr)(path);
}

When building an application that references libfoo.so.1, the link-edit completes successfully regardless of whether a definition for the symbol foo is found. If during execution of the application the function address tests nonzero, the function is called. However, if the symbol definition is not found, the function address tests zero and therefore is not called.

Compilation systems view this address comparison technique as having undefined semantics, which can result in the test statement being removed under optimization. In addition, the runtime symbol binding mechanism places other restrictions on the use of this technique. These restrictions prevent a consistent model from being made available for all dynamic objects.


Note - Undefined weak references in this manner are discouraged. Instead, you should use dlsym(3C) with the RTLD_DEFAULT, or RTLD_PROBE handles as a means of testing for a symbol's existence. See Testing for Functionality.


Tentative Symbol Order Within the Output File

Contributions from input files usually appear in the output file in the order of their contribution. Tentative symbols are an exception to this rule, as these symbols are not fully defined until their resolution is complete. The order of tentative symbols within the output file might not follow the order of their contribution.

If you need to control the ordering of a group of symbols, then any tentative definition should be redefined to a zero-initialized data item. For example, the following tentative definitions result in a reordering of the data items within the output file, as compared to the original order described in the source file foo.c.

$ cat foo.c
char One_array[0x10];
char Two_array[0x20];
char Three_array[0x30];

$ cc -o libfoo.so -G -Kpic foo.c
$ elfdump -sN.dynsym libfoo.so | grep array | sort -k 2,2
      [11]  0x00010614 0x00000020  OBJT GLOB  D    0 .bss           Two_array
       [3]  0x00010634 0x00000030  OBJT GLOB  D    0 .bss           Three_array
       [4]  0x00010664 0x00000010  OBJT GLOB  D    0 .bss           One_array

Sorting the symbols based on their address shows that their output order is different than the order they were defined in the source. In contrast, defining these symbols as initialized data items ensures that the relative ordering of these symbols within the input file is carried over to the output file.

$ cat foo.c
char A_array[0x10] = { 0 };
char B_array[0x20] = { 0 };
char C_array[0x30] = { 0 };

$ cc -o libfoo.so -G -Kpic foo.c
$ elfdump -sN.dynsym libfoo.so | grep array | sort -k 2,2
       [4]  0x00010614 0x00000010  OBJT GLOB  D    0 .data          One_array
      [11]  0x00010624 0x00000020  OBJT GLOB  D    0 .data          Two_array
       [3]  0x00010644 0x00000030  OBJT GLOB  D    0 .data          Three_array

Defining Additional Symbols

Besides the symbols provided from input files, you can supply additional global symbol references or global symbol definitions to a link-edit. In the simplest form, symbol references can be generated using the link-editor's -u option. Greater flexibility is provided with the link-editor's -M option and an associated mapfile. This mapfile enables you to define global symbol references and a variety of global symbol definitions. Attributes of the symbol such as visibility and type can be specified, See SYMBOL_SCOPE / SYMBOL_VERSION Directives for a complete description of the available options.

Defining Additional Symbols with the -u option

The -u option provides a mechanism for generating a global symbol reference from the link-edit command line. This option can be used to perform a link-edit entirely from archives. This option can also provide additional flexibility in selecting the objects to extract from multiple archives. See section Archive Processing for an overview of archive extraction.

For example, perhaps you want to generate a dynamic executable from the relocatable object main.o, which refers to the symbols foo and bar. You want to obtain the symbol definition foo from the relocatable object foo.o contained in lib1.a, and the symbol definition bar from the relocatable object bar.o, contained in lib2.a.

However, the archive lib1.a also contains a relocatable object that defines the symbol bar. This relocatable object is presumably of differing functionality to the relocatable object that is provided in lib2.a. To specify the required archive extraction, you can use the following link-edit.

$ cc -o prog -L. -u foo -l1 main.o -l2

The -u option generates a reference to the symbol foo. This reference causes extraction of the relocatable object foo.o from the archive lib1.a. The first reference to the symbol bar occurs in main.o, which is encountered after lib1.a has been processed. Therefore, the relocatable object bar.o is obtained from the archive lib2.a.


Note - This simple example assumes that the relocatable object foo.o from lib1.a does not directly or indirectly reference the symbol bar. If lib1.a does reference bar, then the relocatable object bar.o is also extracted from lib1.a during its processing. See Archive Processing for a discussion of the link-editor's multi-pass processing of an archive.


Defining Symbol References

The following example shows how three symbol references can be defined. These references are then used to extract members of an archive. Although this archive extraction can be achieved by specifying multiple -u options to the link-edit, this example also shows how the eventual scope of a symbol can be reduced to local.

$ cat foo.c
#include <stdio.h>
void foo()
{
        (void) printf("foo: called from lib.a\n");
}
$ cat bar.c
#include <stdio.h>
void bar()
{
        (void) printf("bar: called from lib.a\n");
}
$ cat main.c
extern  void    foo(), bar();

void main()
{
        foo();
        bar();
}
$ cc -c foo.c bar.c main.c
$ ar -rc lib.a foo.o bar.o main.o
$ cat mapfile
$mapfile_version 2
SYMBOL_SCOPE {
        local:
                foo;
                bar;
        global:
                main;
};
$ cc -o prog -M mapfile lib.a
$ prog
foo: called from lib.a
bar: called from lib.a
$ elfdump -sN.symtab prog | egrep 'main$|foo$|bar$'
      [29]  0x00010f30 0x00000024  FUNC LOCL  H    0 .text          bar
      [30]  0x00010ef8 0x00000024  FUNC LOCL  H    0 .text          foo
      [55]  0x00010f68 0x00000024  FUNC GLOB  D    0 .text          main

The significance of reducing symbol scope from global to local is covered in more detail in the section Reducing Symbol Scope.

Defining Absolute Symbols

The following example shows how two absolute symbol definitions can be defined. These definitions are then used to resolve the references from the input file main.c.

$ cat main.c
#include <stdio.h>
extern  int     foo();
extern  int     bar;

void main()
{
        (void) printf("&foo = 0x%p\n", &foo);
        (void) printf("&bar = 0x%p\n", &bar);
}
$ cat mapfile
$mapfile_version 2
SYMBOL_SCOPE {
        global:
                foo     { TYPE=FUNCTION; VALUE=0x400 };
                bar     { TYPE=DATA;     VALUE=0x800 };
};
$ cc -o prog -M mapfile main.c
$ prog
&foo = 0x400
&bar = 0x800
$ elfdump -sN.symtab prog | egrep 'foo$|bar$'
      [45]  0x00000800 0x00000000  OBJT GLOB  D    0 ABS            bar
      [69]  0x00000400 0x00000000  FUNC GLOB  D    0 ABS            foo

When obtained from an input file, symbol definitions for functions or data items are usually associated with elements of data storage. A mapfile definition is insufficient to be able to construct this data storage, so these symbols must remain as absolute values. A simple mapfile definition that is associated with a size, but no value results in the creation of data storage. In this case, the symbol definition is accompanied with a section index. However, a mapfile definition that is accompanied with a value results in the creation of an absolute symbol. If a symbol is defined in a shared object, an absolute definition should be avoided. See Augmenting a Symbol Definition.

Defining Tentative Symbols

A mapfile can also be used to define a COMMON, or tentative, symbol. Unlike other types of symbol definition, tentative symbols do not occupy storage within a file, but define storage that must be allocated at runtime. Therefore, symbol definitions of this kind can contribute to the storage allocation of the output file being generated.

A feature of tentative symbols that differs from other symbol types is that their value attribute indicates their alignment requirement. A mapfile definition can therefore be used to realign tentative definitions that are obtained from the input files of a link-edit.

The following example shows the definition of two tentative symbols. The symbol foo defines a new storage region whereas the symbol bar is actually used to change the alignment of the same tentative definition within the file main.c.

$ cat main.c
#include <stdio.h>
extern  int     foo;
int             bar[0x10];

void main()
{
        (void) printf("&foo = 0x%p\n", &foo);
        (void) printf("&bar = 0x%p\n", &bar);
}
$ cat mapfile
$mapfile_version 2
SYMBOL_SCOPE {
        global:
                foo     { TYPE=COMMON; VALUE=0x4;   SIZE=0x200 };
                bar     { TYPE=COMMON; VALUE=0x102; SIZE=0x40 };
};
$ cc -o prog -M mapfile main.c
ld: warning: symbol 'bar' has differing alignments:
    (file mapfile value=0x102; file main.o value=0x4);
    largest value applied
$ prog
&foo = 0x21264
&bar = 0x21224
$ elfdump -sN.symtab prog | egrep 'foo$|bar$'
      [45]  0x00021224 0x00000040  OBJT GLOB  D    0 .bss           bar
      [69]  0x00021264 0x00000200  OBJT GLOB  D    0 .bss           foo

Note - This symbol resolution diagnostic can be suppressed by using the link-editor's -t option.


Augmenting a Symbol Definition

The creation of an absolute data symbol within a shared object should be avoided. An external reference from a dynamic executable to a data item within a shared object typically requires the creation of a copy relocation. See Copy Relocations. To provide for this relocation, the data item should be associated with data storage. This association can be produced by defining the symbol within a relocatable object file. This association can also be produced by defining the symbol within a mapfile together with a size declaration and no value declaration. See SYMBOL_SCOPE / SYMBOL_VERSION Directives.

A data symbol can be filtered. See Shared Objects as Filters. To provide this filtering, an object file definition can be augmented with a mapfile definition. The following example creates a filter containing a function and data definition.

$ cat mapfile
$mapfile_version 2
SYMBOL_SCOPE {
        global:
                foo     { TYPE=FUNCTION;       FILTER=filtee.so.1 };
                bar     { TYPE=DATA; SIZE=0x4; FILTER=filtee.so.1 };
        local:
                *;
};
$ cc -o filter.so.1 -G -Kpic -h filter.so.1 -M mapfile -R.
$ elfdump -sN.dynsym filter.so.1 | egrep 'foo|bar'
       [1]  0x000105f8 0x00000004  OBJT GLOB  D    1 .data          bar
       [7]  0x00000000 0x00000000  FUNC GLOB  D    1 ABS            foo
$ elfdump -y filter.so.1 | egrep 'foo|bar'
       [1]  F        [0] filtee.so.1        bar
       [7]  F        [0] filtee.so.1        foo

At runtime, a reference from an external object to either of these symbols is resolved to the definition within the filtee.

Reducing Symbol Scope

Symbol definitions that are defined to have local scope within a mapfile can be used to reduce the symbol's eventual binding. This mechanism removes the symbol's visibility to future link-edits which use the generated file as part of their input. In fact, this mechanism can provide for the precise definition of a file's interface, and so restrict the functionality made available to others.

For example, say you want to generate a simple shared object from the files foo.c and bar.c. The file foo.c contains the global symbol foo, which provides the service that you want to make available to others. The file bar.c contains the symbols bar and str, which provide the underlying implementation of the shared object. A shared object created with these files, typically results in the creation of three symbols with global scope.

$ cat foo.c
extern  const char *bar();

const char *foo()
{
        return (bar());
}
$ cat bar.c
const char *str = "returned from bar.c";

const char *bar()
{
        return (str);
}
$ cc -o libfoo.so.1 -G foo.c bar.c
$ elfdump -sN.symtab libfoo.so.1 | egrep 'foo$|bar$|str$'
      [41]  0x00000560 0x00000018  FUNC GLOB  D    0 .text          bar
      [44]  0x00000520 0x0000002c  FUNC GLOB  D    0 .text          foo
      [45]  0x000106b8 0x00000004  OBJT GLOB  D    0 .data          str

You can now use the functionality offered by libfoo.so.1 as part of the link-edit of another application. References to the symbol foo are bound to the implementation provided by the shared object.

Because of their global binding, direct reference to the symbols bar and str is also possible. This visibility can have dangerous consequences, as you might later change the implementation that underlies the function foo. In so doing, you could unintentionally cause an existing application that had bound to bar or str to fail or misbehave.

Another consequence of the global binding of the symbols bar and str is that these symbols can be interposed upon by symbols of the same name. The interposition of symbols within shared objects is covered in section Simple Resolutions. This interposition can be intentional and be used as a means of circumventing the intended functionality offered by the shared object. On the other hand, this interposition can be unintentional, the result of the same common symbol name used for both the application and the shared object.

When developing the shared object, you can protect against these scenarios by reducing the scope of the symbols bar and str to a local binding. In the following example, the symbols bar and str are no longer available as part of the shared object's interface. Thus, these symbols cannot be referenced, or interposed upon, by an external object. You have effectively defined an interface for the shared object. This interface can be managed while hiding the details of the underlying implementation.

$ cat mapfile
$mapfile_version 2
SYMBOL_SCOPE {
        local:
                bar;
                str;
};
$ cc -o libfoo.so.1 -M mapfile -G foo.c bar.c
$ elfdump -sN.symtab libfoo.so.1 | egrep 'foo$|bar$|str$'
      [24]  0x00000548 0x00000018  FUNC LOCL  H    0 .text          bar
      [25]  0x000106a0 0x00000004  OBJT LOCL  H    0 .data          str
      [45]  0x00000508 0x0000002c  FUNC GLOB  D    0 .text          foo

This symbol scope reduction has an additional performance advantage. The symbolic relocations against the symbols bar and str that would have been necessary at runtime are now reduced to relative relocations. See When Relocations are Performed for details of symbolic relocation overhead.

As the number of symbols that are processed during a link-edit increases, defining local scope reduction within a mapfile becomes harder to maintain. An alternative and more flexible mechanism enables you to define the shared object's interface in terms of the global symbols that should be maintained. Global symbol definitions allow the link-editor to reduce all other symbols to local binding. This mechanism is achieved using the special auto-reduction directive “*”. For example, the previous mapfile definition can be rewritten to define foo as the only global symbol required in the output file generated.

$ cat mapfile
$mapfile_version 2
SYMBOL_VERSION ISV_1.1 {
        global:
                foo;
        local:
                *;
};
$ cc -o libfoo.so.1 -M mapfile -G foo.c bar.c
$ elfdump -sN.symtab libfoo.so.1 | egrep 'foo$|bar$|str$'
      [26]  0x00000570 0x00000018  FUNC LOCL  H    0 .text          bar
      [27]  0x000106d8 0x00000004  OBJT LOCL  H    0 .data          str
      [50]  0x00000530 0x0000002c  FUNC GLOB  D    0 .text          foo

This example also defines a version name, libfoo.so.1.1, as part of the mapfile directive. This version name establishes an internal version definition that defines the file's symbolic interface. The creation of a version definition is recommended. The definition forms the foundation of an internal versioning mechanism that can be used throughout the evolution of the file. See Chapter 5, Application Binary Interfaces and Versioning.


Note - If a version name is not supplied, the output file name is used to label the version definition. The versioning information that is created within the output file can be suppressed using the link-editor's -z noversion option.


Whenever a version name is specified, all global symbols must be assigned to a version definition. If any global symbols remain unassigned to a version definition, the link-editor generates a fatal error condition.

$ cat mapfile
$mapfile_version 2
SYMBOL_VERSION ISV_1.1 {
        global:
                foo;
};
$ cc -o libfoo.so.1 -M mapfile -G foo.c bar.c
Undefined           first referenced
 symbol                 in file
str                     bar.o  (symbol has no version assigned)
bar                     bar.o  (symbol has no version assigned)
ld: fatal: Symbol referencing errors. No output written to libfoo.so.1

The -B local option can be used to assert the auto-reduction directive “*” from the command line. The previous example an be compiled successfully as follows.

$ cc -o libfoo.so.1 -M mapfile -B local -G foo.c bar.c

When generating an executable or shared object, any symbol reduction results in the recording of version definitions within the output image. When generating a relocatable object, the version definitions are created but the symbol reductions are not processed. The result is that the symbol entries for any symbol reductions still remain global. For example, using the previous mapfile with the auto-reduction directive and associated relocatable objects, an intermediate relocatable object is created with no symbol reduction.

$ cat mapfile
$mapfile_version 2
SYMBOL_VERSION ISV_1.1 {
        global:
                foo;
        local:
                *;
};
$ ld -o libfoo.o -M mapfile -r foo.o bar.o
$ elfdump -s libfoo.o | egrep 'foo$|bar$|str$'
      [28]  0x00000050 0x00000018  FUNC GLOB  H    0 .text          bar
      [29]  0x00000010 0x0000002c  FUNC GLOB  D    2 .text          foo
      [30]  0x00000000 0x00000004  OBJT GLOB  H    0 .data          str

The version definitions created within this image show that symbol reductions are required. When the relocatable object is used eventually to generate an executable or shared object, the symbol reductions occur. In other words, the link-editor reads and interprets symbol reduction information that is contained in the relocatable objects in the same manner as versioning data is processed from a mapfile.

Thus, the intermediate relocatable object produced in the previous example can now be used to generate a shared object.

$ ld -o libfoo.so.1 -G libfoo.o
$ elfdump -sN.symtab libfoo.so.1 | egrep 'foo$|bar$|str$'
      [24]  0x00000508 0x00000018  FUNC LOCL  H    0 .text          bar
      [25]  0x00010644 0x00000004  OBJT LOCL  H    0 .data          str
      [42]  0x000004c8 0x0000002c  FUNC GLOB  D    0 .text          foo

Symbol reduction at the point at which an executable or shared object is created is typically the most common requirement. However, symbol reductions can be forced to occur when creating a relocatable object by using the link-editor's -B reduce option.

$ ld -o libfoo.o -M mapfile -B reduce -r foo.o bar.o
$ elfdump -sN.symtab libfoo.o | egrep 'foo$|bar$|str$'
      [20]  0x00000050 0x00000018  FUNC LOCL  H    0 .text          bar
      [21]  0x00000000 0x00000004  OBJT LOCL  H    0 .data          str
      [30]  0x00000010 0x0000002c  FUNC GLOB  D    2 .text          foo
Symbol Elimination

An extension to symbol reduction is the elimination of a symbol entry from an object's symbol table. Local symbols are only maintained in an object's .symtab symbol table. This entire table can be removed from the object by using the link-editor's -s option, or strip(1). On occasion, you might want to maintain the .symtab symbol table but remove selected local symbol definitions.

Symbol elimination can be carried out using the mapfile keyword ELIMINATE. As with the local directive, symbols can be individually defined, or the symbol name can be defined as the special auto-elimination directive “*”. The following example shows the elimination of the symbol bar for the previous symbol reduction example.

$ cat mapfile
$mapfile_version 2
SYMBOL_VERSION ISV_1.1 {
        global:
                foo;
        local:
                str;
        eliminate:
                *;
};
$ cc -o libfoo.so.1 -M mapfile -G foo.c bar.c
$ elfdump -sN.symtab libfoo.so.1 | egrep 'foo$|bar$|str$'
      [26]  0x00010690 0x00000004  OBJT LOCL  H    0 .data          str
      [44]  0x000004e8 0x0000002c  FUNC GLOB  D    0 .text          foo

The -B eliminate option can be used to assert the auto-elimination directive “*” from the command line.

External Bindings

When a symbol reference from the object being created is satisfied by a definition within a shared object, the symbol remains undefined. The relocation information that is associated with the symbol provides for its lookup at runtime. The shared object that provided the definition typically becomes a dependency.

The runtime linker employs a default search model to locate this definition at runtime. Typically, each object is searched, starting with the dynamic executable, and progressing through each dependency in the same order in which the objects were loaded.

Objects can also be created to use direct bindings. With this technique, the relationship between the symbol reference and the object that provides the symbol definition is maintained within the object being created. The runtime linker uses this information to directly bind the reference to the object that defines the symbol, thus bypassing the default symbol search model. See Appendix D, Direct Bindings.

String Table Compression

String tables are compressed by the link-editor by removing duplicate entries, together with tail substrings. This compression can significantly reduce the size of any string tables. For example, a compressed .dynstr table results in a smaller text segment and hence reduced runtime paging activity. Because of these benefits, string table compression is enabled by default.

Objects that contribute a very large number of symbols can increase the link-edit time due to the string table compression. To avoid this cost during development use the link-editors -z nocompstrtab option. Any string table compression performed during a link-edit can be displayed using the link-editors debugging tokens -D strtab,detail.