Go to main content

Oracle® Solaris 11.4 Linkers and Libraries Guide

Exit Print View

Updated: October 2019
 
 

Symbol Processing

Symbols can be categorized as local or global. See Symbol Visibility.

During input file processing, local symbols are copied from any input relocatable object files to the output object being built, without examination.

The global symbols from all input relocatable objects, and the global symbols from any external dependencies, are analyzed and combined in a process known as symbol resolution. The link-editor places each symbol in an internal symbol table in the order that the symbols are encountered. If a symbol with the same name was contributed by an earlier object, and already exists in the symbol table, the symbol resolution process determines which of the two symbols to keep. As a side effect of this process, the link-editor determines how to establish references to external object dependencies.

On successful completion of input file processing, the link-editor applies any symbol visibility adjustment, and determines if any unresolved symbol references remain. If any fatal symbol resolution errors have occurred, or if any unresolved symbol references remain, the link-edit terminates. Finally, the link-editor's internal symbol table is added to the symbol tables of the image being created.

The following sections expand upon symbol visibilities, symbol resolution, and undefined symbol processing.

Symbol Visibility

Symbols can be categorized as local or global. Local symbols can not be referenced from an object other than the object that contains the symbol definition. By default, local symbols are copied from any input relocatable object files to the output object being built. Local symbols can instead be eliminated from the output object. See Symbol Elimination.

Global symbols can be referenced from other objects besides the object that contains the symbol definition. After collection and resolution, global symbols are added to the symbol tables being created in the output object. Although all global symbols are processed and resolved together, their final visibility can be adjusted. Global symbols can define additional visibility attributes. See Figure 34, Table 34, ELF Symbol Visibility. In addition, mapfile symbol directives can be used to assign symbol visibilities during a link-edit. See Figure 8, Table 8, Symbol Scope Types. These visibility attributes, and directives, can result in a global symbol having its visibility adjusted when written to the output object.

When creating a relocatable object, all visibility attributes and directives are recorded in the output object. However, the visibility changes implied by these attributes are not applied. Any visibility processing is instead deferred to a subsequent link-edit of a dynamic object that reads these objects as input. In special cases, the –B reduce option can be used to force the immediate interpretation of any visibility attributes or directives.

When creating a dynamic object, symbol visibility attributes and directives are applied before the symbols are written to any symbol tables. Visibility attributes can ensure that symbols remain global, and are not affected by any symbol reduction techniques. Visibility attributes and directives can also result in global symbols being demoted to local. This latter technique is most frequently used to explicitly define an objects exported interface. See Reducing Symbol Scope.

Symbol Resolution

Symbol resolution runs the entire spectrum, from simple and intuitive to complex and perplexing. Most resolutions are carried out silently by the link-editor. However, some relocations can be accompanied by warning diagnostics, while others can result in a fatal error condition.

The most common simple resolutions involve binding symbol references from one object to symbol definitions within another object. This binding can occur between two relocatable objects, or between a relocatable object and the first definition found in a shared object dependency. Complex resolutions typically occur between two or more relocatable objects.

The resolution of two symbols depends on their attributes, the type of file that provides the symbol, and the type of file being generated. For a complete description of symbol attributes, see Symbol Table Section. For the following discussions, however, three basic symbol types are identified.

  • Undefined – Symbols that have been referenced in a file but have not been assigned a storage address.

  • Tentative – Symbols that have been created within a file but have not yet been sized, or allocated in storage. These symbols appear as uninitialized C symbols, or FORTRAN COMMON blocks within the file.

  • Defined – Symbols that have been created, and assigned storage addresses and space within the file.

In its simplest form, symbol resolution involves the use of a precedence relationship. This relationship has defined symbols dominate tentative symbols, which in turn dominate undefined symbols.

The following example of C code shows how these symbol types can be generated. Undefined symbols are prefixed with u_. Tentative symbols are prefixed with t_. Defined symbols are prefixed with d_.

$ cat main.c
extern int u_bar;
extern int u_foo();

int t_bar;
int d_bar = 1;

int d_foo()
{
        return (u_foo(u_bar, t_bar, d_bar));
}
$ cc -o main.o -c main.c
$ elfdump -s main.o

Symbol Table Section:  .symtab
    index  value   size  type bind oth ver shndx          name
    ....
      [7]      0      0  FUNC GLOB  D    0 UNDEF          u_foo
      [8]   0x10   0x40  FUNC GLOB  D    0 .text          d_foo
      [9]    0x4    0x4  OBJT GLOB  D    0 COMMON         t_bar
     [10]      0    0x4  NOTY GLOB  D    0 UNDEF          u_bar
     [11]      0    0x4  OBJT GLOB  D    0 .data          d_bar

Simple Resolutions

Simple symbol resolutions are by far the most common. In this case, two symbols with similar characteristics are detected, with one symbol taking precedence over the other. This symbol resolution is carried out silently by the link-editor. For example, with symbols of the same binding, a symbol reference from one file is bound to a defined, or tentative symbol definition, from another file. Or, a tentative symbol definition from one file is bound to a defined symbol definition from another file. This resolution can occur between two relocatable objects, or between a relocatable object and the first definition found in a shared object dependency.

Symbols that undergo resolution can have either a global or weak binding. When processing relocatable objects, weak bindings have lower precedence than global bindings. A weak symbol definition is silently overridden by a global definition of the same name.

Another form of simple symbol resolution, interposition, occurs between relocatable objects and shared objects, or between multiple shared objects. In these cases, when a symbol is multiply-defined, the relocatable object, or the first definition between multiple shared objects, is silently taken by the link-editor. The relocatable object's definition, or the first shared object's definition, is said to interpose on all other definitions. This interposition can be used to override the functionality provided by another shared object. Multiply-defined symbols that occur between relocatable objects and shared objects, or between multiple shared objects, are treated identically. A symbols weak binding or global binding is irrelevant. By resolving to the first definition, regardless of the symbols binding, both the link-editor and runtime linker behave consistently.

Use the link-editor's –m option to write a list of all interposed symbol references, along with section load address information, to the standard output.

Complex Resolutions

Complex resolutions occur when two symbols of the same name are found with differing attributes. In these cases, the link-editor generates a warning message, while selecting the most appropriate symbol. This message indicates the symbol, the attributes that conflict, and the identity of the file from which the symbol definition is taken. In the following example, two files with a definition of the data item array have different size requirements.

$ cat foo.c
int array[1];
$ cat bar.c
int array[2] = { 1, 2 };
$ ld -r -o temp.o foo.c bar.c
ld: warning: symbol 'array' has differing sizes:
    (file foo.o value=0x4; file bar.o value=0x8);
    bar.o definition taken

A similar diagnostic is produced if the symbol's alignment requirements differ. In both of these cases, the diagnostic can be suppressed by using the link-editor's –t option.

Another form of attribute difference is the symbol's type. In the following example, the symbol bar() has been defined as both a data item and a function.

$ cat foo.c
int bar()
{
        return (0);
}
$ cc -o libfoo.so -G -K pic foo.c
$ cat main.c
int bar = 1;

int main()
{
        return (bar);
}
$ cc -o main main.c -L. -lfoo
ld: warning: symbol 'bar' has differing types:
    (file main.o type=OBJT; file ./libfoo.so type=FUNC);
    main.o definition taken

Note -  Symbol types in this context are classifications that can be expressed in ELF. These symbol types are not related to the data types as employed by the programming language, except in the crudest fashion.

In cases like the previous example, the relocatable object definition is taken when the resolution occurs between a relocatable object and a shared object. Or, the first definition is taken when the resolution occurs between two shared objects. When such resolutions occur between symbols of weak or global binding, a warning is also produced.

Inconsistencies between symbol types are not suppressed by the link-editor's –t option.

Fatal Resolutions

Symbol conflicts that cannot be resolved result in a fatal error condition and an appropriate error message. This message indicates the symbol name together with the names of the files that provided the symbols. No output file is generated. Although the fatal condition is sufficient to terminate the link-edit, all input file processing is first completed. In this manner, all fatal resolution errors can be identified.

The most common fatal error condition exists when two relocatable objects both define non-weak symbols of the same name.

$ cat foo.c
int bar = 1;
$ cat bar.c
int bar()
{
        return (0);
}
$ ld -r -o temp.o foo.c bar.c
ld: fatal: symbol 'bar' is multiply-defined:
    (file foo.o and file bar.o);

foo.c and bar.c have conflicting definitions for the symbol bar. Because the link-editor cannot determine which should dominate, the link-edit usually terminates with an error message.

Multiple symbol definitions should not occur. In some simple coding scenarios, multiple symbol definition errors can be suppressed using the link-editor's –z muldefs option. This option allows the first definition of a multiply defined symbol to be propagated to the output file, while any other definitions of the multiply defined symbol are discarded. If all references to a multiply defined item use the global symbol name of that item, then all references are resolved to the first instance of the multiply defined symbol.

However, specialized compiler options, or high levels of compiler optimization, can circumvent the use of the –z muldefs option. Under these conditions, the compilers may substitute a global symbol reference to a local section symbol reference. This can result in the individual items of a multiply defined item continuing to be referenced, rather than all references being directed to a single global symbol. This inconsistency can result in multiple items having different values, which can cause unexpected program behavior. For greater flexibility, multiple symbol definitions should be avoided.

Undefined Symbols

After all of the input files have been read and all symbol resolution is complete, the link-editor searches the internal symbol table for any symbol references that have not been bound to symbol definitions. These symbol references are referred to as undefined symbols. Undefined symbols can affect the link-edit process according to the type of symbol, together with the type of output file being generated.

Generating an Executable Output File

When generating an executable output file, the link-editor's default behavior is to terminate with an appropriate error message should any symbols remain undefined. A symbol remains undefined when a symbol reference in a relocatable object is never matched to a symbol definition.

$ cat main.c
extern int foo();

int main()
{
        return (foo());
}
$ cc -o prog main.c
Undefined           first referenced
 symbol                 in file
foo                     main.o
ld: fatal: symbol referencing errors

Similarly, if a shared object is used to create an executable and leaves an unresolved symbol definition, an undefined symbol error results.

$ cat foo.c
extern int bar;
int foo()
{
        return (bar);
}
$ cc -o libfoo.so -G -K pic foo.c
$ cc -o prog main.c -L. -lfoo
Undefined           first referenced
 symbol                 in file
bar                     ./libfoo.so
ld: fatal: symbol referencing errors

To allow undefined symbols, as in the previous example, use the link-editor's –z nodefs option to suppress the default error condition.


Note -  Take care when using the –z nodefs option. If an unavailable symbol reference is required during the execution of a process, a fatal runtime relocation error occurs. This error might be detected during the initial execution and testing of an application. However, more complex execution paths can result in this error condition taking much longer to detect, which can be time consuming and costly.

Symbols can also remain undefined when a symbol reference in a relocatable object is bound to a symbol definition in an implicitly defined shared object. For example, continuing with the files main.c and foo.c used in the previous example.

$ cat bar.c
int bar = 1;
$ cc -o libbar.so -R. -G -K pic bar.c -L. -lfoo
$ ldd libbar.so
        libfoo.so =>     ./libfoo.so
$ cc -o prog main.c -L. -lbar
Undefined           first referenced
 symbol                 in file
foo                     main.o  (symbol belongs to implicit \
                        dependency ./libfoo.so)
ld: fatal: symbol referencing errors

prog is built with an explicit reference to libbar.so. libbar.so has a dependency on libfoo.so. Therefore, an implicit reference to libfoo.so from prog is established.

Because main.c made a specific reference to the interface provided by libfoo.so, prog really has a dependency on libfoo.so. However, only explicit shared object dependencies are recorded in the output file being generated. Thus, prog fails to run if a new version of libbar.so is developed that no longer has a dependency on libfoo.so.

For this reason, bindings of this type are deemed fatal. The implicit reference must be made explicit by referencing the library directly during the link-edit of prog. The required reference is hinted at in the fatal error message that is shown in the preceding example.

Generating a Shared Object Output File

When the link-editor is generating a shared object output file, undefined symbols are allowed to remain at the end of the link-edit. This default behavior allows the shared object to import symbols from an executable that defines the shared object as a dependency.

The link-editor's –z defs option can be used to force a fatal error if any undefined symbols remain. This option is recommended when creating any shared objects. Shared objects that reference symbols from an application can use the –z defs option, together with defining the symbols by using an extern mapfile directive. See SYMBOL_SCOPE and SYMBOL_VERSION Directives.

A self-contained shared object, in which all references to external symbols are satisfied by named dependencies, provides maximum flexibility. The shared object can be employed by many users without those users having to determine and establish dependencies to satisfy the shared object's requirements.

Weak Symbols

Historically, weak symbols have been used to circumvent interposition, or test for optional functionality. However, experience has shown that weak symbols are fragile and unreliable in modern programming environments, and their use is discouraged.

Weak symbol aliases were frequently employed within system shared objects. The intent was to provide an alternative interface name, typically the symbol name with a prefixed "_" character. This alias name could be referenced from other system shared objects to avoid interposition issues due to an application exporting their own implementation of the symbol name. In practice, this technique proved to be overly complex and was used inconsistently. Modern versions of Oracle Solaris establish explicit bindings between system objects with direct bindings. See Direct Bindings.

Weak symbol references were often employed to test for the existence of an interface at runtime. This technique places restrictions on the build environment, the runtime environment, and can be circumvented by compiler optimizations. The use of dlsym(3C) with the RTLD_DEFAULT, or RTLD_PROBE handles, provides a consistent and robust means of testing for a symbol's existence. See Testing for Functionality.

Tentative Symbol Order Within the Output File

Contributions from input files usually appear in the output file in the order of their contribution. Tentative symbols are an exception to this rule, as these symbols are not fully defined until their resolution is complete. The order of tentative symbols within the output file might not follow the order of their contribution.

If you need to control the ordering of a group of symbols, then any tentative definition should be redefined to a zero-initialized data item. For example, the following tentative definitions result in a reordering of the data items within the output file, as compared to the original order described in the source file foo.c.

$ cat foo.c
char One_array[0x10];
char Two_array[0x20];
char Three_array[0x30];
$ cc -o libfoo.so -G -Kpic foo.c
$ elfdump -sN.dynsym libfoo.so | grep array | sort -k 2,2
    [11]   0x10614  0x20  OBJT GLOB  D    0 .bss           Two_array
     [3]   0x10634  0x30  OBJT GLOB  D    0 .bss           Three_array
     [4]   0x10664  0x10  OBJT GLOB  D    0 .bss           One_array

Sorting the symbols based on their address shows that their output order is different than the order they were defined in the source. In contrast, defining these symbols as initialized data items ensures that the relative ordering of these symbols within the input file is carried over to the output file.

$ cat foo.c
char A_array[0x10] = { 0 };
char B_array[0x20] = { 0 };
char C_array[0x30] = { 0 };
$ cc -o libfoo.so -G -Kpic foo.c
$ elfdump -sN.dynsym libfoo.so | grep array | sort -k 2,2
     [4]   0x10614  0x10  OBJT GLOB  D    0 .data          One_array
    [11]   0x10624  0x20  OBJT GLOB  D    0 .data          Two_array
     [3]   0x10644  0x30  OBJT GLOB  D    0 .data          Three_array

Defining Additional Symbols

Besides the symbols provided from input files, you can supply additional global symbol references or global symbol definitions to a link-edit. In the simplest form, symbol references can be generated using the link-editor's –u option. Greater flexibility is provided with the link-editor's –M option and an associated mapfile. This mapfile enables you to define global symbol references and a variety of global symbol definitions. Attributes of the symbol such as visibility and type can be specified, See SYMBOL_SCOPE and SYMBOL_VERSION Directives for a complete description of the available options.

Defining Additional Symbols with the -u option

The –u option provides a mechanism for generating a global symbol reference from the link-edit command line. This option can be used to perform a link-edit entirely from archives. This option can also provide additional flexibility in selecting the objects to extract from multiple archives. See Archive Processing for an overview of archive extraction.

For example, perhaps you want to generate a dynamic executable from the relocatable object main.o, which refers to the symbols foo and bar. You want to obtain the symbol definition foo from the relocatable object foo.o contained in lib1.a, and the symbol definition bar from the relocatable object bar.o, contained in lib2.a.

However, the archive lib1.a also contains a relocatable object that defines the symbol bar. This relocatable object is presumably of differing functionality to the relocatable object that is provided in lib2.a. To specify the required archive extraction, you can use the following link-edit.

$ cc -o prog -L. -u foo -l1 main.o -l2

The –u option generates a reference to the symbol foo. This reference causes extraction of the relocatable object foo.o from the archive lib1.a. The first reference to the symbol bar occurs in main.o, which is encountered after lib1.a has been processed. Therefore, the relocatable object bar.o is obtained from the archive lib2.a.


Note -  This simple example assumes that the relocatable object foo.o from lib1.a does not directly or indirectly reference the symbol bar. If lib1.a does reference bar, then the relocatable object bar.o is also extracted from lib1.a during its processing. See Archive Processing for a discussion of the link-editor's multi-pass processing of an archive.

Defining Symbol References

The following example shows how three symbol references can be defined. These references are then used to extract members of an archive. Although this archive extraction can be achieved by specifying multiple –u options to the link-edit, this example also shows how the eventual scope of a symbol can be reduced to local.

$ cat foo.c
#include    <stdio.h>

void foo()
{
        (void) printf("foo: called from lib.a\n");
}
$ cat bar.c
#include    <stdio.h>

void bar()
{
        (void) printf("bar: called from lib.a\n");
}
$ cat main.c
extern void foo(), bar();

void main()
{
        foo();
        bar();
}
$ cc -c foo.c bar.c main.c
$ ar -rc lib.a foo.o bar.o main.o
$ cat mapfile
$mapfile_version 2
SYMBOL_SCOPE {
        local:
                foo;
                bar;
        global:
                main;
};
$ cc -o prog -M mapfile lib.a
$ prog
foo: called from lib.a
bar: called from lib.a
$ elfdump -sN.symtab prog | egrep 'main$|foo$|bar$'
    [29]   0x10f30  0x24  FUNC LOCL  H    0 .text          bar
    [30]   0x10ef8  0x24  FUNC LOCL  H    0 .text          foo
    [55]   0x10f68  0x24  FUNC GLOB  D    0 .text          main

The significance of reducing symbol scope from global to local is covered in more detail in the section Reducing Symbol Scope.

Defining Absolute Symbols

The following example shows how two absolute symbol definitions can be defined. These definitions are then used to resolve the references from the input file main.c.

$ cat main.c
#include    <stdio.h>

extern int foo();
extern int bar;

void main()
{
        (void) printf("&foo = 0x%p\n", &foo);
        (void) printf("&bar = 0x%p\n", &bar);
}
$ cat mapfile
$mapfile_version 2
SYMBOL_SCOPE {
        global:
                foo     { TYPE=FUNCTION; VALUE=0x400 };
                bar     { TYPE=DATA;     VALUE=0x800 };
};
$ cc -o prog -M mapfile main.c
$ prog
&foo = 0x400
&bar = 0x800
$ elfdump -sN.symtab prog | egrep 'foo$|bar$'
    [45]     0x800     0  OBJT GLOB  D    0 ABS            bar
    [69]     0x400     0  FUNC GLOB  D    0 ABS            foo

When obtained from an input file, symbol definitions for functions or data items are usually associated with elements of data storage. A mapfile definition is insufficient to be able to construct this data storage, so these symbols must remain as absolute values. A simple mapfile definition that is associated with a size, but no value results in the creation of data storage. In this case, the symbol definition is accompanied with a section index. However, a mapfile definition that is accompanied with a value results in the creation of an absolute symbol. If a symbol is defined in a shared object, an absolute definition should be avoided. See Augmenting a Symbol Definition.

Defining Tentative Symbols

A mapfile can also be used to define a COMMON, or tentative, symbol. Unlike other types of symbol definition, tentative symbols do not occupy storage within a file, but define storage that must be allocated at runtime. Therefore, symbol definitions of this kind can contribute to the storage allocation of the output file being generated.

A feature of tentative symbols that differs from other symbol types is that their value attribute indicates their alignment requirement. A mapfile definition can therefore be used to realign tentative definitions that are obtained from the input files of a link-edit.

The following example shows the definition of two tentative symbols. The symbol foo defines a new storage region whereas the symbol bar is actually used to change the alignment of the same tentative definition within the file main.c.

$ cat main.c
#include    <stdio.h>

extern int foo;

int bar[0x10];

void main()
{
        (void) printf("&foo = 0x%p\n", &foo);
        (void) printf("&bar = 0x%p\n", &bar);
}
$ cat mapfile
$mapfile_version 2
SYMBOL_SCOPE {
        global:
                foo     { TYPE=COMMON; VALUE=0x4;   SIZE=0x200 };
                bar     { TYPE=COMMON; VALUE=0x102; SIZE=0x40 };
};
$ cc -o prog -M mapfile main.c
ld: warning: symbol 'bar' has differing alignments:
    (file mapfile value=0x102; file main.o value=0x4);
    largest value applied
$ prog
&foo = 0x21264
&bar = 0x21224
$ elfdump -sN.symtab prog | egrep 'foo$|bar$'
    [45]   0x21224   0x40  OBJT GLOB  D    0 .bss           bar
    [69]   0x21264  0x200  OBJT GLOB  D    0 .bss           foo

Note -  This symbol resolution diagnostic can be suppressed by using the link-editor's –t option.

Encapsulation Symbols

Encapsulation symbols refer to a pair of symbols that can be generated, during the link-edit of a final object, that identify a unique section. These symbols are assigned the address of the beginning, and the address of the end, of the associated section, and thus encapsulate the section address range. The symbol pair are __start_<section_name> and __stop_<section_name> respectively. By default, both symbols are assigned a protected visibility.

Encapsulation symbols are created for a section when the following criteria are met.

  • The section is allocatable, and the section name does not start with the standard period (.) prefix.

  • A reference to the encapsulation symbol exists from the input relocatable objects provided with the link-edit.


Note - There is no special attribute associated with the symbol reference that dictates they must be bound to a section range. If references to the symbols exist but no matching section exists in the input relocatable objects being processed, a fatal symbol resolution error can result. However, if references to the symbols exist, a matching section exists, but the matching section is discarded as unused, the symbols are assigned an address of 0.

The following section is a candidate for encapsulation symbols.

$ elfdump -cN_meta_data bar.o

Section Header[3]:  sh_name: _meta_data
    sh_addr:      0               sh_flags:   [ SHF_WRITE SHF_ALLOC ]
    sh_size:      0x4             sh_type:    [ SHT_PROGBITS ]
    sh_offset:    0xb8            sh_entsize: 0x4 (1 entry)
    ....

The following references trigger the creation of encapsulated symbols.

$ elfdump -s foo.o | fgrep __meta_
    [25]   0          0    NOTY GLOB  D  0 UNDEF       __start__meta_data
    [26]   0          0    NOTY GLOB  D  0 UNDEF       __stop__meta_data

$ cc -o main foo.o bar.o
$ elfdump -cN_meta_data main

Section Header[21]:  sh_name: _meta_data
    sh_addr:      0x8060e20       sh_flags:   [ SHF_WRITE SHF_ALLOC ]
    sh_size:      0x4             sh_type:    [ SHT_PROGBITS ]
    sh_offset:    0xe20           sh_entsize: 0x4 (1 entry)
    ....
$ elfdump -sN.symtab main | fgrep __meta_
    [32]   0x8060e20  0x4  OBJT GLOB  P  0 _meta_data  __start__meta_data
    [34]   0x8060e24    0  OBJT GLOB  P  0 _meta_data  __stop__meta_data

Augmenting a Symbol Definition

The creation of an absolute data symbol within a shared object should be avoided. An external reference from a dynamic executable to a data item within a shared object typically requires the creation of a copy relocation. See Copy Relocations. To provide for this relocation, the data item should be associated with data storage. This association can be produced by defining the symbol within a relocatable object file. This association can also be produced by defining the symbol within a mapfile together with a size declaration and no value declaration. See SYMBOL_SCOPE and SYMBOL_VERSION Directives.

A data symbol can be filtered. See Shared Objects as Filters. To provide this filtering, an object file definition can be augmented with a mapfile definition. The following example creates a filter containing a function and data definition.

$ cat mapfile
$mapfile_version 2
SYMBOL_SCOPE {
        global:
                foo     { TYPE=FUNCTION;       FILTER=filtee.so.1 };
                bar     { TYPE=DATA; SIZE=0x4; FILTER=filtee.so.1 };
        local:
                *;
};
$ cc -o filter.so.1 -G -Kpic -h filter.so.1 -M mapfile -R.
$ elfdump -sN.dynsym filter.so.1 | egrep 'foo|bar'
     [1]   0x105f8     0x4  OBJT GLOB  D    1 .data          bar
     [7]         0       0  FUNC GLOB  D    1 ABS            foo
$ elfdump -y filter.so.1 | egrep 'foo|bar'
     [1]  [ FILTER ]        [0] filtee.so.1        bar
     [7]  [ FILTER ]        [0] filtee.so.1        foo

At runtime, a reference from an external object to either of these symbols is resolved to the definition within the filtee.

Reducing Symbol Scope

Symbol definitions that are defined to have local scope within a mapfile can be used to reduce the symbol's eventual binding. This mechanism removes the symbol's visibility to future link-edits which use the generated file as part of their input. In fact, this mechanism can provide for the precise definition of a file's interface, and so restrict the functionality made available to others.

For example, say you want to generate a simple shared object from the files foo.c and bar.c. The file foo.c contains the global symbol foo, which provides the service that you want to make available to others. The file bar.c contains the symbols bar and str, which provide the underlying implementation of the shared object. A shared object created with these files, typically results in the creation of three symbols with global scope.

$ cat foo.c
extern const char *bar();

const char *foo()
{
        return (bar());
}
$ cat bar.c
const char *str = "returned from bar.c";

const char *bar()
{
        return (str);
}
$ cc -o libfoo.so.1 -G foo.c bar.c
$ elfdump -sN.symtab libfoo.so.1 | egrep 'foo$|bar$|str$'
    [41]     0x560    0x18  FUNC GLOB  D    0 .text          bar
    [44]     0x520    0x2c  FUNC GLOB  D    0 .text          foo
    [45]   0x106b8     0x4  OBJT GLOB  D    0 .data          str

You can now use the functionality offered by libfoo.so.1 as part of the link-edit of another application. References to the symbol foo are bound to the implementation provided by the shared object.

Because of their global binding, direct reference to the symbols bar and str is also possible. This visibility can have dangerous consequences, as you might later change the implementation that underlies the function foo. In so doing, you could unintentionally cause an existing application that had bound to bar or str to fail or misbehave.

Another consequence of the global binding of the symbols bar and str is that these symbols can be interposed upon by symbols of the same name. The interposition of symbols within shared objects is covered in section Simple Resolutions. This interposition can be intentional and be used as a means of circumventing the intended functionality offered by the shared object. On the other hand, this interposition can be unintentional, the result of the same common symbol name used for both the application and the shared object.

When developing the shared object, you can protect against these scenarios by reducing the scope of the symbols bar and str to a local binding. In the following example, the symbols bar and str are no longer available as part of the shared object's interface. Thus, these symbols cannot be referenced, or interposed upon, by an external object. You have effectively defined an interface for the shared object. This interface can be managed while hiding the details of the underlying implementation.

$ cat mapfile
$mapfile_version 2
SYMBOL_SCOPE {
        local:
                bar;
                str;
};
$ cc -o libfoo.so.1 -M mapfile -G foo.c bar.c
$ elfdump -sN.symtab libfoo.so.1 | egrep 'foo$|bar$|str$'
    [24]     0x548    0x18  FUNC LOCL  H    0 .text          bar
    [25]   0x106a0     0x4  OBJT LOCL  H    0 .data          str
    [45]     0x508    0x2c  FUNC GLOB  D    0 .text          foo

This symbol scope reduction has an additional performance advantage. The symbolic relocations against the symbols bar and str that would have been necessary at runtime are now reduced to relative relocations. See When Relocations are Performed for details of symbolic relocation overhead.

As the number of symbols that are processed during a link-edit increases, defining local scope reduction within a mapfile becomes harder to maintain. An alternative and more flexible mechanism enables you to define the shared object's interface in terms of the global symbols that should be maintained. Global symbol definitions allow the link-editor to reduce all other symbols to local binding. This mechanism is achieved using the special auto-reduction directive "*". For example, the previous mapfile definition can be rewritten to define foo as the only global symbol required in the output file generated.

$ cat mapfile
$mapfile_version 2
SYMBOL_VERSION ISV_1.1 {
        global:
               foo;
        local:
               *;
};
$ cc -o libfoo.so.1 -M mapfile -G foo.c bar.c
$ elfdump -sN.symtab libfoo.so.1 | egrep 'foo$|bar$|str$'
    [26]     0x570    0x18  FUNC LOCL  H    0 .text          bar
    [27]   0x106d8     0x4  OBJT LOCL  H    0 .data          str
    [50]     0x530    0x2c  FUNC GLOB  D    0 .text          foo

This example also defines a version name, ISV_1.1, as part of the mapfile directive. This version name establishes an internal version definition that defines the file's symbolic interface. The creation of a version definition is recommended. The definition forms the foundation of an internal versioning mechanism that can be used throughout the evolution of the file. See Interfaces and Versioning.


Note -  If a version name is not supplied, the output file name is used to label the version definition. The versioning information that is created within the output file can be suppressed using the link-editor's –z noversion option.

Whenever a version name is specified, all global symbols must be assigned to a version definition. If any global symbols remain unassigned to a version definition, the link-editor generates a fatal error condition.

$ cat mapfile
$mapfile_version 2
SYMBOL_VERSION ISV_1.1 {
        global:
                foo;
};
$ cc -o libfoo.so.1 -M mapfile -G foo.c bar.c
Undefined           first referenced
 symbol                 in file
str                     bar.o  (symbol has no version assigned)
bar                     bar.o  (symbol has no version assigned)
ld: fatal: symbol referencing errors

The –B local option can be used to assert the auto-reduction directive "*" from the command line. The previous example an be compiled successfully as follows.

$ cc -o libfoo.so.1 -M mapfile -B local -G foo.c bar.c

When generating an executable or shared object, any symbol reduction results in the recording of version definitions within the output image. When generating a relocatable object, the version definitions are created but the symbol reductions are not processed. The result is that the symbol entries for any symbol reductions still remain global. For example, using the previous mapfile with the auto-reduction directive and associated relocatable objects, an intermediate relocatable object is created with no symbol reduction.

$ cat mapfile
$mapfile_version 2
SYMBOL_VERSION ISV_1.1 {
        global:
                foo;
        local:
                *;
};
$ ld -o libfoo.o -M mapfile -r foo.o bar.o
$ elfdump -s libfoo.o | egrep 'foo$|bar$|str$'
    [29]      0x10    0x2c  FUNC GLOB  D    2 .text          foo
    [30]         0     0x4  OBJT GLOB  H    0 .data          str

The version definitions created within this image show that symbol reductions are required. When the relocatable object is used eventually to generate a dynamic object, the symbol reductions occur. In other words, the link-editor reads and interprets symbol reduction information that is contained in the relocatable objects in the same manner as versioning data is processed from a mapfile.

Thus, the intermediate relocatable object produced in the previous example can now be used to generate a shared object.

$ ld -o libfoo.so.1 -G libfoo.o
$ elfdump -sN.symtab libfoo.so.1 | egrep 'foo$|bar$|str$'
    [24]     0x508    0x18  FUNC LOCL  H    0 .text          bar
    [25]   0x10644     0x4  OBJT LOCL  H    0 .data          str
    [42]     0x4c8    0x2c  FUNC GLOB  D    0 .text          foo

Symbol reduction at the point at which an executable or shared object is created is typically the most common requirement. However, symbol reductions can be forced to occur when creating a relocatable object by using the link-editor's –B reduce option.

$ ld -o libfoo.o -M mapfile -B reduce -r foo.o bar.o
$ elfdump -sN.symtab libfoo.o | egrep 'foo$|bar$|str$'
    [20]      0x50    0x18  FUNC LOCL  H    0 .text          bar
    [21]         0     0x4  OBJT LOCL  H    0 .data          str
    [30]      0x10    0x2c  FUNC GLOB  D    2 .text          foo

Symbol Elimination

An extension to symbol reduction is the elimination of a symbol entry from an object's symbol table. Local symbols are only maintained in an object's .symtab symbol table. This entire table can be removed from the object by using the link-editor's –z strip-class option, or after a link-edit by using strip(1). On occasion, you might want to maintain the .symtab symbol table but remove selected local symbol definitions.

Symbol elimination can be carried out using the mapfile keyword ELIMINATE. As with the local directive, symbols can be individually defined, or the symbol name can be defined as the special auto-elimination directive "*". The following example shows the elimination of the symbol bar for the previous symbol reduction example.

$ cat mapfile
$mapfile_version 2
SYMBOL_VERSION ISV_1.1 {
        global:
                foo;
        local:
                str;
        eliminate:
                *;
};
$ cc -o libfoo.so.1 -M mapfile -G foo.c bar.c
$ elfdump -sN.symtab libfoo.so.1 | egrep 'foo$|bar$|str$'
    [26]   0x10690     0x4  OBJT LOCL  H    0 .data          str
    [44]     0x4e8    0x2c  FUNC GLOB  D    0 .text          foo

The –B eliminate option can be used to assert the auto-elimination directive "*" from the command line.

External Bindings

When a symbol reference from the object being created is satisfied by a definition within a shared object, the symbol remains undefined. The relocation information that is associated with the symbol provides for its lookup at runtime. The shared object that provided the definition typically becomes a dependency.

The runtime linker employs a default search model to locate this definition at runtime. Typically, each object is searched, starting with the executable, and progressing through each dependency in the same order in which the objects were loaded.

Objects can also be created to use direct bindings. With this technique, the relationship between the symbol reference and the object that provides the symbol definition is maintained within the object being created. The runtime linker uses this information to directly bind the reference to the object that defines the symbol, thus bypassing the default symbol search model. See Direct Bindings.

String Table Compression

String tables are compressed by the link-editor by removing duplicate entries, together with tail substrings. This compression can significantly reduce the size of any string tables. For example, a compressed .dynstr table results in a smaller text segment and hence reduced runtime paging activity. Because of these benefits, string table compression is enabled by default.

Objects that contribute a very large number of symbols can increase the link-edit time due to the string table compression. To avoid this cost during development use the link-editors –z nocompstrtab option. Any string table compression performed during a link-edit can be displayed using the link-editors debugging tokens –D strtab,detail.