Undefined Symbols

Language:

After all of the input files have been read and all symbol resolution is complete, the link-editor searches the internal symbol table for any symbol references that have not been bound to symbol definitions. These symbol references are referred to as undefined symbols. Undefined symbols can affect the link-edit process according to the type of symbol, together with the type of output file being generated.

Generating an Executable Output File

When generating an executable output file, the link-editor's default behavior is to terminate with an appropriate error message should any symbols remain undefined. A symbol remains undefined when a symbol reference in a relocatable object is never matched to a symbol definition.

$ cat main.c
extern int foo();

int main()
{
        return (foo());
}
$ cc -o prog main.c
Undefined           first referenced
 symbol                 in file
foo                     main.o
ld: fatal: symbol referencing errors

Similarly, if a shared object is used to create a dynamic executable and leaves an unresolved symbol definition, an undefined symbol error results.

$ cat foo.c
extern int bar;
int foo()
{
        return (bar);
}
$ cc -o libfoo.so -G -K pic foo.c
$ cc -o prog main.c -L. -lfoo
Undefined           first referenced
 symbol                 in file
bar                     ./libfoo.so
ld: fatal: symbol referencing errors

To allow undefined symbols, as in the previous example, use the link-editor's –z nodefs option to suppress the default error condition.

Note - Take care when using the –z nodefs option. If an unavailable symbol reference is required during the execution of a process, a fatal runtime relocation error occurs. This error might be detected during the initial execution and testing of an application. However, more complex execution paths can result in this error condition taking much longer to detect, which can be time consuming and costly.

Symbols can also remain undefined when a symbol reference in a relocatable object is bound to a symbol definition in an implicitly defined shared object. For example, continuing with the files main.c and foo.c used in the previous example.

$ cat bar.c
int bar = 1;
$ cc -o libbar.so -R. -G -K pic bar.c -L. -lfoo
$ ldd libbar.so
        libfoo.so =>     ./libfoo.so
$ cc -o prog main.c -L. -lbar
Undefined           first referenced
 symbol                 in file
foo                     main.o  (symbol belongs to implicit \
                        dependency ./libfoo.so)
ld: fatal: symbol referencing errors

prog is built with an explicit reference to libbar.so. libbar.so has a dependency on libfoo.so. Therefore, an implicit reference to libfoo.so from prog is established.

Because main.c made a specific reference to the interface provided by libfoo.so, prog really has a dependency on libfoo.so. However, only explicit shared object dependencies are recorded in the output file being generated. Thus, prog fails to run if a new version of libbar.so is developed that no longer has a dependency on libfoo.so.

For this reason, bindings of this type are deemed fatal. The implicit reference must be made explicit by referencing the library directly during the link-edit of prog. The required reference is hinted at in the fatal error message that is shown in the preceding example.

Generating a Shared Object Output File

When the link-editor is generating a shared object output file, undefined symbols are allowed to remain at the end of the link-edit. This default behavior allows the shared object to import symbols from a dynamic executable that defines the shared object as a dependency.

The link-editor's –z defs option can be used to force a fatal error if any undefined symbols remain. This option is recommended when creating any shared objects. Shared objects that reference symbols from an application can use the –z defs option, together with defining the symbols by using an extern mapfile directive. See SYMBOL_SCOPE / SYMBOL_VERSION Directives.

A self-contained shared object, in which all references to external symbols are satisfied by named dependencies, provides maximum flexibility. The shared object can be employed by many users without those users having to determine and establish dependencies to satisfy the shared object's requirements.

Weak Symbols

Historically, weak symbols have been used to circumvent interposition, or test for optional functionality. However, experience has shown that weak symbols are fragile and unreliable in modern programming environments, and their use is discouraged.

Weak symbol aliases were frequently employed within system shared objects. The intent was to provide an alternative interface name, typically the symbol name with a prefixed “_” character. This alias name could be referenced from other system shared objects to avoid interposition issues due to an application exporting their own implementation of the symbol name. In practice, this technique proved to be overly complex and was used inconsistently. Modern versions of Oracle Solaris establish explicit bindings between system objects with direct bindings. See Chapter 6, Direct Bindings.

Weak symbol references were often employed to test for the existence of an interface at runtime. This technique places restrictions on the build environment, the runtime environment, and can be circumvented by compiler optimizations. The use of dlsym (3C) with the RTLD_DEFAULT, or RTLD_PROBE handles, provides a consistent and robust means of testing for a symbol's existence. See Testing for Functionality.