Linker and Libraries Guide

Symbol Processing

During input file processing, all local symbols from the input relocatable objects are passed through to the output file image. All global symbols are accumulated internally within the link-editor. This internal symbol table is searched for each new global symbol entry processed to determine if a symbol with the same name has already been encountered from a previous input file. If so, a symbol resolution process is called to determine which of the two entries is to be kept.

On completion of input file processing, and providing no fatal error conditions have been encountered during symbol resolution, the link-editor determines if any unbound symbol references (undefined symbols) remain that will cause the link-edit to fail.

Finally, the link-editor's internal symbol table is added to the symbol table(s) of the image being created.

The following sections expand upon symbol resolution and undefined symbol processing.

Symbol Resolution

Symbol resolution runs the entire spectrum, from simple and intuitive to complex and perplexing. Resolutions can be carried out silently by the link-editor, be accompanied by warning diagnostics, or result in a fatal error condition.

The resolution of two symbols depends on the symbols' attributes, the type of file providing the symbol and the type of file being generated. For a complete description of symbol attributes see "Symbol Table". For the following discussions, however, it is worth identifying three basic symbol types:

Undefined

These symbols have been referenced in a file but have not been assigned a storage address.

Tentative

These symbols have been created within a file but have not yet been sized or allocated in storage. They appear as uninitialized C symbols, or FORTRAN COMMON blocks within the file.

Defined

These symbols have been created and assigned storage addresses and space within the file.

In its simplest form, symbol resolution involves the use of a precedence relationship that has defined symbols dominating tentative symbols, which in turn dominate undefined symbols.

The following C code example shows how these symbol types can be generated (undefined symbols are prefixed with u_, tentative symbols are prefixed with t_, and defined symbols are prefixed with d_):


$ cat main.c
extern int      u_bar;
extern int      u_foo();

int             t_bar;
int             d_bar = 1;

d_foo()
{
        return (u_foo(u_bar, t_bar, d_bar));
}
$ cc -o main.o -c main.c
$ nm -x main.o

[Index]   Value      Size      Type  Bind  Other Shndx   Name
...............
[8]     |0x00000000|0x00000000|NOTY |GLOB |0x0  |UNDEF  |u_foo
[9]     |0x00000000|0x00000040|FUNC |GLOB |0x0  |2      |d_foo
[10]    |0x00000004|0x00000004|OBJT |GLOB |0x0  |COMMON |t_bar
[11]    |0x00000000|0x00000000|NOTY |GLOB |0x0  |UNDEF  |u_bar
[12]    |0x00000000|0x00000004|OBJT |GLOB |0x0  |3      |d_bar

Simple Resolutions

These symbol resolutions are by far the most common, and result when two symbols with similar characteristics are detected, and one symbol takes precedence over the other. This symbol resolution is carried out silently by the link-editor. For example, for symbols with the same binding, a reference to an undefined symbol from one file will be bound to, or satisfied by, a defined or tentative symbol definition from another file. Or, a tentative symbol definition from one file will be bound to a defined symbol definition from another file.

Symbols that undergo resolution can have either a global or weak binding. Weak bindings have less precedence than global binding, and so symbols with different bindings are resolved according to a slight alteration of the basic rules. But first, it is worth introducing how weak symbols can be produced.

Weak symbols can be defined individually or as aliases to global symbols using a pragma definition:


$ cat main.c
#pragma weak    bar
#pragma weak    foo = _foo

int             bar = 1;

_foo()
{
        return (bar);
}
$ cc -o main.o -c main.c
$ nm -x main.o

[Index]   Value      Size      Type  Bind  Other Shndx   Name
...............
[7]     |0x00000000|0x00000004|OBJT |WEAK |0x0  |3      |bar
[8]     |0x00000000|0x00000028|FUNC |WEAK |0x0  |2      |foo
[9]     |0x00000000|0x00000028|FUNC |GLOB |0x0  |2      |_foo

Notice that the weak alias foo is assigned the same attributes as the global symbol _foo. This relationship will be maintained by the link-editor and will result in the symbols being assigned the same value in the output image.

In symbol resolution, weak defined symbols will be silently overridden by any global definition of the same name.

Another form of simple symbol resolution occurs between relocatable objects and shared objects, or between multiple shared objects, and is termed interposition. In these cases, if a symbol is multiply defined, the relocatable object, or the first definition between multiple shared objects, will be silently taken by the link-editor. The relocatable object's definition, or the first shared object's definition, is said to interpose on all other definitions. This interposition can be used to override the functionality provided by one shared object by a dynamic executable or another shared object.

The combination of weak symbols and interposition provides a very useful programming technique. For example, the standard C library provides several services that you are allowed to redefine. However, ANSI C defines a set of standard services that must be present on the system and cannot be replaced in a strictly conforming program.

The function fread(3S), for example, is an ANSI C library function, whereas the system function read(2) is not. A conforming ANSI C program must be able to redefine read(2), and still use fread(3S) in a predictable way.

The problem here is that read(2) underlies the fread(3S) implementation in the standard C library, and so it would seem that a program that redefines read(2) might confuse the fread(3S) implementation. To guard against this, ANSI C states that an implementation cannot use a name that is not reserved to it, and by using the pragma directive shown below:


#pragma weak read = _read

you can define just such a reserved name, and from it generate an alias for the function read(2). Thus, you can quite freely define your own read() function without compromising the fread(3S) implementation, which in turn is implemented to use the _read() function.

The link-editor will not complain of your redefinition of read(), either when linking against the shared object or archive version of the standard C library. In the former case, interposition will take its course, whereas in the latter case, the fact that the C library's definition of read(2) is weak allows it to be quietly overridden.

By using the link-editor's -m option, a list of all interposed symbol references, along with section load address information, is written to the standard output.

Complex Resolutions

Complex resolutions occur when two symbols of the same name are found with differing attributes. In these cases the link-editor will select the most appropriate symbol and will generate a warning message indicating the symbol, the attributes that conflict, and the identity of the file from which the symbol definition is taken. For example:


$ cat foo.c
int array[1];

$ cat bar.c
int array[2] = { 1, 2 };

$ cc -dn -r -o temp.o foo.c bar.c
ld: warning: symbol `array' has differing sizes:
        (file foo.o value=0x4; file bar.o value=0x8);
        bar.o definition taken

Here, two files with a definition of the data item array have different size requirements. A similar diagnostic is produced if the symbols' alignment requirements differed. In both of these cases the diagnostic can be suppressed by using the link-editor's -t option.

Another form of attribute difference is the symbol's type. For example:


$ cat foo.c
bar()
{
        return (0);
}
$ cc -o libfoo.so -G -K pic foo.c
$ cat main.c
int     bar = 1;

main()
{
        return (bar);
}
$ cc -o main main.c -L. -lfoo
ld: warning: symbol `bar' has differing types:
        (file main.o type=OBJT; file ./libfoo.so type=FUNC);
        main.o definition taken

Here the symbol bar has been defined as both a data item and a function.


Note -

types in this context are the symbol types that can be expressed in ELF. They are not related to the data types as employed by the programming language except in the crudest fashion.


In cases like this, the relocatable object definition will be taken when the resolution occurs between a relocatable object and a shared object, or, the first definition will be taken when the resolution occurs between two shared objects. When such resolutions occur between symbols of different bindings (weak or global), a warning will also be produced.

Inconsistencies between symbol types are not suppressed by the link-editor's -t option.

Fatal Resolutions

Symbol conflicts that cannot be resolved result in a fatal error condition. In this case an appropriate error message is provided indicating the symbol name together with the names of the files that provided the symbols, and no output file will be generated. Although the fatal condition is sufficient to terminate the link-edit, all input file processing will first be completed. In this manner all fatal resolution errors can be identified.

The most common fatal error condition exists when two relocatable objects both define symbols of the same name, and neither symbol is a weak definition:


$ cat foo.c
int bar = 1;

$ cat bar.c
bar()
{
        return (0);
}

$ cc -dn -r -o temp.o foo.c bar.c
ld: fatal: symbol `bar' is multiply defined:
        (file foo.o and file bar.o);
ld: fatal: File processing errors. No output written to int.o

Here foo.c and bar.c have conflicting definitions for the symbol bar. Since the link-editor cannot determine which should dominate, it will usually give up. However, the link-editor's -z muldefs option can be used to suppress this error condition, and allows the first symbol definition to be taken.

Undefined Symbols

After all input files have been read and all symbol resolution is complete, the link-editor will search the internal symbol table for any symbol references that have not been bound to symbol definitions. These symbol references are referred to as undefined symbols. The effect of these undefined symbols on the link-edit process can vary according to the type of output file being generated, and possibly the type of symbol.

Generating an Executable

When the link-editor is generating an executable output file, the link-editor's default behavior is to terminate the link-edit with an appropriate error message should any symbols remain undefined. A symbol remains undefined when a symbol reference in a relocatable object is never matched to a symbol definition:


$ cat main.c
extern int foo();
main()
{
        return (foo());
}

$ cc -o prog main.c
Undefined           first referenced
 symbol                 in file
foo                     main.o
ld: fatal: Symbol referencing errors. No output written to prog

In a similar manner, a symbol reference within a shared object that is never matched to a symbol definition when the shared object is being used to build a dynamic executable, will also result in an undefined symbol:


$ cat foo.c
extern int bar;
foo()
{
        return (bar);
}

$ cc -o libfoo.so -G -K pic foo.c
$ cc -o prog main.c -L. -lfoo
Undefined           first referenced
 symbol                 in file
bar                     ./libfoo.so
ld: fatal: Symbol referencing errors. No output written to prog

If you wish to allow undefined symbols, as in cases like the previous example, then the default fatal error condition can be suppressed by using the link-editor's -z nodefs option.


Note -

Care should be taken when using the -z nodefs option. If an unavailable symbol reference is required during the execution of a process, a fatal runtime relocation error will occur. Although this error can be detected during the initial execution and testing of an application, more complex execution paths can result in this error condition taking much longer to detect, which can be time consuming and costly.


Symbols can also remain undefined when a symbol reference in a relocatable object is bound to a symbol definition in an implicitly defined shared object. For example, continuing with the files main.c and foo.c used in the previous example:


$ cat bar.c
int bar = 1;

$ cc -o libbar.so -R. -G -K pic bar.c -L. -lfoo
$ ldd libbar.so
        libfoo.so =>     ./libfoo.so

$ cc -o prog main.c -L. -lbar
Undefined           first referenced
 symbol                 in file
foo                     main.o  (symbol belongs to implicit \
                        dependency ./libfoo.so)
ld: fatal: Symbol referencing errors. No output written to prog

Here prog is being built with an explicit reference to libbar.so, and because libbar.so has a dependency on libfoo.so, an implicit reference to libfoo.so from prog is established.

Because main.c made a specific reference to the interface provided by libfoo.so, then prog really has a dependency on libfoo.so. However, only explicit shared object dependencies are recorded in the output file being generated. Thus, prog will fail to run if a new version of libbar.so is developed that no longer has a dependency on libfoo.so.

For this reason, bindings of this type are deemed fatal, and the implicit reference must be made explicit by referencing the library directly during the link-edit of prog (the required reference is hinted at in the fatal error message shown in this example).

Generating a Shared Object

When the link-editor is generating a shared object, it will by default allow undefined symbols to remain at the end of the link-edit. This allows the shared object to import symbols from either relocatable objects or other shared objects when it is used to build a dynamic executable. The link-editor's -z defs option can be used to force a fatal error if any undefined symbols remain.

In general a self contained shared object, in which all references to external symbols are satisfied by named dependencies, provides maximum flexibility. In this case the shared object can be employed by many users, without those users having to determine and establish dependencies to satisfy the shared objects requirements.

Weak Symbols

Weak symbol references that are not bound during a link-edit will not result in a fatal error condition, no matter what output file type is being generated.

If a static executable is being generated, the symbol will be converted to an absolute symbol and assigned a value of zero.

If a dynamic executable or shared object is being produced, the symbol will be left as an undefined weak reference. During process execution, the runtime linker will search for this symbol, and if it does not find a match, will bind the reference to an address of zero instead of generating a fatal runtime relocation error.

Historically, these undefined weak referenced symbols have been employed as a mechanism for testing for the existence of functionality. For example, the following C code fragment may have been used in the shared object libfoo.so.1:


#pragma weak    foo
 
extern  void    foo(char *);
 
void
bar(char * path)
{
        void (* fptr)();
 
        if ((fptr = foo) != 0)
                (* fptr)(path);
}

When an application is built that references libfoo.so.1, the link-edit will successfully complete regardless of whether a definition for the symbol foo is found. If, during execution of the application the function address tests nonzero, the function will be called. However, if the symbol definition is not found, the function address will test zero, and so will not be called.

However, compilation systems view this address comparison technique as having undefined semantics, which can result in the test statement being removed under optimization. In addition, the runtime symbol binding mechanism places other restrictions on the use of this technique which prevents a consistent model from being available for all dynamic objects.

Users are discouraged from using undefined weak references in this manner, and instead are encouraged to use dlsym(3X) with the RTLD_DEFAULT flag as a means of testing for a symbols existence (see "Testing for Functionality").

Tentative Symbol Order Within the Output File

Contributions from input files usually appear in the output file in the order of their contribution. An exception occurs when processing tentative symbols and their associated storage. These symbols are not fully defined until their resolution is complete. If the resolution occurs as a result of encountering a defined symbol from a relocatable object, then the order of appearance will be that which would have occurred for the definition.

If it is desirable to control the ordering of a group of symbols, then any tentative definition should be redefined to a zero-initialized data item. For example, the following tentative definitions result in a reordering of the data items within the output file compared to the original order described in the source file foo.c:


$ cat foo.c
char A_array[0x10];
char B_array[0x20];
char C_array[0x30];
$ cc -o prog main.c foo.c
$ nm -vx prog | grep array
[32]    |0x00020754|0x00000010|OBJT |GLOB |0x0  |15  |A_array
[34]    |0x00020764|0x00000030|OBJT |GLOB |0x0  |15  |C_array
[42]    |0x00020794|0x00000020|OBJT |GLOB |0x0  |15  |B_array

By defining these symbols as initialized data items, the relative ordering of these symbols within the input file is carried over to the output file:


$ cat foo.c
char A_array[0x10] = { 0 };
char B_array[0x20] = { 0 };
char C_array[0x30] = { 0 };
$ cc -o prog main.c foo.c
$ nm -vx prog | grep array
[32]    |0x000206bc|0x00000010|OBJT |GLOB |0x0  |12  |A_array
[42]    |0x000206cc|0x00000020|OBJT |GLOB |0x0  |12  |B_array
[34]    |0x000206ec|0x00000030|OBJT |GLOB |0x0  |12  |C_array

Defining Additional Symbols

Besides the symbols provided from any input files, you can also supply additional symbol references or definitions to a link-edit. In the simplest form, symbol references can be generated using the link-editor's -u option. Greater flexibility is provided with the link-editor's -M option and an associated mapfile which allows you to define symbol references and a variety of symbol definitions.

The -u option provides a mechanism for generating a symbol reference from the link-edit command line. This option can be used to perform a link-edit entirely from archives, or to provide additional flexibility in selecting the objects to extract from multiple archives (see section "Archive Processing" for an overview of archive extraction).

For example, let's take the generation of a dynamic executable from the relocatable object main.o which makes reference to the symbols foo and bar. You wish to obtain the symbol definition foo from the relocatable object foo.o contained in lib1.a, and the symbol definition bar from the relocatable object bar.o contained in lib2.a.

However, the archive lib1.a also contains a relocatable object defining the symbol bar (presumably of differing functionality to that provided in lib2.a). To specify the required archive extraction, the following link-edit can be used:


$ cc -o prog -L. -u foo -l1 main.o -l2

Here, the -u option generates a reference to the symbol foo. This reference will cause extraction of the relocatable object foo.o from the archive lib1.a. As the first reference to the symbol bar occurs in main.o, which is encountered after lib1.a has been processed, the relocatable object bar.o will be obtained from the archive lib2.a.


Note -

This simple example assumes that the relocatable object foo.o from lib1.a does not directly, or indirectly, reference the symbol bar. If it did then the relocatable object bar.o will also be extracted from lib1.a during its processing (see section "Archive Processing" for a discussion of the link-editor's multi-pass processing of an archive).


A more extensive set of symbol definitions can be provided using the link-editor's -M option and an associated mapfile. The syntax for these mapfile entries is:


[ name ] {
     scope:
            symbol [ = [ type ] [ value ] [ size ] ];
} [ dependency ];
name

A label for this set of symbol definitions, and if present, identifies a version definition within the image. See Chapter 5, Versioning for more details.

scope

Indicates the visibility of the symbols' binding within the output file being generated. This can have either the value local or global. All symbols defined with a mapfile are treated as global in scope during the link-edit process. That is, they will be resolved against any other symbols of the same name obtained from any of the input files. However, symbols defined as local scope will be reduced to symbols with a local binding within any executable or shared object file being generated.

symbol

The name of the symbol required. If the name is not followed by any symbol attributes then the result will be the creation of a symbol reference. This reference is exactly the same as would be generated using the -u option discussed earlier in this section. If the symbol name is followed by an optional "=" character then a symbol definition will be generated using the associated attributes.

When in local scope, this symbol name can be defined as the special auto-reduction directive "*". This directive results in all global symbols, not explicitly defined to be global in the mapfile, being given a local binding within any executable or shared object file being generated.

type

Indicates the symbols' type attribute and can be either data, function or common. The former two type attributes result in anabsolute symbol definition (see "Symbol Table"). The latter type attribute results in a tentative symbol definition.

value

Indicates the symbols' value attribute and takes the form of Vnumber.

size

Indicates the symbols' size attribute and takes the form of Snumber.

dependency

Represents a version definition that will be inherited by this definition. See Chapter 5, Versioning for more details.

If either a version definition or the auto-reduction directive is specified, then versioning information is recorded in the image created. If this image is an executable or shared object, then any symbol reduction is also applied.

If the image being created is a relocatable object, then by default no symbol reduction is applied. In this case, any symbol reductions are recorded as part of the versioning information, and these reductions will be applied when the relocatable object is finally used to generate an executable or shared object. The link-editor's -B reduce option can be used to force symbol reduction when generating a relocatable objecr.

A more detailed description of the versioning information is provided in Chapter 5, Versioning.


Note -

To insure interface definition stability, no wildcard expansion is provided for defining symbol names.


The remainder of this section presents several examples of using this mapfile syntax.

The following example shows how three symbol references can be defined and used to extract members of an archive. Although this archive extraction can be achieved by specifying multiple -u options to the link-edit, this example also shows how the eventual scope of a symbol can be reduced to local:


$ cat foo.c
foo()
{
        (void) printf("foo: called from lib.a\n");
}
$ cat bar.c
bar()
{
        (void) printf("bar: called from lib.a\n");
}
$ cat main.c
extern  void    foo(), bar();

main()
{
        foo();
        bar();
}
$ ar -rc lib.a foo.o bar.o main.o
$ cat mapfile
{
        local:
                foo;
                bar;
        global:
                main;
};
$ cc -o prog -M mapfile lib.a
$ prog
foo: called from lib.a
bar: called from lib.a
$ nm -x prog | egrep "main$|foo$|bar$"
[28]    |0x00010604|0x00000024|FUNC |LOCL |0x0  |7      |foo
[30]    |0x00010628|0x00000024|FUNC |LOCL |0x0  |7      |bar
[49]    |0x0001064c|0x00000024|FUNC |GLOB |0x0  |7      |main

The significance of reducing symbol scope from global to local is covered in more detail in the section "Reducing Symbol Scope".

The following example shows how two absolute symbol definitions can be defined and used to resolve the references from the input file main.c:


$ cat main.c
extern  int     foo();
extern  int     bar;

main()
{
        (void) printf("&foo = %x\n", &foo);
        (void) printf("&bar = %x\n", &bar);
}
$ cat mapfile
{
        global:
                foo = FUNCTION V0x400;
                bar = DATA V0x800;
};
$ cc -o prog -M mapfile main.c
$ prog
&foo = 400
&bar = 800
$ nm -x prog | egrep "foo$|bar$"
[37]    |0x00000800|0x00000000|OBJT |GLOB |0x0  |ABS    |bar
[42]    |0x00000400|0x00000000|FUNC |GLOB |0x0  |ABS    |foo

When obtained from an input file, symbol definitions for functions or data items are usually associated with elements of data storage. As a mapfile definition is insufficient to be able to construct this data storage, these symbols must remain as absolute values.

However, a mapfile can also be used to define a common, or tentative, symbol. Unlike other types of symbol definition, tentative symbols do not occupy storage within a file, but define storage that must be allocated at runtime. Therefore, symbol definitions of this kind can contribute to the storage allocation of the output file being generated.

A feature of tentative symbols, that differs from other symbol types, is that their value attribute indicates their alignment requirement. A mapfile definition can therefore be used to realign tentative definitions obtained from the input files of a link-edit.

The following example shows the definition of two tentative symbols. The symbol foo defines a new storage region whereas the symbol bar is actually used to change the alignment of the same tentative definition within the file main.c:


$ cat main.c
extern  int     foo;
int             bar[0x10];

main()
{
        (void) printf("&foo = %x\n", &foo);
        (void) printf("&bar = %x\n", &bar);
}
$ cat mapfile
{
        global:
                foo = COMMON V0x4 S0x200;
                bar = COMMON V0x100 S0x40;
};
$ cc -o prog -M mapfile main.c
ld: warning: symbol `bar' has differing alignments:
        (file mapfile value=0x100; file main.o value=0x4);
        largest value applied
$ prog
&foo = 20940
&bar = 20900
$ nm -x prog | egrep "foo$|bar$"
[37]    |0x00020900|0x00000040|OBJT |GLOB |0x0  |16     |bar
[42]    |0x00020940|0x00000200|OBJT |GLOB |0x0  |16     |foo

Note -

This symbol resolution diagnostic can be suppressed by using the link-editor's -t option.


Reducing Symbol Scope

In the previous section it was shown how symbol definitions defined to have local scope within a mapfile can be used to reduce the symbol's eventual binding. This mechanism can play an important role in reducing the symbol's visibility to future link-edits that use the generated file as part of their input. In fact, this mechanism can provide for the precise definition of a file's interface, and so restrict the functionality made available to others.

For example, let's take the generation of a simple shared object from the files foo.c and bar.c. The file foo.c contains the global symbol foo which provides the service that you wish to make available to others. The file bar.c contains the symbols bar and str which provide the underlying implementation of the shared object. A simple build of the shared object will usually result in all three of these symbols having global scope:


$ cat foo.c
extern  const char *    bar();

const char * foo()
{
        return (bar());
}
$ cat bar.c
const char * str = "returned from bar.c";

const char * bar()
{
        return (str);
}
$ cc -o lib.so.1 -G foo.c bar.c
$ nm -x lib.so.1 | egrep "foo$|bar$|str$"
[29]    |0x000104d0|0x00000004|OBJT |GLOB |0x0  |12     |str
[32]    |0x00000418|0x00000028|FUNC |GLOB |0x0  |6      |bar
[33]    |0x000003f0|0x00000028|FUNC |GLOB |0x0  |6      |foo

You can now use the functionality offered by this shared object as part of the link-edit of another application. References to the symbol foo will be bound to the implementation provided by the shared object.

However, because of their global binding, direct reference to the symbols bar and str is also possible. This can have dangerous consequences, as the you might later change the implementation underlying the function foo, and in so doing unintentionally cause an existing application that had bound to bar or str fail or misbehave.

Another consequence of the global binding of the symbols bar and str is that they can be interposed upon by symbols of the same name (the interposition of symbols within shared objects is covered in section "Simple Resolutions"). This interposition can be intentional and be used as a means of circumventing the intended functionality offered by the shared object. On the other hand, this interposition can be unintentional, being the result of the application and the shared object using the same common symbol name.

When developing the shared object you can protect against this type of scenario by reducing the scope of the symbols bar and str to a local binding, for example:


$ cat mapfile
{
        local:
                bar;
                str;
};
$ cc -o lib.so.1 -M mapfile -G foo.c bar.c
$ nm -x lib.so.1 | egrep "foo$|bar$|str$"
[27]    |0x000003dc|0x00000028|FUNC |LOCL |0x0  |6      |bar
[28]    |0x00010494|0x00000004|OBJT |LOCL |0x0  |12     |str
[33]    |0x000003b4|0x00000028|FUNC |GLOB |0x0  |6      |foo

Here the symbols bar and str are no longer available as part of the shared objects interface. Thus these symbols cannot be referenced, or interposed upon, by an external object. You have effectively defined an interface for the shared object. This interface can be managed while hiding the details of the underlying implementation.

This symbol scope reduction has an additional performance advantage. The symbolic relocations against the symbols bar and str that would have been necessary at runtime are now reduced to relative relocations. This reduces the runtime overhead of initializing and processing the shared object (see section "When Relocations are Performed" for details of symbolic relocation overhead).

As the number of symbols processed during a link-edit gets large, the ability to define each local scope reduction within a mapfile becomes harder to maintain. An alternative, and more flexible mechanism, allows you to define the shared objects interface in terms of the global symbols that should

be maintained, and instructs the link-editor to reduce all other symbols to local binding. This mechanism is achieved using the special auto-reduction directive "*". For example, the previous mapfile definition can be rewritten to define foo as the only global symbol required in the output file generated:


$ cat mapfile
lib.so.1.1 {
        global:
                foo;
        local:
                *;
};
$ cc -o lib.so.1 -M mapfile -G foo.c bar.c
$ nm -x lib.so.1 | egrep "foo$|bar$|str$"
[30]    |0x00000370|0x00000028|FUNC |LOCL |0x0  |6      |bar
[31]    |0x00010428|0x00000004|OBJT |LOCL |0x0  |12     |str
[35]    |0x00000348|0x00000028|FUNC |GLOB |0x0  |6      |foo

This example also defines a version name, lib.so.1.1, as part of the mapfile directive. This version name establishes an internal version definition that defines the files symbolic interface. The creation of a version definition is recommended, and forms the foundation of an internal versioning mechanism that can be used throughout the evolution of the file. See Chapter 5, Versioning for more details on this topic.


Note -

If a version name is not supplied the output filename will be used to label the version definition. The versioning information created within the output file can be suppressed using the link-editors' -z noversion option.


Whenever a version name is specified, all global symbols must be assigned to a version definition. If any global symbols remain unassigned to a version definition the link-editor will generate a fatal error condition:


$ cat mapfile
lib.so.1.1 {
        global:
                foo;
};
$ cc -o lib.so.1 -M mapfile -G foo.c bar.c
Undefined           first referenced
 symbol                 in file
str                     bar.o  (symbol has no version assigned)
bar                     bar.o  (symbol has no version assigned)
ld: fatal: Symbol referencing errors. No output written \
to lib.so.1

When generating an executable or shared object, any symbol reduction results in the recording of version definitions within the output image, together with the reduction of the appropriate symbols. By default, when generating a relocatable object, the version definitions are created, but the symbol reductions are not processed. The result is that the symbol entries for any symbol reductions will still remain global. For example, using the previous mapfile and associated relocatable objects, an intermediate relocatable object is created which shows no symbol reduction:


$ ld -o lib.o -M mapfile -r foo.o bar.o
$ nm -x lib.o | egrep "foo$|bar$|str$"
[17]    |0x00000000|0x00000004|OBJT |GLOB |0x0  |3      |str
[19]    |0x00000028|0x00000028|FUNC |GLOB |0x0  |1      |bar
[20]    |0x00000000|0x00000028|FUNC |GLOB |0x0  |1      |foo

However, the version definitions created within this image record the fact that symbol reductions are required. When the relocatable object is eventually used to generate an executable or shared object, the symbol reductions will occur. In other words, the link-editor reads and interprets symbol reduction information contained in relocatable objects in the same manner as it can process the data from a mapfile.

Thus, the intermediate relocatable object produced in the previous example can now be used to generate a shared object:


$ cc -o lib.so.1 -G lib.o
$ nm -x lib.so.1 | egrep "foo$|bar$|str$"
[22]    |0x000104a4|0x00000004|OBJT |LOCL |0x0  |14     |str
[24]    |0x000003dc|0x00000028|FUNC |LOCL |0x0  |8      |bar
[36]    |0x000003b4|0x00000028|FUNC |GLOB |0x0  |8      |foo

Symbol reductions can be forced to occur when building a relocatable object by using the link-editor's -B reduce option:


$ ld -o lib.o -M mapfile -B reduce -r foo.o bar.o
$ nm -x lib.o | egrep "foo$|bar$|str$"
[15]    |0x00000000|0x00000004|OBJT |LOCL |0x0  |3      |str
[16]    |0x00000028|0x00000028|FUNC |LOCL |0x0  |1      |bar
[20]    |0x00000000|0x00000028|FUNC |GLOB |0x0  |1      |foo