Linker and Libraries Guide

Reducing Symbol Scope

In the previous section it was shown how symbol definitions defined to have local scope within a mapfile can be used to reduce the symbol's eventual binding. This mechanism can play an important role in reducing the symbol's visibility to future link-edits that use the generated file as part of their input. In fact, this mechanism can provide for the precise definition of a file's interface, and so restrict the functionality made available to others.

For example, take the generation of a simple shared object from the files foo.c and bar.c. The file foo.c contains the global symbol foo, which provides the service that you want to make available to others. The file bar.c contains the symbols bar and str, which provide the underlying implementation of the shared object. The creation of a simple shared object usually results in all three of these symbols having global scope:


$ cat foo.c
extern  const char *    bar();

const char * foo()
{
         return (bar());
}
$ cat bar.c
const char * str = "returned from bar.c";

const char * bar()
{
         return (str);
}
$ cc -o lib.so.1 -G foo.c bar.c
$ nm -x lib.so.1 | egrep "foo$|bar$|str$"
[29]    |0x000104d0|0x00000004|OBJT |GLOB |0x0  |12     |str
[32]    |0x00000418|0x00000028|FUNC |GLOB |0x0  |6      |bar
[33]    |0x000003f0|0x00000028|FUNC |GLOB |0x0  |6      |foo

You can now use the functionality offered by this shared object as part of the link-edit of another application. References to the symbol foo are bound to the implementation provided by the shared object.

However, because of their global binding, direct reference to the symbols bar and str is also possible. This can have dangerous consequences, as the you might later change the implementation underlying the function foo, and in so doing unintentionally cause an existing application that had bound to bar or str to fail or misbehave.

Another consequence of the global binding of the symbols bar and str is that they can be interposed upon by symbols of the same name (the interposition of symbols within shared objects is covered in section "Simple Resolutions"). This interposition can be intentional and be used as a means of circumventing the intended functionality offered by the shared object. On the other hand, this interposition can be unintentional, the result of the same common symbol name used for both the application and the shared object.

When developing the shared object you can protect against this type of scenario by reducing the scope of the symbols bar and str to a local binding; for example:


$ cat mapfile
{
         local:
                 bar;
                 str;
};
$ cc -o lib.so.1 -M mapfile -G foo.c bar.c
$ nm -x lib.so.1 | egrep "foo$|bar$|str$"
[27]    |0x000003dc|0x00000028|FUNC |LOCL |0x0  |6      |bar
[28]    |0x00010494|0x00000004|OBJT |LOCL |0x0  |12     |str
[33]    |0x000003b4|0x00000028|FUNC |GLOB |0x0  |6      |foo

Here the symbols bar and str are no longer available as part of the shared objects interface. Thus these symbols cannot be referenced, or interposed upon, by an external object. You have effectively defined an interface for the shared object. This interface can be managed while hiding the details of the underlying implementation.

This symbol scope reduction has an additional performance advantage. The symbolic relocations against the symbols bar and str that would have been necessary at runtime are now reduced to relative relocations. This reduces the runtime overhead of initializing and processing the shared object (see section "When Relocations are Performed" for details of symbolic relocation overhead).

As the number of symbols processed during a link-edit increases, the ability to define each local scope reduction within a mapfile becomes harder to maintain. An alternative, and more flexible mechanism, allows you to define the shared objects interface in terms of the global symbols that should be maintained, and instructs the link-editor to reduce all other symbols to local binding. This mechanism is achieved using the special auto-reduction directive "*". For example, the previous mapfile definition can be rewritten to define foo as the only global symbol required in the output file generated:


$ cat mapfile
lib.so.1.1
{
         global:
                 foo;
         local:
                 *;
};
$ cc -o lib.so.1 -M mapfile -G foo.c bar.c
$ nm -x lib.so.1 | egrep "foo$|bar$|str$"
[30]    |0x00000370|0x00000028|FUNC |LOCL |0x0  |6      |bar
[31]    |0x00010428|0x00000004|OBJT |LOCL |0x0  |12     |str
[35]    |0x00000348|0x00000028|FUNC |GLOB |0x0  |6      |foo

This example also defines a version name, lib.so.1.1, as part of the mapfile directive. This version name establishes an internal version definition that defines the file's symbolic interface. The creation of a version definition is recommended, and forms the foundation of an internal versioning mechanism that can be used throughout the evolution of the file. See Chapter 5, Versioning for more details on this topic.


Note -

If a version name is not supplied, the output filename is used to label the version definition. The versioning information created within the output file can be suppressed using the link-editors' -znoversion option.


Whenever a version name is specified, all global symbols must be assigned to a version definition. If any global symbols remain unassigned to a version definition, the link-editor generates a fatal error condition:


$ cat mapfile
lib.so.1.1 {
         global:
                 foo;
};
$ cc -o lib.so.1 -M mapfile -G foo.c bar.c
Undefined           first referenced
 symbol                 in file
str                     bar.o  (symbol has no version assigned)
bar                     bar.o  (symbol has no version assigned)
ld: fatal: Symbol referencing errors. No output written to lib.so.1

When generating an executable or shared object, any symbol reduction results in the recording of version definitions within the output image, together with the reduction of the appropriate symbols. By default, when generating a relocatable object, the version definitions are created, but the symbol reductions are not processed. The result is that the symbol entries for any symbol reductions still remain global. For example, using the previous mapfile with the auto-reduction directive and associated relocatable objects, an intermediate relocatable object is created that shows no symbol reduction:


$ ld -o lib.o -M mapfile -r foo.o bar.o
$ nm -x lib.o | egrep "foo$|bar$|str$"
[17]    |0x00000000|0x00000004|OBJT |GLOB |0x0  |3      |str
[19]    |0x00000028|0x00000028|FUNC |GLOB |0x0  |1      |bar
[20]    |0x00000000|0x00000028|FUNC |GLOB |0x0  |1      |foo

However, the version definitions created within this image record the fact that symbol reductions are required. When the relocatable object is used eventually to generate an executable or shared object, the symbol reductions occur. In other words, the link-editor reads and interprets symbol reduction information contained in relocatable objects in the same manner as it processes the data from a mapfile.

Thus, the intermediate relocatable object produced in the previous example can now be used to generate a shared object:


$ cc -o lib.so.1 -M mapfile -G lib.o
$ nm -x lib.so.1 | egrep "foo$|bar$|str$"
[22]    |0x000104a4|0x00000004|OBJT |LOCL |0x0  |14     |str
[24]    |0x000003dc|0x00000028|FUNC |LOCL |0x0  |8      |bar
[36]    |0x000003b4|0x00000028|FUNC |GLOB |0x0  |8      |foo

Symbol reductions can be forced to occur when creating a relocatable object by using the link-editor's -Breduce option:


$ ld -o lib.o -M mapfile -B reduce -r foo.o bar.o
$ nm -x lib.o | egrep "foo$|bar$|str$"
[15]    |0x00000000|0x00000004|OBJT |LOCL |0x0  |3      |str
[16]    |0x00000028|0x00000028|FUNC |LOCL |0x0  |1      |bar
[20]    |0x00000000|0x00000028|FUNC |GLOB |0x0  |1      |foo