Stub Objects

Language:

A stub object is a shared object, built entirely from mapfiles, that supplies the same linking interface as the real object, while containing no code or data. Stub objects cannot be used at runtime. However, an application can be built against a stub object, where the stub object provides the real object name to be used at runtime.

When building a stub object, the link-editor ignores any object or library files specified on the command line, and these files need not exist in order to build a stub. Since the compilation step can be omitted, and because the link-editor has relatively little work to do, stub objects can be built very quickly.

Stub objects can be used to solve a variety of build problems.

Speed

Modern machines, using a version of the make utility with the ability to parallelize operations, are capable of compiling and linking many objects simultaneously, and doing so offers significant speedups. However, it is typical that a given object will depend on other objects, and that there will be a core set of objects that nearly everything else depends on. It is necessary to order the builds so that all objects are built ahead of their use by other objects. This ordering creates bottlenecks that reduce the amount of parallelization that is possible and limits the overall speed at which the code can be built.
Complexity/Correctness

In a large body of code, there can be a large number of dependencies between the various objects. The makefiles or other build descriptions for these objects can become very complex and difficult to understand or maintain. The dependencies can change as the system evolves. This can cause a given set of makefiles to become slightly incorrect over time, leading to race conditions and mysterious rare build failures.
Dependency Cycles

It might be desirable to organize code as cooperating shared objects, each of which draw on the resources provided by the other. Such cycles cannot be supported in an environment where objects must be built before the objects that use them, even though the runtime linker is fully capable of loading and using such objects if they could be built.

Stub shared objects offer an alternative method for building code that sidesteps the above issues. Stub objects can be quickly built for all the shared objects produced by the build. Then, all the real dynamic objects can be built in parallel, in any order, using the stub objects to stand in for the real objects at link-time. Afterwards, the real dynamic objects are kept, and the stub shared objects are discarded.

Stub objects are built from one or more mapfiles, which must collectively satisfy the following requirements.

At least one mapfile must specify the STUB_OBJECT directive. See STUB_OBJECT Directive.
All function and data symbols that make up the external interface to the object must be explicitly listed in the mapfile.
The mapfile must use symbol scope reduction ('*'), to remove any symbols not explicitly listed from the external interface. See SYMBOL_SCOPE and SYMBOL_VERSION Directives.
All global data exported from the object must have an ASSERT symbol attribute in the mapfile to specify the symbol type and size. In the case where there are multiple symbols that reference the same data, the ASSERT for one of these symbols must specify the TYPE and SIZE attributes, while the others must use the ALIAS attribute to reference this primary symbol. See ASSERT Attribute.

Given such a mapfile, the stub and real versions of the shared object can be built using the same command line for each. The –z stub option is added to the link-edit of the stub object, and is omitted from the link-edit of the real object.

To demonstrate these ideas, the following code implements a shared object named idx5, which exports data from a 5 element array of integers. Each element is initialized to contain its zero-based array index. This data is made available as a global array, as an alternative alias data symbol with weak binding, and through a functional interface.

$ cat idx5.c
int _idx5[5] = { 0, 1, 2, 3, 4 };
#pragma weak idx5 = _idx5

int
idx5_func(int index)
{
        if ((index < 0) || (index > 4))
                return (-1);
        return (_idx5[index]);
}

A mapfile is required to describe the interface provided by this shared object.

$ cat mapfile
$mapfile_version 2
STUB_OBJECT;
SYMBOL_SCOPE {
        _idx5 {
                ASSERT { TYPE=data; SIZE=4[5] };
        };
        idx5 {
                ASSERT { BINDING=weak; ALIAS=_idx5 };
        };
        idx5_func;
    local:
        *;
};

The following main program is used to print all the index values available from the idx5 shared object.

$ cat main.c
#include <stdio.h>

extern int _idx5[5], idx5[5], idx5_func(int);

int
main(int argc, char **argv)
{
        int i;
        for (i = 0; i < 5; i++)
                (void) printf("[%d] %d %d %d\n",
                    i, _idx5[i], idx5[i], idx5_func(i));
        return (0);
}

The following commands create a stub version of this shared object in a subdirectory named stublib. The elfdump command is used to verify that the resulting object is a stub. The command used to build the stub differs from that of the real object only in the addition of the –z stub option, and the use of a different output file name. This demonstrates the ease with which stub generation can be added to existing code.

$ cc -Kpic -G -M mapfile -h libidx5.so.1 idx5.c -o stublib/libidx5.so.1 -z stub
$ ln -s libidx5.so.1 stublib/libidx5.so
$ elfdump -d stublib/libidx5.so | grep STUB
    [11]  FLAGS_1           0x4000000           [ STUB ]

The main program can now be built, using the stub object to stand in for the real shared object, and setting a runpath that will find the real object at runtime. However, as the real object has not been built, this program cannot yet be run. Attempts to cause the system to load the stub object are rejected, as the runtime linker knows that stub objects lack the actual code and data found in the real object, and cannot execute.

$ cc main.c -L stublib -R '$ORIGIN/lib' -lidx5 -lc
$ ./a.out
ld.so.1: a.out: fatal: libidx5.so.1: open failed: No such file or directory
Killed
$ LD_PRELOAD=stublib/libidx5.so.1  ./a.out
ld.so.1: a.out: fatal: stublib/libidx5.so.1: stub shared object \
    cannot be used at runtime
Killed

The real object is built using the same command used to build the stub object. The –z stub option is omitted, and the path for the real output file is specified.

$ cc -Kpic -G -M mapfile -h libidx5.so.1 idx5.c -o lib/libidx5.so.1

Once the real object has been built in the lib subdirectory, the program can be run.

$ ./a.out
[0] 0 0 0
[1] 1 1 1
[2] 2 2 2
[3] 3 3 3
[4] 4 4 4

Using Stub Objects to Hide Obsolete Interfaces

Libraries evolve, and sometimes the original functionality proves to be undesirable. It is common for new abilities to be added, and for older ones to be considered obsolete. When backward compatibility is a concern, it is necessary to maintain such older functionality in the library for the benefit of existing objects. However, you may wish to prevent new use of these features. Stub objects can be used to enforce this policy. The mapfile STUB_ELIMINATE flag can be used to mark functions or data from an object that should be eliminated from the stub object, while remaining in the real object. This prevents new code, which links to the stub object, from using these obsolete items, and encourages code to be rewritten to use the preferred interfaces. Since the real objects still contain these items, existing objects are able to use them.

The libidx5 example from the previous section illustrates this. That library demonstrates how to export global data from an object. However, exported global data introduces complexity to dynamic linking, and is best avoided. It is usually a better design to provide a function to access such data, such as the idx5_func()function provided by libidx5. Continuing that example, STUB_ELIMINATE can be used to make the global data unavailable to new code that links to the stub, while providing those old interfaces in the real object for the benefit of existing programs.

The mapfile is rewritten to apply STUB_ELIMINATE to the two global data symbols. A benefit of applying STUB_ELIMINATE to global data is that it is no longer necessary to provide an ASSERT directive to provide the data size. In this example, the ASSERT is commented out. A real mapfile might omit it entirely.

$ cat better_mapfile
$mapfile_version 2
STUB_OBJECT;
SYMBOL_SCOPE {
        _idx5 {
                FLAGS=STUB_ELIMINATE;
                #ASSERT { TYPE=data; SIZE=4[5] };
        };
        idx5 {
                FLAGS=STUB_ELIMINATE;
                #ASSERT { BINDING=weak; ALIAS=_idx5 };
        };
        idx5_func;
    local:
        *;
};

A new version of the test program only uses the functional interface.

$ cat better_main.c
#include    <stdio.h>

extern int idx5_func(int);

int
main(int argc, char **argv)
{
        int i;
        for (i = 0; i < 5; i++)
                (void) printf("[%d] %d\n", i, idx5_func(i));
        return (0);
}

The old test program is saved, the stub object is rebuilt using the new mapfile, and the test program is rebuilt, linking against the new stub object that employs STUB_ELIMINATE:

$ cp a.out original_a.out
$ cc -Kpic -G -M better_mapfile -h libidx5.so.1 idx5.c -o stublib/libidx5.so.1 -z stub
$ cc better_main.c -o better_a.out -L stublib -R '$ORIGIN/lib' -lidx5 -lc
$ ./better_a.out
[0] 0
[1] 1
[2] 2
[3] 3
[4] 4

The original test program can no longer be built, because the stub library lacks the necessary global data symbols. However, the preexisting binary that used them continues to function because the real library still provides the global data symbols.

$ cc main.c -L stublib -R '$ORIGIN/lib' -lidx5 -lc
Undefined                       first referenced
 symbol                             in file
idx5                                main.o
_idx5                               main.o
ld: fatal: symbol referencing errors
$ ./original_a.out
[0] 0 0 0
[1] 1 1 1
[2] 2 2 2
[3] 3 3 3
[4] 4 4 4