Linker and Libraries Guide

Generating the Output File

After input file processing and symbol resolution has completed with no fatal errors, the link-editor generates the output file. The link-editor first generates the additional sections necessary to complete the output file. These sections include the symbol tables, which contain local symbol definitions together with resolved global symbol and weak symbol information, from all the input files.

Also included are any output relocation and dynamic information sections required by the runtime linker. After all the output section information has been established, the total output file size is calculated. The output file image is then created accordingly.

When creating a dynamic executable or shared object, two symbol tables are usually generated. The .dynsym table and its associated string table .dynstr contain register, global, weak, and section symbols. These sections become part of the text segment that is mapped as part of the process image at runtime. See mmapobj(2). This mapping enables the runtime linker to read these sections to perform any necessary relocations.

The .symtab table, and its associated string table .strtab contain all the symbols collected from the input file processing. These sections are not mapped as part of the process image. These sections can even be stripped from the image by using the link-editor's -s option, or after the link-edit by using strip(1).

During the generation of the symbol tables, reserved symbols are created. These symbols have special meaning to the linking process. These symbols should not be defined in your code.

_etext: The first location after all read-only information, typically referred to as the text segment.
_edata: The first location after initialized data.
_end: The first location after all data.
_DYNAMIC: The address of the .dynamic information section.
_END_: The same as _end. The symbol has local scope and, together with the _START_ symbol, provides a simple means of establishing an object's address range.
_GLOBAL_OFFSET_TABLE_: The position-independent reference to a link-editor supplied table of addresses, the .got section. This table is constructed from position-independent data references that occur in objects that have been compiled with the -K pic option. See Position-Independent Code.
_PROCEDURE_LINKAGE_TABLE_: The position-independent reference to a link-editor supplied table of addresses, the .plt section. This table is constructed from position-independent function references that occur in objects that have been compiled with the -K pic option. See Position-Independent Code.
_START_: The first location within the text segment. The symbol has local scope and, together with the _END_ symbol, provides a simple means of establishing an object's address range.

When generating an executable, the link-editor looks for additional symbols to define the executable's entry point. If a symbol was specified using the link-editor's -e option, that symbol is used. Otherwise the link-editor looks for the reserved symbol names _start, and then main.

Identifying Capability Requirements

Capabilities identify the attributes of a system that are required to allow code to execute. The following capabilities, in their order of precedence, are available.

A platform capability - identifies a specific platform by name.
A machine capability - identifies a specific machine hardware by name.
Hardware capabilities - identify instruction set extensions and other hardware details with capabilities flags.
Software capabilities - reflect attributes of the software environment with capabilities flags.

Each of these capabilities can be defined individually, or combined to produce a capabilities group.

Code that can only be executed when certain capabilities are available should identify these requirements by means of a capabilities section within the associated ELF object. Recording capability requirements within an object allows the system to validate the object before attempting to execute the associated code. These requirements can also provide a framework where the system can select the most appropriate object from a family of objects. A family consists of variants of the same object, where each variant requires different capabilities.

Dynamic objects, as well as individual functions or initialized data items within an object, can be associated with capability requirements. Ideally, capability requirements are recorded in the relocatable objects that are produced by the compiler, and reflect the options or optimization that was specified at compile time. The link-editor combines the capabilities of any input relocatable objects to create a final capabilities section for the output file. See Capabilities Section.

In addition, capabilities can be defined when the link-editor creates an output file. These capabilities are identified using a mapfile and the link-editor's -M option. Capabilities that are defined by using a mapfile can augment, or override, the capabilities that are specified within any input relocatable objects. Mapfiles are usually used to augment compilers that do not generate the necessary capability information.

System capabilities are the capabilities that describe a running system. The platform name, and machine hardware name can be displayed with uname(1) using the -i option and -m option respectively. The system hardware capabilities can be displayed with isainfo(1) using the -v option. At runtime, the capability requirements of an object are compared against the system capabilities to determine whether the object can be loaded, or a symbol within the object can be used.

Object capabilities are capabilities that are associated with an object. These capabilities define the requirements of the entire object, and control whether the object can be loaded at runtime. If an object requires capabilities that can not be satisfied by the system, then the object can not be loaded at runtime. Capabilities can be used to provide more than one instance of a given object, each optimized for systems that match the objects requirements. The runtime linker can transparently select the best instance from such a family of object instances by comparing the objects capability requirements to the capabilities provided by the system.

Symbol capabilities are capabilities that are associated with individual functions, or initialized data items, within an object. These capabilities define the requirements of one or more symbols within an object, and control whether the symbol can be used at runtime. Symbol capabilities allow for the presence of multiple instances of a function within a single object. Each instance of the function can be optimized for a system with different capabilities. Symbol capabilities also allow for the presence of multiple instances of an initialized data item within an object. Each instance of the data can define system specific data. If a symbol instance requires capabilities that can not be satisfied by the system, then that symbol instance can not be used at runtime. Instead, an alternative instance of the same symbol name must be used. Symbol capabilities offer the ability to construct a single object that can be used on systems of varying abilities. A family of functions can provide optimized instances for systems that can support the capabilities, and more generic instances for other, less capable systems. A family of initialized data items can provide system specific data. The runtime linker transparently selects the best instance from such a family of symbol instances by comparing the symbols capability requirements to the capabilities provided by the system.

Object and symbol capabilities provide for selecting the best object, and the best symbol within an object, for the currently running system. Object and symbol capabilities are optional features, both independent of each other. However, an object that defines symbol capabilities may also define object capabilities. In this case, any family of capabilities symbols should be accompanied with one instance of the symbol that satisfies the object capabilities. If no object capabilities exist, any family of capability symbols should be accompanied with one instance of the symbol that requires no capabilities. This symbol instance provides the default implementation, should no capability instance be applicable for a given system.

The following x86 example displays the object capabilities of foo.o. These capabilities apply to the entire object. In this example, no symbol capabilities exist.

$ elfdump -H foo.o

Capabilities Section:  .SUNW_cap

 Object Capabilities:
     index  tag               value
       [0]  CA_SUNW_HW_1     0x840  [ SSE  MMX ]

The following x86 example displays the symbol capabilities of bar.o. These capabilities apply to the individual functions foo and bar. Two instances of each symbol exist, each instance being assigned to a different set of capabilities. In this example, no object capabilities exist.

$ elfdump -H bar.o

Capabilities Section:  .SUNW_cap

 Symbol Capabilities:
     index  tag               value
       [1]  CA_SUNW_HW_1      0x40  [ MMX ]

  Symbols:
     index    value      size      type bind oth ver shndx    name
      [25]  0x00000000 0x00000021  FUNC LOCL  D    0 .text    foo%mmx
      [26]  0x00000024 0x0000001e  FUNC LOCL  D    0 .text    bar%mmx

 Symbol Capabilities:
     index  tag               value
       [3]  CA_SUNW_HW_1      0x800  [ SSE ]

  Symbols:
     index    value      size      type bind oth ver shndx    name
      [33]  0x00000044 0x00000021  FUNC LOCL  D    0 .text    foo%sse
      [34]  0x00000068 0x0000001e  FUNC LOCL  D    0 .text    bar%sse

Note –

In this example, the capability symbols follow a naming convention that appends a capability identifier to the generic symbol name. This convention can be produced by the link-editor when object capabilities are converted to symbol capabilities, and is discussed later in Converting Object Capabilities to Symbol Capabilities.

Capability definitions provide for many combinations that allow you to identify the requirements of an object, or of individual symbols within an object. Hardware capabilities provide the greatest flexibility. Hardware capabilities define hardware requirements without dictating a specific machine hardware name, or platform name. However, sometimes there are attributes of an underlying system that can only be determined from the machine hardware name, or platform name. Identifying a capability name can allow you to code to very specific system capabilities, but the use of the identified object can be restrictive. Should a new machine hardware name or platform name become applicable for the object, the object must be rebuilt to identify the new capability name.

The following sections describe how capabilities can be defined, and used by the link-editor.

Identifying a Platform Capability

A platform capability of an object identifies the platform name of the systems that the object, or specific symbols within the object, can execute upon. Multiple platform capabilities can be defined. This identification is very specific, and takes precedence over any other capability types.

The platform name of a system can be displayed by the utility uname(1) with the -i option.

A platform capability requirement can be defined using the following mapfile syntax.

        $mapfile_version 2
        CAPABILITY {
                PLATFORM  = platform_name...;
                PLATFORM += platform_name...;
                PLATFORM -= platform_name...;
        };

The PLATFORM attribute is qualified with one or more platform names. The “+=” form of assignment augments the platform capabilities specified by the input objects, while the “=” form overrides them. The “-=" form of assignment is used to exclude platform capabilities from the output object. The following SPARC example identifies the object foo.so.1 as being specific to the SUNW,SPARC-Enterprise platform.

$ cat mapfile
$mapfile_version 2
CAPABILITY {
        PLATFORM = 'SUNW,SPARC-Enterprise';
};
$ cc -o foo.so.1 -G -K pic -Mmapfile foo.c -lc
$ elfdump -H foo.so.1

Capabilities Section:  .SUNW_cap

 Object Capabilities:
     index  tag               value
       [0]  CA_SUNW_PLAT     SUNW,SPARC-Enterprise

Relocatable objects can define platform capabilities. These capabilities are gathered together to define the final capability requirements of the object being built.

The platform capability of an object can be controlled explicitly from a mapfile by using the “=” form of assignment to override any platform capabilities that might be provided from any input relocatable objects. An empty PLATFORM attribute used with the “=” form of assignment effectively removes any platform capabilities requirement from the object being built.

A platform capability requirement defined in a dynamic object is validated by the runtime linker against the platform name of the system. The object is only used if one of the platform names recorded in the object match the platform name of the system.

Targeting code to a specific platform can be useful in some instances, however the development of a hardware capabilities family can provide greater flexibility, and is recommended. Hardware capabilities families can provide for optimized code to be exercised on a broader range of systems.

Identifying a Machine Capability

A machine capability of an object identifies the machine hardware name of the systems that the object, or specific symbols within the object, can execute upon. Multiple machine capabilities can be defined. This identification carries less precedence than platform capability definitions, but takes precedence over any other capability types.

The machine hardware name of a system can be displayed by the utility uname(1) with the -m option.

A machine capability requirement can be defined using the following mapfile syntax.

        $mapfile_version 2
        CAPABILITY {
                MACHINE  = machine_name...;
                MACHINE += machine_name...;
                MACHINE -= machine_name...;
        };

The MACHINE attribute is qualified with one or more machine hardware names. The “+=” form of assignment augments the machine capabilities specified by the input objects, while the “=” form overrides them. The “-=" form of assignment is used to exclude machine capabilities from the output object. The following SPARC example identifies the object foo.so.1 as being specific to the sun4u machine hardware name.

$ cat mapfile
$mapfile_version 2
CAPABILITY {
        MACHINE = sun4u;
};
$ cc -o foo.so.1 -G -K pic -Mmapfile foo.c -lc
$ elfdump -H foo.so.1

Capabilities Section:  .SUNW_cap

 Object Capabilities:
     index  tag               value
       [0]  CA_SUNW_MACH     sun4u

Relocatable objects can define machine capabilities. These capabilities are gathered together to define the final capability requirements of the object being built.

The machine capability of an object can be controlled explicitly from a mapfile by using the “=” form of assignment to override any machine capabilities that might be provided from any input relocatable objects. An empty MACHINE attribute used with the “=” form of assignment effectively removes any machine capabilities requirement from the object being built.

A machine capability requirement defined in a dynamic object is validated by the runtime linker against the machine hardware name of the system. The object is only used if one of the machine names recorded in the object match the machine name of the system.

Targeting code to a specific machine can be useful in some instances, however the development of a hardware capabilities family can provide greater flexibility, and is recommended. Hardware capabilities families can provide for optimized code to be exercised on a broader range of systems.

Identifying Hardware Capabilities

The hardware capabilities of an object identify the hardware requirements of a system necessary for the object, or specific symbol, to execute correctly. An example of this requirement might be the identification of code that requires the MMX or SSE features that are available on some x86 architectures.

Hardware capability requirements can be identified using the following mapfile syntax.

        $mapfile_version 2
        CAPABILITY {
                HW  = hwcap_flag...;
                HW += hwcap_flag...;
                HW -= hwcap_flag...;
        };

The HW attribute to the CAPABILITY directive is qualified with one or more tokens, which are symbolic representations of hardware capabilities. The “+=” form of assignment augments the hardware capabilities specified by the input objects, while the “=” form overrides them. The “-=" form of assignment is used to exclude hardware capabilities from the output object.

For SPARC systems, hardware capabilities are defined as AV_ values in sys/auxv_SPARC.h. For x86 systems, hardware capabilities are defined as AV_ values in sys/auxv_386.h.

The following x86 example shows the declaration of MMX and SSE as hardware capabilities required by the object foo.so.1.

$ egrep "MMX|SSE" /usr/include/sys/auxv_386.h
#define AV_386_MMX    0x0040
#define AV_386_SSE    0x0800
$ cat mapfile
$mapfile_version 2
CAPABILITY {
        HW += SSE MMX;
};
$ cc -o foo.so.1 -G -K pic -Mmapfile foo.c -lc
$ elfdump -H foo.so.1

Capabilities Section:  .SUNW_cap

 Object Capabilities:
     index  tag               value
      [0]  CA_SUNW_HW_1     0x840  [ SSE  MMX ]

Relocatable objects can contain hardware capabilities values. The link-editor combines any hardware capabilities values from multiple input relocatable objects. The resulting CA_SUNW_HW_1 value is a bitwise-inclusive OR of the associated input values. By default, these values are combined with the hardware capabilities specified by a mapfile.

The hardware capability requirements of an object can be controlled explicitly from a mapfile by using the “=” form of assignment to override any hardware capabilities that might be provided from any input relocatable objects. An empty HW attribute used with the “=” form of assignment effectively removes any hardware capabilities requirement from the object being built.

The following example suppresses any hardware capabilities data defined by the input relocatable object foo.o from being included in the output file, bar.o.

$ elfdump -H foo.o

Capabilities Section:  .SUNW_cap

 Object Capabilities:
     index  tag               value
       [0]  CA_SUNW_HW_1     0x840  [ SSE  MMX ]
$ cat mapfile
$mapfile_version 2
CAPABILITY {
        HW = ;
};
$ ld -o bar.o -r -Mmapfile foo.o
$ elfdump -H bar.o
$

Any hardware capability requirements defined by a dynamic object are validated by the runtime linker against the hardware capabilities that are provided by the system. If any of the hardware capability requirements can not be satisfied, the object is not loaded at runtime. For example, if the SSE feature is not available to a process, ldd(1) indicates the following error.

$ ldd prog
        foo.so.1 =>      ./foo.so.1  - hardware capability unsupported: \
                         0x800 [ SSE ]
        ....

Multiple variants of a dynamic object that exploit different hardware capabilities can provide a flexible runtime environment using filters. See Capability Specific Shared Objects.

Hardware capabilities can also be used to identify the capabilities of individual functions within a single object. In this case, the runtime linker can select the most appropriate function instance to use based upon the current system capabilities. See Creating a Family of Symbol Capabilities Functions.

Identifying Software Capabilities

The software capabilities of an object identify characteristics of the software that might be important for debugging or monitoring processes. Software capabilities can also influence process execution. Presently, the only software capabilities that are recognized relate to frame pointer usage by the object, and process address space restrictions.

Objects can indicate that their frame pointer use is known. This state is then qualified by declaring the frame pointer as being used or not.

64–bit objects can indicate that at runtime they must be exercised within a 32–bit address space.

Software capabilities flags are defined in sys/elf.h.

#define  SF1_SUNW_FPKNWN    0x001
#define  SF1_SUNW_FPUSED    0x002
#define  SF1_SUNW_ADDR32    0x004

These software capability requirements can be identified using the following mapfile syntax.

        $mapfile_version 2
        CAPABILITY {
                SF  = sfcap_flags...;
                SF += sfcap_flags...;
                SF -= sfcap_flags...;
        };

The SF attribute to the CAPABILITY directive can be assigned any of the tokens FPKNWN, FPUSED and ADDR32.

Relocatable objects can contain software capabilities values. The link-editor combines the software capabilities values from multiple input relocatable objects. Software capabilities can also be supplied with a mapfile. By default, any mapfile values are combined with the values supplied by relocatable objects.

The software capability requirements of an object can be controlled explicitly from a mapfile by using the “=” form of assignment to override any software capabilities that might be provided from any input relocatable objects. An empty SF attribute used with the “=” form of assignment effectively removes any software capabilities requirement from the object being built.

The following example suppresses any software capabilities data defined by the input relocatable object foo.o from being included in the output file, bar.o.

$ elfdump -H foo.o

Capabilities Section:  .SUNW_cap

 Object Capabilities:
     index  tag               value
       [0]  CA_SUNW_SF_1     0x3  [ SF1_SUNW_FPKNWN  SF1_SUNW_FPUSED ]
$ cat mapfile
$mapfile_version 2
CAPABILITY {
        SF = ;
};
$ ld -o bar.o -r -Mmapfile foo.o
$ elfdump -H bar.o
$

Software Capability Frame Pointer Processing

The computation of a CA_SUNW_SF_1 value from two frame pointer input values is as follows.

Table 2–1 CA_SUNW_SF_1 Frame Pointer Flag Combination State Table


Input file 1	Input file 2
	`SF1_SUNW_FPKNWN` `SF1_SUNW_FPUSED`	`SF1_SUNW_FPKNWN`	`<unknown>`
`SF1_SUNW_FPKNWN` `SF1_SUNW_FPUSED`	`SF1_SUNW_FPKNWN` `SF1_SUNW_FPUSED`	`SF1_SUNW_FPKNWN`	`SF1_SUNW_FPKNWN` `SF1_SUNW_FPUSED`
`SF1_SUNW_FPKNWN`	`SF1_SUNW_FPKNWN`	`SF1_SUNW_FPKNWN`	`SF1_SUNW_FPKNWN`
`<unknown>`	`SF1_SUNW_FPKNWN` `SF1_SUNW_FPUSED`	`SF1_SUNW_FPKNWN`	`<unknown>`

This computation is applied to each relocatable object value and mapfile value. The frame pointer software capabilities of an object are unknown if no .SUNW_cap section exists, or if the section contains no CA_SUNW_SF_1 value, or if neither the SF1_SUNW_FPKNW or SF1_SUNW_FPUSED flags are set.

Software Capability Address Space Restriction Processing

64–bit objects that are identified with the SF1_SUNW_ADDR32 software capabilities flag can contain optimized code that requires a 32–bit address space. 64–bit objects that are identified in this manner can interoperate with any other 64–bit objects whether they are identified with the SF1_SUNW_ADDR32 flag or not. An occurrence of the SF1_SUNW_ADDR32 flag within a 64–bit input relocatable object is propagated to the CA_SUNW_SF_1 value that is created for the output file being created by the link-editor.

The existence of the SF1_SUNW_ADDR32 flag within a 64–bit executable ensures that the associated process is restricted to the lower 32–bit address space. This restricted address space includes the process stack and all process dependencies. Within such a process, all objects, whether they are identified with the SF1_SUNW_ADDR32 flag or not, are loaded within the restricted 32–bit address space.

64–bit shared objects can contain the SF1_SUNW_ADDR32 flag. However, the restricted address space requirement can only be established by a 64–bit executable containing the SF1_SUNW_ADDR32 flag. Therefore, a 64–bit SF1_SUNW_ADDR32 shared object must be a dependency of a 64–bit SF1_SUNW_ADDR32 executable.

A 64–bit SF1_SUNW_ADDR32 shared object that is encountered by the link-editor when building an unrestricted 64–bit executable results in a warning.

$ cc -m64 -o main main.c -lfoo
ld: warning: file libfoo.so: section .SUNW_cap: software capability ADDR32: \
    requires executable be built with ADDR32 capability

A 64–bit SF1_SUNW_ADDR32 shared object that is encountered at runtime by a process that is created from an unrestricted 64–bit executable, results in a fatal error.

$ ldd main
        libfoo.so =>     ./libfoo.so  - software capability unsupported: \ 
                         0x4  [ ADDR32 ]
        ....
$ main
ld.so.1: main: fatal: ./libfoo.so: software capability unsupported: 0x4  [ ADDR32 ]

An executable can be seeded with the SF1_SUNW_ADDR32 using a mapfile.

$ cat mapfile
$mapfile_version 2
CAPABILITY {
        SF += ADDR32;
};
$ cc -m64 -o main main.c -Mmapfile -lfoo
$ elfdump -H main

Capabilities Section:  .SUNW_cap

 Object Capabilities:
     index  tag               value
       [0]  CA_SUNW_SF_1     0x4  [ SF1_SUNW_ADDR32 ]

Creating a Family of Symbol Capabilities Functions

Developers often desire to provide multiple instances of functions, each optimized for a particular set of capabilities, within a single object. It is desirable for the selection and use of these instances to be transparent to any consumers. A generic, front-end function can be created to provide an external interface. This generic instance, together with the optimized instances, can be combined into one object. The generic instance might use getisax(2) to determine the systems capabilities and then call the appropriate optimized function instance to handle a task. Although this model works, it suffers from a lack of generality, and incurs a runtime overhead.

Symbol capabilities offer an alternative mechanism to construct such an object. This mechanism is simpler, more efficient, and does not require you to write additional front-end code. Multiple instances of a function can be created and associated with different capabilities. These instances, together with a default instance of the function that is suitable for any system, can be combined into a single dynamic object. The selection of the most appropriate member from this family of symbols is carried out by the runtime linker using the symbol capabilities information.

In the following example, the x86 objects foobar.mmx.o and foobar.sse.o, contain the same function foo() and bar(), that have been compiled to use the MMX and SSE instructions respectively.

$ elfdump -H foobar.mmx.o

Capabilities Section:  .SUNW_cap

 Symbol Capabilities:
     index  tag               value
       [1]  CA_SUNW_ID        mmx
       [2]  CA_SUNW_HW_1      0x40  [ MMX ]

  Symbols:
     index    value      size      type bind oth ver shndx    name
      [10]  0x00000000 0x00000021  FUNC LOCL  D    0 .text    foo%mmx
      [16]  0x00000024 0x0000001e  FUNC LOCL  D    0 .text    bar%mmx

$ elfdump -H foobar.sse.o

Capabilities Section:  .SUNW_cap

 Symbol Capabilities:
     index  tag               value
       [1]  CA_SUNW_ID        sse
       [2]  CA_SUNW_HW_1      0x800  [ SSE ]

  Capabilities symbols:
     index    value      size      type bind oth ver shndx    name
      [16]  0x00000000 0x0000002f  FUNC LOCL  D    0 .text    foo%sse
      [18]  0x00000048 0x00000030  FUNC LOCL  D    0 .text    bar%sse

Each of these objects contain a local symbol identifying the capabilities function foo%*() and bar%*(). In addition, each object also defines a global reference to the function foo() and bar(). Any internal references to foo() or bar() are relocated through these global references, as are any external interfaces.

These two objects can now be combined with a default instance of foo() and bar(). These default instances satisfy the global references, and provide an implementation that is compatible with any object capabilities. These default instances are said to lead each capabilities family. If no object capabilities exist, this default instance should also require no capabilities. Effectively, three instances of foo() and bar() exist, the global instance provides the default, and the local instances provide implementations that are used at runtime if the associated capabilities are available.

$ cc -o libfoobar.so.1 -G foobar.o foobar.sse.o foobar.mmx.o
$ elfdump -sN.dynsym libfoobar.so.1 | egrep "foo|bar"
       [2]  0x00000700 0x00000021  FUNC LOCL  D    0 .text    foo%mmx
       [4]  0x00000750 0x0000002f  FUNC LOCL  D    0 .text    foo%sse
       [8]  0x00000784 0x0000001e  FUNC LOCL  D    0 .text    bar%mmx
       [9]  0x000007b0 0x00000030  FUNC LOCL  D    0 .text    bar%sse
      [15]  0x000007a0 0x00000014  FUNC GLOB  D    1 .text    foo
      [17]  0x000007c0 0x00000014  FUNC GLOB  D    1 .text    bar

The capabilities information for a dynamic object displays the capabilities symbols, and reveals the capabilities families that are available.

$ elfdump -H libfoobar.so.1

Capabilities Section:  .SUNW_cap

 Symbol Capabilities:
     index  tag               value
       [1]  CA_SUNW_ID        mmx
       [2]  CA_SUNW_HW_1      0x40  [ MMX ]

  Symbols:
     index    value      size      type bind oth ver shndx    name
       [2]  0x00000700 0x00000021  FUNC LOCL  D    0 .text    foo%mmx
       [8]  0x00000784 0x0000001e  FUNC LOCL  D    0 .text    bar%mmx

 Symbol Capabilities:
     index  tag               value
       [4]  CA_SUNW_ID        sse
       [5]  CA_SUNW_HW_1      0x800  [ SSE ]

  Symbols:
     index    value      size      type bind oth ver shndx    name
       [4]  0x00000750 0x0000002f  FUNC LOCL  D    0 .text    foo%sse
       [9]  0x000007b0 0x00000030  FUNC LOCL  D    0 .text    bar%sse

Capabilities Chain Section:  .SUNW_capchain

 Capabilities family: foo
  chainndx  symndx      name
         1  [15]        foo
         2  [2]         foo%mmx
         3  [4]         foo%sse

 Capabilities family: bar
  chainndx  symndx      name
         5  [17]        bar
         6  [8]         bar%mmx
         7  [9]         bar%sse

At runtime, all references to foo() and bar() are initially bound to the global symbols. However, the runtime linker recognizes that these functions are the lead instance of a capabilities family. The runtime linker inspects each family member to determine if a better capability function is available. There is a one time cost to this operation, which occurs on the first call to the function. Subsequent calls to foo() and bar() are bound directly to the function instance selected by the first call. This function selection can be observed by using the runtime linkers debugging capabilities.

In the following example, the underlying system does not provide MMX or SSE support. The lead instance of foo() requires no special capabilities support, and thus satisfies any relocation reference.

$ LD_DEBUG=symbols main
....
debug: symbol=foo;  lookup in file=./libfoo.so.1  [ ELF ]
debug:   symbol=foo[15]:  capability family default
debug:   symbol=foo%mmx[2]:  capability specific (CA_SUNW_HW_1):  [ 0x40  [ MMX ] ]
debug:   symbol=foo%mmx[2]:  capability rejected
debug:   symbol=foo%sse[4]:  capability specific (CA_SUNW_HW_1):  [ 0x800  [ SSE ] ]
debug:   symbol=foo%sse[4]:  capability rejected
debug:   symbol=foo[15]:  used

In the following example, MMX is available, but SSE is not. The MMX capable instance of foo() satisfies any relocation reference.

$ LD_DEBUG=symbols main
....
debug: symbol=foo;  lookup in file=./libfoo.so.1  [ ELF ]
debug:   symbol=foo[15]:  capability family default
debug:   symbol=foo[2]:  capability specific (CA_SUNW_HW_1):  [ 0x40  [ MMX ] ]
debug:   symbol=foo[2]:  capability candidate
debug:   symbol=foo[4]:  capability specific (CA_SUNW_HW_1):  [ 0x800  [ SSE ] ]
debug:   symbol=foo[4]:  capability rejected
debug:   symbol=foo[2]:  used

When more than one capability instance can be exercised on the same system, a set of precedent rules are used to select one instance.

A capability group that defines a platform name takes precedent over a group that does not define a platform name.
A capability group that defines a machine hardware name takes precedent over a group that does not define a machine hardware name.
A larger hardware capabilities value takes precedent over a smaller hardware capabilities value.

A family of capabilities function instances must be accessed from a procedure linkage table entry. See Procedure Linkage Table (Processor-Specific). This procedure linkage reference requires the runtime linker to resolve the function. During this process, the runtime linker can process the associated symbol capabilities information, and select the best function from the available family of function instances.

When symbol capabilities are not used, there are cases where the link-editor can resolve references to code without the need of a procedure linkage table entry. For example, within a dynamic executable, a reference to a function that exists within the executable can be bound internally at link-edit time. Hidden and protected functions within shared objects can also be bound internally at link-edit time. In these cases, there is normally no need for the runtime linker to be involved in resolving a reference to these functions.

However, when symbol capabilities are used, the function must be resolved from a procedure linkage table entry. This entry is necessary in order for the runtime linker to be involved in selecting the appropriate function, while maintaining a read-only text segment. This mechanism results in an indirection through a procedure linkage table entry for all calls to a capability function. This indirection might not be necessary if symbol capabilities are not used. Therefore, there is a small trade off between the cost of calling the capability function, and any performance improvement gained from using the capability function over its default counterpart.

Note –

Although a capability function must be accessed through a procedure linkage table entry, the function can still be defined as hidden or protected. The runtime linker honors these visibility states and restricts any binding to these functions. This behavior results in the same bindings as are produced when symbol capabilities are not associated with the function. A hidden function can not be bound to from an external object. A reference to a protected function from within an object will only be bound to within the same object.

Creating a Family of Symbol Capabilities Data Items

Multiple instances of initialized data, where each instance is specific to a system, can be provided within the same object. However, providing such data through functional interfaces if often simpler, and is recommended. See Creating a Family of Symbol Capabilities Functions. Special care is required to provide multiple instances of initialized data within an executable.

The following example initializes a data item foo within foo.c, to point to a machine name string. This file can be compiled for various machines, and each instance is identified with a machine capability. A reference to this data item is made from bar() from the file bar.c. A shared object foobar.so.1 is then created by combining bar() with two capabilities instances of foo.

$ cat foo.c
char *foo = MACHINE;
$ cat bar.c
#include	<stdio.h>

extern char *foo = MACHINE;

void bar()
{
        (void) printf("machine: %s\n", foo);
}

$ elfdump -H foobar.so.1

Capabilities Section:  .SUNW_cap

 Symbol Capabilities:
     index  tag               value
       [1]  CA_SUNW_ID       sun4u
       [2]  CA_SUNW_MACH     sun4u

  Symbols:
     index    value      size      type bind oth ver shndx          name
       [1]  0x000108d4 0x00000004  OBJT LOCL  D    0 .data          foo%sun4u

 Symbol Capabilities:
     index  tag               value
       [4]  CA_SUNW_ID       sun4v
       [5]  CA_SUNW_MACH     sun4v

  Symbols:
     index    value      size      type bind oth ver shndx          name
       [2]  0x000108d8 0x00000004  OBJT LOCL  D    0 .data          foo%sun4v

An application can reference bar(), and the runtime linker binds to the instance of foo that is associated with the underlying system.

$ uname -m
sun4u
$ main
machine: sun4u

The proper operation of this code depends on the code having been compiled to be position-independent, as is normally the case for code in sharable objects. See Position-Independent Code. Position-independent data references are indirect references, which allow the runtime linker to locate the required reference and update elements of the data segment. This relocation update of the data segment preserves the text segment as read-only.

However, the code within an executable is typically position-dependent. In addition, data references within an executable are bound at link-edit time. Within an executable, a symbol capabilities data reference must remain unresolved through a global data item, so that the runtime linker can select from the symbol capabilities family. If the reference from bar() in the previous example bar.c is compiled as position-dependent code, then the text segment of the executable must be relocated at runtime. By default, this condition results in a fatal link-time error.

$ cc -o main main.c bar.c foo.o foo.1.o foo.2.o ...
warning: Text relocation remains                referenced
    against symbol                  offset      in file
foo                                 0x0         bar.o
foo                                 0x8         bar.o

One approach to solve this error condition is to compile bar.c as position-independent. Note however, that all references to any symbol capabilities data items from within the executable must be compiled position-independent for this technique to work.

Although data can be accessed using the symbol capabilities mechanism, making data items a part of the public interface to an object can be problematic. An alternative, and more flexible model, is to encapsulate each data item within a symbol capabilities function. This function provides the sole means of access to the data. Hiding data behind a symbol capabilities function has the important benefit of allowing the data to be defined static and kept private. The previous example can be coded to use symbol capabilities functions.

$ cat foobar.c
cat bar.c
#include	<stdio.h>

static char *foo = MACHINE;

void bar()
{
        (void) printf("machine: %s\n", foo);
}
$ elfdump -H main

Capabilities Section:  .SUNW_cap

 Symbol Capabilities:
     index  tag               value
       [1]  CA_SUNW_ID       sun4u
       [2]  CA_SUNW_MACH     sun4u

  Symbols:
     index    value      size      type bind oth ver shndx          name
       [1]  0x0001111c 0x0000001c  FUNC LOCL  D    0 .text          bar%sun4u

 Symbol Capabilities:
     index  tag               value
       [4]  CA_SUNW_ID       sun4v
       [5]  CA_SUNW_MACH     sun4v

  Symbols:
     index    value      size      type bind oth ver shndx          name
       [2]  0x00011138 0x0000001c  FUNC LOCL  D    0 .text          bar%sun4v

$ uname -m
sun4u
$ main
machine: sun4u

Converting Object Capabilities to Symbol Capabilities

Ideally, the compiler can generate objects that are identified with symbol capabilities. If the compiler can not create symbol capabilities, the link-editor offers a solution.

A relocatable object that defines object capabilities can be transformed into a relocatable object that defines symbol capabilities using the link-editor. Using the link-editor -z symbolcap option, any capability data section is converted to define symbol capabilities. All global functions within the object are converted into local functions, and are associated with symbol capabilities. All global initialized data items are converted to local data items, and are associated with symbol capabilities. These transformed symbols are appended with any capability identifier specified as part of the object capabilities group. If a capability identifier is not defined, a default group name is appended.

For each original global function or initialized data item, a global reference is created. This reference is associated to any relocation requirements, and provides for binding to a default, global symbol when this object is finally combined to create a dynamic executable or shared object.

Note –

The -z symbolcap option applies to objects that contain an object capabilities section. The option has no affect upon relocatable objects that already contain symbol capabilities, or relocatable objects that contain both object and symbol capabilities. This design allows multiple objects to be combined by the link-editor, with only those objects that contain object capabilities being affected by the option.

In the following example, a x86 relocatable object contains two global functions foo() and bar(). This object has been compiled to require the MMX and SSE hardware capabilities. In these examples, the capabilities group has been named with a capabilities identifier entry. This identifier name is appended to the transformed symbol names. Without this explicit identifier, the link-editor appends a default capabilities group name.

$ elfdump -H foo.o

Capabilities Section:  .SUNW_cap

 Object Capabilities:
     index  tag               value
       [0]  CA_SUNW_ID        sse,mmx
       [1]  CA_SUNW_HW_1      0x840  [ SSE MMX ]

$ elfdump -s foo.o | egrep "foo|bar"
      [25]  0x00000000 0x00000021  FUNC GLOB  D    0 .text    foo
      [26]  0x00000024 0x0000001e  FUNC GLOB  D    0 .text    bar

$ elfdump -r foo.o | fgrep foo
  R_386_PLT32                0x38       .rel.text             foo

This relocatable object can now be transformed into a symbols capabilities relocatable object.

$ ld -r -o foo.1.o -z symbolcap foo.o
$ elfdump -H foo.1.o

Capabilities Section:  .SUNW_cap

 Symbol Capabilities:
     index  tag               value
       [1]  CA_SUNW_ID        sse,mmx
       [2]  CA_SUNW_HW_1      0x840  [ SSE MMX ]

  Symbols:
     index    value      size      type bind oth ver shndx    name
      [25]  0x00000000 0x00000021  FUNC LOCL  D    0 .text    foo%sse,mmx
      [26]  0x00000024 0x0000001e  FUNC LOCL  D    0 .text    bar%sse,mmx

$ elfdump -s foo.1.o | egrep "foo|bar"
      [25]  0x00000000 0x00000021  FUNC LOCL  D    0 .text    foo%sse,mmx
      [26]  0x00000024 0x0000001e  FUNC LOCL  D    0 .text    bar%sse,mmx
      [37]  0x00000000 0x00000000  FUNC GLOB  D    0 UNDEF    foo
      [38]  0x00000000 0x00000000  FUNC GLOB  D    0 UNDEF    bar

$ elfdump -r foo.1.o | fgrep foo
  R_386_PLT32                0x38       .rel.text             foo

This object can now be combined with other objects containing instances of the same functions, associated with different symbol capabilities, to produce an executable or shared object. In addition, a default instance of each function, one that is not associated with any symbol capabilities, should be provided to lead each capabilities family. This default instance provides for all external references, and ensures that an instance of the function is available on any system.

At runtime, any references to foo() and bar() are directed to the lead instances. However, the runtime linker selects the best symbol capabilities instance if the system accommodates the appropriate capabilities.

Exercising a Capability Family

Objects are normally designed and built so that they can execute on all systems of a given architecture. However, individual systems, with special capabilities, are often targeted for optimization. Optimized code can be identified with the capabilities that the code requires to execute, using the mechanisms described in the previous sections.

To exercise and test optimized instances it is necessary to use a system that provides the required capabilities. For each system, the runtime linker determines the capabilities that are available, and then chooses the most capable instances. To aid testing and experimentation, the runtime linker can be told to use an alternative set of capabilities than those provided by the system. In addition, you can specify that only specific files should be validated against these alternative capabilities.

An alternative set of capabilities is derived from the system capabilities, and can be re-initialized or have capabilities added or removed.

A family of environment variables is available to create and target the use of an alternative set of capabilities.

LD_PLATCAP={name}: Identifies an alternative platform name.
LD_MACHCAP={name}: Identifies an alternative machine hardware name.
LD_HWCAP=[+-]{token | number},....: Identifies an alternative hardware capabilities value.
LD_SFCAP=[+-]{token | number},....: Identifies an alternative software capabilities value.
LD_CAP_FILES=file,....: Identifies the files that should be validated against the alternative capabilities.

The capabilities environment variables LD_PLATCAP and LD_MACHCAP accept a string that defines the platform name and machine hardware names respectively. See Identifying a Platform Capability, and Identifying a Machine Capability.

The capabilities environment variables LD_HWCAP and LD_SFCAP accept a comma separated list of tokens as a symbolic representation of capabilities. See Identifying Hardware Capabilities, and Identifying Software Capabilities. A token can also be a numeric value.

A “+” prefix results in the capabilities that follow being added to the alternative capabilities. A “-” prefix results in the capabilities that follow being removed from the alternative capabilities. The lack of “+-” result in the capabilities that follow replacing the alternative capabilities.

The removal of a capability results in a more restricted capabilities environment being emulated. Normally, when a family of capabilities instances is available, a generic, non-capabilities specific instance is also provided. A more restricted capabilities environment can therefore be used to force the use of less capable, or generic code instances.

The addition of a capability results in a more enhanced capabilities environment being emulated. This environment should be created with caution, but can be used to exercise the framework of a capabilities family. For example, a family of functions can be created that define their expected capabilities using mapfiles. These functions can use printf(3C) to confirm their execution. The creation of the associated objects can then be validated and exercised with various capability combinations. This prototyping of a capabilities family can prove useful before the real capabilities requirements of the functions are coded. However, if the code within a family instance requires a specific capability to execute correctly, and this capability is not provided by the system, but is set as an alternative capability, the code instance will fail to execute correctly.

Establishing a set of alternative capabilities without also using LD_CAP_FILES results in all of the capabilities specific objects of a process being validated against the alternative capabilities. This approach should also be exercised with caution, as many system objects require system capabilities to execute correctly. Any alteration of capabilities can cause system objects to fail to execute correctly.

A best environment for capabilities experimentation is to use a system that provides all the capabilities your objects are targeted to use. LD_CAP_FILES should also be used to isolate the objects you wish to experiment with. Capabilities can then be disabled, using the “-” syntax, so that the various instances of your capabilities family can be exercised. Each instance is fully supported by the true capabilities of the system.

For example, suppose you have two x86 capabilities objects, libfoo.so and libbar.so. These objects contain capability functions optimized to use SSE2 instructions, functions optimized to use MMX instructions, and generic functions that require no capabilities. The underlying system provides both SSE2 and MMX. By default, the fully optimized SSE2 functions are used.

libfoo.so and libbar.so can be restricted to use the functions optimized for MMX instructions by removing the SSE2 capability by using a LD_HWCAP definition. The most flexible means of defining LD_CAP_FILES is to use the base name of the required files.

$ LD_HWCAP=-sse2 LD_CAP_FILES=libfoo.so,libbar.so ./main

libfoo.so and libbar.so can be further restricted to use only generic functions by removing the SSE2 and MMX capabilities.

$ LD_HWCAP=-sse2,mmx LD_CAP_FILES=libfoo.so,libbar.so ./main

Note –

The capabilities available for an application, and any alternative capabilities that have been set, can be observed using the runtime linkers diagnostics.

$ LD_DEBUG=basic LD_HWCAP=-sse2,mmx,cx8 ./main
....
02328: hardware capabilities (CA_SUNW_HW_1) - 0x5c6f  \
    [ SSE3 SSE2 SSE FXSR MMX CMOV SEP CX8 TSC FPU ]
02328: alternative hardware capabilities (CA_SUNW_HW_1) - 0x4c2b  \
    [ SSE3 SSE FXSR CMOV SEP TSC FPU ]
....