Input File Processing

The link-editor reads input files in the order in which the files appear on the command line. Each file is opened and inspected to determine the files ELF type, and therefore determine how the file must be processed. The file types that apply as input for the link-edit are determined by the binding mode of the link-edit, either static or dynamic.

Under static mode, the link-editor accepts only relocatable objects or archive libraries as input files. Under dynamic mode, the link-editor also accepts shared objects.

Relocatable objects represent the most basic input file type to the link-editing process. The program data sections within these files are concatenated into the output file image being generated. The link-edit information sections are organized for later use. Information sections do not become part of the output file image, as new information sections are generated to take their place. Symbols are gathered into an internal symbol table for verification and resolution. This table is then used to create one or more symbol tables in the output image.

Although input files can be specified directly on the link-edit command line, archive libraries and shared objects are commonly specified using the -l option. See Linking With Additional Libraries. During a link-edit, the interpretation of archive libraries and shared objects are quite different. The next two sections expand upon these differences.

Archive Processing

Archives are built using ar(1). Archives usually consist of a collection of relocatable objects with an archive symbol table. This symbol table provides an association of symbol definitions with the objects that supply these definitions. By default, the link-editor provides selective extraction of archive members. The link-editor uses unresolved symbolic references to select objects from the archive that are required to complete the binding process. You can also explicitly extract all members of an archive.

The link-editor extracts a relocatable object from an archive under the following conditions.

The archive member contains a symbol definition that satisfies a symbol reference, presently held in the link-editor's internal symbol table. This reference is sometimes referred to as an undefined symbol.
The archive member contains a data symbol definition that satisfies a tentative symbol definition presently held in the link-editor's internal symbol table. An example is a FORTRAN COMMON block definition, which causes the extraction of a relocatable object that defines the same DATA symbol.
The archive member contains a symbol definition that matches a reference that requires hidden visibility or protected visibility. See Table 12-20.
The link-editors -z allextract is in effect. This option suspends selective archive extraction and causes all archive members to be extracted from the archive being processed.

Under selective archive extraction, a weak symbol reference does not extract an object from an archive unless the -z weakextract option is in effect. See Simple Resolutions for more information.

Note - The options -z weakextract, -z allextract, and -z defaultextract enable you to toggle the archive extraction mechanism among multiple archives.

With selective archive extraction, the link-editor makes multiple passes through an archive. Relocatable objects are extracted as needed to satisfy the symbol information being accumulated in the link-editor internal symbol table. After the link-editor has made a complete pass through the archive without extracting any relocatable objects, the next input file is processed.

By extracting only the relocatable objects needed when an archive is encountered, the position of the archive on the command line can be significant. See Position of an Archive on the Command Line.

Note - Although the link-editor makes multiple passes through an archive to resolve symbols, this mechanism can be quite costly. Especially, for large archives that contain random organizations of relocatable objects. In these cases, you should use tools like lorder(1) and tsort(1) to order the relocatable objects within the archive. This ordering reduces the number of passes the link-editor must carry out.

Shared Object Processing

Shared objects are indivisible whole units that have been generated by a previous link-edit of one or more input files. When the link-editor processes a shared object, the entire contents of the shared object become a logical part of the resulting output file image. This logical inclusion means that all symbol entries defined in the shared object are made available to the link-editing process.

The shared object's program data sections and most of the link-editing information sections are unused by the link-editor. These sections are interpreted by the runtime linker when the shared object is bound to generate a runnable process. However, the occurrence of a shared object is remembered. Information is stored in the output file image to indicate that this object is a dependency that must be made available at runtime.

By default, all shared objects specified as part of a link-edit are recorded as dependencies in the object being built. This recording is made regardless of whether the object being built actually references symbols offered by the shared object. To minimize the overhead of runtime linking, only specify those dependencies that resolve symbol references from the object being built. The link-editor's debugging facility, and ldd(1) with the -u option, can be used to determine unused dependencies. The link-editor's -z ignore option can be used to suppress the dependency recording of any unused shared objects.

If a shared object has dependencies on other shared objects, these dependencies can also be processed. This processing occurs after all command line input files have been processed, to complete the symbol resolution process. However, the shared object names are not recorded as dependencies in the output file image being generated.

Although the position of a shared object on the command line has less significance than archive processing, the position can have a global effect. Multiple symbols of the same name are allowed to occur between relocatable objects and shared objects, and between multiple shared objects. See Symbol Resolution.

The order of shared objects processed by the link-editor is maintained in the dependency information that is stored in the output file image. In the absence of lazy loading, the runtime linker loads the specified shared objects in the same order. Therefore, the link-editor and the runtime linker select the first occurrence of a symbol of a multiply-defined series of symbols.

Note - Multiple symbol definitions, are reported in the load map output generated using the -m option.

Linking With Additional Libraries

Although the compiler drivers often ensure that appropriate libraries are specified to the link-editor, frequently you must supply your own. Shared objects and archives can be specified by explicitly naming the input files required to the link-editor. However, a more common and more flexible method involves using the link-editor's -l option.

Library Naming Conventions

By convention, shared objects are usually designated by the prefix lib and the suffix .so. Archives are designated by the prefix lib and the suffix .a. For example, libfoo.so is the shared object version of the “foo” implementation that is made available to the compilation environment. libfoo.a is the library's archive version.

These conventions are recognized by the -l option of the link-editor. This option is commonly used to supply additional libraries to a link-edit. The following example directs the link-editor to search for libfoo.so. If the link-editor does not find libfoo.so, a search for libfoo.a is made before moving on to the next directory to be searched.

$ cc -o prog file1.c file2.c -lfoo

Note - A naming convention exists regarding the compilation environment and the runtime environment use of shared objects. The compilation environment uses the simple .so suffix, whereas the runtime environment commonly uses the suffix with an additional version number. See Naming Conventions and Coordination of Versioned Filenames.

When link-editing in dynamic mode, you can choose to link with a mix of shared objects and archives. When link-editing in static mode, only archive libraries are acceptable for input.

In dynamic mode, when using the -l option, the link-editor first searches the given directory for a shared object that matches the specified name. If no match is found, the link-editor looks for an archive library in the same directory. In static mode, when using the -l option, only archive libraries are sought.

Linking With a Mix of Shared Objects and Archives

The library search mechanism in dynamic mode searches a given directory for a shared object, and then searches for an archive library. Finer control of the search is possible through the -B option.

By specifying the -B dynamic and -B static options on the command line, you can toggle the library search between shared objects or archives respectively. For example, to link an application with the archive libfoo.a and the shared object libbar.so, issue the following command.

$ cc -o prog main.o file1.c -Bstatic -lfoo -Bdynamic -lbar

The -B static and -B dynamic options are not exactly symmetrical. When you specify -B static, the link-editor does not accept shared objects as input until the next occurrence of -B dynamic. However, when you specify -B dynamic, the link-editor first looks for shared objects and then archive library's in any given directory.

The precise description of the previous example is that the link-editor first searches for libfoo.a. The link-editor then searches for libbar.so, and if that search fails, searches for libbar.a.

Position of an Archive on the Command Line

The position of an archive on the command line can affect the output file being produced. The link-editor searches an archive only to resolve undefined or tentative external references that have previously been encountered. After this search is completed and any required members have been extracted, the link-editor moves onto the next input file on the command line.

Therefore by default, the archive is not available to resolve any new references from the input files that follow the archive on the command line. For example, the following command directs the link-editor to search libfoo.a only to resolve symbol references that have been obtained from file1.c. The libfoo.a archive is not available to resolve symbol references from file2.c or file3.c.

$ cc -o prog file1.c -Bstatic -lfoo file2.c file3.c -Bdynamic

Interdependencies between archives can exist, such that the extraction of members from one archive must be resolved by extracting members from another archive. If these dependencies are cyclic, the archives must be specified repeatedly on the command line to satisfy previous references.

$ cc -o prog .... -lA -lB -lC -lA -lB -lC -lA

The determination, and maintenance, of repeated archive specifications can be tedious. The -z rescan-now option makes this process simpler. The -z rescan-now option is processed by the link-editor immediately when the option is encountered on the command line. All archives that have been processed from the command line prior to this option are immediately reprocessed. This processing attempts to locate additional archive members that resolve symbol references. This archive rescanning continues until a pass over the archive list occurs in which no new members are extracted. The previous example can be simplified as follows.

$ cc -o prog .... -lA -lB -lC -z rescan-now

Alternatively, the -z rescan-start and -z rescan-end options can be used to group mutually dependent archives together into an archive group. These groups are reprocessed by the link-editor immediately when the closing delimiter is encountered on the command line. Archives found within the group are reprocessed in an attempt to locate additional archive members that resolve symbol references. This archive rescanning continues until a pass over the archive group occurs in which no new members are extracted. Using archive groups, the previous example can be written as follows.

$ cc -o prog .... -z rescan-start -lA -lB -lC -z rescan-end

Note - You should specify any archives at the end of the command line unless multiple-definition conflicts require you to do otherwise.

Directories Searched by the Link-Editor

All previous examples assume the link-editor knows where to search for the libraries listed on the command line. By default, when linking 32–bit objects, the link-editor knows of only three standard directories in which to look for libraries, /usr/ccs/lib, followed by /lib, and finally /usr/lib. When linking 64–bit objects, only two standard directories are used, /lib/64 followed by /usr/lib/64. All other directories to be searched must be added to the link-editor's search path explicitly.

You can change the link-editor search path by using a command line option, or by using an environment variable.

Using a Command-Line Option

You can use the -L option to add a new path name to the library search path. This option alters the search path at the point the option is encountered on the command line. For example, the following command searches path1, followed by /usr/ccs/lib, /lib, and finally /usr/lib, to find libfoo. The command searches path1 and then path2, followed by /usr/ccs/lib, /lib, and /usr/lib, to find libbar.

$ cc -o prog main.o -Lpath1 file1.c -lfoo file2.c -Lpath2 -lbar

Path names that are defined by using the -L option are used only by the link-editor. These path names are not recorded in the output file image being created. Therefore, these path names are not available for use by the runtime linker.

Note - You must specify -L if you want the link-editor to search for libraries in your current directory. You can use a period (.) to represent the current directory.

You can use the -Y option to change the default directories searched by the link-editor. The argument supplied with this option takes the form of a colon separated list of directories. For example, the following command searches for libfoo only in the directories /opt/COMPILER/lib and /home/me/lib.

$ cc -o prog main.c -YP,/opt/COMPILER/lib:/home/me/lib -lfoo

The directories that are specified by using the -Y option can be supplemented by using the -L option. Compiler drivers often use the -Y option to provide compiler specific search paths.

Using an Environment Variable

You can also use the environment variable LD_LIBRARY_PATH to add to the link-editor's library search path. Typically, LD_LIBRARY_PATH takes a colon-separated list of directories. In its most general form, LD_LIBRARY_PATH can also take two directory lists separated by a semicolon. These lists are searched before and after the -Y lists supplied on the command line.

The following example shows the combined effect of setting LD_LIBRARY_PATH and calling the link-editor with several -L occurrences.

$ LD_LIBRARY_PATH=dir1:dir2;dir3
$ export LD_LIBRARY_PATH
$ cc -o prog main.c -Lpath1 .... -Lpath2 .... -Lpathn -lfoo

The effective search path is dir1:dir2:path1:path2.... pathn:dir3:/usr/ccs/lib:/lib:/usr/lib.

If no semicolon is specified as part of the LD_LIBRARY_PATH definition, the specified directory list is interpreted after any -L options. In the following example, the effective search path is path1:path2.... pathn:dir1:dir2:/usr/ccs/lib:/lib:/usr/lib.

$ LD_LIBRARY_PATH=dir1:dir2
$ export LD_LIBRARY_PATH
$ cc -o prog main.c -Lpath1 .... -Lpath2 .... -Lpathn -lfoo

Note - This environment variable can also be used to augment the search path of the runtime linker. See Directories Searched by the Runtime Linker. To prevent this environment variable from influencing the link-editor, use the -i option.

Directories Searched by the Runtime Linker

The runtime linker looks in two default locations for dependencies. When processing 32–bit objects, the default locations are /lib and /usr/lib. When processing 64–bit objects, the default locations are /lib/64 and /usr/lib/64. All other directories to be searched must be added to the runtime linker's search path explicitly.

When a dynamic executable or shared object is linked with additional shared objects, the shared objects are recorded as dependencies. These dependencies must be located during process execution by the runtime linker. When linking a dynamic object, one or more search paths can be recorded in the output file. These search paths are referred to as a runpath. The runtime linker uses the runpath of an object to locate the dependencies of that object.

Specialized objects can be built with the -z nodefaultlib option to suppress any search of the default location at runtime. Use of this option implies that all the dependencies of an object can be located using its runpaths. Without this option, no matter how you augment the runtime linker's search path, the last search paths used are always the default locations.

Note - The default search path can be administrated by using a runtime configuration file. See Configuring the Default Search Paths. However, the creator of a dynamic object should not rely on the existence of this file. You should always ensure that an object can locate its dependencies with only its runpaths or the default locations.

You can use the -R option, which takes a colon-separated list of directories, to record a runpath in a dynamic executable or shared object. The following example records the runpath /home/me/lib:/home/you/lib in the dynamic executable prog.

$ cc -o prog main.c -R/home/me/lib:/home/you/lib -Lpath1 \ 
 -Lpath2 file1.c file2.c -lfoo -lbar

The runtime linker uses these paths, followed by the default location, to obtain any shared object dependencies. In this case, this runpath is used to locate libfoo.so.1 and libbar.so.1.

The link-editor accepts multiple -R options. These multiple specifications are concatenate together, separated by a colon. Thus, the previous example can also be expressed as follows.

$ cc -o prog main.c -R/home/me/lib -Lpath1 -R/home/you/lib \ 
 -Lpath2 file1.c file2.c -lfoo -lbar

For objects that can be installed in various locations, the $ORIGIN dynamic string token provides a flexible means of recording a runpath. See Locating Associated Dependencies.

Note - A historic alternative to specifying the -R option is to set the environment variable LD_RUN_PATH, and make this available to the link-editor. The scope and function of LD_RUN_PATH and -R are identical, but when both are specified, -R supersedes LD_RUN_PATH.

Initialization and Termination Sections

Dynamic objects can supply code that provides for runtime initialization and termination processing. The initialization code of a dynamic object is executed once each time the dynamic object is loaded in a process. The termination code of a dynamic object is executed once each time the dynamic object is unloaded from a process or at process termination. This code can be encapsulated in one of two section types, either an array of function pointers or a single code block. Each of these section types is built from a concatenation of like sections from the input relocatable objects.

The sections .pre_initarray, .init_array and .fini_array provide arrays of runtime pre-initialization, initialization, and termination functions, respectively. When creating a dynamic object, the link-editor identifies these arrays with the .dynamic tag pairs DT_PREINIT_[ARRAY/ARRAYSZ], DT_INIT_[ARRAY/ARRAYSZ], and DT_FINI_[ARRAY/ARRAYSZ] accordingly. These tags identify the associated sections so that the sections can be called by the runtime linker. A pre-initialization array is applicable to dynamic executables only.

Note - Functions that are assigned to these arrays must be provided from the object that is being built.

The sections .init and .fini provide a runtime initialization and termination code block, respectively. The compiler drivers typically supply .init and .fini sections with files they add to the beginning and end of your input file list. These compiler provided files have the effect of encapsulating the .init and .fini code from your relocatable objects into individual functions. These functions are identified by the reserved symbol names _init and _fini respectively. When creating a dynamic object, the link-editor identifies these symbols with the .dynamic tags DT_INIT and DT_FINI accordingly. These tags identify the associated sections so they can be called by the runtime linker.

For more information about the execution of initialization and termination code at runtime see Initialization and Termination Routines.

The registration of initialization and termination functions can be carried out directly by the link-editor by using the -z initarray and -z finiarray options. For example, the following command places the address of foo() in an .init_array element, and the address of bar() in a .fini_array element.

$ cat main.c
#include    <stdio.h>

void foo()
{
        (void) printf("initializing: foo()\n");
}

void bar()
{
        (void) printf("finalizing: bar()\n");
}

void main()
{
        (void) printf("main()\n");
}

$ cc -o main -zinitarray=foo -zfiniarray=bar main.c
$ main
initializing: foo()
main()
finalizing: bar()

The creation of initialization and termination sections can be carried out directly using an assembler. However, most compilers offer special primitives to simplify their declaration. For example, the previous code example can be rewritten using the following #pragma definitions. These definitions result in a call to foo() being placed in an .init section, and a call to bar() being placed in a .fini section.

$ cat main.c
#include    <stdio.h>

#pragma init (foo)
#pragma fini (bar)

....
$ cc -o main main.c
$ main
initializing: foo()
main()
finalizing: bar()

Initialization and termination code, spread throughout several relocatable objects, can result in different behavior when included in an archive library or shared object. The link-edit of an application that uses this archive might extract only a fraction of the objects contained in the archive. These objects might provide only a portion of the initialization and termination code spread throughout the members of the archive. At runtime, only this portion of code is executed. The same application built against the shared object will have all the accumulated initialization and termination code executed when the dependency is loaded at runtime.

To determine the order of executing initialization and termination code within a process at runtime is a complex issue that involves dependency analysis. Limit the content of initialization and termination code to simplify this analysis. Simplified, self contained, initialization and termination code provides predictable runtime behavior. See Initialization and Termination Order for more details.

Data initialization should be independent if the initialization code is involved with a dynamic object whose memory can be dumped using dldump(3C).

Skip Navigation Links
Exit Print View
	Linker and Libraries Guide Oracle Solaris 11 Information Library