The link-editors provide a number of support interfaces that enable the monitoring, and in some cases modification, of link-editor and runtime linker processing. These interfaces typically require a more advanced understanding of link-editing concepts than has been described in previous chapters. The following interfaces are described in this chapter:
ld-support – Link-Editor Support Interface
rtld-audit – Runtime Linker Auditing Interface
rtld-debugger – Runtime Linker Debugger Interface
The link-editor performs many operations including the opening of files and the concatenation of sections from these files. Monitoring, and sometimes modifying, these operations can often be beneficial to components of a compilation system.
This section describes the ld-support interface for input file inspection, and to some degree, input file data modification of those files that compose a link-edit. Two applications that employ this interface are the link-editor itself, which uses it to process debugging information within relocatable objects, and the make(1S) utility, which uses it to save state information.
The ld-support interface is composed of a support library that offers one or more support interface routines. This library is loaded as part of the link-edit process, and any support routines found are called at various stages of link-editing.
You should be familiar with the elf(3ELF) structures and file format when using this interface.
The link-editor accepts one or more support libraries provided by either the SGS_SUPPORT
environment variable or with the link-editor's -S
option. The environment variable consists of a colon separated list of support libraries:
$ SGS_SUPPORT=./support.so.1:libldstab.so.1 cc ... |
The -S option specifies a single support library. Multiple -S options can be specified:
$ LD_OPTIONS="-S./support.so.1 -Slibldstab.so.1" cc ... |
A support library is a shared object. The link-editor opens each support library, in the order they are specified, using dlopen(3DL). If both the environment variable and -S option are encountered, then the support libraries specified with the environment variable are processed first. Each support library is then searched, using dlsym(3DL), for any support interface routines. These support routines are then called at various stages of link-editing.
A support library must be consistent with the ELF class of the link-editor being invoked, either 32–bit or 64–bit. See 32–Bit and 64–Bit Environments for more details.
By default, the Solaris support library libldstab.so.1 is used by the link-editor to process, and compact, compiler-generated debugging information supplied within input relocatable objects. This default processing is suppressed if you invoke the link-editor with any support libraries specified using the -S option. If the default processing of libldstab.so.1 is required in addition to your support library services, add libldstab.so.1 explicitly to the list of support libraries supplied to the link-editor.
As described in 32–Bit and 64–Bit Environments, the 64–bit link-editor (ld(1)) is capable of generating 32–bit objects and the 32–bit link-editor is capable of generating 64–bit objects. Each of these objects has an associated support interface defined.
The support interface for 64–bit objects is similar to that of 32–bit objects, but ends in a 64 suffix, for example ld_start() and ld_start64(). This convention allows both implementations of the support interface to reside in a single shared object libldstab.so.1 of each class, 32–bit and 64–bit.
The SGS_SUPPORT
environment variable can be specified with a _32 or _64 suffix, and the link-editor options -z ld32 and -z ld64 can be used to define -S option requirements. These definitions will only be interpreted, respectively, by the 32–bit or 64–bit class
of the link-editor. This enables both classes of support library to be specified when the class of the link-editor may not be known.
All ld-support interfaces are defined in the header file link.h. All interface arguments are basic C types or ELF types. The ELF data types can be examined with the ELF access library libelf. See elf(3ELF) for a description of libelf contents. The following interface functions are provided by the ld-support interface, and are described in their expected order of use.
This function provides the initial handshake between the link-editor and the support library.
uint_t ld_version(uint_t version);
The link-editor calls this interface with the highest version of the ld-support interface it is capable of supporting. The support library can verify that this version is sufficient for its use, and return the version it expects to use. This version is normally LD_SUP_VCURRENT.
If the support library does not provide this interface, the initial support level LD_SUP_VERSION1 is assumed.
If the support library returns a version of zero, or a value greater than the ld-support interface the link-editor supports, the support library will not be used.
This function is called after initial validation of the link-editor command line, and indicates the start of input file processing.
void ld_start(const char * name, const Elf32_Half type, const char * caller); void ld_start64(const char * name, const Elf64_Half type, const char * caller);
name is the output file name being created. type is the output file type, which is either ET_DYN, ET_REL, or ET_EXEC, as defined in sys/elf.h. caller is the application calling the interface, which is normally /usr/ccs/bin/ld.
This function is called for each input file before any processing of the files data is carried out.
void ld_file(const char * name, const Elf_Kind kind, int flags, Elf * elf); void ld_file64(const char * name, const Elf_Kind kind, int flags, Elf * elf);
name is the input file about to be processed. kind indicates the input file type, which is either ELF_K_AR, or ELF_K_ELF, as defined in libelf.h. flags indicates how the link-editor obtained the file, and can be one or more of the following definitions:
LD_SUP_DERIVED – The file name was not explicitly named on the command line. It was either derived from a -l expansion, or it identifies an extracted archive member.
LD_SUP_EXTRACTED – The file was extracted from an archive.
LD_SUP_INHERITED – The file was obtained as a dependency of a command-line shared object.
If no flags values are specified then the input file has been explicitly named on the command line. elf is a pointer to the file's ELF descriptor.
This function is called for each section of the input file. This function is called before the link-editor has determined whether the section should be propagated to the output file. This function differs from ld_section() processing, which is only called for sections that contribute to the output file.
void ld_input_section(const char * name, Elf32_Shdr ** shdr, Elf32_Word sndx, Elf_Data * data, Elf * elf, unit_t flags); void ld_input_section64(const char * name, Elf64_Shdr ** shdr, Elf64_Word sndx, Elf_Data * data, Elf * elf, uint_t flags);
name is the input section name. shdr is a pointer to the associated section header. sndx is the section index within the input file. data is a pointer to the associated data buffer. elf is a pointer to the file's ELF descriptor. flags is reserved for future use.
Modification of the section header is permitted by reallocating a section header and reassigning the *shdr to the new header. The link-editor uses the section header information that *shdr points to upon return from ld_input_section() to process the section.
You can modify the data by reallocating the data and reassigning the Elf_Data buffer's d_buf pointer. Any modification to the data should ensure the correct setting of the Elf_Data buffer's d_size element. For input sections that become part of the output image, setting the d_size element to zero effectively removes the data from the output image.
The flags field points to a uint_t data field that is initially zero filled. No flags are currently assigned, although the ability to assign flags in future updates, by the link-editor or the support library, is provided.
This function is called for each section of the input file that will be propagated to the output file, but before any processing of the section data is carried out.
void ld_section(const char * name, Elf32_Shdr * shdr, Elf32_Word sndx, Elf_Data * data, Elf * elf); void ld_section64(const char * name, Elf64_Shdr * shdr, Elf64_Word sndx, Elf_Data * data, Elf * elf);
name is the input section name. shdr is a pointer to the associated section header. sndx is the section index within the input file. data is a pointer to the associated data buffer. elf is a pointer to the files ELF descriptor.
You can modify the data by reallocating the data itself and reassigning the Elf_Data buffer's d_buf pointer. Any modification to the data should ensure the correct setting of the Elf_Data buffer's d_size element. For input sections that will become part of the output image, setting the d_size element to zero will effectively remove the data from the output image.
Any sections that are stripped by use of the link-editor's -s option, or discarded due to SHT_SUNW_COMDAT processing or SHF_EXCLUDE identification (see Table 7–14), are not reported to ld_section(). See Comdat Section.
This function is called when input file processing is complete but before the output file is laid out.
void ld_input_done(uint_t flags);
The flags field points to a uint_t data field that is initially zero filled. No flags are currently assigned, although the ability to assign flags in future updates, by the link-editor or the support library, is provided.
This function is called when the link-edit is complete.
void ld_atexit(int status); void ld_atexit64(int status);
status is the exit(2) code that will be returned by the link-editor and is either EXIT_FAILURE or EXIT_SUCCESS, as defined in stdlib.h.
The following example creates a support library that prints the section name of any relocatable object file processed as part of a 32–bit link-edit.
$ cat support.c #include <link.h> #include <stdio.h> static int indent = 0; void ld_start(const char * name, const Elf32_Half type, const char * caller) { (void) printf("output image: %s\n", name); } void ld_file(const char * name, const Elf_Kind kind, int flags, Elf * elf) { if (flags & LD_SUP_EXTRACTED) indent = 4; else indent = 2; (void) printf("%*sfile: %s\n", indent, "", name); } void ld_section(const char * name, Elf32_Shdr * shdr, Elf32_Word sndx, Elf_Data * data, Elf * elf) { Elf32_Ehdr * ehdr = elf32_getehdr(elf); if (ehdr->e_type == ET_REL) (void) printf("%*s section [%ld]: %s\n", indent, "", (long)sndx, name); } |
This support library is dependent upon libelf to provide the ELF access function elf32_getehdr(3ELF) that is used to determine the input file type. The support library is built using:
$ cc -o support.so.1 -G -K pic support.c -lelf -lc |
The following example shows the section diagnostics resulting from the construction of a trivial application from a relocatable object and a local archive library. The invocation of the support library, in addition to default debugging information processing, is brought about by the -S option usage.
$ LD_OPTIONS="-S./support.so.1 -Slibldstab.so.1" \ cc -o prog main.c -L. -lfoo output image: prog file: /opt/COMPILER/crti.o section [1]: .shstrtab section [2]: .text ....... file: /opt/COMPILER/crt1.o section [1]: .shstrtab section [2]: .text ....... file: /opt/COMPILER/values-xt.o section [1]: .shstrtab section [2]: .text ....... file: main.o section [1]: .shstrtab section [2]: .text ....... file: ./libfoo.a file: ./libfoo.a(foo.o) section [1]: .shstrtab section [2]: .text ....... file: /usr/lib/libc.so file: /opt/COMPILER/crtn.o section [1]: .shstrtab section [2]: .text ....... file: /usr/lib/libdl.so.1 |
The number of sections displayed in this example have been reduced to simplify the output. Also, the files included by the compiler driver can vary.
This section describes the rtld-audit interface that enables a process to access runtime linking information regarding itself. One example of the use of this mechanism is the runtime profiling of shared objects described in Profiling Shared Objects.
The rtld-audit interface is implemented as an audit library that offers one or more auditing interface routines. If this library is loaded as part of a process, then the audit routines are called by the runtime linker at various stages of process execution. These interfaces enable the audit library to access:
The search for dependencies. Search paths may be substituted by the audit library.
Information regarding loaded objects.
Symbol bindings that occur between these loaded objects. These bindings can be altered by the audit library.
Exploitation of the lazy binding mechanism provided by procedure linker table entries to allow auditing of function calls and their return values. The arguments to a function and its return value can be modified by the audit library. See Procedure Linkage Table (Processor-Specific).
Some of these facilities can be achieved by preloading specialized shared objects. A preloaded object exists within the same namespace as the objects of a process. This often restricts or complicates the implementation of the preloaded shared object. The rtld-audit interface offers the user a unique namespace in which to execute their audit libraries. This namespace ensures that the audit library does not intrude upon the normal bindings that occur within the process.
When the runtime linker binds a dynamic executable with its dependencies, it generates a linked list of link-maps to describe the process. The link-map structure describes each object within the process and is defined in /usr/include/sys/link.h. The symbol search mechanism required to bind objects of an application traverses this list of link-maps. This link-map list is said to provide the namespace for process symbol resolution.
The runtime linker itself is also described by a link-map. This link-map is maintained on a different list from that of the application objects. The runtime linker therefore resides in its own unique name space, which prevents any direct binding of the application to services within the runtime linker. An application can only call upon the public services of the runtime linker by the filter libdl.so.1.
The rtld-audit interface employs its own link-map list on which it maintains any audit libraries. The audit libraries are thus isolated from the symbol binding requirements of the application. Inspection of the application link-map list is possible with dlmopen(3DL). When used with the RTLD_NOLOAD flag, dlmopen(3DL) allows the audit library to query an object's existence without causing its loading.
Two identifiers are defined in /usr/include/link.h to define the application and runtime linker link-map lists:
#define LM_ID_BASE 0 /* application link-map list */ #define LM_ID_LDSO 1 /* runtime linker link-map list */
Each rtld-audit support library is assigned a unique free link-map identifier.
An audit library is built like any other shared object. Its unique namespace within a process requires some additional care. The namespace:
Must provide all dependency requirements.
Should not use system interfaces that do not provide for multiple instances of the interface within a process.
If the audit library calls printf(3C), then the audit library must define a dependency on libc. See Generating a Shared Object Output File. Because the audit library has a unique namespace, symbol references cannot be satisfied by the libc present in the application being audited. If an audit library has a dependency on libc, then two versions of libc.so.1 are loaded into the process. One version satisfies the binding requirements of the application link-map list. The other version satisfies the binding requirements of the audit link-map list.
To ensure that audit libraries are built with all dependencies recorded, use the link-editors -z defs option.
Some system interfaces assume that they are the only instance of their implementation within a process, for example, threads, signals and malloc(3C). Audit libraries should avoid using such interfaces, as doing so can inadvertently alter the behavior of the application.
An audit library can allocate memory using mapmalloc(3MALLOC), as this allocation method can exist with any allocation scheme normally employed by the application.
The rtld-audit interface is enabled by one of two means. Each method implies a scope to the objects that are audited.
Global auditing is enabled using the runtime linker environment variable LD_AUDIT
. The audit
libraries made available by this method are provided with information regarding all dynamic objects used by the process.
Local auditing is enabled through dynamic entries recorded within an object at the time it was built. The audit libraries made available by this method are provided with information regarding those dynamic objects identified for auditing.
Either method of invocation consists of a string that contains a colon-separated list of shared objects that are loaded by dlmopen(3DL). Each object is loaded onto its own audit link-map list. Each object is also searched for audit routines using dlsym(3DL). Audit routines that are found are called at various stages during the applications execution.
The rtld-audit interface enables multiple audit libraries to be supplied. Audit libraries that expect to be employed in this fashion should not alter the bindings that would normally be returned by the runtime linker. Altering these bindings can produce unexpected results from audit libraries that follow.
Secure applications can only obtain audit libraries from trusted directories. Presently, the only trusted directory known to the runtime linker is /usr/lib/secure for 32–bit objects or /usr/lib/secure/64 for 64–bit objects.
Local auditing requirements can be established when an object is built using the link-editor options -p or -P. If you want to audit the use of a shared object libfoo.so.1, with the audit library audit.so.1, record this requirement at link-edit time using the -p option:
$ cc -G -o libfoo.so.1 -Wl,-paudit.so.1 -Kpic foo.c $ dump -Lv libfoo.so.1 | fgrep AUDIT [3] AUDIT audit.so.1 |
At runtime, the existence of this audit identifier results in the audit library being loaded and information being passed to it regarding the identifying object.
With this mechanism alone, information such as searching for the identifying object has occurred prior to the audit library being loaded. To provide as much auditing information as possible, the existence of an object requiring local auditing is propagated to users of that object. For example, if an application is built that depends on libfoo.so.1, then the application is identified to indicate its dependencies require auditing:
$ cc -o main main.c libfoo.so.1 $ dump -Lv main | fgrep AUDIT [5] DEPAUDIT audit.so.1 |
The auditing enabled via this mechanism will result in the audit library being loaded and information being passed to it regarding all of the applications explicit dependencies. This dependency auditing can also be recorded directly when creating an object by using the link-editor's -P option:
$ cc -o main main.c -Wl,-Paudit.so.1 $ dump -Lv main | fgrep AUDIT [5] DEPAUDIT audit.so.1 |
Auditing can be disabled at runtime by setting the environment variable LD_NOAUDIT
to a non-null value.
The following functions are provided by the rtld-audit interface and are described in their expected order of use.
References to architecture, or object class specific interfaces are reduced to their generic name to simplify the discussions. For example, a reference to la_symbind32() and la_symbind64() is specified as la_symbind().
This function provides the initial handshake between the runtime linker and the audit library. This interface must be provided by the audit library for it to be loaded.
uint_t la_version(uint_t version);
The runtime linker calls this interface with the highest version of the rtld-audit interface it is capable of supporting. The audit library can verify that this version is sufficient for its use, and return the version it expects to use. This version is normally LAV_CURRENT, which is defined in /usr/include/link.h.
If the audit library returns a version of zero, or a value greater than the rtld-audit interface the runtime linker supports, the audit library will not be used.
This function informs an auditor that link-map activity is occurring.
void la_activity(uintptr_t * cookie, uint_t flags);
cookie identifies the object heading the link-map. flags indicates the type of activity as defined in /usr/include/link.h:
LA_ACT_ADD – Objects are being added to the link-map list.
LA_ACT_DELETE – Objects are being deleted from the link-map list.
LA_ACT_CONSISTENT – Object activity has been completed.
This function informs an auditor that an object is about to be searched for.
char * la_objsearch(const char * name, uintptr_t * cookie, uint_t flags);
name indicates the file or path name being searched for. cookie identifies the object initiating the search. flags identifies the origin and creation of name as defined in /usr/include/link.h:
LA_SER_ORIG – This is the initial search name. Typically this indicates the file name that is recorded as a DT_NEEDED entry, or the argument supplied to dlmopen(3DL).
LA_SER_LIBPATH – The path name has been created from a LD_LIBRARY_PATH component.
LA_SER_RUNPATH – The path name has been created from a runpath component.
LA_SER_DEFAULT – The path name has been created from a default search path component.
LA_SER_CONFIG – The path component originated from a configuration file (see the crle(1) man page).
LA_SER_SECURE – The path component is specific to secure objects.
The return value indicates the search path name that the runtime linker should continue to process. A value of 0 indicates that this path should be ignored. An audit library that simply monitors search paths should return name.
This function is called each time a new object is loaded by the runtime linker.
uint_t la_objopen(Link_map * lmp, Lmid_t lmid, uintptr_t * cookie);
lmp provides the link-map structure that describes the new object. lmid identifies the link-map list to which the object has been added. cookie provides a pointer to an identifier. This identifier is initialized to the objects lmp. This identifier can be modified by the audit library to better identify the object to other rtld-audit interface routines
The la_objopen() function returns a value that indicates the symbol bindings of interest for this object. These values can result in later calls to la_symbind(). The return value is a mask of the following values defined in/usr/include/link.h:
LA_FLG_BINDTO – Audit symbol bindings to this object.
LA_FLG_BINDFROM – Audit symbol bindings from this object.
See the la_symbind() function for more details on the use of these two flags.
A return value of zero indicates that binding information is of no interest for this object.
This function is called once after all objects have been loaded for the application, but before transfer of control to the application occurs.
void la_preinit(uintptr_t * cookie);
cookie identifies the primary object that started the process, normally the dynamic executable.
This function is called when a binding occurs between two objects that have been tagged for binding notification from la_objopen().
uintptr_t la_symbind32(Elf32_Sym * sym, uint_t ndx, uintptr_t * refcook, uintptr_t * defcook, uint_t * flags); uintptr_t la_symbind64(Elf64_Sym * sym, uint_t ndx, uintptr_t * refcook, uintptr_t * defcook, uint_t * flags, const char * sym_name);
sym is a constructed symbol structure (see /usr/include/sys/elf.h), whose sym->st_value indicates the address of the symbol definition being bound. la_symbind32() has the sym->st_name adjusted to point to the actual symbol name, while la_symbind64() leaves sym->st_name to be the index into the bound objects string table.
ndx indicates the symbol index within the bound object's dynamic symbol table. refcook describes the object making reference to this symbol. This identifier is the same as the one that is passed to the la_objopen() that returned LA_FLG_BINDFROM. defcook describes the object defining this symbol. This identifier is the same as passed to the la_objopen() that returned LA_FLG_BINDTO.
flags points to a data item that can convey information regarding the binding and can be used to modify the continued auditing of procedure linkage table symbol entries. This value is a mask of the following flags defined in /usr/include/link.h:
LA_SYMB_NOPLTENTER – The la_pltenter() function is not be called for this symbol.
LA_SYMB_NOPLTEXIT – The la_pltexit() function is not be called for this symbol.
LA_SYMB_DLSYM – The symbol binding occurred as a result of calling dlsym(3DL).
LA_SYMB_ALTVALUE (LAV_VERSION2) – An alternate value was returned for the symbol value by a previous call to la_symbind().
By default, if the la_pltenter() or la_pltexit() functions exist within the audit library, they are called after la_symbind() for procedure linkage table symbols each time the symbol is referenced. See also Audit Interface Limitations.
The return value indicates the address to which control should be passed following this call. An audit library that simply monitors symbol binding should return the value of sym->st_value so that control is passed to the bound symbol definition. An audit library can intentionally redirect a symbol binding by returning a different value.
sym_name, which is applicable for la_symbind64() only, contains the name of the symbol being processed. This name is available in the sym->st_name field for the 32–bit interface.
These functions are called on a SPARC and x86 system respectively, when a procedure linkage symbol entry, between two objects that have been tagged for binding notification, is called.
uintptr_t la_sparcv8_pltenter(Elf32_Sym * sym, uint_t ndx, uintptr_t * refcook, uintptr_t * defcook, La_sparcv8_regs * regs, uint_t * flags); uintptr_t la_sparcv9_pltenter(Elf64_Sym * sym, uint_t ndx, uintptr_t * refcook, uintptr_t * defcook, La_sparcv9_regs * regs, uint_t * flags, const char * sym_name); uintptr_t la_i86_pltenter(Elf32_Sym * sym, uint_t ndx, uintptr_t * refcook, uintptr_t * defcook, La_i86_regs * regs, uint_t * flags);
sym, ndx, refcook, defcook and sym_name provide the same information as passed to la_symbind().
regs points to the out registers on a SPARC system, and the stack and frame registers on a x86 system, as defined in /usr/include/link.h.
flags points to a data item that can convey information regarding the binding and can be used to modify the continuing auditing of this procedure linkage table entry. This data item is the same as pointed to by the flags from la_symbind(). This value is a mask of the following flags defined in /usr/include/link.h:
LA_SYMB_NOPLTENTER – la_pltenter() is not be called again for this symbol.
LA_SYMB_NOPLTEXIT – la_pltexit() is not be called for this symbol.
The return value indicates the address to which control should be passed following this call. An audit library that simply monitors symbol binding should return the value of sym->st_value so that control is passed to the bound symbol definition. An audit library can intentionally redirect a symbol binding by returning a different value.
This function is called when a procedure linkage symbol entry between two objects that have been tagged for binding notification returns, but before control reaches the caller.
uintptr_t la_pltexit(Elf32_Sym * sym, uint_t ndx, uintptr_t * refcook, uintptr_t * defcook, uintptr_t retval); uintptr_t la_pltexit64(Elf64_Sym * sym, uint_t ndx, uintptr_t * refcook, uintptr_t * defcook, uintptr_t retval, const char * sym_name);
sym, ndx, refcook, defcook and sym_name provide the same information as passed to la_symbind(). retval is the return code from the bound function. An audit library that simply monitors symbol binding should return retval. An audit library can intentionally return a different value.
The la_pltexit() interface is experimental. See Audit Interface Limitations.
This function is called after any termination code for an object has been executed and prior to the object being unloaded.
uint_t la_objclose(uintptr_t * cookie);
cookie was obtained from a previous la_objopen() and identifies the object. Any return value is presently ignored.
The following simple example creates an audit library that prints the name of each shared object dependency loaded by the dynamic executable date(1).
$ cat audit.c #include <link.h> #include <stdio.h> uint_t la_version(uint_t version) { return (LAV_CURRENT); } uint_t la_objopen(Link_map * lmp, Lmid_t lmid, uintptr_t * cookie) { if (lmid == LM_ID_BASE) (void) printf("file: %s loaded\n", lmp->l_name); return (0); } $ cc -o audit.so.1 -G -K pic -z defs audit.c -lmapmalloc -lc $ LD_AUDIT=./audit.so.1 date file: date loaded file: /usr/lib/libc.so.1 loaded file: /usr/lib/libdl.so.1 loaded file: /usr/lib/locale/en_US/en_US.so.2 loaded Thur Aug 10 17:03:55 PST 2000 |
A number of demonstration applications that use the rtld-audit interface are provided in the SUNWosdem package under /usr/demo/link_audit:
This demo provides tracing of procedure calls between the dynamic objects of a named application.
This demo provides a stack trace for a specified function whenever called by a named application.
This demo traces the amount of time spent in each function for a named application.
This demo reports all symbol bindings performed to load a named application.
sotruss(1) and whocalls(1) are also included in the SUNWtoo package. perfcnt and symbindrep are example programs only and are not intended for use in a production environment.
There are some limitations regarding the use of the la_pltexit() family. These limitations stem from the need to insert an extra stack frame between the caller and callee to provide a means of acquiring the la_pltexit() return value. This requirement is not a problem when calling just the la_pltenter() routines, as any intervening stack can be cleaned up prior to transferring control to the destination function.
Because of these limitations, la_pltexit() should be considered an experimental interface. When in doubt, avoid the use of the la_pltexit() routines.
A small number of functions exist that directly inspect the stack or make assumptions regarding its state. Some examples of these functions are the setjmp(3C) family, vfork(2), and any function that returns a structure, not a pointer to a structure. These functions are compromised by the extra stack created to support la_pltexit().
The runtime linker cannot detect functions of this type, and thus the audit library creator is responsible for disabling la_pltexit() for such routines.
The runtime linker performs many operations including the mapping of objects into memory and the binding of symbols. Debugging programs often need to access information that describes these runtime linker operations as part of analyzing an application. These debugging programs run as a separate process to the application they are analyzing.
This section describes the rtld-debugger interface for monitoring and modifying a dynamically linked application from another process. The architecture of this interface follows the model used in libthread_db(3THR).
When using the rtld-debugger interface, at least two processes are involved:
One or more target processes. The target processes must be dynamically linked and use the runtime linker /usr/lib/ld.so.1 for 32–bit processes, or /usr/lib/64/ld.so.1 for 64–bit processes.
A controlling process links with the rtld-debugger interface library and uses it to inspect the dynamic aspects of the target processes. A 64–bit controlling process can debug both 64–bit and 32–bit targets. However, a 32–bit controlling process is limited to 32–bit targets.
The most anticipated use of the rtld-debugger interface is when the controlling process is a debugger and its target is a dynamic executable.
The rtld-debugger interface enables the following activities with a target process:
Initial rendezvous with the runtime linker.
Notification of the loading and unloading of dynamic objects.
Retrieval of information regarding any loaded objects.
Stepping over procedure linkage table entries.
Enabling object padding.
To be able to inspect and manipulate a target process, the rtld-debugger interface employs an exported interface, an imported interface, and agents for communicating between these interfaces.
The controlling process is linked with the rtld-debugger interface provided by librtld_db.so.1, and makes requests of the interface exported from this library. This interface is defined in /usr/include/rtld_db.h. In turn, librtld_db.so.1 makes requests of the interface imported from the controlling process. This interaction allows the rtld-debugger interface to:
Look up symbols in a target process.
Read and write memory in the target process.
The imported interface consists of a number of proc_service routines that most debuggers already employ to analyze processes. These routines are described in Debugger Import Interface.
The rtld-debugger interface assumes that the process being analyzed is stopped when requests are made of the rtld-debugger interface. If this halt does not occur, data structures within the runtime linker of the target process might not be in a consistent state for examination.
The flow of information between librtld_db.so.1, the controlling process (debugger) and the target process (dynamic executable) is diagrammed in the following figure.
The rtld-debugger interface is dependent upon the proc_service interface, /usr/include/proc_service.h, which is considered experimental. The rtld-debugger interface might have to track changes in the proc_service interface as it evolves.
A sample implementation of a controlling process that uses the rtld-debugger interface is provided in the SUNWosdem package under /usr/demo/librtld_db. This debugger, rdb, provides an example of using the proc_service imported interface, and shows the required calling sequence for all librtld_db.so.1 exported interfaces. The following sections describe the rtld-debugger interfaces. More detailed information can be obtained by examining the sample debugger.
An agent provides an opaque handle that can describe internal interface structures. The agent also provides a mechanism of communication between the exported and imported interfaces. The rtld-debugger interface is intended to be used by a debugger which can manipulate several processes at the same time, these agents are used to identify the process.
Is an opaque structure that is created by the controlling process to identify the target process that is passed between the exported and imported interface.
Is an opaque structure created by the rtld-debugger interface that identifies the target process that is passed between the exported and imported interface.
This section describes the various interfaces exported by the /usr/lib/librtld_db.so.1 audit library. It is broken down into functional groups.
This function establishes the rtld-debugger version requirements. The base version is defined as RD_VERSION1. The current version is always defined by RD_VERSION.
rd_err_e rd_init(int version);
Version RD_VERSION2, added in Solaris 8 10/00, extends the rd_loadobj_t structure. See the rl_flags, rl_bend and rl_dynamic fields in Scanning Loadable Objects.
Version RD_VERSION3, added in Solaris 8 01/01, extends the rd_plt_info_t structure. See the pi_baddr and pi_flags fields in Procedure Linkage Table Skipping.
If the version requirement of the controlling process is greater than the rtld-debugger interface available, then RD_NOCAPAB is returned.
This function creates a new exported interface agent.
rd_agent_t * rd_new(struct ps_prochandle * php);
php is a cookie created by the controlling process to identify the target process. This cookie is used by the imported interface offered by the controlling process to maintain context, and is opaque to the rtld-debugger interface.
This function resets the information within the agent based off the same ps_prochandle structure given to rd_new().
rd_err_e rd_reset(struct rd_agent * rdap);
This function deletes an agent and frees any state associated with it.
void rd_delete(struct rd_agent * rdap);
The following error states can be returned by the rtld-debugger interface (defined in rtld_db.h):
typedef enum { RD_ERR, RD_OK, RD_NOCAPAB, RD_DBERR, RD_NOBASE, RD_NODYNAM, RD_NOMAPS } rd_err_e;
The following interfaces can be used to gather the error information.
This function returns a descriptive error string describing the error code rderr.
char * rd_errstr(rd_err_e rderr);
This function turns logging on (1) or off (0).
void rd_log(const int onoff); |
When logging is turned on, the imported interface function ps_plog() provided by the controlling process, is called with more detailed diagnostic information.
You can obtain information for each object maintained on the runtime linkers link-map is achieved by using the following structure, defined in rtld_db.h:
typedef struct rd_loadobj { psaddr_t rl_nameaddr; unsigned rl_flags; psaddr_t rl_base; psaddr_t rl_data_base; unsigned rl_lmident; psaddr_t rl_refnameaddr; psaddr_t rl_plt_base; unsigned rl_plt_size; psaddr_t rl_bend; psaddr_t rl_padstart; psaddr_t rl_padend; psaddt_t rl_dynamic; } rd_loadobj_t;
Notice that all addresses given in this structure, including string pointers, are addresses in the target process and not in the address space of the controlling process itself.
A pointer to a string that contains the name of the dynamic object.
With revision RD_VERSION2, dynamically loaded relocatable objects are identified with RD_FLG_MEM_OBJECT.
The base address of the dynamic object.
The base address of the data segment of the dynamic object.
The link-map identifier (see Establishing a Namespace).
If the dynamic object is a filter, then this points to the name of the filtees.
These elements are present for backward compatibility and are currently unused.
The end address of the object (text + data + bss). With revision RD_VERSION2, a dynamically loaded relocatable object will cause this element to point to the end of the created object, which will include its section headers.
The base address of the padding before the dynamic object (refer to Dynamic Object Padding).
The base address of the padding after the dynamic object (refer to Dynamic Object Padding).
This field, added with RD_VERSION2, provides the base address of the object's dynamic section, which allows reference to such entries as DT_CHECKSUM (see Table 7–43).
The rd_loadobj_iter() routine uses this object data structure to access information from the runtime linker's link-map lists:
This function iterates over all dynamic objects currently loaded in the target process.
typedef int rl_iter_f(const rd_loadobj_t *, void *); rd_err_e rd_loadobj_iter(rd_agent_t * rap, rl_iter_f * cb, void * clnt_data);
On each iteration the imported function specified by cb is called. clnt_data can be used to pass data to the cb call. Information about each object is returned via a pointer to a volatile (stack allocated) rd_loadobj_t structure.
Return codes from the cb routine are examined by rd_loadobj_iter() and have the following meaning:
1 – continue processing link-maps.
0 – stop processing link-maps and return control to the controlling process.
rd_loadobj_iter() returns RD_OK on success. A return of RD_NOMAPS indicates the runtime linker has not yet loaded the initial link-maps.
A controlling process can track certain events that occur within the scope of the runtime linker that. These events are:
The runtime linker has loaded and relocated all the dynamic objects and is about to start calling the .init sections of each object loaded.
The runtime linker has finished calling all of the .init sections and is about to transfer control to the primary executable.
The runtime linker has been invoked to either load or unload a dynamic object.
These events can be monitored using the following interface, defined in sys/link.h and rtld_db.h:
typedef enum { RD_NONE = 0, RD_PREINIT, RD_POSTINIT, RD_DLACTIVITY } rd_event_e; /* * Ways that the event notification can take place: */ typedef enum { RD_NOTIFY_BPT, RD_NOTIFY_AUTOBPT, RD_NOTIFY_SYSCALL } rd_notify_e; /* * Information on ways that the event notification can take place: */ typedef struct rd_notify { rd_notify_e type; union { psaddr_t bptaddr; long syscallno; } u; } rd_notify_t;
The following functions track events:
This function enables (1) or disables (0) event monitoring.
rd_err_e rd_event_enable(struct rd_agent * rdap, int onoff);
Presently, for performance reasons, the runtime linker ignores event disabling. The controlling process should not assume that a given break-point can not be reached because of the last call to this routine.
This function specifies how the controlling program is notified of a given event.
rd_err_e rd_event_addr(rd_agent_t * rdap, rd_event_e event, rd_notify_t * notify);
Depending on the event type, the notification of the controlling process takes place by calling a benign, cheap system call that is identified by notify->u.syscallno, or executing a break point at the address specified by notify->u.bptaddr. The controlling process is responsible for tracing the system call or place the actual break-point.
When an event has occurred, additional information can be obtained by this interface, defined in rtld_db.h:
typedef enum { RD_NOSTATE = 0, RD_CONSISTENT, RD_ADD, RD_DELETE } rd_state_e; typedef struct rd_event_msg { rd_event_e type; union { rd_state_e state; } u; } rd_event_msg_t;
The rd_state_e values are:
There is no additional state information available.
The link-maps are in a stable state and can be examined.
A dynamic object is in the process of being loaded and the link-maps are not in a stable state. They should not be examined until the RD_CONSISTANT state is reached.
A dynamic object is in the process of being deleted and the link-maps are not in a stable state. They should not be examined until the RD_CONSISTANT state is reached.
The rd_event_getmsg() function is used to obtain this event state information.
This function provides additional information concerning an event.
rd_err_e rd_event_getmsg(struct rd_agent * rdap, rd_event_msg_t * msg);
The following table shows the possible state for each of the different event types.
RD_PREINIT |
RD_POSTINIT |
RD_DLACTIVITY |
---|---|---|
RD_NOSTATE |
RD_NOSTATE |
RD_CONSISTANT |
|
|
RD_ADD |
|
|
RD_DELETE |
The rtld-debugger interface enables a controlling process to skip over procedure linkage table entries. When a controlling process, such as a debugger, is asked to step into a function for the first time, the procedure linkage table processing, causes control to be passed to the runtime linker to search for the function definition.
The following interface enables a controlling process to step over the runtime linker's procedure linkage table processing. The controlling process can determine when a procedure linkage table entry is encountered based on external information provided in the ELF file.
Once a target process has stepped into a procedure linkage table entry, the process calls the rd_plt_resolution() interface:
This function returns the resolution state of the current procedure linkage table entry and information on how to skip it.
rd_err_e rd_plt_resolution(rd_agent_t * rdap, paddr_t pc, lwpid_t lwpid, paddr_t plt_base, rd_plt_info_t * rpi); |
pc represents the first instruction of the procedure linkage table entry. lwpid provides the lwp identifier and plt_base provides the base address of the procedure linkage table. These three variables provide information sufficient for various architectures to process the procedure linkage table.
rpi provides detailed information regarding the procedure linkage table entry as defined in the following data structure, defined in rtld_db.h:
typedef enum { RD_RESOLVE_NONE, RD_RESOLVE_STEP, RD_RESOLVE_TARGET, RD_RESOLVE_TARGET_STEP } rd_skip_e; typedef struct rd_plt_info { rd_skip_e pi_skip_method; long pi_nstep; psaddr_t pi_target; psaddr_t pi_baddr; unsigned int pi_flags; } rd_plt_info_t; #define RD_FLG_PI_PLTBOUND 0x0001
The elements of the rd_plt_info_tstructure are:
Identifies how the procedure linkage table entry can be traversed. This method is set to one of the rd_skip_e values.
Identifies how many instructions to step over when RD_RESOLVE_STEP or RD_RESOLVE_TARGET_STEP are returned.
Specifies the address at which to set a breakpoint when RD_RESOLVE_TARGET_STEP or RD_RESOLVE_TARGET are returned.
The procedure linkage table destination address, added with RD_VERSION3. When the RD_FLG_PI_PLTBOUND flag of the pi_flags field is set, this element identifies the resolved (bound) destination address.
A flags field, added with RD_VERSION3. The flag RD_FLG_PI_PLTBOUND identifies the procedure linkage entry as having been resolved (bound) to its destination address, which is available in the pi_baddr field.
The following scenarios are possible from the rd_plt_info_t return values:
The first call through this procedure linkage table must be resolved by the runtime linker. In this case, the rd_plt_info_t contains:
{RD_RESOLVE_TARGET_STEP, M, <BREAK>, 0, 0} |
The controlling process sets a breakpoint at BREAK and continues the target process. When the breakpoint is reached, the procedure linkage table entry processing has finished. The controlling process can then step M instructions to the destination function. Notice that the bound address (pi_baddr) has not been set since this is the first call through a procedure linkage table entry.
On the Nth time through this procedure linkage table, rd_plt_info_t contains:
{RD_RESOLVE_STEP, M, 0, <BoundAddr>, RD_FLG_PI_PLTBOUND} |
The procedure linkage table entry has already been resolved and the controlling process can step M instructions to the destination function. The address that the procedure linkage table entry is bound to is <BoundAddr> and the RD_FLG_PI_PLTBOUND bit has been set in the flags field.
The default behavior of the runtime linker relies on the operating system to load dynamic objects where they can be most efficiently referenced. Some controlling processes benefit from the existence of padding around the objects loaded into memory of the target process. This interface enables a controlling process to request this padding.
This function enables or disables the padding of any subsequently loaded objects with the target process. Padding occurs on both sides of the loaded object.
rd_err_e rd_objpad_enable(struct rd_agent * rdap, size_t padsize);
padsize specifies the size of the padding, in bytes, to be preserved both before and after any objects loaded into memory. This padding is reserved as a memory mapping using mmap(2) with PROT_NONE permissions and the MAP_NORESERVE flag. Effectively, the runtime linker reserves areas of the virtual address space of the target process adjacent to any loaded objects. These areas can later be utilized by the controlling process.
A padsize of 0 disables any object padding for later objects.
Reservations obtained using mmap(2) from /dev/zero with MAP_NORESERVE can be reported using the proc(1) facilities and by referring to the link-map information provided in rd_loadobj_t.
The imported interface that a controlling process must provide to librtld_db.so.1 is defined in /usr/include/proc_service.h. A sample implementation of these proc_service functions can be found in the rdb demonstration debugger. The rtld-debugger interface uses only a subset of the proc_service interfaces available. Future versions of the rtld-debugger interface might take advantage of additional proc_service interfaces without creating an incompatible change.
The following interfaces are currently being used by the rtld-debugger interface:
This function returns a pointer to a copy of the auxv vector.
ps_err_e ps_pauxv(const struct ps_prochandle * ph, auxv_t ** aux);
Because the auxv vector information is copied to an allocated structure, the pointer remains as long as the ps_prochandle is valid.
This function reads data from the target process.
ps_err_e ps_pread(const struct ps_prochandle * ph, paddr_t addr, char * buf, int size); |
From address addr in the target process, size bytes are copied to buf.
This function writes data to the target process.
ps_err_e ps_pwrite(const struct ps_prochandle * ph, paddr_t addr, char * buf, int size);
size bytes from buf are copied into the target process at address addr.
This function is called with additional diagnostic information from the rtld-debugger interface.
void ps_plog(const char * fmt, ...);
The controlling process determines where, or if, to log this diagnostic information. The arguments to ps_plog() follow the printf(3C) format.
This function searches for the symbol in the target process.
ps_err_e ps_pglobal_lookup(const struct ps_prochandle * ph, const char * obj, const char * name, ulong_t * sym_addr);
The symbol named name is searched for within the object named obj within the target process ph. If the symbol is found, the symbol address is stored in sym_addr.
This function searches for the symbol in the target process.
ps_err_e ps_pglobal_sym(const struct ps_prochandle * ph, const char * obj, const char * name, ps_sym_t * sym_desc);
The symbol named name is searched for within the object named obj within the target process ph. If the symbol is found, the symbol descriptor is stored in sym_desc.
In the event that the rtld-debugger interface needs to find symbols within the application or runtime linker prior to any link-map creation, the following reserved values for obj are available:
#define PS_OBJ_EXEC ((const char *)0x0) /* application id */ #define PS_OBJ_LDSO ((const char *)0x1) /* runtime linker id */
The controlling process can use the procfs file system for these objects, using the following pseudo code:
ioctl(.., PIOCNAUXV, ...) - obtain AUX vectors ldsoaddr = auxv[AT_BASE]; ldsofd = ioctl(..., PIOCOPENM, &ldsoaddr); /* process elf information found in ldsofd ... */ execfd = ioctl(.., PIOCOPENM, 0); /* process elf information found in execfd ... */
Once the file descriptors are found, the ELF files can be examined for their symbol information by the controlling program.