Linker and Libraries Guide

Runtime Linker Debugger Interface

As described in Chapter 3, Runtime Linker the runtime linker performs many operations including the mapping of objects into memory and the binding of symbols. Debugging programs often need to access information that describes these runtime linker operations as part of analyzing an application. These debugging programs run as a separate process to the application they are analyzing.

This section describes the supported interface for monitoring and modifying a dynamically linked application from another process. This interface is referred to as the rtld-debugger interface. The architecture of this interface follows the model used in libthread_db(3T).

When using the rtld-debugger interface, at least two processes are involved:

One or more target processes. The target processes must be dynamically linked and use /usr/lib/ld.so.1 as their runtime linker, or /usr/lib/sparcv9/ld.so.1 if they are 64-bit SPARCV9 processes.

A controlling process links with the rtld-debugger interface library and uses it to inspect the dynamic aspects of the target process(es). A 64-bit controlling process can debug both 64-bit and 32-bit targets; however, 32-bit controlling process are limited to 32-bit targets.

The most anticipated use for rtld-debugger is that the controlling process is a debugger and its target is a dynamic executable.

The rtld-debugger interface enables the following for a target process:

Initial rendezvous with the runtime linker

Notification of the loading and unloading of dynamic objects

Retrieval of information regarding any loaded objects

Stepping over procedure linkage table entries

Enabling object padding

Interaction Between Controlling and Target Process

To be able to inspect and manipulate a target process, the rtld-debugger interface employs an exported interface, an imported interface, and agents for communicating between these interfaces.

The controlling process is linked with the rtld-debugger interface provided by librtld_db.so.1, and makes requests of the interface exported from this library. This interface is defined in /usr/include/rtld_db.h. In turn, librtld_db.so.1 makes requests of the interface imported from the controlling process. This interaction allows the rtld-debugger interface to:

Look up symbols in a target process

Read and write memory in the target process

The imported interface consists of a number of proc_service routines (which are described in "Debugger Import Interface"), which most debuggers already employ to analyze processes.

The rtld-debugger interface assumes that the process being analyzed is stopped when requests are made of the rtld-debugger interface. If this is not the case, data structures within the runtime linker of the target process might not be in a consistent state for examination.

The flow of information between librtld_db.so.1, the controlling process (debugger) and the target process (dynamic executable) is diagrammed below:

Figure 6-1 rtld-debugger information flow

Note -

The rtld-debugger interface is dependent upon the proc_service interface (/usr/include/proc_service.h) which is considered experimental. The rtld-debugger interface might have to track changes in the proc_service interface as it evolves.

A sample implementation of a controlling process that uses the rtld-debugger interface is provided in the SUNWosdem package under /usr/demo/librtld_db. This debugger, rdb, provides an example of using the proc_service imported interface, and shows the required calling sequence for all librtld_db.so.1 exported interfaces. The following sections describe the rtld-debugger interfaces, and more detailed information can be obtained by examining the sample debugger.

Debugger Interface Agents

An agent provides an opaque handle that can describe internal interface structures, and provides a mechanism of communication between the exported and imported interfaces. As the rtld-debugger interface is intended to be used by a debugger which can manipulate several processes at the same time, these agents are used to identify the process.

struct ps_prochandle;

struct ps_prochandle: An opaque structure created by the controlling process to identify the target process which is passed between the exported and imported interface.

struct rd_agent;

struct rd_agent: An opaque structure created by the rtld-debugger interface identifying the target process which is passed between the exported and imported interface.

Debugger Exported Interface

This section describes the various interfaces exported by the /usr/lib/librtld_db.so.1 audit library, and is broken down into functional groups.

Agent Manipulation

rd_err_e rd_init(int version);

rd_init()

This function establishes the rtld-debugger version requirements. The current version is defined by RD_VERSION.

If the version requirement of the controlling process is greater than the rtld-debugger interface available, then RD_NOCAPAB is returned.

rd_agent_t * rd_new(struct ps_prochandle * php);

rd_new(): This function creates a new exported interface agent. php is a cookie created by the controlling process to identify the target process. This cookie is used by the imported interface offered by the controlling process to maintain context, and is opaque to the rtld-debugger interface.

rd_err_e rd_reset(struct rd_agent * rdap);

rd_reset(): This function resets the information within the agent based off the same ps_prochandle structure given to rd_new(). This function is called when a target process is restarted.

void rd_delete(struct rd_agent * rdap);

rd_delete(): This function deletes an agent and frees any state associated with it.

Error Handling

The following error states can be returned by the rtld-debugger interface (defined in rtld_db.h):

typedef enum {
        RD_ERR,
        RD_OK,
        RD_NOCAPAB,
        RD_DBERR,
        RD_NOBASE,
        RD_NODYNAM,
        RD_NOMAPS
} rd_err_e;

The following interfaces can be used to gather the error information:

char * rd_errstr(rd_err_e rderr);

rd_errstr(): This function returns a descriptive error string describing the error code rderr.

void rd_log(const int onoff);

rd_log(): This function turns logging on (1) or off (0). When logging is turned on, the imported interface function ps_plog() provided by the controlling process, is called with more detailed diagnostic information.

Scanning Loadable Objects

Obtaining information for each object maintained on the runtime linkers link-map (see "Establishing a Name-space") is achieved using the following structure (defined in rtld_db.h):

typedef struct rd_loadobj {
        psaddr_t        rl_nameaddr;
        unsigned        rl_flags;
        psaddr_t        rl_base;
        psaddr_t        rl_data_base;
        unsigned        rl_lmident;
        psaddr_t        rl_refnameaddr;
        psaddr_t        rl_plt_base;
        unsigned        rl_plt_size;
        psaddr_t        rl_bend;
        psaddr_t        rl_padstart;
        psaddr_t        rl_padend;
} rd_loadobj_t;

Notice that all addresses given in this structure, including string pointers, are addresses in the target process and not in the address space of the controlling process itself:

rl_nameaddr: Pointer to a string which contains the name of the dynamic object.
rl_flags: Reserved for future use.
rl_base: Base address of dynamic object.
rl_data_base: Base address of data segment of dynamic object.
rl_lmident: The link-map identifier (see "Establishing a Name-space").
rl_refnameaddr: If the dynamic object is a filter (see "Shared Objects as Filters"), then this points to the name of the filtee(s).
rl_plt_base, rl_plt_size: These elements are present for backward compatibility and are currently unused.
rl_bend: End address of object (text + data + bss).
rl_padstart: Base address of padding before dynamic object (refer to "Dynamic Object Padding").
rl_padend: Base address of padding after dynamic object (refer to "Dynamic Object Padding").

The following routine uses this object data structure to access information from the runtime linkers link-map lists:

typedef int rl_iter_f(const rd_loadobj_t *, void *);
 
rd_err_e rd_loadobj_iter(rd_agent_t * rap, rl_iter_f * cb,
        void * clnt_data);

rd_loadobj_iter()

This function iterates over all dynamic objects currently loaded in the target process. On each iteration the imported function specified by cb is called. clnt_data can be used to pass data to the cb call. Information about each object is returned via a pointer to a volatile (stack allocated) rd_loadobj_t structure.

Return codes from the cb routine are examined by rd_loadobj_iter() and have the following meaning:

1 -- continue processing link-maps.

0 -- stop processing link-maps and return control to the controlling process.

rd_loadobj_iter() returns RD_OK on success. A return of RD_NOMAPS indicates the runtime linker has not yet loaded the initial link-maps.

Event Notification

There are certain events that occur within the scope of the runtime linker that a controlling process can track. These events are:

RD_PREINIT: The runtime linker has loaded and relocated all the dynamic objects and is about to start calling the .init sections of each object loaded (see "Initialization and Termination Routines").
RD_POSTINIT: The runtime linker has finished calling all of the .init sections and is about to transfer control to the primary executable.
RD_DLACTIVITY: The runtime linker has been invoked to either load or unload a dynamic object (see "Loading Additional Objects").

These events can be monitored using the following interface (defined in sys/link.h and rtld_db.h):

typedef enum {
        RD_NONE = 0,
        RD_PREINIT,
        RD_POSTINIT,
        RD_DLACTIVITY
} rd_event_e;
 
/*
 * ways that the event notification can take place:
 */
typedef enum {
        RD_NOTIFY_BPT,
        RD_NOTIFY_AUTOBPT,
        RD_NOTIFY_SYSCALL
} rd_notify_e;
 
/*
 * information on ways that the event notification can take place:
 */
typedef struct rd_notify {
        rd_notify_e     type;
        union {
                psaddr_t        bptaddr;
                long            syscallno;
        } u;
} rd_notify_t;

rderr_e rd_event_enable(struct rd_agent * rdap, int onoff);

rd_event_enable(): This function enables (1) or disables (0) event monitoring.

Note -

Presently, for performance reasons, the runtime linker ignores event disabling. The controlling process should not assume that a given break-point will not be reached because of the last call to this routine.

rderr_e rd_event_addr(rd_agent_t * rdap, rd_event_e event,
        rd_notify_t * notify);

rd_event_addr()

This function specifies how the controlling program will be notified of a given event.

Depending on the event type, the notification of the controlling process will take place by calling a benign, cheap system call which is identified by notify->u.syscallno, or executing a break point at the address specified by notify->u.bptaddr. It is the responsibility of the controlling process to trace the system call or place the actual break-point.

When an event has occurred, additional information can be obtained by this interface (defined in rtld_db.h):

typedef enum {
        RD_NOSTATE = 0,
        RD_CONSISTENT,
        RD_ADD,
        RD_DELETE
} rd_state_e;
 
typedef struct rd_event_msg {
        rd_event_e      type;
        union {
                rd_state_e      state;
        } u;
} rd_event_msg_t;

rd_state_e values have the following meaning:

RD_NOSTATE: There is no additional state information available.
RD_CONSISTANT: The link-maps are in a stable state and can be examined.
RD_ADD: A dynamic object is in the process of being loaded and the link-maps are not in a stable state. They should not be examined until the RD_CONSISTANT state is reached.
RD_DELETE: A dynamic object is in the process of being deleted and the link-maps are not in a stable state. They should not be examined until the RD_CONSISTANT state is reached.

rderr_e rd_event_getmsg(struct rd_agent * rdap,
        rd_event_msg_t * msg);

rd_event_getmsg(): This function provides additional information concerning an event.

The following table shows the possible state for each of the different event types:

`RD_PREINIT`	`RD_POSTINIT`	`RD_DLACTIVITY`
`RD_NOSTATE`	`RD_NOSTATE`	`RD_CONSISTANT`
		`RD_ADD`
		`RD_DELETE`

Procedure Linkage Table Skipping

The rtld-debugger interface offers the ability to help skip over procedure linkage table entries (refer to "Procedure Linkage Table (Processor-Specific)"). When a controlling process, such as a debugger, is asked to step into a function for the first time, they often wish to skip the actual procedure linkage table processing, as this results in control being passed to the runtime linker to search for the function definition.

The following interface allows a controlling process to step over the runtime linker's procedure linkage table processing. It is assumed that the controlling process can determine when a procedure linkage table entry is encountered, based on external information provided in the ELF file.

Once a target process has stepped into a procedure linkage table entry, it calls the following interface:

rd_err_e rd_plt_resolution(rd_agent_t * rdap, paddr_t pc,
        lwpid_t lwpid, paddr_t plt_base, rd_plt_info_t * rpi);

rd_plt_resolution()

This function returns the resolution state of the current procedure linkage table entry and information on how to skip it.

pc represents the first instruction of the procedure linkage table entry. lwpid privides the lwp identifier and plt_base provides the base address of the procedure linkage table. These three variables provide information sufficient for various architectures to process the procedure linkage table.

rpi provides detailed information regarding the procedure linkage table entry as defined in the following data structure (defined in rtld_db.h):

typedef enum {
    RD_RESOLVE_NONE,
    RD_RESOLVE_STEP,
    RD_RESOLVE_TARGET,
    RD_RESOLVE_TARGET_STEP
} rd_skip_e;
 
typedef struct rd_plt_info {
        rd_skip_e       pi_skip_method;
        long            pi_nstep;
        psaddr_t        pi_target;
} rd_plt_info_t;

The following scenarios are possible from the rd_plt_info_t return values:

This is the first call through this procedure linkage table so it must be resolved by the runtime linker. rd_plt_info_t will contain:
{RD_RESOLVE_TARGET_STEP, M, <BREAK>}
The controlling process sets a break-point at BREAK and continues the target process. When the break-point is reached, the procedure linkage table entry processing has finished, and the controlling process can step M instructions to the destination function.

This is the Nth time through this procedure linkage table. rd_plt_info_t will contain:
{RD_RESOLVE_STEP, M, 0}
The procedure linkage table entry has already been resolved and the controlling process can step M instructions to the destination function.

Note -

Future implementations might employ RD_RESOLVE_TARGET as a means of setting a break point directly in the target function; however, this capability is not yet available in this version of the rtld-debugger interface.

Dynamic Object Padding

The default behavior of the runtime linker relies on the operating system to load dynamic objects where they can be most efficiently referenced. Some controlling processes benefit from the existence of padding around the objects loaded into memory of the target process. This interface allows a controlling process to request this padding.

rd_err_e rd_objpad_enable(struct rd_agent * rdap, size_t padsize);

rd_objpad_enable()

This function enables or disables the padding of any subsequently loaded objects with the target process. Padding occurs on both sides of the loaded object.

padsize specifies the size of the padding, in bytes, to be preserved both before and after any objects loaded into memory. This padding is reserved as a memory mapping using mmap(2) with PROT_NONE permissions and the MAP_NORESERVE flag. Effectively the runtime linker reserves areas of the virtual address space of the target process adjacent to any mapped objects. These areas can later be utilized by the controlling process.

A padsize of 0 disables any object padding for later objects.

Note -

Reservations obtained using mmap(2) from /dev/zero with MAP_NORESERVE can be reported using the proc(1) facilities and by referring to the link-map information provided in rd_loadobj_t.

Debugger Import Interface

The imported interface that a controlling process must provide to librtld_db.so.1 is defined in /usr/include/proc_service.h. A sample implementation of these proc_service functions can be found in the rdb demonstration debugger. The rtld-debugger interface uses only a subset of the proc_service interfaces available. Future versions of the rtld-debugger interface might take advantage of additional proc_service interfaces without creating an incompatible change.

The following interfaces are currently being used by the rtld-debugger interface:

ps_err_e ps_pauxv(const struct ps_prochandle * ph, auxv_t ** aux);

ps_pauxv(): This function returns a pointer to a copy of the auxv vector. As the auxv vector information is copied to an allocated structure, the life time of this pointer is as long as the prochandle is valid.

ps_err_e ps_pread(const struct ps_prochandle * ph, paddr_t addr,
        char * buf, int size);

ps_pread(): This function reads size bytes from the target process at address addr and copies them into buf.

ps_err_e ps_pwrite(const struct ps_prochandle * ph, paddr_t addr,
          char * buf, int size);

ps_pwrite(): This function writes size bytes from buf into the target process at address addr.

void ps_plog(const char * fmt, ...);

ps_plog(): This function is called with additional diagnostic information from the rtld-debugger interface. It is up to the controlling process to decide where (or if) to log this diagnostic information. The arguments to ps_plog() follow the printf(3S) format.

ps_err_e ps_pglobal_lookup(const struct ps_prochandle * ph,
        const char * obj, const char * name, ulong_t * sym_addr);

ps_pglobal_lookup(): This function searches for the symbol named name within the object named obj within the target process ph. If the symbol is found address of the symbol is stored in sym_addr.

ps_err_e ps_pglobal_sym(const struct ps_prochandle * ph,
        const char * obj, const char * name, ps_sym_t * sym);

ps_pglobal_sym()

This function searches for the symbol named name within the object named obj within the target process ph. If the symbol is found, the descriptor sym is filled in.

In the event that the rtld-debugger interface needs to find symbols within the application or runtime linker prior to any link-map creation, the following reserved values for obj are available:

#define PS_OBJ_EXEC ((const char *)0x0)  /* application id */
#define PS_OBJ_LDSO ((const char *)0x1)  /* runtime linker id */

One mechanism the controlling process can use to find the symbol table for these objects is through the procfs file system using the following pseudo code:

ioctl(.., PIOCNAUXV, ...)       - obtain AUX vectors
ldsoaddr = auxv[AT_BASE];
ldsofd = ioctl(..., PIOCOPENM, &ldsoaddr);
 
/* process elf information found in ldsofd ... */
 
execfd = ioctl(.., PIOCOPENM, 0);
 
/* process elf information found in execfd ... */

Once the file descriptors are found, the ELF files can be examined for their symbol information by the controlling program.