Solaris Dynamic Tracing Guide

Chapter 40 Translators

In Chapter 39, Stability, we learned about how DTrace computes and reports program stability attributes. Ideally, we would like to construct our DTrace programs by consuming only Stable or Evolving interfaces. Unfortunately, when debugging a low-level problem or measuring system performance, you may need to enable probes that are associated with internal operating system routines such as functions in the kernel, rather than probes associated with more stable interfaces such as system calls. The data available at probe locations deep within the software stack is often a collection of implementation artifacts rather than more stable data structures such as those associated with the Solaris system call interfaces. In order to aid you in writing stable D programs, DTrace provides a facility to translate implementation artifacts into stable data structures accessible from your D program statements.

Translator Declarations

A translator is a collection of D assignment statements provided by the supplier of an interface that can be used to translate an input expression into an object of struct type. To understand the need for and use of translators, we'll consider as an example the ANSI-C standard library routines defined in stdio.h. These routines operate on a data structure named FILE whose implementation artifacts are abstracted away from C programmers. A standard technique for creating a data structure abstraction is to provide only a forward declaration of a data structure in public header files, while keeping the corresponding struct definition in a separate private header file.

If you are writing a C program and wish to know the file descriptor corresponding to a FILE struct, you can use the fileno(3C) function to obtain the descriptor rather than dereferencing a member of the FILE struct directly. The Solaris header files enforce this rule by defining FILE as an opaque forward declaration tag so it cannot be dereferenced directly by C programs that include <stdio.h>. Inside the libc.so.1 library, you can imagine that fileno() is implemented in C something like this:

int
fileno(FILE *fp)
{
	struct file_impl *ip = (struct file_impl *)fp;

	return (ip->fd);
}

Our hypothetical fileno() takes a FILE pointer as an argument and casts it to a pointer to a corresponding internal libc structure, struct file_impl, and then returns the value of the fd member of the implementation structure. Why does Solaris implement interfaces like this? By abstracting the details of the current libc implementation away from client programs, Sun is able to maintain a commitment to strong binary compatibility while continuing to evolve and change the internal implementation details of libc. In our example, the fd member could change size or position within struct file_impl, even in a patch, and existing binaries calling fileno(3C) would not be affected by this change because they do not depend on these artifacts.

Unfortunately, observability software such as DTrace has the need to peer inside the implementation in order to provide useful results, and does not have the luxury of calling arbitrary C functions defined in Solaris libraries or in the kernel. You could declare a copy of struct file_impl in your D program in order to instrument the routines declared in stdio.h, but then your D program would rely on Private implementation artifacts of the library that might break in a future micro or minor release, or even in a patch. Ideally, we want to provide a construct for use in D programs that is bound to the implementation of the library and is updated accordingly, but still provides an additional layer of abstraction associated with greater stability.

A new translator is created using a declaration of the form:

translator output-type < input-type input-identifier > {
	member-name = expression ;
	member-name = expression ;
	...
};

The output-type names a struct that will be the result type for the translation. The input-type specifies the type of the input expression, and is surrounded in angle brackets < > and followed by an input-identifier that can be used in the translator expressions as an alias for the input expression. The body of the translator is surrounded in braces { } and terminated with a semicolon (;), and consists of a list of member-name and identifiers corresponding translation expressions. Each member declaration must name a unique member of the output-type and must be assigned an expression of a type compatible with the member type, according to the rules for the D assignment (=) operator.

For example, we could define a struct of stable information about stdio files based on some of the available libc interfaces:

struct file_info {
	int file_fd;   /* file descriptor from fileno(3C) */
	int file_eof;  /* eof flag from feof(3C) */
};

A hypothetical D translator from FILE to file_info could then be declared in D as follows:

translator struct file_info < FILE *F > {
	file_fd = ((struct file_impl *)F)->fd;
	file_eof = ((struct file_impl *)F)->eof;
};

In our hypothetical translator, the input expression is of type FILE * and is assigned the input-identifier F. The identifier F can then be used in the translator member expressions as a variable of type FILE * that is only visible within the body of the translator declaration. To determine the value of the output file_fd member, the translator performs a cast and dereference similar to the hypothetical implementation of fileno(3C) shown above. A similar translation is performed to obtain the value of the EOF indicator.

Sun provides a set of translators for use with Solaris interfaces that you can invoke from your D programs, and promises to maintain these translators according to the rules for interface stability defined earlier as the implementation of the corresponding interface changes. We'll learn about these translators later in the chapter, after we learn how to invoke translators from D. The translator facility itself is also provided for use by application and library developers who wish to offer their own translators that D programmers can use to observe the state of their software packages.

Translate Operator

The D operator xlate is used to perform a translation from an input expression to one of the defined translation output structures. The xlate operator is used in an expression of the form:

xlate < output-type > ( input-expression )

For example, to invoke the hypothetical translator for FILE structs defined above and access the file_fd member, you would write the expression:

xlate <struct file_info *>(f)->file_fd;

where f is a D variable of type FILE *. The xlate expression itself is assigned the type defined by the output-type. Once a translator is defined, it can be used to translate input expressions to either the translator output struct type, or to a pointer to that struct.

If you translate an input expression to a struct, you can either dereference a particular member of the output immediately using the “.” operator, or you can assign the entire translated struct to another D variable to make a copy of the values of all the members. If you dereference a single member, the D compiler will only generate code corresponding to the expression for that member. You may not apply the & operator to a translated struct to obtain its address, as the data object itself does not exist until it is copied or one of its members is referenced.

If you translate an input expression to a pointer to a struct, you can either dereference a particular member of the output immediately using the -> operator, or you can dereference the pointer using the unary * operator, in which case the result behaves as if you translated the expression to a struct. If you dereference a single member, the D compiler will only generate code corresponding to the expression for that member. You may not assign a translated pointer to another D variable as the data object itself does not exist until it is copied or one of its members is referenced, and therefore cannot be addressed.

A translator declaration may omit expressions for one or more members of the output type. If an xlate expression is used to access a member for which no translation expression is defined, the D compiler will produce an appropriate error message and abort the program compilation. If the entire output type is copied by means of a structure assignment, any members for which no translation expressions are defined will be filled with zeroes.

In order to find a matching translator for an xlate operation, the D compiler examines the set of available translators in the following order:

First, the compiler looks for a translation from the exact input expression type to the exact output type.
Second, the compiler resolves the input and output types by following any typedef aliases to the underlying type names, and then looks for a translation from the resolved input type to the resolved output type.
Third, the compiler looks for a translation from a compatible input type to the resolved output type. The compiler uses the same rules as it does for determining compatibility of function call arguments with function prototypes in order to determine if an input expression type is compatible with a translator's input type.

If no matching translator can be found according to these rules, the D compiler produces an appropriate error message and program compilation fails.

Process Model Translators

The DTrace library file /usr/lib/dtrace/procfs.d provides a set of translators for use in your D programs to translate from the operating system kernel implementation structures for processes and threads to the stable proc(4) structures psinfo and lwpsinfo. These structures are also used in the Solaris /proc filesystem files /proc/pid/psinfo and /proc/pid/lwps/lwpid/lwpsinfo, and are defined in the system header file /usr/include/sys/procfs.h. These structures define useful Stable information about processes and threads such as the process ID, LWP ID, initial arguments, and other data displayed by the ps(1) command. Refer to proc(4) for a complete description of the struct members and semantics.

Table 40–1 procfs.d Translators


Input Type	Input Type Attributes	Output Type	Output Type Attributes
`proc_t *`	Private/Private/Common	`psinfo_t *`	Stable/Stable/Common
`kthread_t *`	Private/Private/Common	`lwpsinfo_t *`	Stable/Stable/Common

Stable Translations

While a translator provides the ability to convert information into a stable data structure, it does not necessarily resolve all stability issues that can arise in translating data. For example, if the input expression for an xlate operation itself references Unstable data, the resulting D program is also Unstable because program stability is always computed as the minimum stability of the accumulated D program statements and expressions. Therefore, it is sometimes necessary to define a specific stable input expression for a translator in order to permit stable programs to be constructed. The D inline mechanism can be used to facilitate such stable translations.

The DTrace procfs.d library provides the curlwpsinfo and curpsinfo variables described earlier as stable translations. For example, the curlwpsinfo variable is actually an inline declared as follows:

inline lwpsinfo_t *curlwpsinfo = xlate <lwpsinfo_t *> (curthread);
#pragma D attributes Stable/Stable/Common curlwpsinfo

The curlwpsinfo variable is defined as an inlined translation from the curthread variable, a pointer to the kernel's Private data structure representing a thread, to the Stable lwpsinfo_t type. The D compiler processes this library file and caches the inline declaration, making curlwpsinfo appear as any other D variable. The #pragma statement following the declaration is used to explicitly reset the attributes of the curlwpsinfo identifier to Stable/Stable/Common, masking the reference to curthread in the inlined expression. This combination of D features permits D programmers to use curthread as the source of a translation in a safe fashion that can be updated by Sun coincident to corresponding changes in the Solaris implementation.