Chapter 17 Translators

Chapter 16, DTrace Stability Features describes how DTrace computes and reports program stability attributes. Ideally, you should construct your DTrace programs by consuming only Stable or Evolving interfaces. Unfortunately, when debugging a low-level problem or measuring system performance, you might need to enable probes that are associated with internal operating system routines, such as functions in the kernel, rather than probes that are associated with more stable interfaces, such as system calls. The available data at probe locations deep within the software stack is often a collection of implementation artifacts rather than more stable data structures, such as those associated with Oracle Linux system call interfaces. To assist you with writing stable D programs, DTrace provides a facility for translating implementation artifacts into stable data structures that are accessible from your D program statements.

17.1 Translator Declarations

A translator is a collection of D assignment statements provided by the supplier of an interface. Translators can be used to translate an input expression into an object of the struct type. To understand the need for using translators, consider as an example the ANSI C standard library routines that are defined in stdio.h. These routines operate on a data structure named FILE, which contains implementation artifacts that are abstracted away from C programmers. A standard technique for creating a data structure abstraction is to provide only a forward declaration of a data structure in public header files, while keeping the corresponding struct definition in a separate and private header file.

If you are writing a C program and want to know the file descriptor corresponding to a FILE struct, use the fileno() function to obtain the descriptor rather than dereferencing a member of the FILE struct directly. The Oracle Linux header files enforce this rule by defining FILE as an opaque forward declaration tag so that it cannot be dereferenced directly by C programs that include <stdio.h>.

Inside the /lib/libc.so.6 library, consider the following hypothetical example where fileno is implemented in C, noting that a real-life implementation would not be at all similar to this example:

int
fileno(FILE *fp)
{
  struct file_impl *ip = (struct file_impl *)fp;
 
  return (ip->fd);
}

In the example, the hypothetical fileno takes a FILE pointer as an argument and casts it to a pointer that corresponds to the internal libc structure, struct file_impl, then returns the value of the fd member of the implementation structure.

Unfortunately, observability software like DTrace requires the ability to peer inside the implementation in order to provide useful results. DTrace cannot call arbitrary C functions that are defined in Oracle Linux libraries or in the kernel. You could declare a copy of struct file_impl in your D program to instrument the routines that are declared in stdio.h, but then your D program would rely on Private implementation artifacts of the library that might break in a future micro or minor release, or even in a patch. Ideally, you want to provide a construct for use in D programs that is bound to the implementation of the library and is updated accordingly, yet still provides an additional layer of abstraction associated with greater stability.

A new translator is created by using a declaration of the following form:

translator output-type < input-type input-identifier > {
  member-name = expression ;
  member-name = expression ;
  ...
};

The output-type names a struct that will be the result type for the translation. The input-type specifies the type of the input expression, is surrounded in angle brackets <>, and followed by an input-identifier that can be used in the translator expressions as an alias for the input expression. The body of the translator is surrounded in braces {} and terminated with a semicolon (;), and consists of a list of member-names and identifiers that correspond to translation expressions. Each member declaration must name a unique member of the output-type and must be assigned an expression of a type that is compatible with the member type, according to the rules for the D assignment (=) operator.

For example, you could define a struct of stable information about stdio files based on some of the available libc interfaces:

struct file_info {
  int file_fd;   /* file descriptor from fileno() */
  int file_eof;  /* eof flag from feof() */
};

Then, you could define a hypothetical D translator from FILE to file_info:

translator struct file_info < FILE *F > {
  file_fd = ((struct file_impl *)F)->fd;
  file_eof = ((struct file_impl *)F)->eof;
};

In this hypothetical translator, the input expression is of type FILE * and is assigned the input-identifier F. The identifier F can then be used in the translator member expressions as a variable of type FILE * that is only visible within the body of the translator declaration. To determine the value of the output file_fd member, the translator performs a cast and dereference similar to the hypothetical implementation of fileno() shown in the previous example. A similar translation is performed to obtain the value of the EOF indicator.

17.2 xlate D Operator

The xlate D operator is used to perform a translation from an input expression to one of the defined translation output structures. The xlate operator is used in an expression of the following form:

xlate <output-type> ( input-expression )

For example, to invoke the hypothetical translator for FILE structs that are defined previously and access the file_fd member, you would write the expression as follows:

xlate <struct file_info *>(f)->file_fd;

where f is a D variable of type FILE *. The xlate expression itself is assigned the type that is defined by the output-type. When a translator is defined, it can be used to translate input expressions to either the translator output struct type or to a pointer to that struct.

If you translate an input expression to a struct, you can either dereference a particular member of the output immediately by using the “.” operator, or you can assign the entire translated struct to another D variable to make a copy of the values of all the members. If you dereference a single member, the D compiler only generates code that corresponds to the expression for that member. You may not apply the & operator to a translated struct to obtain its address, as the data object itself does not exist until it is copied or one of its members is referenced.

If you translate an input expression to a pointer to a struct, you can either dereference a particular member of the output immediately by using the -> operator, or you can dereference the pointer by using the unary * operator. In the latter case, the result behaves as though you translated the expression to a struct. If you dereference a single member, the D compiler only generates code corresponding to the expression for that member. You may not assign a translated pointer to another D variable, as the data object does not exist until it is copied or one of its members is referenced, and therefore cannot be addressed.

A translator declaration may omit expressions for one or more members of the output type. If an xlate expression is used to access a member for which no translation expression is defined, the D compiler produces an appropriate error message and aborts the program compilation. If the entire output type is copied by means of a structure assignment, any members for which no translation expressions are defined are filled with zeroes.

To find a matching translator for an xlate operation, the D compiler examines the set of available translators in the following order:

  • The compiler checks for a translation from the exact input expression type to the exact output type.

  • The compiler resolves the input and output types by following any typedef aliases to the underlying type names, and then checks for a translation from the resolved input type to the resolved output type.

  • The compiler checks for a translation from a compatible input type to the resolved output type. The compiler uses the same rules as those used for determining compatibility of function call arguments with function prototypes in order to determine if an input expression type is compatible with a translator's input type.

If no matching translator can be found according to these rules, the D compiler produces an appropriate error message and the program compilation fails.

17.3 Process Model Translators

The DTrace library file, /usr/lib64/dtrace/version/procfs.d, provides a set of translators for use in your D programs to translate from the operating system kernel implementation structure for a process descriptor (struct task_struct), to the stable structures, psinfo and lwpsinfo. These structures define useful Stable information about processes and threads, such as the process ID, process priority, command name, initial arguments, and other data that is displayed by the ps command. The following table describes procfs.d translators.

Table 17.1 procfs.d Translators

Input Type

Input Type Attributes

Output Type

Output Type Attributes

struct task_struct *

Private/Private/Common

psinfo_t *

Stable/Stable/Common

struct task_struct *

Private/Private/Common

lwpsinfo_t *

Stable/Stable/Common


17.4 Stable Translations

Although a translator provides the ability to convert information into a stable data structure, it does not necessarily resolve all stability issues that can arise in translating data. For example, if the input expression for an xlate operation references Unstable data, the resulting D program is also Unstable because program stability is always computed as the minimum stability of the accumulated D program statements and expressions. Therefore, it is sometimes necessary to define a specific stable input expression for a translator to permit stable programs to be constructed. To facilitate such stable translations, you can use the D inline mechanism.

The DTrace procfs.d library provides the curlwpsinfo and curpsinfo variables, which were previously described as stable translations. For example, the curpsinfo and curlwpsinfo variables are actually inline and declared as follows:

inline psinfo_t *curpsinfo = xlate <psinfo_t *> (curthread);
#pragma D attributes Stable/Stable/Common curpsinfo

inline lwpsinfo_t *curlwpsinfo = xlate <lwpsinfo_t *> (curthread);
#pragma D attributes Stable/Stable/Common curlwpsinfo

The curpsinfo and curlwpsinfo are both defined as inline translations from the curthread variable, a pointer to the kernel's Private data structure representing a process descriptor, to the Stable lwpsinfo_t type. The D compiler processes this library file and caches the inline declarations, making curpsinfo and curlwpsinfo appear as any other D variable. The #pragma statement following the declaration is used to explicitly reset the attributes of the curpsinfo and curlwpsinfo identifiers to Stable/Stable/Common, masking the reference to curthread in the inline expressions.