Chapter 2 The D Programming Language

Table of Contents

The D systems programming language enables you to interface with operating system APIs and with the hardware. This chapter formally describes the overall structure of a D program and the various features for constructing probe descriptions that match more than one probe. The chapter also discusses the use of the C preprocessor, cpp, with D programs.

2.1 D Program Structure

A D program, also known as a script, consists of a set of clauses that describe the probes to enable and the predicates and actions to bind to these probes. D programs can also contain declarations of variables and definitions of new types. See Section 2.9, “Variables” and Section 2.13, “Type and Constant Definitions” for more details.

2.1.1 Probe Clauses and Declarations

As shown in the examples in this guide thus far, a D program source file consists of one or more probe clauses that describe the instrumentation to be enabled by DTrace. Each probe clause uses the following general form:

probe descriptions 
/ predicate / 
{
  action statements
}

Note that the predicate and list of action statements may be omitted. Any directives that are found outside of probe clauses are referred to as declarations. Declarations may only be used outside of probe clauses. No declarations are permitted inside of the enclosing braces ({}). Also, declarations may not be interspersed between the elements of the probe clause in previous example. You can use white space to separate any D program elements and to indent action statements.

Declarations can be used to declare D variables and external C symbols or to define new types for use in D. For more details, see Section 2.9, “Variables” and Section 2.13, “Type and Constant Definitions”. Special D compiler directives, called pragmas, may also appear anywhere in a D program, including outside of probe clauses. D pragmas are specified on lines beginning with a # character. For example, D pragmas are used to set DTrace runtime options. See Chapter 10, Options and Tunables for more details.

2.1.2 Probe Descriptions

Every program clause begins with a list of one or more probe descriptions, each taking the following usual form:

provider:module:function:name

If one or more fields of the probe description are omitted, the specified fields are interpreted from right to left by the D compiler. For example, the probe description foo:bar would match a probe with the function foo and name bar, regardless of the value of the probe's provider and module fields. Therefore, a probe description is really more accurately viewed as a pattern that can be used to match one or more probes based on their names.

You should write your D probe descriptions specifying all four field delimiters so that you can specify the desired provider on the left-hand side. If you don't specify the provider, you might obtain unexpected results if multiple providers publish probes with the same name. Similarly, subsequent versions of DTrace might include new providers with probes that unintentionally match your partially specified probe descriptions. You can specify a provider but match any of its probes by leaving any of the module, function, and name fields blank. For example, the description syscall::: can be used to match every probe that is published by the DTrace syscall provider.

Probe descriptions also support a pattern-matching syntax similar to the shell globbing pattern matching syntax that is described in the sh(1) manual page. Before matching a probe to a description, DTrace scans each description field for the characters *, ?, and [. If one of these characters appears in a probe description field and is not preceded by a \, the field is regarded as a pattern. The description pattern must match the entire corresponding field of a given probe. To successfully match and enable a probe, the complete probe description must match on every field. A probe description field that is not a pattern must exactly match the corresponding field of the probe. Note that a description field that is empty matches any probe.

The special characters in the following table are recognized in probe name patterns.

Table 2.1 Probe Name Pattern Matching Characters

Symbol

Description

*

Matches any string, including the null string.

?

Matches any single character.

[...]

Matches any one of the enclosed characters. A pair of characters separated by - matches any character between the pair, inclusive. If the first character after the [ is !, any character not enclosed in the set is matched.

\

Interpret the next character as itself, without any special meaning.


Pattern match characters can be used in any or all of the four fields of your probe descriptions. You can also use patterns to list matching probes by them on the command line by using the dtrace -l command. For example, the dtrace -l -f kmem_* command lists all of the DTrace probes in functions with names that begin with the prefix kmem_.

If you want to specify the same predicate and actions for more than one probe description, or description pattern, you can place the descriptions in a comma-separated list. For example, the following D program would trace a timestamp each time probes associated with entry to system calls containing the strings “read” or “write” fire:

syscall::*read*:entry, syscall::*write*:entry
{
  trace(timestamp);
}

A probe description can also specify a probe by using its integer probe ID, for example, the following clause could be used to enable probe ID 12345, as reported by dtrace -l -i 12345:

12345
{
  trace(timestamp);
}
Note

You should always write your D programs using human-readable probe descriptions. Integer probe IDs are not guaranteed to remain consistent as DTrace provider kernel modules are loaded and unloaded or following a reboot.

2.1.3 Clause Predicates

Predicates are expressions that are enclosed in a pair of slashes (//) that are then evaluated at probe firing time to determine whether the associated actions should be executed. Predicates are the primary conditional construct that are used for building more complex control flow in a D program. You can omit the predicate section of the probe clause entirely for any probe. In which case, the actions are always executed when the probe fires.

Predicate expressions can use any of the D operators and can refer to any D data objects such as variables and constants. The predicate expression must evaluate to a value of integer or pointer type so that it can be considered as true or false. As with all D expressions, a zero value is interpreted as false and any non-zero value is interpreted as true.

2.1.4 Probe Actions

Probe actions are described by a list of statements that are separated by semicolons (;) and enclosed in braces ({}). An empty set of braces with no statements included, leads to the default actions, which are to print the CPU and the probe.

2.1.5 Order of Execution

The actions for a probe are executed in program order, regardless of whether those actions are in the same clause or in different clauses.

No other ordering constraints are imposed. It is not uncommon for the output from two distinct probes to appear interspersed or in an opposite order from which the probes fired. Also, output might appear misordered if it came from different CPUs.

2.1.6 Use of the C Preprocessor

The C programming language that is used for defining Linux system interfaces includes a preprocessor that performs a set of initial steps in C program compilation. The C preprocessor is commonly used to define macro substitutions, where one token in a C program is replaced with another predefined set of tokens, or to include copies of system header files. You can use the C preprocessor in conjunction with your D programs by specifying the dtrace command with the -c option. This option causes the dtrace command to execute the cpp preprocessor on your program source file and then pass the results to the D compiler. The C preprocessor is described in more detail in The C Programming Language by Kernighan and Ritchie, details of which are referenced in Preface.

The D compiler automatically loads the set of C type descriptions that is associated with the operating system implementation. However, you can use the preprocessor to include other type definitions such as the types that are used in your own C programs. You can also use the preprocessor to perform other tasks such as creating macros that expand to chunks of D code and other program elements. If you use the preprocessor with your D program, you may only include files that contain valid D declarations. The D compiler can correctly interpret C header files that include only external declarations of types and symbols. However, the D compiler cannot parse C header files that include additional program elements, such as C function source code, which produces an appropriate error message.

2.2 Compilation and Instrumentation

When you write traditional programs, you often use a compiler to convert your program from source code into object code that you can execute. When you use the dtrace command you are invoking the compiler for the D language that was used in a previous example to write the hello.d program. When your program is compiled, it is sent into the operating system kernel for execution by DTrace. There, the probes named in your program are enabled and the corresponding provider performs whatever instrumentation is required in order to activate them.

All of the instrumentation in DTrace is completely dynamic: probes are enabled discretely only when you are using them. No instrumented code is present for inactive probes, so your system does not experience any kind of performance degradation when you are not using DTrace. After your experiment is complete and the dtrace command exits, all of the probes that you used are automatically disabled and their instrumentation is removed, returning your system to its exact original state. No effective difference exists between a system where DTrace is not active and a system where the DTrace software is not installed, other than a few megabytes of disk space that is required for type information and for DTrace itself.

The instrumentation for each probe is performed dynamically on the live, running operating system or on user processes that you select. The system is not quiesced or paused in any way and instrumentation code is added only for the probes that you enable. As a result, the probe effect of using DTrace is limited to exactly what you direct DTrace to do: no extraneous data is traced and no one, big “tracing switch” is turned on in the system. All of the DTrace instrumentation is designed to be as efficient as possible. These features enable you to use DTrace in production to solve real problems in real time.

The DTrace framework also provides support for an arbitrary number of virtual clients. You can run as many simultaneous DTrace experiments and commands as you like, limited only by your system's memory capacity. The commands all operate independently using the same underlying instrumentation. This same capability also permits any number of distinct users on the system to take advantage of DTrace simultaneously: developers, administrators, and service personnel can all work together, or on distinct problems, using DTrace on the same system without interfering with one another.

Unlike programs that are written in C and C++, and similar to programs that are written in the Java programming language, DTrace D programs are compiled into a safe, intermediate form that is used for execution when your probes fire. This intermediate form is validated for safety when your program is first examined by the DTrace kernel software. The DTrace execution environment also handles any runtime errors that might occur during your D program's execution, including dividing by zero, dereferencing invalid memory, and so on, and reports them to you. As a result, you can never construct an unsafe program that would cause DTrace to inadvertently damage the operating system kernel or one of the processes running on your system. These safety features enable you to use DTrace in a production environment without being concerned about crashing or corrupting your system. If you make a programming mistake, DTrace reports the error to you and disables your instrumentation, enabling you to correct the mistake and try again. The DTrace error reporting and debugging features are described later in this guide.

Figure 2.1, “Overview of the DTrace Architecture and Components” shows the different components of the DTrace architecture.

Figure 2.1 Overview of the DTrace Architecture and Components
The diagram illustrates the different components of the DTrace architecture, including probe providers that are loaded into kernel space and which communicate with the DTrace driver, the DTrace library in user space, and the dtrace command, which makes calls into the DTrace library.


Now that you understand how DTrace works, let us return to the tour of the D programming language and start writing some more interesting programs.

2.3 Variables and Arithmetic Expressions

Our next example program makes use of the DTrace profile provider to implement a simple time-based counter. The profile provider is able to create new probes based on the descriptions found in your D program. If you create a probe named profile:::tick-nsec for some integer n, the profile provider creates a probe that fires every n seconds. Type the following source code and save it in a file named counter.d:

/* 
 * Count off and report the number of seconds elapsed
 */

dtrace:::BEGIN
{ 
  i = 0; 
} 

profile:::tick-1sec
{
  i = i + 1;
  trace(i);
}

dtrace:::END 
{
  trace(i);
}

When executed, the program counts off the number of elapsed seconds until you press Ctrl-C, and then prints the total at the end:

# dtrace -s counter.d
dtrace: script 'counter.d' matched 3 probes
CPU     ID                    FUNCTION:NAME
  1    638                       :tick-1sec         1
  1    638                       :tick-1sec         2
  1    638                       :tick-1sec         3
  1    638                       :tick-1sec         4
  1    638                       :tick-1sec         5
  1    638                       :tick-1sec         6
  1    638                       :tick-1sec         7
^C
  1    638                       :tick-1sec         8
  0      2                             :END         8

The first three lines of the program are a comment to explain what the program does. Similar to C, C++, and the Java programming language, the D compiler ignores any characters between the /* and */ symbols. Comments can be used anywhere in a D program, including both inside and outside your probe clauses.

The BEGIN probe clause defines a new variable named i and assigns it the integer value zero using the statement:

i = 0;

Unlike C, C++, and the Java programming language, D variables can be created by simply using them in a program statement; explicit variable declarations are not required. When a variable is used for the first time in a program, the type of the variable is set based on the type of its first assignment. Each variable has only one type over the lifetime of the program, so subsequent references must conform to the same type as the initial assignment. In counter.d, the variable i is first assigned the integer constant zero, so its type is set to int. D provides the same basic integer data types as C, including those in the following table.

Data Type

Description

char

Character or single byte integer

int

Default integer

short

Short integer

long

Long integer

long long

Extended long integer

The sizes of these types are dependent on the operating system kernel's data model, described in Section 2.8, “Types, Operators, and Expressions”. D also provides built-in friendly names for signed and unsigned integer types of various fixed sizes, as well as thousands of other types that are defined by the operating system.

The central part of counter.d is the probe clause that increments the counter i:

profile:::tick-1sec
{
  i = i + 1;
  trace(i);
}

This clause names the probe profile:::tick-1sec, which tells the profile provider to create a new probe that fires once per second on an available processor. The clause contains two statements, the first incrementing i, and the second tracing (printing) the new value of i. All the usual C arithmetic operators are available in D. For the complete list, see Section 2.8, “Types, Operators, and Expressions”. The trace function takes any D expression as its argument, so you could write counter.d more concisely as follows:

profile:::tick-1sec
{
  trace(++i);
}

If you want to explicitly control the type of the variable i, you can surround the desired type in parentheses when you assign it in order to cast the integer zero to a specific type. For example, if you wanted to determine the maximum size of a char in D, you could change the BEGIN clause as follows:

dtrace:::BEGIN
{
  i = (char)0;
}

After running counter.d for a while, you should see the traced value grow and then wrap around back to zero. If you grow impatient waiting for the value to wrap, try changing the profile probe name to profile:::tick-100msec to make a counter that increments once every 100 milliseconds, or 10 times per second.

2.4 Predicate Examples

For runtime safety, one major difference between D and other programming languages such as C, C++, and the Java programming language is the absence of control-flow constructs such as if-statements and loops. D program clauses are written as single straight-line statement lists that trace an optional, fixed amount of data. D does provide the ability to conditionally trace data and modify control flow using logical expressions called predicates. A predicate expression is evaluated at probe firing time prior to executing any of the statements associated with the corresponding clause. If the predicate evaluates to true, represented by any non-zero value, the statement list is executed. If the predicate is false, represented by a zero value, none of the statements are executed and the probe firing is ignored.

Type the following source code for the next example and save it in a file named countdown.d:

dtrace:::BEGIN 
{
  i = 10;
}

profile:::tick-1sec
/i > 0/
{
  trace(i--);
}

profile:::tick-1sec
/i == 0/
{
  trace("blastoff!");
  exit(0);
}

This D program implements a 10-second countdown timer using predicates. When executed, countdown.d counts down from 10 and then prints a message and exits:

# dtrace -s countdown.d
dtrace: script 'countdown.d' matched 3 probes
CPU     ID                    FUNCTION:NAME
  0    638                       :tick-1sec        10
  0    638                       :tick-1sec         9
  0    638                       :tick-1sec         8
  0    638                       :tick-1sec         7
  0    638                       :tick-1sec         6
  0    638                       :tick-1sec         5
  0    638                       :tick-1sec         4
  0    638                       :tick-1sec         3
  0    638                       :tick-1sec         2
  0    638                       :tick-1sec         1
  0    638                       :tick-1sec   blastoff!       
#

This example uses the BEGIN probe to initialize an integer i to 10 to begin the countdown. Next, as in the previous example, the program uses the tick-1sec probe to implement a timer that fires once per second. Notice that in countdown.d, the tick-1sec probe description is used in two different clauses, each with a different predicate and action list. The predicate is a logical expression surrounded by enclosing slashes // that appears after the probe name and before the braces {} that surround the clause statement list.

The first predicate tests whether i is greater than zero, indicating that the timer is still running:

profile:::tick-1sec
/i > 0/
{
  trace(i--);
}

The relational operator > means greater than and returns the integer value zero for false and one for true. All of the C relational operators are supported in D. For the complete list, see Section 2.8, “Types, Operators, and Expressions”. If i is not yet zero, the script traces i and then decrements it by one using the -- operator.

The second predicate uses the == operator to return true when i is exactly equal to zero, indicating that the countdown is complete:

profile:::tick-1sec
/i == 0/
{
  trace("blastoff!");
  exit(0);
}

Similar to the first example, hello.d, countdown.d uses a sequence of characters enclosed in double quotes, called a string constant, to print a final message when the countdown is complete. The exit function is then used to exit dtrace and return to the shell prompt.

If you look back at the structure of countdown.d, you will see that by creating two clauses with the same probe description but different predicates and actions, we effectively created the logical flow:

i = 10
once per second,
  if i is greater than zero
    trace(i--);
  if i is equal to zero
    trace("blastoff!");
    exit(0);

When you wish to write complex programs using predicates, try to first visualize your algorithm in this manner, and then transform each path of your conditional constructs into a separate clause and predicate.

Now let us combine predicates with a new provider, the syscall provider, and create our first real D tracing program. The syscall provider permits you to enable probes on entry to or return from any Oracle Linux system call. The next example uses DTrace to observe every time your shell performs a read() or write() system call. First, open two windows, one to use for DTrace and the other containing the shell process that you are going to watch. In the second window, type the following command to obtain the process ID of this shell:

# echo $$
2860

Now go back to your first window and type the following D program and save it in a file named rw.d. As you type in the program, replace the integer constant 2860 with the process ID of the shell that was printed in response to your echo command.

syscall::read:entry,
syscall::write:entry
/pid == 2860/
{
}

Notice that the body of rw.d's probe clause is left empty because the program is only intended to trace notification of probe firings and not to trace any additional data. Once you have typed in rw.d, use dtrace to start your experiment and then go to your second shell window and type a few commands, pressing return after each command. As you type, you should see dtrace report probe firings in your first window, similar to the following example:

# dtrace -s rw.d
dtrace: script 'rw.d' matched 2 probes
CPU     ID                    FUNCTION:NAME
  1      7                      write:entry 
  1      5                       read:entry 
  0      7                      write:entry 
  0      5                       read:entry 
  0      7                      write:entry 
  0      5                       read:entry 
  0      7                      write:entry 
  0      5                       read:entry  
  0      7                      write:entry 
  1      7                      write:entry 
  1      7                      write:entry 
  1      5                       read:entry
...^C

You are now watching your shell perform read() and write() system calls to read a character from your terminal window and echo back the result. This example includes many of the concepts described so far and a few new ones as well. First, to instrument read() and write() in the same manner, the script uses a single probe clause with multiple probe descriptions by separating the descriptions with commas like this:

syscall::read:entry,
syscall::write:entry

For readability, each probe description appears on its own line. This arrangement is not strictly required, but it makes for a more readable script. Next the script defines a predicate that matches only those system calls that are executed by your shell process:

/pid == 2860/

The predicate uses the predefined DTrace variable pid, which always evaluates to the process ID associated with the thread that fired the corresponding probe. DTrace provides many built-in variable definitions for useful things like the process ID. The following table lists a few DTrace variables you can use to write your first D programs.

Variable Name

Data Type

Meaning

errno

int

Current errno value for system calls

execname

string

Name of the current process's executable file

pid

pid_t

Process ID of the current process

tid

id_t

Thread ID of the current thread

probeprov

string

Current probe description's provider field

probemod

string

Current probe description's module field

probefunc

string

Current probe description's function field

probename

string

Current probe description's name field

Now that you've written a real instrumentation program, try experimenting with it on different processes running on your system by changing the process ID and the system call probes that are instrumented. Then, you can make one more simple change and turn rw.d into a very simple version of a system call tracing tool like strace. An empty probe description field acts as a wildcard, matching any probe, so change your program to the following new source code to trace any system call executed by your shell:

syscall:::entry
/pid == 2860/
{
}

Try typing a few commands in the shell such as cd, ls, and date and see what your DTrace program reports.

2.5 Output Formatting Examples

System call tracing is a powerful way to observe the behavior of many user processes. The following example improves upon the earlier rw.d program by formatting its output so you can more easily understand the output. Type the following program and save it in a file called stracerw.d:

syscall::read:entry,
syscall::write:entry
/pid == $1/
{
  printf("%s(%d, 0x%x, %4d)", probefunc, arg0, arg1, arg2);
}

syscall::read:return,
syscall::write:return
/pid == $1/
{
  printf("\tt = %d\n", arg1);
}

In this example, the constant 2860 is replaced with the label $1 in each predicate. This label enables you to specify the process of interest as an argument to the script: $1 is replaced by the value of the first argument when the script is compiled. To execute stracerw.d, use the dtrace options -q and -s, followed by the process ID of your shell as the final argument. The -q option indicates that dtrace should be quiet and suppress the header line and the CPU and ID columns shown in the preceding examples. As a result, you only see the output for the data that you explicitly trace. Type the following command, replacing 2860 with the process ID of a shell process, and then press return a few times in the specified shell:

# dtrace -q -s stracerw.d 2860
        t = 1
write(2, 0x7fa621b9b000,    1)  t = 1
write(1, 0x7fa621b9c000,   22)  t = 22
write(2, 0x7fa621b9b000,   20)  t = 20
read(0, 0x7fff60f74b8f,    1)   t = 1
write(2, 0x7fa621b9b000,    1)  t = 1
write(1, 0x7fa621b9c000,   22)  t = 22
write(2, 0x7fa621b9b000,   20)  t = 20
read(0, 0x7fff60f74b8f,    1)   t = 1
write(2, 0x7fa621b9b000,    1)  t = 1
write(1, 0x7fa621b9c000,   22)  t = 22
write(2, 0x7fa621b9b000,   20)  t = 20
read(0, 0x7fff60f74b8f,    1)^C
#

Now let us examine your D program and its output in more detail. First, a clause similar to the earlier program instruments each of the shell's calls to read() and write(). But for this example, we use a new function, printf, to trace the data and print it out in a specific format:

syscall::read:entry,
syscall::write:entry
/pid == $1/
{
  printf("%s(%d, 0x%x, %4d)", probefunc, arg0, arg1, arg2);
}

The printf function combines the ability to trace data, as if by the trace function used earlier, with the ability to output the data and other text in a specific format that you describe. The printf function tells DTrace to trace the data associated with each argument after the first argument, and then to format the results using the rules described by the first printf argument, known as a format string.

The format string is a regular string that contains any number of format conversions, each beginning with the % character, that describe how to format the corresponding argument. The first conversion in the format string corresponds to the second printf argument, the second conversion to the third argument, and so on. All of the text between conversions is printed verbatim. The character following the % conversion character describes the format to use for the corresponding argument. Here are the meanings of the three format conversions used in stracerw.d.

Format Conversion

Description

%d

Print the corresponding value as a decimal integer

%s

Print the corresponding value as a string

%x

Print the corresponding value as a hexadecimal integer

DTrace printf works just like the C printf() library routine or the shell printf utility. If you have never seen printf before, the formats and options are explained in detail in Chapter 6, Output Formatting. You should read this chapter carefully even if you are already familiar with printf from another language. In D, printf is provided as a built-in and some new format conversions are available to you designed specifically for DTrace.

To help you write correct programs, the D compiler validates each printf format string against its argument list. Try changing probefunc in the clause above to the integer 123. If you run the modified program, you will see an error message telling you that the string format conversion %s is not appropriate for use with an integer argument:

# dtrace -q -s stracerw.d
dtrace: failed to compile script stracerw.d: line 5: printf( )
argument #2 is incompatible with conversion #1 prototype:
	conversion: %s
	 prototype: char [] or string (or use stringof)
	  argument: int
#

To print the name of the read or write system call and its arguments, use the printf statement:

printf("%s(%d, 0x%x, %4d)", probefunc, arg0, arg1, arg2);

to trace the name of the current probe function and the first three integer arguments to the system call, available in the DTrace variables arg0, arg1, and arg2. For more information about probe arguments, see Section 2.9.5, “Built-In Variables”. The first argument to read() and write() is a file descriptor, printed in decimal. The second argument is a buffer address, formatted as a hexadecimal value. The final argument is the buffer size, formatted as a decimal value. The format specifier %4d is used for the third argument to indicate that the value should be printed using the %d format conversion with a minimum field width of 4 characters. If the integer is less than 4 characters wide, printf inserts extra blanks to align the output.

To print the result of the system call and complete each line of output, use the following clause:

syscall::read:return,
syscall::write:return
/pid == $1/
{
  printf("\tt = %d\n", arg1);
}

Notice that the syscall provider also publishes a probe named return for each system call in addition to entry. The DTrace variable arg1 for the syscall return probes evaluates to the system call's return value. The return value is formatted as a decimal integer. The character sequences beginning with backwards slashes in the format string expand to tab (\t) and newline (\n) respectively. These escape sequences help you print or record characters that are difficult to type. D supports the same set of escape sequences as C, C++, and the Java programming language. For a complete list of escape sequences, see Section 2.8.3, “Constants”.

2.6 Array Overview

D permits you to define variables that are integers, as well as other types to represent strings and composite types called structs and unions. If you are familiar with C programming, you will be happy to know you can use any type in D that you can in C. If you are not a C expert, do not worry: the different kinds of data types are all described in Section 2.8, “Types, Operators, and Expressions”.

D also supports arrays. Linearly indexed scalar arrays, familiar to C programmers, are discussed in Section 2.10.3, “Array Declarations and Storage”.

More powerful and commonly used are associative arrays, which are indexed with tuples. Each associative array has a particular type signature. That is, its tuples all have the same number of elements, those elements of consistent type and in the same order, and its values are all of the same type. D associative arrays are described further in Section 2.9.2, “Associative Arrays”.

2.6.1 Associative Array Example

For example, the following D statements access an associative array, whose values must all be type int and whose tuples must all have signature string,int, setting an element to 456 and then incrementing it to 457:

a["hello", 123] = 456;
a["hello", 123]++;

Now let us use an associative array in a D program. Type the following program and save it in a file named rwtime.d:

syscall::read:entry,
syscall::write:entry
/pid == $1/
{
  ts[probefunc] = timestamp;
}
syscall::read:return,
syscall::write:return
/pid == $1 && ts[probefunc] != 0/
{
  printf("%d nsecs", timestamp - ts[probefunc]);
}

As with stracerw.d, specify the ID of the shell process when you execute rwtime.d. If you type a few shell commands, you will see the time elapsed during each system call. Type in the following command and then press return a few times in your other shell:

# dtrace -s rwtime.d `/usr/bin/pgrep -n bash`
dtrace: script 'rwtime.d' matched 4 probes
CPU     ID                    FUNCTION:NAME
  0      8                     write:return 51962 nsecs
  0      8                     write:return 45257 nsecs
  0      8                     write:return 40787 nsecs
  1      6                      read:return 925959305 nsecs
  1      8                     write:return 46934 nsecs
  1      8                     write:return 41626 nsecs
  1      8                     write:return 176839 nsecs
...
^C
#

To trace the elapsed time for each system call, you must instrument both the entry to and return from read() and write() and measure the time at each point. Then, on return from a given system call, you must compute the difference between our first and second timestamp. You could use separate variables for each system call, but this would make the program annoying to extend to additional system calls. Instead, it is easier to use an associative array indexed by the probe function name. The following is the first probe clause:

syscall::read:entry,
syscall::write:entry
/pid == $1/
{
  ts[probefunc] = timestamp;
}

This clause defines an array named ts and assigns the appropriate member the value of the DTrace variable timestamp. This variable returns the value of an always-incrementing nanosecond counter. When the entry timestamp is saved, the corresponding return probe samples timestamp again and reports the difference between the current time and the saved value:

syscall::read:return,
syscall::write:return
/pid == $1 && ts[probefunc] != 0/
{
  printf("%d nsecs", timestamp - ts[probefunc]);
}

The predicate on the return probe requires that DTrace is tracing the appropriate process and that the corresponding entry probe has already fired and assigned ts[probefunc] a non-zero value. This trick eliminates invalid output when DTrace first starts. If your shell is already waiting in a read() system call for input when you execute dtrace, the read:return probe fires without a preceding read:entry for this first read() and ts[probefunc] will evaluate to zero because it has not yet been assigned.

2.7 External Symbols and Types

DTrace instrumentation executes inside the Oracle Linux operating system kernel. So, in addition to accessing special DTrace variables and probe arguments, you can also access kernel data structures, symbols, and types. These capabilities enable advanced DTrace users, administrators, service personnel, and driver developers to examine low-level behavior of the operating system kernel and device drivers. The reading list at the start of this guide includes books that can help you learn more about Oracle Linux operating system internals.

D uses the back quote character (`) as a special scoping operator for accessing symbols that are defined in the operating system and not in your D program. For example, the Oracle Linux kernel contains a C declaration of a system variable named max_pfn. This variable is declared in C in the kernel source code as follows:

unsigned long max_pfn

To trace the value of this variable in a D program, you can write the following D statement:

trace(`max_pfn);

DTrace associates each kernel symbol with the type that is used for the symbol in the corresponding operating system C code, which provides easy source-based access to the native operating system data structures.

To use external operating system variables, you will need access to the corresponding operating system source code.

Kernel symbol names are kept in a separate namespace from D variable and function identifiers, so you do not need to be concerned about these names conflicting with your D variables. When you prefix a variable with a back quote, the D compiler searches the known kernel symbols and uses the list of loaded modules to find a matching variable definition. Because the Oracle Linux kernel supports dynamically loaded modules with separate symbol namespaces, the same variable name might be used more than once in the active operating system kernel. You can resolve these name conflicts by specifying the name of the kernel module that contains the variable to be accessed prior to the back quote in the symbol name. For example, you would refer to the address of the _bar function that is provided by a kernel module named foo as follows:

foo`_bar

You can apply any of the D operators to external variables, except for those that modify values, subject to the usual rules for operand types. When required, the D compiler loads the variable names that correspond to active kernel modules, so you do not need to declare these variables. You may not apply any operator to an external variable that modifies its value, such as = or +=. For safety reasons, DTrace prevents you from damaging or corrupting the state of the software that you are observing.

When you access external variables from a D program, you are accessing the internal implementation details of another program, such as the operating system kernel or its device drivers. These implementation details do not form a stable interface upon which you can rely. Any D programs you write that depend on these details might cease to work when you next upgrade the corresponding piece of software. For this reason, external variables are typically used to debug performance or functionality problems by using DTrace. To learn more about the stability of your D programs, see Chapter 16, DTrace Stability Features.

You have now completed a whirlwind tour of DTrace and have learned many of the basic DTrace building blocks that are necessary to build larger and more complex D programs. The remaining portions of this chapter describe the complete set of rules for D and demonstrate how DTrace can make complex performance measurements and functional analysis of the system easy. Later, you will learn how to use DTrace to connect user application behavior to system behavior, which provides you with the capability to analyze your entire software stack.

2.8 Types, Operators, and Expressions

D provides the ability to access and manipulate a variety of data objects: variables and data structures can be created and modified, data objects that are defined in the operating system kernel and user processes can be accessed, and integer, floating-point, and string constants can be declared. D provides a superset of the ANSI C operators that are used to manipulate objects and create complex expressions. This section describes the detailed set of rules for types, operators, and expressions.

2.8.1 Identifier Names and Keywords

D identifier names are composed of uppercase and lowercase letters, digits, and underscores, where the first character must be a letter or underscore. All identifier names beginning with an underscore (_) are reserved for use by the D system libraries. You should avoid using these names in your D programs. By convention, D programmers typically use mixed-case names for variables and all uppercase names for constants.

D language keywords are special identifiers that are reserved for use in the programming language syntax itself. These names are always specified in lowercase and must not be used for the names of D variables. The following table lists the keywords that are reserved for use by the D language.

Table 2.2 D Keywords

auto*

do*

if*

register*

string+

unsigned

break*

double

import*+

restrict*

stringof+

void

case*

else*

inline

return*

struct

volatile

char

enum

int

self+

switch*

while*

const

extern

long

short

this+

xlate+

continue*

float

offsetof+

signed

translator+

 

counter*+

for*

probe*+

sizeof

typedef

 

default*

goto*

provider*+

static*

union

 

D reserves for use as keywords a superset of the ANSI C keywords. The keywords reserved for future use by the D language are marked with “*”. The D compiler produces a syntax error if you attempt to use a keyword that is reserved for future use. The keywords that are defined by D but not defined by ANSI C are marked with “+”. D provides the complete set of types and operators found in ANSI C. The major difference in D programming is the absence of control-flow constructs. Note that keywords associated with control-flow in ANSI C are reserved for future use in D.

2.8.2 Data Types and Sizes

D provides fundamental data types for integers and floating-point constants. Arithmetic may only be performed on integers in D programs. Floating-point constants may be used to initialize data structures, but floating-point arithmetic is not permitted in D. In Oracle Linux, D provides a 64-bit data model for use in writing programs. However, a 32-bit data model is not supported. The data model used when executing your program is the native data model that is associated with the active operating system kernel, which must also be 64-bit.

The names of the integer types and their sizes in the 64-bit data model are shown in the following table. Integers are always represented in twos-complement form in the native byte-encoding order of your system.

Table 2.3 D Integer Data Types

Type Name

64-bit Size

char

1 byte

short

2 bytes

int

4 bytes

long

8 bytes

long long

8 bytes


Integer types can be prefixed with the signed or unsigned qualifier. If no sign qualifier is present, it is assumed that the type is signed. The D compiler also provides the type aliases that are listed in the following table.

Table 2.4 D Integer Type Aliases

Type Name

Description

int8_t

1-byte signed integer

int16_t

2-byte signed integer

int32_t

4-byte signed integer

int64_t

8-byte signed integer

intptr_t

Signed integer of size equal to a pointer

uint8_t

1-byte unsigned integer

uint16_t

2-byte unsigned integer

uint32_t

4-byte unsigned integer

uint64_t

8-byte unsigned integer

uintptr_t

Unsigned integer of size equal to a pointer


These type aliases are equivalent to using the name of the corresponding base type listed in the previous table and are appropriately defined for each data model. For example, the uint8_t type name is an alias for the type unsigned char. See Section 2.13, “Type and Constant Definitions” for information about how to define your own type aliases for use in D programs.

Note

The predefined type aliases cannot be used in files that are included by the preprocessor.

D provides floating-point types for compatibility with ANSI C declarations and types. Floating-point operators are not supported in D, but floating-point data objects can be traced and formatted with the printf function. You can use the floating-point types that are listed in the following table.

Table 2.5 D Floating-Point Data Types

Type Name

64-bit Size

float

4 bytes

double

8 bytes

long double

16 bytes


D also provides the special type string to represent ASCII strings. Strings are discussed in more detail in Section 2.11, “DTrace Support for Strings”.

2.8.3 Constants

Integer constants can be written in decimal (12345), octal (012345), or hexadecimal (0x12345) format. Octal (base 8) constants must be prefixed with a leading zero. Hexadecimal (base 16) constants must be prefixed with either 0x or 0X. Integer constants are assigned the smallest type among int, long, and long long that can represent their value. If the value is negative, the signed version of the type is used. If the value is positive and too large to fit in the signed type representation, the unsigned type representation is used. You can apply one of the suffixes listed in the following table to any integer constant to explicitly specify its D type.

Suffix

D type

u or U

unsigned version of the type selected by the compiler

l or L

long

ul or UL

unsigned long

ll or LL

long long

ull or ULL

unsigned long long

Floating-point constants are always written in decimal format and must contain either a decimal point (12.345), an exponent (123e45), or both ( 123.34e-5). Floating-point constants are assigned the type double by default. You can apply one of the suffixes listed in the following table to any floating-point constant to explicitly specify its D type.

Suffix

D type

f or F

float

l or L

long double

Character constants are written as a single character or escape sequence that is enclosed in a pair of single quotes ('a'). Character constants are assigned the int type rather than char and are equivalent to an integer constant with a value that is determined by that character's value in the ASCII character set. See the ascii(7) manual page for a list of characters and their values. You can also use any of the special escape sequences that are listed in the following table in your character constants. D supports the same escape sequences as those found in ANSI C.

Table 2.6 Character Escape Sequences

Escape Sequence

Represents

Escape Sequence

Represents

\a

alert

\\

backslash

\b

backspace

\?

question mark

\f

form feed

\'

single quote

\n

newline

\"

double quote

\r

carriage return

\0oo

octal value 0oo

\t

horizontal tab

\xhh

hexadecimal value 0xhh

\v

vertical tab

\0

null character


You can include more than one character specifier inside single quotes to create integers with individual bytes that are initialized according to the corresponding character specifiers. The bytes are read left-to-right from your character constant and assigned to the resulting integer in the order corresponding to the native endianness of your operating environment. Up to eight character specifiers can be included in a single character constant.

Strings constants of any length can be composed by enclosing them in a pair of double quotes ("hello"). A string constant may not contain a literal newline character. To create strings containing newlines, use the \n escape sequence instead of a literal newline. String constants can contain any of the special character escape sequences that are shown for character constants previously. Similar to ANSI C, strings are represented as arrays of characters terminated by a null character (\0) that is implicitly added to each string constant you declare. String constants are assigned the special D type string. The D compiler provides a set of special features for comparing and tracing character arrays that are declared as strings. See Section 2.11, “DTrace Support for Strings” for more information.

2.8.4 Arithmetic Operators

D provides the binary arithmetic operators that are described in the following table for use in your programs. These operators all have the same meaning for integers that they do in ANSI C.

Table 2.7 Binary Arithmetic Operators

Operator

Description

+

Integer addition

-

Integer subtraction

*

Integer multiplication

/

Integer division

%

Integer modulus


Arithmetic in D may only be performed on integer operands or on pointers. See Section 2.10, “Pointers and Scalar Arrays”. Arithmetic may not be performed on floating-point operands in D programs. The DTrace execution environment does not take any action on integer overflow or underflow. You must specifically check for these conditions in situations where overflow and underflow can occur.

However, the DTrace execution environment does automatically check for and report division by zero errors resulting from improper use of the / and % operators. If a D program executes an invalid division operation, DTrace automatically disables the affected instrumentation and reports the error. Errors that are detected by DTrace have no effect on other DTrace users or on the operating system kernel. You therefore do not need to be concerned about causing any damage if your D program inadvertently contains one of these errors.

In addition to these binary operators, the + and - operators can also be used as unary operators as well, and these operators have higher precedence than any of the binary arithmetic operators. The order of precedence and associativity properties for all of the D operators is presented in Table 2.12, “D Operator Precedence and Associativity”. You can control precedence by grouping expressions in parentheses (()).

2.8.5 Relational Operators

D provides the binary relational operators that are described in the following table for use in your programs. These operators all have the same meaning that they do in ANSI C.

Table 2.8 D Relational Operators

Operator

Description

<

Left-hand operand is less than right-operand

<=

Left-hand operand is less than or equal to right-hand operand

>

Left-hand operand is greater than right-hand operand

>=

Left-hand operand is greater than or equal to right-hand operand

==

Left-hand operand is equal to right-hand operand

!=

Left-hand operand is not equal to right-hand operand


Relational operators are most frequently used to write D predicates. Each operator evaluates to a value of type int, which is equal to one if the condition is true, or zero if it is false.

Relational operators can be applied to pairs of integers, pointers, or strings. If pointers are compared, the result is equivalent to an integer comparison of the two pointers interpreted as unsigned integers. If strings are compared, the result is determined as if by performing a strcmp() on the two operands. The following table shows some example D string comparisons and their results.

D string comparison

Result

"coffee" < "espresso"

Returns 1 (true)

"coffee" == "coffee"

Returns 1 (true)

"coffee"" >= "mocha"

Returns 0 (false)

Relational operators can also be used to compare a data object associated with an enumeration type with any of the enumerator tags defined by the enumeration. Enumerations are a facility for creating named integer constants and are described in more detail in Section 2.13, “Type and Constant Definitions”.

2.8.6 Logical Operators

D provides the binary logical operators that are listed in the following table for use in your programs. The first two operators are equivalent to the corresponding ANSI C operators.

Table 2.9 D Logical Operators

Operator

Description

&&

Logical AND: true if both operands are true

||

Logical OR: true if one or both operands are true

^^

Logical XOR: true if exactly one operand is true


Logical operators are most frequently used in writing D predicates. The logical AND operator performs the following short-circuit evaluation: if the left-hand operand is false, the right-hand expression is not evaluated. The logical OR operator also performs the following short-circuit evaluation: if the left-hand operand is true, the right-hand expression is not evaluated. The logical XOR operator does not short-circuit. Both expression operands are always evaluated.

In addition to the binary logical operators, the unary ! operator can be used to perform a logical negation of a single operand: it converts a zero operand into a one and a non-zero operand into a zero. By convention, D programmers use ! when working with integers that are meant to represent boolean values and == 0 when working with non-boolean integers, although the expressions are equivalent.

The logical operators may be applied to operands of integer or pointer types. The logical operators interpret pointer operands as unsigned integer values. As with all logical and relational operators in D, operands are true if they have a non-zero integer value and false if they have a zero integer value.

2.8.7 Bitwise Operators

D provides the binary operators that are listed in the following table for manipulating individual bits inside of integer operands. These operators all have the same meaning as in ANSI C.

Table 2.10 D Bitwise Operators

Operator

Description

&

Bitwise AND

|

Bitwise OR

^

Bitwise XOR

<<

Shift the left-hand operand left by the number of bits specified by the right-hand operand

>>

Shift the left-hand operand right by the number of bits specified by the right-hand operand


The binary & operator is used to clear bits from an integer operand. The binary | operator is used to set bits in an integer operand. The binary ^ operator returns one in each bit position, exactly where one of the corresponding operand bits is set.

The shift operators are used to move bits left or right in a given integer operand. Shifting left fills empty bit positions on the right-hand side of the result with zeroes. Shifting right using an unsigned integer operand fills empty bit positions on the left-hand side of the result with zeroes. Shifting right using a signed integer operand fills empty bit positions on the left-hand side with the value of the sign bit, also known as an arithmetic shift operation.

Shifting an integer value by a negative number of bits or by a number of bits larger than the number of bits in the left-hand operand itself produces an undefined result. The D compiler produces an error message if the compiler can detect this condition when you compile your D program.

In addition to the binary logical operators, the unary ~ operator may be used to perform a bitwise negation of a single operand: it converts each zero bit in the operand into a one bit, and each one bit in the operand into a zero bit.

2.8.8 Assignment Operators

D provides the binary assignment operators that are listed in the folloiwng table for modifying D variables. You can only modify D variables and arrays. Kernel data objects and constants may not be modified using the D assignment operators. The assignment operators have the same meaning as they do in ANSI C.

Table 2.11 D Assignment Operators

Operator

Description

=

Set the left-hand operand equal to the right-hand expression value.

+=

Increment the left-hand operand by the right-hand expression value

-=

Decrement the left-hand operand by the right-hand expression value.

*=

Multiply the left-hand operand by the right-hand expression value.

/=

Divide the left-hand operand by the right-hand expression value.

%=

Modulo the left-hand operand by the right-hand expression value.

|=

Bitwise OR the left-hand operand with the right-hand expression value.

&=

Bitwise AND the left-hand operand with the right-hand expression value.

^=

Bitwise XOR the left-hand operand with the right-hand expression value.

<<=

Shift the left-hand operand left by the number of bits specified by the right-hand expression value.

>>=

Shift the left-hand operand right by the number of bits specified by the right-hand expression value.


Aside from the assignment operator =, the other assignment operators are provided as shorthand for using the = operator with one of the other operators that were described earlier. For example, the expression x = x + 1 is equivalent to the expression x += 1, except that the expression x is evaluated one time. These assignment operators adhere to the same rules for operand types as the binary forms described earlier.

The result of any assignment operator is an expression equal to the new value of the left-hand expression. You can use the assignment operators or any of the operators described thus far in combination to form expressions of arbitrary complexity. You can use parentheses () to group terms in complex expressions.

2.8.9 Increment and Decrement Operators

D provides the special unary ++ and -- operators for incrementing and decrementing pointers and integers. These operators have the same meaning as they do in ANSI C. These operators can only be applied to variables and they may be applied either before or after the variable name. If the operator appears before the variable name, the variable is first modified and then the resulting expression is equal to the new value of the variable. For example, the following two code fragments produce identical results:

x += 1; y = x;

y = ++x;

If the operator appears after the variable name, then the variable is modified after its current value is returned for use in the expression. For example, the following two code fragments produce identical results:

y = x; x -= 1;

y = x--;

You can use the increment and decrement operators to create new variables without declaring them. If a variable declaration is omitted and the increment or decrement operator is applied to a variable, the variable is implicitly declared to be of type int64_t.

The increment and decrement operators can be applied to integer or pointer variables. When applied to integer variables, the operators increment or decrement the corresponding value by one. When applied to pointer variables, the operators increment or decrement the pointer address by the size of the data type that is referenced by the pointer. Pointers and pointer arithmetic in D are discussed in Section 2.10, “Pointers and Scalar Arrays”.

2.8.10 Conditional Expressions

Although D does not provide support for if-then-else constructs, it does provide support for simple conditional expressions by using the ? and : operators. These operators enable a triplet of expressions to be associated, where the first expression is used to conditionally evaluate one of the other two.

For example, the following D statement could be used to set a variable x to one of two strings, depending on the value of i:

x = i == 0 ? "zero" : "non-zero";

In the previous example, the expression i == 0 is first evaluated to determine whether it is true or false. If the expression is true, the second expression is evaluated and its value is returned. If the expression is false, the third expression is evaluated and its value is returned.

As with any D operator, you can use multiple ?: operators in a single expression to create more complex expressions. For example, the following expression would take a char variable c containing one of the characters 0-9, a-f, or A-F, and return the value of this character when interpreted as a digit in a hexadecimal (base 16) integer:

hexval = (c >= '0' && c <= '9') ? c - '0' : (c >= 'a' && c <= 'f') ? c + 10 - 'a' : c + 10 - 'A';

To be evaluated for its truth value, the first expression that is used with ?: must be a pointer or integer. The second and third expressions can be of any compatible types. You may not construct a conditional expression where, for example, one path returns a string and another path returns an integer. The second and third expressions also may not invoke a tracing function such as trace or printf. If you want to conditionally trace data, use a predicate instead. See Section 2.4, “Predicate Examples” for more information.

2.8.11 Type Conversions

When expressions are constructed by using operands of different but compatible types, type conversions are performed to determine the type of the resulting expression. The D rules for type conversions are the same as the arithmetic conversion rules for integers in ANSI C. These rules are sometimes referred to as the usual arithmetic conversions.

A simple way to describe the conversion rules is as follows: each integer type is ranked in the order char, short, int, long, long long, with the corresponding unsigned types assigned a rank higher than its signed equivalent, but below the next integer type. When you construct an expression using two integer operands such as x + y and the operands are of different integer types, the operand type with the highest rank is used as the result type.

If a conversion is required, the operand with the lower rank is first promoted to the type of the higher rank. Promotion does not actually change the value of the operand: it simply extends the value to a larger container according to its sign. If an unsigned operand is promoted, the unused high-order bits of the resulting integer are filled with zeroes. If a signed operand is promoted, the unused high-order bits are filled by performing sign extension. If a signed type is converted to an unsigned type, the signed type is first sign-extended and then assigned the new, unsigned type that is determined by the conversion.

Integers and other types can also be explicitly cast from one type to another. In D, pointers and integers can be cast to any integer or pointer types, but not to other types. Rules for casting and promoting strings and character arrays are discussed in Section 2.11, “DTrace Support for Strings”.

An integer or pointer cast is formed using an expression such as the following:

y = (int)x;

In this example, the destination type is enclosed in parentheses and used to prefix the source expression. Integers are cast to types of higher rank by performing promotion. Integers are cast to types of lower rank by zeroing the excess high-order bits of the integer.

Because D does not permit floating-point arithmetic, no floating-point operand conversion or casting is permitted and no rules for implicit floating-point conversion are defined.

2.8.12 Operator Precedence

Table 2.12, “D Operator Precedence and Associativity” lists the D rules for operator precedence and associativity. These rules are somewhat complex, but they are necessary to provide precise compatibility with the ANSI C operator precedence rules. The following entries in the following table are in order from highest precedence to lowest precedence.

Table 2.12 D Operator Precedence and Associativity

Operators

Associativity

() [] -> .

Left to right

! ~ ++ -- + - * & (type) sizeof stringof offsetof xlate

Right to left

* / %

Left to right

+ -

Left to right

<< >>

Left to right

< <= > >=

Left to right

== !=

Left to right

&

Left to right

^

Left to right

|

Left to right

&&

Left to right

^^

Left to right

||

Left to right

?:

Right to left

= += -= *= /= %= &= ^= ?= <<= >>=

Right to left

,

Left to right


Several operators listed in the previous table that have not been discussed yet. These operators are described in subsequent chapters. The following table lists several miscellaneous operators that are provided by the D language.

Operators

Description

For More Information

sizeof

Computes the size of an object.

Section 2.12, “Structs and Unions”

offsetof

Computes the offset of a type member.

Section 2.12, “Structs and Unions”

stringof

Converts the operand to a string.

Section 2.11, “DTrace Support for Strings”

xlate

Translates a data type.

Chapter 17, Translators

unary &

Computes the address of an object.

Section 2.10, “Pointers and Scalar Arrays”

unary *

Dereferences a pointer to an object.

Section 2.10, “Pointers and Scalar Arrays”

-> and .

Accesses a member of a structure or union type.

Section 2.12, “Structs and Unions”

The comma (,) operator that is listed in the table is for compatibility with the ANSI C comma operator. It can be used to evaluate a set of expressions in left-to-right order and return the value of the right most expression. This operator is provided strictly for compatibility with C and should generally not be used.

The () entry listed in the table of operator precedence represents a function call. For examples of calls to functions, such as printf and trace, see Chapter 6, Output Formatting. A comma is also used in D to list arguments to functions and to form lists of associative array keys. Note that this comma is not the same as the comma operator and does not guarantee left-to-right evaluation. The D compiler provides no guarantee regarding the order of evaluation of arguments to a function or keys to an associative array. Note that you should be careful of using expressions with interacting side-effects, such as the pair of expressions i and i++, in these contexts.

The [] entry listed in the table of operator precedence represents an array or associative array reference. Examples of associative arrays are presented in Section 2.9.2, “Associative Arrays”. A special kind of associative array, called an aggregation, is described in Chapter 3, Aggregations. The [] operator can also be used to index into fixed-size C arrays as well. See Section 2.10, “Pointers and Scalar Arrays”.

2.9 Variables

D provides two basic types of variables for use in your tracing programs: scalar variables and associative arrays. An aggregation is a special kind of array variable. See Chapter 3, Aggregations for more information about aggregations.

To understand the scope of variables, consider the following figure.

Figure 2.2 Scope of Variables
The figure illustrates global, thread-local, and clause-local scopes.


In the figure, system execution is illustrated, showing elapsed time along the horizontal axis and thread number along the vertical axis. D probes fire at different times on different threads, and each time a probe fires, the D script is run. Any D variable would have one of the scopes that are described in the following table.

Scope

Syntax

Initial Value

Thread-safe?

Description

global

myname

0

No

Any probe that fires on any thread accesses the same instance of the variable.

Thread-local

self->myname

0

Yes

Any probe that fires on a thread accesses the thread-specific instance of the variable.

Clause-local

this->myname

Not defined

Yes

Any probe that fires accesses an instance of the variable specific to that particular firing of the probe.

Note

Note the following additional information:

  • Scalar variables and associative arrays have a global scope and are not multi-processor safe (MP-safe). Because the value of such variables can be changed by more than one processor, there is a chance that a variable can become corrupted if more than one probe modifies it.

  • Aggregations are MP-safe even though they have a global scope because independent copies are updated locally before a final aggregation produces the global result.

2.9.1 Scalar Variables

Scalar variables are used to represent individual, fixed-size data objects, such as integers and pointers. Scalar variables can also be used for fixed-size objects that are composed of one or more primitive or composite types. D provides the ability to create arrays of objects, as well as composite structures. DTrace also represents strings as fixed-size scalars by permitting them to grow to a predefined maximum length. Control over string length in your D program is discussed further in Section 2.11, “DTrace Support for Strings”.

Scalar variables are created automatically the first time you assign a value to a previously undefined identifier in your D program. For example, to create a scalar variable named x of type int, you can simply assign it a value of type int in any probe clause, for example:

BEGIN
{
  x = 123;
}

Scalar variables that are created in this manner are global variables: each one is defined once and is visible in every clause of your D program. Any time that you reference the x identifier, you are referring to a single storage location associated with this variable.

Unlike ANSI C, D does not require explicit variable declarations. If you do want to declare a global variable and assign its name and type explicitly before using it, you can place a declaration outside of the probe clauses in your program, as shown in the following example:

int x; /* declare an integer x for later use */
BEGIN
{
  x = 123;
  ...
}

Explicit variable declarations are not necessary in most D programs, but sometimes are useful when you want to carefully control your variable types or when you want to begin your program with a set of declarations and comments documenting your program's variables and their meanings.

Unlike ANSI C declarations, D variable declarations may not assign initial values. You must use a BEGIN probe clause to assign any initial values. All global variable storage is filled with zeroes by DTrace before you first reference the variable.

The D language definition places no limit on the size and number of D variables. Limits are defined by the DTrace implementation and by the memory that is available on your system. The D compiler enforces any of the limitations that can be applied at the time you compile your program. See Chapter 10, Options and Tunables for more about how to tune options related to program limits.

2.9.2 Associative Arrays

Associative arrays are used to represent collections of data elements that can be retrieved by specifying a name, which is called a key. D associative array keys are formed by a list of scalar expression values, called a tuple. You can think of the array tuple as an imaginary parameter list to a function that is called to retrieve the corresponding array value when you reference the array. Each D associative array has a fixed key signature consisting of a fixed number of tuple elements, where each element has a given, fixed type. You can define different key signatures for each array in your D program.

Associative arrays differ from normal, fixed-size arrays in that they have no predefined limit on the number of elements: the elements can be indexed by any tuple, as opposed to just using integers as keys, and the elements are not stored in preallocated, consecutive storage locations. Associative arrays are useful in situations where you would use a hash table or other simple dictionary data structure in a C, C++, or Java language program. Associative arrays provide the ability to create a dynamic history of events and state captured in your D program, which you can use to create more complex control flows.

To define an associative array, you write an assignment expression of the following form:

name [ key ] = expression ;

where name is any valid D identifier and key is a comma-separated list of one or more expressions.

For example, the following statement defines an associative array a with key signature [ int, string ] and stores the integer value 456 in a location named by the tuple [123, "hello"]:

a[123, "hello"] = 456;

The type of each object that is contained in the array is also fixed for all elements in a given array. Because it was first assigned by using the integer 456, every subsequent value that is stored in the array will also be of type int. You can use any of the assignment operators that are defined in Section 2.8, “Types, Operators, and Expressions” to modify associative array elements, subject to the operand rules defined for each operator. The D compiler produces an appropriate error message if you attempt an incompatible assignment. You can use any type with an associative array key or value that can be used with a scalar variable.

You can reference an associative array by using any tuple that is compatible with the array key signature. The rules for tuple compatibility are similar to those for function calls and variable assignments. That is, the tuple must be of the same length and each type in the list of actual parameters and must be compatible with the corresponding type in the formal key signature. For example, for an associative array x that is defined as follows:

x[123ull] = 0;

The key signature is of type unsigned long long and the values are of type int. This array can also be referenced by using the expression x['a'] because the tuple consisting of the character constant 'a', of type int and length one, is compatible with the key signature unsigned long long, according to the arithmetic conversion rules. These rules are described in Section 2.8.11, “Type Conversions”.

If you need to explicitly declare a D associative array before using it, you can create a declaration of the array name and key signature outside of the probe clauses in your program source code, for example:

int x[unsigned long long, char];
BEGIN
{
  x[123ull, 'a'] = 456;
}

Storage is allocated only for array elements with a nonzero value.

Note

When an associative array is defined, references to any tuple of a compatible key signature are permitted, even if the tuple in question has not been previously assigned. Accessing an unassigned associative array element is defined to return a zero-filled object. A consequence of this definition is that underlying storage is not allocated for an associative array element until a non-zero value is assigned to that element. Conversely, assigning an associative array element to zero causes DTrace to deallocate the underlying storage.

This behavior is important because the dynamic variable space out of which associative array elements are allocated is finite; if it is exhausted when an allocation is attempted, the allocation fails and an error message indicating a dynamic variable drop is generated. Always assign zero to associative array elements that are no longer in use. See Chapter 10, Options and Tunables for information about techniques that you can use to eliminate dynamic variable drops.

2.9.3 Thread-Local Variables

DTrace provides the ability to declare variable storage that is local to each operating system thread, as opposed to the global variables demonstrated earlier in this chapter. Thread-local variables are useful in situations where you want to enable a probe and mark every thread that fires the probe with some tag or other data. Creating a program to solve this problem is easy in D because thread-local variables share a common name in your D code, but refer to separate data storage that is associated with each thread.

Thread-local variables are referenced by applying the -> operator to the special identifier self, for example:

syscall::read:entry
{
  self->read = 1;
}

This D fragment example enables the probe on the read() system call and associates a thread-local variable named read with each thread that fires the probe. Similar to global variables, thread-local variables are created automatically on their first assignment and assume the type that is used on the right-hand side of the first assignment statement, which is int in this example.

Each time the self->read variable is referenced in your D program, the data object that is referenced is the one associated with the operating system thread that was executing when the corresponding DTrace probe fired. You can think of a thread-local variable as an associative array that is implicitly indexed by a tuple that describes the thread's identity in the system. A thread's identity is unique over the lifetime of the system: if the thread exits and the same operating system data structure is used to create a new thread, this thread does not reuse the same DTrace thread-local storage identity.

When you have defined a thread-local variable, you can reference it for any thread in the system, even if the variable in question has not been previously assigned for that particular thread. If a thread's copy of the thread-local variable has not yet been assigned, the data storage for the copy is defined to be filled with zeroes. As with associative array elements, underlying storage is not allocated for a thread-local variable until a non-zero value is assigned to it. Also, as with associative array elements, assigning zero to a thread-local variable causes DTrace to deallocate the underlying storage. Always assign zero to thread-local variables that are no longer in use. For other techniques to fine-tune the dynamic variable space from which thread-local variables are allocated, see Chapter 10, Options and Tunables.

Thread-local variables of any type can be defined in your D program, including associative arrays. The following are some example thread-local variable definitions:

self->x = 123; /* integer value */

self->s = "hello"; /* string value */

self->a[123, 'a'] = 456; /* associative array */

Like any D variable, you do not need to explicitly declare thread-local variables prior to using them. If you want to create a declaration anyway, you can place one outside of your program clauses by pre-pending the keyword self, for example:

self int x; /* declare int x as a thread-local variable */ 
syscall::read:entry
{
  self->x = 123;
}

Thread-local variables are kept in a separate namespace from global variables so that you can reuse names. Remember that x and self->x are not the same variable if you overload names in your program.

The following example shows how to use thread-local variables. In an editor, type the following program and save it in a file named rtime.d:

syscall::read:entry
{
  self->t = timestamp;
}

syscall::read:return
/self->t != 0/
{
  printf("%d/%d spent %d nsecs in read()\n", pid, tid, timestamp - self->t);
  /* 
   * We are done with this thread-local variable; assign zero to it
   * to allow the DTrace runtime to reclaim the underlying storage.
   */ 
  self->t = 0;
}

Next, in your shell, start the program running. Wait a few seconds and you should begin to see some output. If no output appears, try running a few commands:

# dtrace -q -s rtime.d
3987/3987 spent 12786263 nsecs in read()
2183/2183 spent 13410 nsecs in read()
2183/2183 spent 12850 nsecs in read()
2183/2183 spent 10057 nsecs in read()
3583/3583 spent 14527 nsecs in read()
3583/3583 spent 12571 nsecs in read()
3583/3583 spent 9778 nsecs in read()
3583/3583 spent 9498 nsecs in read()
3583/3583 spent 9778 nsecs in read()
2183/2183 spent 13968 nsecs in read()
2183/2183 spent 72076 nsecs in read()
...
^C
#

The rtime.d program uses a thread-local variable that is named to capture a timestamp on entry to read() by any thread. Then, in the return clause, the program prints the amount of time spent in read() by subtracting self->t from the current timestamp. The built-in D variables pid and tid report the process ID and thread ID of the thread that is performing the read(). Because self->t is no longer needed after this information is reported, it is then assigned 0 to enable DTrace to reuse the underlying storage that is associated with t for the current thread.

Typically, you see many lines of output without doing anything because server processes and daemons are executing read() all the time behind the scenes. Try changing the second clause of rtime.d to use the execname variable to print out the name of the process performing a read(), for example:

printf("%s/%d spent %d nsecs in read()\n", execname, tid, timestamp - self->t);

If you find a process that is of particular interest, add a predicate to learn more about its read() behavior, as shown in the following example:

syscall::read:entry
/execname == "Xorg"/
{
  self->t = timestamp;
}

2.9.4 Clause-Local Variables

The value of a D variable can be accessed whenever a probe fires. Section 2.9, “Variables” describes how variables could have a different scope. For a global variable, the same instance of the variable is accessed from every thread. For thread-local, the instance of the variable is thread-specific.

Meanwhile, for a clause-local variable, the instance of the variable is specific to that particular firing of the probe. Clause-local is the narrowest scope. When a probe fires on a CPU, the D script is executed in program order. Each clause-local variable is instantiated with an undefined value the first time it is used in the script. The same instance of the variable is used in all clauses until the D script has completed execution for that particular firing of the probe.

Clause-local variables can be referenced and assigned by prefixing with this->:

BEGIN
{
  this->secs = timestamp / 1000000000;
  ...
}

If you want to declare a clause-local variable explicitly before using it, you can do so by using the this keyword:

this int x;  /* an integer clause-local variable */
this char c; /* a character clause-local variable */

BEGIN
{
  this->x = 123;
  this->c = 'D';
}

Note that if your program contains multiple clauses for a single probe, any clause-local variables remain intact as the clauses are executed, as shown in the following example. Type the following source code and save it in a file named clause.d:

int me;       /* an integer global variable */
this int foo; /* an integer clause-local variable */

tick-1sec
{ 
  /*
   * Set foo to be 10 if and only if this is the first clause executed.
   */
  this->foo = (me % 3 == 0) ? 10 : this->foo;
  printf("Clause 1 is number %d; foo is %d\n", me++ % 3, this->foo++);
}

tick-1sec
{
  /*
   * Set foo to be 20 if and only if this is the first clause executed.
   */
  this->foo = (me % 3 == 0) ? 20 : this->foo;
  printf("Clause 2 is number %d; foo is %d\n", me++ % 3, this->foo++);
}

tick-1sec
{
  /*
   * Set foo to be 30 if and only if this is the first clause executed.
   */
  this->foo = (me % 3 == 0) ? 30 : this->foo;
  printf("Clause 3 is number %d; foo is %d\n", me++ % 3, this->foo++);
}

Because the clauses are always executed in program order, and because clause-local variables are persistent across different clauses that are enabling the same probe, running the preceding program always produces the same output:

# dtrace -q -s clause.d
Clause 1 is number 0; foo is 10
Clause 2 is number 1; foo is 11
Clause 3 is number 2; foo is 12
Clause 1 is number 0; foo is 10
Clause 2 is number 1; foo is 11
Clause 3 is number 2; foo is 12
Clause 1 is number 0; foo is 10
Clause 2 is number 1; foo is 11
Clause 3 is number 2; foo is 12
Clause 1 is number 0; foo is 10
Clause 2 is number 1; foo is 11
Clause 3 is number 2; foo is 12
^C

While clause-local variables are persistent across clauses that are enabling the same probe, their values are undefined in the first clause executed for a given probe. Be sure to assign each clause-local variable an appropriate value before using it or your program might have unexpected results.

Clause-local variables can be defined using any scalar variable type, but associative arrays may not be defined using clause-local scope. The scope of clause-local variables only applies to the corresponding variable data, not to the name and type identity defined for the variable. When a clause-local variable is defined, this name and type signature can be used in any subsequent D program clause.

You can use clause-local variables to accumulate intermediate results of calculations or as temporary copies of other variables. Access to a clause-local variable is much faster than access to an associative array. Therefore, if you need to reference an associative array value multiple times in the same D program clause, it is more efficient to copy it into a clause-local variable first and then reference the local variable repeatedly.

2.9.5 Built-In Variables

The following table provides a complete list of built-in D variables. All of these variables are scalar global variables.

Table 2.13 DTrace Built-In Variables

Variable

Description

args[]

The typed arguments, if any, to the current probe. The args[] array is accessed using an integer index, but each element is defined to be the type corresponding to the given probe argument. For information about any typed arguments, use dtrace -l with the verbose option -v and check Argument Types.

int64_t arg0, ..., arg9

The first ten input arguments to a probe, represented as raw 64-bit integers. Values are meaningful only for arguments defined for the current probe.

uintptr_t caller

The program counter location of the current kernel thread at the time the probe fired.

chipid_t chip

The CPU chip identifier for the current physical chip.

processorid_t cpu

The CPU identifier for the current CPU. See Section 11.8, “sched Provider” for more information.

cpuinfo_t *curcpu

The CPU information for the current CPU. See Section 11.8, “sched Provider”.

lwpsinfo_t *curlwpsinfo

The process state of the current thread. See Section 11.7, “proc Provider”.

psinfo_t *curpsinfo

The process state of the process associated with the current thread. See Section 11.7, “proc Provider”.

task_struct *curthread

Is a vmlinux data type, for which members can be found by searching for "task_struct" on the Internet.

string cwd

The name of the current working directory of the process associated with the current thread.

uint_t epid

The enabled probe ID (EPID) for the current probe. This integer uniquely identifies a particular probe that is enabled with a specific predicate and set of actions.

int errno

The error value returned by the last system call executed by this thread.

string execname

The name that was passed to execve() to execute the current process.

fileinfo_t fds[]

The files that the current process has opened in an fileinfo_t array, indexed by file descriptor number. See Section 11.9.2.3, “fileinfo_t”.

Note

You must load the sdt kernel module for fds[] to be available.

gid_t gid

The real group ID of the current process.

uint_t id

The probe ID for the current probe. This ID is the system-wide unique identifier for the probe, as published by DTrace and listed in the output of dtrace -l.

uint_t ipl

The interrupt priority level (IPL) on the current CPU at probe firing time.

Note

This value is non-zero if interrupts are firing and zero otherwise. The non-zero value depends on whether preemption is active, as well as other factors, and can vary between kernel releases and kernel configurations.

lgrp_id_t lgrp

The latency group ID for the latency group of which the current CPU is a member. This value is always zero.

pid_t pid

The process ID of the current process.

pid_t ppid

The parent process ID of the current process.

string probefunc

The function name portion of the current probe's description.

string probemod

The module name portion of the current probe's description.

string probename

The name portion of the current probe's description.

string probeprov

The provider name portion of the current probe's description.

psetid_t pset

The processor set ID for the processor set containing the current CPU. This value is always zero.

string root

The name of the root directory of the process that is associated with the current thread.

uint_t stackdepth

The current thread's stack frame depth at probe firing time.

id_t tid

The task ID of the current thread.

uint64_t timestamp

The current value of a nanosecond timestamp counter. This counter increments from an arbitrary point in the past and should only be used for relative computations.

uintptr_t ucaller

The program counter location of the current user thread at the time the probe fired.

uid_t uid

The real user ID of the current process.

uint64_t uregs[]

The current thread's saved user-mode register values at probe firing time. Use of the uregs[] array is discussed in Section 12.5, “uregs[] Array”.

uint64_t vtimestamp

The current value of a nanosecond timestamp counter that is virtualized to the amount of time that the current thread has been running on a CPU, minus the time spent in DTrace predicates and actions. This counter increments from an arbitrary point in the past and should only be used for relative time computations.

int64_t walltimestamp

The current number of nanoseconds since 00:00 Universal Coordinated Time, January 1, 1970.


Functions that are built into the D language such as trace are discussed in Chapter 4, Actions and Subroutines.

2.9.6 External Variables

The D language uses the back quote character (`) as a special scoping operator for accessing variables that are defined in the operating system and not in your D program. For more information, see Section 2.7, “External Symbols and Types”.

2.10 Pointers and Scalar Arrays

Pointers are memory addresses of data objects in the operating system kernel or in the address space of a user process. D provides the ability to create and manipulate pointers and store them in variables and associative arrays. This section describes the D syntax for pointers, operators that can be applied to create or access pointers, and the relationship between pointers and fixed-size scalar arrays. Also discussed are issues relating to the use of pointers in different address spaces.

Note

If you are an experienced C or C++ programmer, you can skim most of this section as the D pointer syntax is the same as the corresponding ANSI C syntax. Howevver, you should read Section 2.10.1, “Pointers and Addresses” and Section 2.10.8, “Pointers to DTrace Objects”, as these sections describe features and issues that are specific to DTrace.

2.10.1 Pointers and Addresses

The Linux operating system uses a technique called virtual memory to provide each user process with its own virtual view of the memory resources on your system. A virtual view of memory resources is referred to as an address space. An address space associates a range of address values, either [0 ... 0xffffffff] for a 32-bit address space or [0 ... 0xffffffffffffffff] for a 64-bit address space, with a set of translations that the operating system and hardware use to convert each virtual address to a corresponding physical memory location. Pointers in D are data objects that store an integer virtual address value and associate it with a D type that describes the format of the data stored at the corresponding memory location.

You can explicitly declare a D variable to be of pointer type by first specifying the type of the referenced data and then appending an asterisk (*) to the type name. Doing so indicates you want to declare a pointer type, as shown in the following statement:

int *p;

This statement declares a D global variable named p that is a pointer to an integer. The declaration means that p is a 64-bit integer with a value that is the address of another integer located somewhere in memory. Because the compiled form of your D code is executed at probe firing time inside the operating system kernel itself, D pointers are typically pointers associated with the kernel's address space. You can use the arch command to determine the number of bits that are used for pointers by the active operating system kernel.

If you want to create a pointer to a data object inside of the kernel, you can compute its address by using the & operator. For example, the operating system kernel source code declares an unsigned long max_pfn variable. You could trace the address of this variable by tracing the result of applying the & operator to the name of that object in D:

trace(&`max_pfn);

The * operator can be used to refer to the object addressed by the pointer, and acts as the inverse of the & operator. For example, the following two D code fragments are equivalent in meaning:

q = &`max_pfn; trace(*q);

trace(`max_pfn); 

In this example, the first fragment creates a D global variable pointer q. Because the max_pfn object is of type unsigned long, the type of &`max_pfn is unsigned long * (that is, pointer to unsigned long), implicitly setting the type of q. Tracing the value of *qfollows the pointer back to the data object max_pfn. This fragment is therefore the same as the second fragment, which directly traces the value of the data object by using its name.

2.10.2 Pointer Safety

If you are a C or C++ programmer, you might be a bit apprehensive after reading the previous section because you know that misuse of pointers in your programs can cause your programs to crash. DTrace, however, is a robust, safe environment for executing your D programs. Take note that these types of mistakes cannot cause program crashes. You might write a buggy D program, but invalid D pointer accesses do not cause DTrace or the operating system kernel to fail or crash in any way. Instead, the DTrace software detects any invalid pointer accesses, disables your instrumentation, and reports the problem back to you for debugging.

If you have previously programmed in the Java programming language, you are probably aware that the Java language does not support pointers for precisely the same reasons of safety. Pointers are needed in D because they are an intrinsic part of the operating system's implementation in C, but DTrace implements the same kind of safety mechanisms that are found in the Java programming language to prevent buggy programs from damaging themselves or each other. DTrace's error reporting is similar to the runtime environment for the Java programming language that detects a programming error and reports an exception.

To observe DTrace's error handling and reporting, you could write a deliberately bad D program using pointers. For example, in an editor, type the following D program and save it in a file named badptr.d:

BEGIN
{
  x = (int *)NULL;
  y = *x;
  trace(y);
}

The badptr.d program creates a D pointer named x that is a pointer to int. The program assigns this pointer the special invalid pointer value NULL, which is a built-in alias for address 0. By convention, address 0 is always defined as invalid so that NULL can be used as a sentinel value in C and D programs. The program uses a cast expression to convert NULL to be a pointer to an integer. The program then dereferences the pointer by using the expression *x, assigns the result to another variable y, and then attempts to trace y. When the D program is executed, DTrace detects an invalid pointer access when the statement y = *x is executed and reports the following error:

# dtrace -s badptr.d
dtrace: script 'badptr.d' matched 1 probe
dtrace: error on enabled probe ID 1 (ID 1: dtrace:::BEGIN):
invalid address (0x0) in action #2 at DIF offset 4
^C
#

Notice that the D program moves past the error and continues to execute; the system and all observed processes remain unperturbed. You can also add an ERROR probe to your script to handle D errors. For details about the DTrace error mechanism, see Section 11.1.3, “ERROR Probe”.

2.10.3 Array Declarations and Storage

In addition to the dynamic associative arrays that are described in Section 2.9, “Variables”, D supports scalar arrays. Scalar arrays are a fixed-length group of consecutive memory locations that each store a value of the same type. Scalar arrays are accessed by referring to each location with an integer, starting from zero. Scalar arrays correspond directly in concept and syntax with arrays in C and C++. Scalar arrays are not used as frequently in D as associative arrays and their more advanced counterparts aggregations. You might, however, need to use scalar arrays to access existing operating system array data structures that are declared in C. Aggregations are described in Chapter 3, Aggregations.

A D scalar array of 5 integers is declared by using the type int and suffixing the declaration with the number of elements in square brackets, for example:

int a[5];

Figure 2.3, “Scalar Array Representation” shows a visual representation of the array storage:

Figure 2.3 Scalar Array Representation
The figure illustrates the elements a[0] through a[4] of the array, declared as a[5], arranged side-by-side in memory.


The D expression a[0] refers to the first array element, a[1] refers to the second, and so on. From a syntactic perspective, scalar arrays and associative arrays are very similar. You can declare an associative array of integers referenced by an integer key as follows:

int a[int];

You can also reference this array using the expression a[0]. But, from a storage and implementation perspective, the two arrays are very different. The static array a consists of five consecutive memory locations numbered from zero, and the index refers to an offset in the storage that is allocated for the array. On the other hand, an associative array has no predefined size and does not store elements in consecutive memory locations. In addition, associative array keys have no relationship to the corresponding value storage location. You can access associative array elements a[0] and a[-5] and only two words of storage are allocated by DTrace, and these might or might not be consecutive. Associative array keys are abstract names for the corresponding values and have no relationship to the value storage locations.

If you create an array using an initial assignment and use a single integer expression as the array index , for example, a[0] = 2, the D compiler always creates a new associative array, even though in this expression a could also be interpreted as an assignment to a scalar array. Scalar arrays must be predeclared in this situation so that the D compiler can recognize the definition of the array size and infer that the array is a scalar array.

2.10.4 Pointer and Array Relationship

Pointers and scalar arrays have a special relationship in D, just as they do in ANSI C. A scalar array is represented by a variable that is associated with the address of its first storage location. A pointer is also the address of a storage location with a defined type. Thus, D permits the use of the array [] index notation with both pointer variables and array variables. For example, the following two D fragments are equivalent in meaning:

p = &a[0]; trace(p[2]);

trace(a[2]); 

In the first fragment, the pointer p is assigned to the address of the first element in scalar array a by applying the & operator to the expression a[0]. The expression p[2] traces the value of the third array element (index 2). Because p now contains the same address associated with a, this expression yields the same value as a[2], shown in the second fragment. One consequence of this equivalence is that C and D permit you to access any index of any pointer or array. Array bounds checking is not performed for you by the compiler or the DTrace runtime environment. If you access memory beyond the end of a scalar array's predefined size, you either get an unexpected result or DTrace reports an invalid address error, as shown in the previous example. As always, you cannot damage DTrace itself or your operating system, but you do need to debug your D program.

The difference between pointers and arrays is that a pointer variable refers to a separate piece of storage that contains the integer address of some other storage. Whereas, an array variable names the array storage itself, not the location of an integer that in turn contains the location of the array. Figure 2.4, “Pointer and Array Storage” illustrates this difference.

Figure 2.4 Pointer and Array Storage
The diagram illustrates the pointer p with the value 0x12345678, which is the address of the first element (a[0]) of the array declared as a[5].


This difference is manifested in the D syntax if you attempt to assign pointers and scalar arrays. If x and y are pointer variables, the expression x = y is legal; it copies the pointer address in y to the storage location that is named by x. If x and y are scalar array variables, the expression x = y is not legal. Arrays may not be assigned as a whole in D. However, an array variable or symbol name can be used in any context where a pointer is permitted. If p is a pointer and a is a scalar array, the statement p = a is permitted. This statement is equivalent to the statement p = &a[0].

2.10.5 Pointer Arithmetic

Because pointers are just integers that are used as addresses of other objects in memory, D provides a set of features for performing arithmetic on pointers. However, pointer arithmetic is not identical to integer arithmetic. Pointer arithmetic implicitly adjusts the underlying address by multiplying or dividing the operands by the size of the type referenced by the pointer.

The following D fragment illustrates this property:

int *x;

BEGIN
{
  trace(x);
  trace(x + 1);
  trace(x + 2);
}

This fragment creates an integer pointer x and then traces its value, its value incremented by one, and its value incremented by two. If you create and execute this program, DTrace reports the integer values 0, 4, and 8.

Since x is a pointer to an int (size 4 bytes), incrementing x adds 4 to the underlying pointer value. This property is useful when using pointers to refer to consecutive storage locations such as arrays. For example, if x was assigned to the address of an array a, similar to what is shown in Figure 2.4, “Pointer and Array Storage”, the expression x + 1 would be equivalent to the expression &a[1]. Similarly, the expression *(x + 1) would refer to the value a[1]. Pointer arithmetic is implemented by the D compiler whenever a pointer value is incremented by using the +, ++, or =+ operators. Pointer arithmetic is also applied as follows; when an integer is subtracted from a pointer on the left-hand side, when a pointer is subtracted from another pointer, or when the -- operator is applied to a pointer.

For example, the following D program would trace the result 2:

int *x, *y;
int a[5];

BEGIN
{
  x = &a[0];
  y = &a[2];
  trace(y - x);
}

2.10.6 Generic Pointers

Sometimes it is useful to represent or manipulate a generic pointer address in a D program without specifying the type of data referred to by the pointer. Generic pointers can be specified by using the type void *, where the keyword void represents the absence of specific type information, or by using the built-in type alias uintptr_t, which is aliased to an unsigned integer type of size that is appropriate for a pointer in the current data model. You may not apply pointer arithmetic to an object of type void *, and these pointers cannot be dereferenced without casting them to another type first. You can cast a pointer to the uintptr_t type when you need to perform integer arithmetic on the pointer value.

Pointers to void can be used in any context where a pointer to another data type is required, such as an associative array tuple expression or the right-hand side of an assignment statement. Similarly, a pointer to any data type can be used in a context where a pointer to void is required. To use a pointer to a non-void type in place of another non-void pointer type, an explicit cast is required. You must always use explicit casts to convert pointers to integer types, such as uintptr_t, or to convert these integers back to the appropriate pointer type.

2.10.7 Multi-Dimensional Arrays

Multi-dimensional scalar arrays are used infrequently in D, but are provided for compatibility with ANSI C and are for observing and accessing operating system data structures that are created by using this capability in C. A multi-dimensional array is declared as a consecutive series of scalar array sizes enclosed in square brackets [] following the base type. For example, to declare a fixed-size, two-dimensional rectangular array of integers of dimensions that is 12 rows by 34 columns, you would write the following declaration:

int a[12][34];

A multi-dimensional scalar array is accessed by using similar notation. For example, to access the value stored at row 0 and column 1, you would write the D expression as follows:

a[0][1]

Storage locations for multi-dimensional scalar array values are computed by multiplying the row number by the total number of columns declared and then adding the column number.

Be careful not to confuse the multi-dimensional array syntax with the D syntax for associative array accesses, that is, a[0][1], is not the same as a[0,1]). If you use an incompatible tuple with an associative array or attempt an associative array access of a scalar array, the D compiler reports an appropriate error message and refuses to compile your program.

2.10.8 Pointers to DTrace Objects

The D compiler prohibits you from using the & operator to obtain pointers to DTrace objects such as associative arrays, built-in functions, and variables. You are prohibited from obtaining the address of these variables so that the DTrace runtime environment is free to relocate them as needed between probe firings . In this way, DTrace can more efficiently manage the memory required for your programs. If you create composite structures, it is possible to construct expressions that do retrieve the kernel address of your DTrace object storage. You should avoid creating such expressions in your D programs. If you need to use such an expression, do not rely on the address being the same across probe firings.

In ANSI C, pointers can also be used to perform indirect function calls or to perform assignments, such as placing an expression using the unary * dereference operator on the left-hand side of an assignment operator. In D, these types of expressions using pointers are not permitted. You may only assign values directly to D variables by specifying their name or by applying the array index operator [] to a D scalar or associative array. You may only call functions that are defined by the DTrace environment by name, as specified in Chapter 4, Actions and Subroutines. Indirect function calls using pointers are not permitted in D.

2.10.9 Pointers and Address Spaces

A pointer is an address that provides a translation within some virtual address space to a piece of physical memory. DTrace executes your D programs within the address space of the operating system kernel itself. The Linux system manages many address spaces: one for the operating system kernel and one for each user process. Because each address space provides the illusion that it can access all of the memory on the system, the same virtual address pointer value can be reused across address spaces, but translate to different physical memory. Therefore, when writing D programs that use pointers, you must be aware of the address space corresponding to the pointers you intend to use.

For example, if you use the syscall provider to instrument entry to a system call that takes a pointer to an integer or array of integers as an argument, for example, pipe(), it would not be valid to dereference that pointer or array using the * or [] operators because the address in question is an address in the address space of the user process that performed the system call. Applying the * or [] operators to this address in D would result in kernel address space access, which would result in an invalid address error or in returning unexpected data to your D program, depending on whether the address happened to match a valid kernel address.

To access user-process memory from a DTrace probe, you must apply one of the copyin, copyinstr, or copyinto functions that are described in Chapter 4, Actions and Subroutines to the user address space pointer. To avoid confusion, take care when writing your D programs to name and comment variables storing user addresses appropriately. You can also store user addresses as uintptr_t so that you do not accidentally compile D code that dereferences them. Techniques for using DTrace on user processes are described in Chapter 12, User Process Tracing.

2.11 DTrace Support for Strings

DTrace provides support for tracing and manipulating strings. This section describes the complete set of D language features for declaring and manipulating strings. Unlike ANSI C, strings in D have their own built-in type and operator support to enable you to easily and unambiguously use them in your tracing programs.

2.11.1 String Representation

In DTrace, strings are represented as an array of characters terminated by a null byte (that is, a byte whose value is zero, usually written as '\0'). The visible part of the string is of variable length, depending on the location of the null byte, but DTrace stores each string in a fixed-size array so that each probe traces a consistent amount of data. Strings cannot exceed the length of the predefined string limit. However, the limit can be modified in your D program or on the dtrace command line by tuning the strsize option. See Chapter 10, Options and Tunables for more information about tunable DTrace options. The default string limit is 256 bytes.

The D language provides an explicit string type rather than using the type char * to refer to strings. The string type is equivalent to char *, in that it is the address of a sequence of characters, but the D compiler and D functions such as trace provide enhanced capabilities when applied to expressions of type string. For example, the string type removes the ambiguity of type char * when you need to trace the actual bytes of a string.

In the following D statement, if s is of type char *, DTrace traces the value of the pointer s, which means it traces an integer address value:

trace(s);

In the following D statement, by the definition of the * operator, the D compiler dereferences the pointer s and traces the single character at that location:

trace(*s);

These behaviors enable you to manipulate character pointers that refer to either single characters, or to arrays of byte-sized integers that are not strings and do not end with a null byte.

In the next D statement, if s is of type string, the string type indicates to the D compiler that you want DTrace to trace a null terminated string of characters whose address is stored in the variable s:

trace(s);

You can also perform lexical comparison of expressions of type string. See Section 2.11.5, “String Comparison”.

2.11.2 String Constants

String constants are enclosed in pairs of double quotes ("") and are automatically assigned the type string by the D compiler. You can define string constants of any length, limited only by the amount of memory DTrace is permitted to consume on your system. The terminating null byte (\0) is added automatically by the D compiler to any string constants that you declare. The size of a string constant object is the number of bytes associated with the string, plus one additional byte for the terminating null byte.

A string constant may not contain a literal newline character. To create strings containing newlines, use the \n escape sequence instead of a literal newline. String constants can also contain any of the special character escape sequences that are defined for character constants. See Table 2.6, “Character Escape Sequences”.

2.11.3 String Assignment

Unlike the assignment of char * variables, strings are copied by value and not by reference. The string assignment operator = copies the actual bytes of the string from the source operand up to and including the null byte to the variable on the left-hand side, which must be of type string. You can create a new string variable by assigning it an expression of type string.

For example, the D statement:

s = "hello";

would create a new variable s of type string and copy the six bytes of the string "hello" into it (five printable characters, plus the null byte). String assignment is analogous to the C library function strcpy(), with the exception that if the source string exceeds the limit of the storage of the destination string, the resulting string is automatically truncated by a null byte at this limit.

You can also assign to a string variable an expression of a type that is compatible with strings. In this case, the D compiler automatically promotes the source expression to the string type and performs a string assignment. The D compiler permits any expression of type char * or of type char[n], that is, a scalar array of char of any size, to be promoted to a string.

2.11.4 String Conversion

Expressions of other types can be explicitly converted to type string by using a cast expression or by applying the special stringof operator, which are equivalent in the following meaning:

s = (string) expression;

s = stringof (expression);

The expression is interpreted as an address to the string.

The stringof operator binds very tightly to the operand on its right-hand side. Typically, parentheses are used to surround the expression for clarity. Although, they are not strictly necessary.

Any expression that is a scalar type, such as a pointer or integer, or a scalar array address may be converted to string. Expressions of other types such as void may not be converted to string. If you erroneously convert an invalid address to a string, the DTrace safety features prevents you from damaging the system or DTrace, but you might end up tracing a sequence of undecipherable characters.

2.11.5 String Comparison

D overloads the binary relational operators and permits them to be used for string comparisons, as well as integer comparisons. The relational operators perform string comparison whenever both operands are of type string or when one operand is of type string and the other operand can be promoted to type string. See Section 2.11.3, “String Assignment” for a detailed description. See also Table 2.14, “D Relational Operators for Strings”, which lists the relational operators that can be used to compare strings.

Table 2.14 D Relational Operators for Strings

Operator

Description

<

Left-hand operand is less than right-operand.

<=

Left-hand operand is less than or equal to right-hand operand.

>

Left-hand operand is greater than right-hand operand.

>=

Left-hand operand is greater than or equal to right-hand operand.

==

Left-hand operand is equal to right-hand operand.

!=

Left-hand operand is not equal to right-hand operand.


As with integers, each operator evaluates to a value of type int, which is equal to one if the condition is true or zero if it is false.

The relational operators compare the two input strings byte-by-byte, similarly to the C library routine strcmp(). Each byte is compared by using its corresponding integer value in the ASCII character set until a null byte is read or the maximum string length is reached. See the ascii(7) manual page for more information. Some example D string comparisons and their results are shown in the following table.

D string comparison

Result

"coffee" < "espresso"

Returns 1 (true)

"coffee" == "coffee"

Returns 1 (true)

"coffee"" >= "mocha"

Returns 0 (false)

Note

Seemingly identical Unicode strings might compare as being different if one or the other of the strings is not normalized.

2.12 Structs and Unions

Collections of related variables can be grouped together into composite data objects called structs and unions. You define these objects in D by creating new type definitions for them. You can use your new types for any D variables, including associative array values. This section explores the syntax and semantics for creating and manipulating these composite types and the D operators that interact with them.

2.12.1 Structs

The D keyword struct, short for structure, is used to introduce a new type that is composed of a group of other types. The new struct type can be used as the type for D variables and arrays, enabling you to define groups of related variables under a single name. D structs are the same as the corresponding construct in C and C++. If you have programmed in the Java programming language previously, think of a D struct as a class that contains only data members and no methods.

Suppose you want to create a more sophisticated system call tracing program in D that records a number of things about each read() and write() system call that is executed by your shell, for example, the elapsed time, number of calls, and the largest byte count passed as an argument.

You could write a D clause to record these properties in three separate associative arrays, as shown in the following example:

int maxbytes[string]; /* declare maxbytes */ 
syscall::read:entry, syscall::write:entry
/pid == 12345/
{
  ts[probefunc] = timestamp;
  calls[probefunc]++;
  maxbytes[probefunc] = arg2 > maxbytes[probefunc] ?
        arg2 : maxbytes[probefunc];
}

This clause, however, is inefficient because DTrace must create three separate associative arrays and store separate copies of the identical tuple values corresponding to probefunc for each one. Instead, you can conserve space and make your program easier to read and maintain by using a struct.

First, declare a new struct type at the top of the D program source file:

struct callinfo {
  uint64_t ts;       /* timestamp of last syscall entry */
  uint64_t elapsed;  /* total elapsed time in nanoseconds */
  uint64_t calls;    /* number of calls made */
  size_t maxbytes;   /* maximum byte count argument */
};

The struct keyword is followed by an optional identifier that is used to refer back to the new type, which is now known as struct callinfo. The struct members are then enclosed in a set of braces {} and the entire declaration is terminated by a semicolon (;). Each struct member is defined by using the same syntax as a D variable declaration, with the type of the member listed first followed by an identifier naming the member and another semicolon (;).

The struct declaration simply defines the new type. It does not create any variables or allocate any storage in DTrace. When declared, you can use struct callinfo as a type throughout the remainder of your D program. Each variable of type struct callinfo stores a copy of the four variables that are described by our structure template. The members are arranged in memory in order, according to the member list, with padding space introduced between members, as required for data object alignment purposes.

You can use the member identifier names to access the individual member values using the “.” operator by writing an expression of the following form:

              variable-name.member-name
            

The following example is an improved program that uses the new structure type. In a text editor, type the following D program and save it in a file named rwinfo.d:

struct callinfo {
  uint64_t ts; /* timestamp of last syscall entry */
  uint64_t elapsed; /* total elapsed time in nanoseconds */
  uint64_t calls; /* number of calls made */
  size_t maxbytes; /* maximum byte count argument */
};

struct callinfo i[string]; /* declare i as an associative array */

syscall::read:entry, syscall::write:entry
/pid == $1/
{
  i[probefunc].ts = timestamp;
  i[probefunc].calls++;
  i[probefunc].maxbytes = arg2 > i[probefunc].maxbytes ?
        arg2 : i[probefunc].maxbytes;
}

syscall::read:return, syscall::write:return
/i[probefunc].ts != 0 && pid == $1/
{
  i[probefunc].elapsed += timestamp - i[probefunc].ts;
}

END
{
  printf("       calls max bytes elapsed nsecs\n");
  printf("------ ----- --------- -------------\n");
  printf("  read %5d %9d %d\n",
  i["read"].calls, i["read"].maxbytes, i["read"].elapsed);
  printf(" write %5d %9d %d\n",
  i["write"].calls, i["write"].maxbytes, i["write"].elapsed);
}

When you have typed the program, run the dtrace -q -s rwinfo.d command, specifying one of your shell processes. Then, type a few commands in your shell. When you have finished typing the shell commands, type Ctrl-C to fire the END probe and print the results:

# dtrace -q -s rwinfo.d `pgrep -n bash`
^C
       calls max bytes elapsed nsecs
------ ----- --------- -------------
  read    25      1024 8775036488
 write    33        22 1859173

2.12.2 Pointers to Structs

Referring to structs by using pointers is very common in C and D. You can use the operator -> to access struct members through a pointer. If struct s has a member m, and you have a pointer to this struct named sp, where sp is a variable of type struct s *, you can either use the * operator to first dereference the sp pointer to access the member:

struct s *sp;
(*sp).m

Or, you can use the -> operator as shorthand for this notation. The following two D fragments are equivalent if sp is a pointer to a struct:

(*sp).m 
sp->m

DTrace provides several built-in variables that are pointers to structs. For example, the pointer curpsinfo refers to struct psinfo and its content provides a snapshot of information about the state of the process associated with the thread that fired the current probe. The following table lists a few example expressions that use curpsinfo, including their types and their meanings.

Example Expression

Type

Meaning

curpsinfo->pr_pid

pid_t

Current process ID

curpsinfo->pr_fname

char []

Executable file name

curpsinfo->pr_psargs

char []

Initial command-line arguments

For more information, see Section 11.7.2.2, “psinfo_t”.

The next example uses the pr_fname member to identify a process of interest. In an editor, type the following script and save it in a file named procfs.d:

syscall::write:entry
/ curpsinfo->pr_fname == "date" /
{
  printf("%s run by UID %d\n", curpsinfo->pr_psargs, curpsinfo->pr_uid);
}

This clause uses the expression curpsinfo->pr_fname to access and match the command name so that the script selects the correct write() requests before tracing the arguments. Notice that by using operator == with a left-hand argument that is an array of char and a right-hand argument that is a string, the D compiler infers that the left-hand argument should be promoted to a string and a string comparison should be performed. Type the command dtrace -q -s procs.d in one shell and then type the date command several times in another shell. The output that is displayed by DTrace is similar to the following:

# dtrace -q -s procfs.d 
date  run by UID 500
/bin/date  run by UID 500
date -R  run by UID 500
...
^C
#

Complex data structures are used frequently in C programs, so the ability to describe and reference structs from D also provides a powerful capability for observing the inner workings of the Oracle Linux operating system kernel and its system interfaces.

2.12.3 Unions

Unions are another kind of composite type that is supported by ANSI C and D and are closely related to structs. A union is a composite type where a set of members of different types are defined and the member objects all occupy the same region of storage. A union is therefore an object of variant type, where only one member is valid at any given time, depending on how the union has been assigned. Typically, some other variable or piece of state is used to indicate which union member is currently valid. The size of a union is the size of its largest member. The memory alignment that is used for the union is the maximum alignment required by the union members.

2.12.4 Member Sizes and Offsets

You can determine the size in bytes of any D type or expression, including a struct or union, by using the sizeof operator. The sizeof operator can be applied either to an expression or to the name of a type surrounded by parentheses, as illustrated in the following two examples:

sizeof expression 
sizeof (type-name)

For example, the expression sizeof (uint64_t) would return the value 8, and the expression sizeof (callinfo.ts) would also return 8, if inserted into the source code of the previous example program. The formal return type of the sizeof operator is the type alias size_t, which is defined as an unsigned integer that is the same size as a pointer in the current data model and is used to represent byte counts. When the sizeof operator is applied to an expression, the expression is validated by the D compiler, but the resulting object size is computed at compile time and no code for the expression is generated. You can use sizeof anywhere an integer constant is required.

You can use the companion operator offsetof to determine the offset in bytes of a struct or union member from the start of the storage that is associated with any object of the struct or union type. The offsetof operator is used in an expression of the following form:

offsetof (type-name, member-name)

Here, type-name is the name of any struct or union type or type alias, and member-name is the identifier naming a member of that struct or union. Similar to sizeof, offsetof returns a size_t and you can use it anywhere in a D program that an integer constant can be used.

2.12.5 Bit-Fields

D also permits the definition of integer struct and union members of arbitrary numbers of bits, known as bit-fields. A bit-field is declared by specifying a signed or unsigned integer base type, a member name, and a suffix indicating the number of bits to be assigned for the field, as shown in the following example:

struct s 
{
  int a : 1;
  int b : 3;
  int c : 12;
};

The bit-field width is an integer constant that is separated from the member name by a trailing colon. The bit-field width must be positive and must be of a number of bits not larger than the width of the corresponding integer base type. Bit-fields that are larger than 64 bits may not be declared in D. D bit-fields provide compatibility with and access to the corresponding ANSI C capability. Bit-fields are typically used in situations when memory storage is at a premium or when a struct layout must match a hardware register layout.

A bit-field is a compiler construct that automates the layout of an integer and a set of masks to extract the member values. The same result can be achieved by simply defining the masks yourself and using the & operator. The C and D compilers attempt to pack bits as efficiently as possible, but they are free to do so in any order or fashion they desire. Therefore, bit-fields are not guaranteed to produce identical bit layouts across differing compilers or architectures. If you require stable bit layout, you should construct the bit masks yourself and extract the values by using the & operator.

A bit-field member is accessed by simply specifying its name in combination with the “.” or -> operators, like any other struct or union member. The bit-field is automatically promoted to the next largest integer type for use in any expressions. Because bit-field storage cannot be aligned on a byte boundary or be a round number of bytes in size, you may not apply the sizeof or offsetof operators to a bit-field member. The D compiler also prohibits you from taking the address of a bit-field member by using the & operator.

2.13 Type and Constant Definitions

This section describes how to declare type aliases and named constants in D. It also discusses D type and namespace management for program and operating system types and identifiers.

2.13.1 typedefs

The typedef keyword is used to declare an identifier as an alias for an existing type. Like all D type declarations, typedef is used outside of probe clauses in a declaration of the following form:

typedef existing-type new-type ;

where existing-type is any type declaration and new-type is an identifier to be used as the alias for this type. For example, the D compiler uses the following declaration internally to create the uint8_t type alias:

typedef unsigned char uint8_t;

You can use type aliases anywhere that a normal type can be used, such as the type of a variable or associative array value or tuple member. You can also combine typedef with more elaborate declarations such as the definition of a new struct, as shown in the following example:

typedef struct foo {
  int x;
  int y;
} foo_t;

In the previous example, struct foo is defined using the same type as its alias, foo_t. Linux C system headers often use the suffix _t to denote a typedef alias.

2.13.2 Enumerations

Defining symbolic names for constants in a program eases readability and simplifies the process of maintaining the program in the future. One method is to define an enumeration, which associates a set of integers with a set of identifiers called enumerators that the compiler recognizes and replaces with the corresponding integer value. An enumeration is defined by using a declaration such as the following:

enum colors {
  RED,
  GREEN,
  BLUE
};

The first enumerator in the enumeration, RED, is assigned the value zero and each subsequent identifier is assigned the next integer value.

You can also specify an explicit integer value for any enumerator by suffixing it with an equal sign and an integer constant, as shown in the following example:

enum colors {
  RED = 7,
  GREEN = 9,
  BLUE
};

The enumerator BLUE is assigned the value 10 by the compiler because it has no value specified and the previous enumerator is set to 9. When an enumeration is defined, the enumerators can be used anywhere in a D program that an integer constant is used. In addition, the enumeration enum colors is also defined as a type that is equivalent to an int. The D compiler allows a variable of enum type to be used anywhere an int can be used and will allow any integer value to be assigned to a variable of enum type. You can also omit the enum name in the declaration, if the type name is not needed.

Enumerators are visible in all subsequent clauses and declarations in your program. Therefore, you cannot define the same enumerator identifier in more than one enumeration. However, you can define more than one enumerator with the same value in either the same or different enumerations. You may also assign integers that have no corresponding enumerator to a variable of the enumeration type.

The D enumeration syntax is the same as the corresponding syntax in ANSI C. D also provides access to enumerations that are defined in the operating system kernel and its loadable modules. Note that these enumerators are not globally visible in your D program. Kernel enumerators are only visible if you specify one as an argument in a comparison with an object of the corresponding enumeration type. This feature protects your D programs against inadvertent identifier name conflicts, with the large collection of enumerations that are defined in the operating system kernel.

The following example D program displays information about I/O requests. The program uses the enumerators B_READ and B_WRITE to differentiate between read and write operations:

io:::done,
io:::start,
io:::wait-done,
io:::wait-start
{
    printf("%8s %10s: %d %16s (%s size %d @ sect %d)\n",
    	args[1]->dev_statname, probename,  
    	timestamp, execname,  
    args[0]->b_flags & B_READ ? "R" : 
    args[0]->b_flags & B_WRITE ? "W" : "?",
    args[0]->b_bcount, args[0]->b_blkno);
}

2.13.3 Inlines

D named constants can also be defined by using inline directives, which provide a more general means of creating identifiers that are replaced by predefined values or expressions during compilation. Inline directives are a more powerful form of lexical replacement than the #define directive provided by the C preprocessor because the replacement is assigned an actual type and is performed by using the compiled syntax tree and not simply a set of lexical tokens. An inline directive is specified by using a declaration of the following form:

inline type name = expression;

where type is a type declaration of an existing type, name is any valid D identifier that is not previously defined as an inline or global variable, and expression is any valid D expression. After the inline directive is processed, the D compiler substitutes the compiled form of expression for each subsequent instance of name in the program source.

For example, the following D program would trace the string "hello" and integer value 123:

inline string hello = "hello";
inline int number = 100 + 23;

BEGIN
{
  trace(hello);
  trace(number);
}

An inline name can be used anywhere a global variable of the corresponding type is used. If the inline expression can be evaluated to an integer or string constant at compile time, then the inline name can also be used in contexts that require constant expressions, such as scalar array dimensions.

The inline expression is validated for syntax errors as part of evaluating the directive. The expression result type must be compatible with the type that is defined by the inline, according to the same rules used for the D assignment operator (=). An inline expression may not reference the inline identifier itself: recursive definitions are not permitted.

The DTrace software packages install a number of D source files in the system directory /usr/lib64/dtrace/installed-version, which contain inline directives that you can use in your D programs.

For example, the signal.d library includes directives of the following form:

inline int SIGHUP = 1;
inline int SIGINT = 2;
inline int SIGQUIT = 3;
...

These inline definitions provide you with access to the current set of Oracle Linux signal names, as described in the sigaction(2) manual page. Similarly, the errno.d library contains inline directives for the C errno constants that are described in the errno(3) manual page.

By default, the D compiler includes all of the provided D library files automatically so that you can use these definitions in any D program.

2.13.4 Type Namespaces

In traditional languages such as ANSI C, type visibility is determined by whether a type is nested inside of a function or other declaration. Types declared at the outer scope of a C program are associated with a single global namespace and are visible throughout the entire program. Types that are defined in C header files are typically included in this outer scope. Unlike these languages, D provides access to types from multiple outer scopes.

D is a language that facilitates dynamic observability across multiple layers of a software stack, including the operating system kernel, an associated set of loadable kernel modules, and user processes that are running on the system. A single D program can instantiate probes to gather data from multiple kernel modules or other software entities that are compiled into independent binary objects. Therefore, more than one data type of the same name, perhaps with different definitions, might be present in the universe of types that are available to DTrace and the D compiler. To manage this situation, the D compiler associates each type with a namespace, which is identified by the containing program object. Types from a particular program object can be accessed by specifying the object name and the back quote (`) scoping operator in any type name.

For example, for a kernel module named foo that contains the following C type declaration:

typedef struct bar {
  int x;
} bar_t;

The types struct bar and bar_t could be accessed from D using the following type names:

struct foo`bar
foo`bar_t

The back quote operator can be used in any context where a type name is appropriate, including when specifying the type for D variable declarations or cast expressions in D probe clauses.

The D compiler also provides two special, built-in type namespaces that use the names C and D, respectively. The C type namespace is initially populated with the standard ANSI C intrinsic types, such as int. In addition, type definitions that are acquired by using the C preprocessor (cpp), by running the dtrace -C command, are processed by and added to the C scope. As a result, you can include C header files containing type declarations that are already visible in another type namespace without causing a compilation error.

The D type namespace is initially populated with the D type intrinsics, such as int and string, as well as the built-in D type aliases, such as uint64_t. Any new type declarations that appear in the D program source are automatically added to the D type namespace. If you create a complex type such as a struct in a D program consisting of member types from other namespaces, the member types are copied into the D namespace by the declaration.

When the D compiler encounters a type declaration that does not specify an explicit namespace using the back quote operator, the compiler searches the set of active type namespaces to find a match by using the specified type name. The C namespace is always searched first, followed by the D namespace. If the type name is not found in either the C or D namespace, the type namespaces of the active kernel modules are searched in load address order, which does not guarantee any ordering properties among the loadable modules. To avoid type name conflicts with other kernel modules, you should use the scoping operator when accessing types that are defined in loadable kernel modules.

The D compiler uses the compressed ANSI C debugging information that is provided with the core Linux kernel modules to automatically access the types that are associated with the operating system source code, without the need to access the corresponding C include files. Note that this symbolic debugging information might not be available for all kernel modules on your system. The D compiler reports an error if you attempt to access a type within the namespace of a module that lacks the compressed C debugging information that is intended for use with DTrace.