2.9 Variables

2.9.1 Scalar Variables
2.9.2 Associative Arrays
2.9.3 Thread-Local Variables
2.9.4 Clause-Local Variables
2.9.5 Built-in Variables
2.9.6 External Variables

D provides two basic types of variables for use in your tracing programs: scalar variables and associative arrays. This section explores the rules for D variables in more detail and describes how variables can be associated with different scopes. A special kind of array variable, called an aggregation, is discussed in Chapter 3, Aggregations.

Note

  • Scalar variables and associative arrays have a global scope and are not multi-processor safe (MP-safe). As the value of such variables can be changed by more than one processor, there is a chance that a variable can become corrupted if more than one probe modifies it.

  • Aggregations are MP-safe even though they have a global scope.

2.9.1 Scalar Variables

Scalar variables are used to represent individual fixed-size data objects, such as integers and pointers. Scalar variables can also be used for fixed-size objects that are composed of one or more primitive or composite types. D provides the ability to create arrays of objects as well as composite structures. DTrace also represents strings as fixed-size scalars by permitting them to grow up to a predefined maximum length. Control over string length in your D program is discussed further in Section 2.11, “Strings”.

Scalar variables are created automatically the first time you assign a value to a previously undefined identifier in your D program. For example, to create a scalar variable named x of type int, you can simply assign it a value of type int in any probe clause:

BEGIN
{
  x = 123;
}

Scalar variables created in this manner are global variables: their name and data storage location is defined once and is visible in every clause of your D program. Any time you reference the identifier x, you are referring to a single storage location associated with this variable.

Unlike ANSI C, D does not require explicit variable declarations. If you do want to declare a global variable to assign its name and type explicitly before using it, you can place a declaration outside of the probe clauses in your program as shown in the following example. Explicit variable declarations are not necessary in most D programs, but are sometimes useful when you want to carefully control your variable types or when you want to begin your program with a set of declarations and comments documenting your program's variables and their meanings.

int x; /* declare an integer x for later use */
BEGIN
{
  x = 123;
  ...
}

Unlike ANSI C declarations, D variable declarations may not assign initial values. You must use a BEGIN probe clause to assign any initial values. All global variable storage is filled with zeroes by DTrace before you first reference the variable.

The D language definition places no limit on the size and number of D variables, but limits are defined by the DTrace implementation and by the memory available on your system. The D compiler will enforce any of the limitations that can be applied at the time you compile your program. You can learn more about how to tune options related to program limits in Chapter 10, Options and Tunables.

2.9.2 Associative Arrays

Associative arrays are used to represent collections of data elements that can be retrieved by specifying a name called a key. D associative array keys are formed by a list of scalar expression values called a tuple. You can think of the array tuple itself as an imaginary parameter list to a function that is called to retrieve the corresponding array value when you reference the array. Each D associative array has a fixed key signature consisting of a fixed number of tuple elements where each element has a given, fixed type. You can define different key signatures for each array in your D program.

Associative arrays differ from normal, fixed-size arrays in that they have no predefined limit on the number of elements, the elements can be indexed by any tuple as opposed to just using integers as keys, and the elements are not stored in preallocated consecutive storage locations. Associative arrays are useful in situations where you would use a hash table or other simple dictionary data structure in a C, C++, or Java language program. Associative arrays give you the ability to create a dynamic history of events and state captured in your D program that you can use to create more complex control flows.

To define an associative array, you write an assignment expression of the form:

name [ key ] = expression ;

where name is any valid D identifier and key is a comma-separated list of one or more expressions. For example, the following statement defines an associative array a with key signature [ int, string ] and stores the integer value 456 in a location named by the tuple [123, "hello"]:

a[123, "hello"] = 456;

The type of each object contained in the array is also fixed for all elements in a given array. Because a was first assigned using the integer 456, every subsequent value stored in the array will also be of type int. You can use any of the assignment operators defined in Section 2.8, “Types, Operators, and Expressions” to modify associative array elements, subject to the operand rules defined for each operator. The D compiler will produce an appropriate error message if you attempt an incompatible assignment. You can use any type with an associative array key or value that you can use with a scalar variable. You cannot nest an associative array within another associative array as a key or value.

You can reference an associative array using any tuple that is compatible with the array key signature. The rules for tuple compatibility are similar to those for function calls and variable assignments: the tuple must be of the same length and each type in the list of actual parameters must be compatible with the corresponding type in the formal key signature. For example, if an associative array x is defined as follows:

x[123ull] = 0;

then the key signature is of type unsigned long long and the values are of type int. This array can also be referenced using the expression x['a'] because the tuple consisting of the character constant 'a' of type int and length one is compatible with the key signature unsigned long long according to the arithmetic conversion rules described in Section 2.8.11, “Type Conversions”.

If you need to explicitly declare a D associative array before using it, you can create a declaration of the array name and key signature outside of the probe clauses in your program source code:

int x[unsigned long long, char];
BEGIN
{
  x[123ull, 'a'] = 456;
}

Once an associative array is defined, references to any tuple of a compatible key signature are permitted, even if the tuple in question has not been previously assigned. Accessing an unassigned associative array element is defined to return a zero-filled object. A consequence of this definition is that underlying storage is not allocated for an associative array element until a non-zero value is assigned to that element. Conversely, assigning an associative array element to zero causes DTrace to deallocate the underlying storage. This behavior is important because the dynamic variable space out of which associative array elements are allocated is finite; if it is exhausted when an allocation is attempted, the allocation fails and an error message is generated indicating a dynamic variable drop. Always assign zero to associative array elements that are no longer in use. See Chapter 10, Options and Tunables for other techniques to eliminate dynamic variable drops.

2.9.3 Thread-Local Variables

DTrace provides the ability to declare variable storage that is local to each operating system thread, as opposed to the global variables demonstrated earlier in this chapter. Thread-local variables are useful in situations where you want to enable a probe and mark every thread that fires the probe with some tag or other data. Creating a program to solve this problem is easy in D because thread-local variables share a common name in your D code but refer to separate data storage associated with each thread. Thread-local variables are referenced by applying the -> operator to the special identifier self:

syscall::read:entry
{
  self->read = 1;
}

This D fragment example enables the probe on the read() system call and associates a thread-local variable named read with each thread that fires the probe. Similar to global variables, thread-local variables are created automatically on their first assignment and assume the type used on the right-hand side of the first assignment statement (in this example, int).

Each time the variable self->read is referenced in your D program, the data object referenced is the one associated with the operating system thread that was executing when the corresponding DTrace probe fired. You can think of a thread-local variable as an associative array that is implicitly indexed by a tuple that describes the thread's identity in the system. A thread's identity is unique over the lifetime of the system: if the thread exits and the same operating system data structure is used to create a new thread, this thread does not reuse the same DTrace thread-local storage identity.

Once you have defined a thread-local variable, you can reference it for any thread in the system even if the variable in question has not been previously assigned for that particular thread. If a thread's copy of the thread-local variable has not yet been assigned, the data storage for the copy is defined to be filled with zeroes. As with associative array elements, underlying storage is not allocated for a thread-local variable until a non-zero value is assigned to it. Also as with associative array elements, assigning zero to a thread-local variable causes DTrace to deallocate the underlying storage. Always assign zero to thread-local variables that are no longer in use. See Chapter 10, Options and Tunables for other techniques to fine-tune the dynamic variable space from which thread-local variables are allocated.

Thread-local variables of any type can be defined in your D program, including associative arrays. Some example thread-local variable definitions are:

self->x = 123; /* integer value */

self->s = "hello"; /* string value */

self->a[123, 'a'] = 456; /* associative array */

Like any D variable, you do not need to explicitly declare thread-local variables before using them. If you want to create a declaration anyway, you can place one outside of your program clauses by prepending the keyword self:

self int x; /* declare int x as a thread-local variable */ 
syscall::read:entry
{
  self->x = 123;
}

Thread-local variables are kept in a separate namespace from global variables so you can reuse names. Remember that x and self->x are not the same variable if you overload names in your program.

The following example shows how to use thread-local variables. In a text editor, type in the following program and save it in a file named rtime.d:

syscall::read:entry
{
  self->t = timestamp;
}

syscall::read:return
/self->t != 0/
{
  printf("%d/%d spent %d nsecs in read()\n", pid, tid, timestamp - self->t);
  /* 
   * We are done with this thread-local variable; assign zero to it
   * to allow the DTrace runtime to reclaim the underlying storage.
   */ 
  self->t = 0;
}

Now go to your shell and start the program running. Wait a few seconds and you should start to see some output. If no output appears, try running a few commands.

# dtrace -q -s rtime.d
3987/3987 spent 12786263 nsecs in read()
2183/2183 spent 13410 nsecs in read()
2183/2183 spent 12850 nsecs in read()
2183/2183 spent 10057 nsecs in read()
3583/3583 spent 14527 nsecs in read()
3583/3583 spent 12571 nsecs in read()
3583/3583 spent 9778 nsecs in read()
3583/3583 spent 9498 nsecs in read()
3583/3583 spent 9778 nsecs in read()
2183/2183 spent 13968 nsecs in read()
2183/2183 spent 72076 nsecs in read()
...
^C
#

rtime.d uses a thread-local variable named to capture a timestamp on entry to read() by any thread. Then, in the return clause, the program prints out the amount of time spent in read() by subtracting self->t from the current timestamp. The built-in D variables pid and tid report the process ID and thread ID of the thread performing the read(). Because self->t is no longer needed once this information is reported, it is then assigned 0 to allow DTrace to reuse the underlying storage associated with t for the current thread.

Typically, you see many lines of output without doing anything because, behind the scenes, server processes and daemons are executing read() all the time. Try changing the second clause of rtime.d to use the execname variable to print out the name of the process performing a read():

printf("%s/%d spent %d nsecs in read()\n", execname, tid, timestamp - self->t);

If you find a process that is of particular interest, add a predicate to learn more about its read() behavior, for example:

syscall::read:entry
/execname == "Xorg"/
{
  self->t = timestamp;
}

2.9.4 Clause-Local Variables

You can also define D variables whose storage is reused for each D program clause that relates to a probe. Clause-local variables are similar to automatic variables in a C, C++, or Java language program that are active during each invocation of a function. Like all D program variables, clause-local variables are created on their first assignment. These variables can be referenced and assigned by applying the -> operator to the special identifier this:

BEGIN
{
  this->secs = timestamp / 1000000000;
  ...
}

If you want to explicitly declare a clause-local variable before using it, you can do so using the this keyword:

this int x;  /* an integer clause-local variable */
this char c; /* a character clause-local variable */

BEGIN
{
  this->x = 123;
  this->c = 'D';
}

Clause-local variables are only active for the lifetime of a given probe clause. After DTrace performs the actions associated with your clauses for a given probe, the storage for all clause-local variables is reclaimed and reused for the next clause. For this reason, clause-local variables are the only D variables that are not initially filled with zeroes. Note that if your program contains multiple clauses for a single probe, any clause-local variables will remain intact as the clauses are executed, as shown in the following example:

int me;       /* an integer global variable */
this int foo; /* an integer clause-local variable */

tick-1sec
{ 
  /*
   * Set foo to be 10 if and only if this is the first clause executed.
   */
  this->foo = (me % 3 == 0) ? 10 : this->foo;
  printf("Clause 1 is number %d; foo is %d\n", me++ % 3, this->foo++);
}

tick-1sec
{
  /*
   * Set foo to be 20 if and only if this is the first clause executed.
   */
  this->foo = (me % 3 == 0) ? 20 : this->foo;
  printf("Clause 2 is number %d; foo is %d\n", me++ % 3, this->foo++);
}

tick-1sec
{
  /*
   * Set foo to be 30 if and only if this is the first clause executed.
   */
  this->foo = (me % 3 == 0) ? 30 : this->foo;
  printf("Clause 3 is number %d; foo is %d\n", me++ % 3, this->foo++);
}

Because the clauses are always executed in program order, and because clause-local variables are persistent across different clauses enabling the same probe, running the above program will always produce the same output:

# dtrace -q -s clause.d
Clause 1 is number 0; foo is 10
Clause 2 is number 1; foo is 11
Clause 3 is number 2; foo is 12
Clause 1 is number 0; foo is 10
Clause 2 is number 1; foo is 11
Clause 3 is number 2; foo is 12
Clause 1 is number 0; foo is 10
Clause 2 is number 1; foo is 11
Clause 3 is number 2; foo is 12
Clause 1 is number 0; foo is 10
Clause 2 is number 1; foo is 11
Clause 3 is number 2; foo is 12
^C

While clause-local variables are persistent across clauses enabling the same probe, their values are undefined in the first clause executed for a given probe. Be sure to assign each clause-local variable an appropriate value before using it, or your program may have unexpected results.

Clause-local variables can be defined using any scalar variable type, but associative arrays may not be defined using clause-local scope. The scope of clause-local variables only applies to the corresponding variable data, not to the name and type identity defined for the variable. Once a clause-local variable is defined, this name and type signature may be used in any subsequent D program clause. You cannot rely on the storage location to be the same across different clauses.

You can use clause-local variables to accumulate intermediate results of calculations or as temporary copies of other variables. Access to a clause-local variable is much faster than access to an associative array. Therefore, if you need to reference an associative array value multiple times in the same D program clause, it is more efficient to copy it into a clause-local variable first and then reference the local variable repeatedly.

2.9.5 Built-in Variables

The following table provides a complete list of D built-in variables. All of these variables are scalar global variables; no thread-local or clause-local variables or built-in associative arrays are currently defined by D.

Table 2.13 DTrace Built-in Variables

Variable

Description

args[]

The typed arguments to the current probe, if any. The args[] array is accessed using an integer index, but each element is defined to be the type corresponding to the given probe argument. For example, if args[] is referenced by a read() system call probe, args[0] is of type int, args[1] is of type void *, and args[2] is of type size_t.

int64_t arg0, ..., arg9

The first ten input arguments to a probe represented as raw 64-bit integers. If fewer than ten arguments are passed to the current probe, the remaining variables return zero.

uintptr_t caller

The program counter location of the current kernel thread at the time the probe fired.

chipid_t chip

The CPU chip identifier for the current physical chip. See Section 11.6, “sched Provider” for more information.

processorid_t cpu

The CPU identifier for the current CPU. See Section 11.6, “sched Provider” for more information.

cpuinfo_t *curcpu

The CPU information for the current CPU. See Section 11.6, “sched Provider” for more information.

lwpsinfo_t *curlwpsinfo

The process state of the current thread. See Section 11.5, “proc Provider” for more information.

psinfo_t *curpsinfo

The process state of the process associated with the current thread. See Section 11.5, “proc Provider” for more information.

kthread_t *curthread

The address of the kernel's process descriptor (with type struct task_struct) for the current thread.

string cwd

The name of the current working directory of the process associated with the current thread.

uint_t epid

The enabled probe ID (EPID) for the current probe. This integer uniquely identifies a particular probe that is enabled with a specific predicate and set of actions.

int errno

The error value returned by the last system call executed by this thread.

string execname

The name that was passed to execve() to execute the current process.

fileinfo_t fds[]

The files that the current process has opened in an fileinfo_t array indexed by file descriptor number. See Section 11.7.5, “fileinfo_t”.

Note

You must load the sdt kernel module for fds[] to be available.

gid_t gid

The real group ID of the current process.

uint_t id

The probe ID for the current probe. This ID is the system-wide unique identifier for the probe as published by DTrace and listed in the output of dtrace -l.

uint_t ipl

The interrupt priority level (IPL) on the current CPU at probe firing time.

Note

This value is non-zero if interrupts are firing and zero otherwise. The non-zero value depends on whether preemption is active (as well as other things), and can vary between kernel releases and kernel configurations.

lgrp_id_t lgrp

The latency group ID for the latency group of which the current CPU is a member. See Section 11.6, “sched Provider” for more information.

This value is always zero.

pid_t pid

The process ID of the current process.

pid_t ppid

The parent process ID of the current process.

string probefunc

The function name portion of the current probe's description.

string probemod

The module name portion of the current probe's description.

string probename

The name portion of the current probe's description.

string probeprov

The provider name portion of the current probe's description.

psetid_t pset

The processor set ID for the processor set containing the current CPU. See Section 11.6, “sched Provider” for more information.

This value is always zero.

string root

The name of the root directory of the process associated with the current thread.

uint_t stackdepth

The current thread's stack frame depth at probe firing time.

id_t tid

The task ID of the current thread.

uint64_t timestamp

The current value of a nanosecond timestamp counter. This counter increments from an arbitrary point in the past and should only be used for relative computations.

uintptr_t ucaller

The program counter location of the current user thread at the time the probe fired.

uid_t uid

The real user ID of the current process.

uint64_t vtimestamp

The current value of a nanosecond timestamp counter that is virtualized to the amount of time that the current thread has been running on a CPU, minus the time spent in DTrace predicates and actions. This counter increments from an arbitrary point in the past and should only be used for relative time computations.

int64_t walltimestamp

The current number of nanoseconds since 00:00 Universal Coordinated Time, January 1, 1970.


Functions built into the D language such as trace are discussed in Chapter 4, Actions and Subroutines.

2.9.6 External Variables

D uses the backquote character (`) as a special scoping operator for accessing variables that are defined in the operating system and not in your D program. For example, the Oracle Linux kernel contains a C declaration of a system variable named max_pfn. This variable is declared in C in the kernel source code as follows:

unsigned long max_pfn

To access the value of this variable in a D program, use the D notation:

`max_pfn

DTrace associates each kernel symbol with the type used for the symbol in the corresponding operating system C code, providing easy source-based access to the native operating system data structures. In order to use external operating system variables, you will need access to the corresponding operating system source code.

When you access external variables from a D program, you are accessing the internal implementation details of another program such as the operating system kernel or its device drivers. These implementation details do not form a stable interface upon which you can rely. Any D programs you write that depend on these details might cease to work when you next upgrade the corresponding piece of software. For this reason, external variables are typically used to debug performance or functionality problems using DTrace. To learn more about the stability of your D programs, refer to Chapter 15, Stability.

Kernel symbol names are kept in a separate namespace from D variable and function identifiers, so you need not worry about these names conflicting with your D variables. When you prefix a variable with a backquote, the D compiler searches the known kernel symbols and uses the list of loaded modules to find a matching variable definition. Because the Oracle Linux kernel supports dynamically loaded modules with separate symbol namespaces, the same variable name might be used more than once in the active operating system kernel. You can resolve these name conflicts by specifying the name of the kernel module whose variable should be accessed prior to the backquote in the symbol name. For example, to refer to the address of the _bar function provided by a kernel module named foo, you would write:

foo`_bar

You can apply any of the D operators to external variables, except those that modify values, subject to the usual rules for operand types. When required, the D compiler loads the variable names that correspond to active kernel modules, so you do not need to declare these variables. You may not apply any operator to an external variable that modifies its value, such as = or +=. For safety reasons, DTrace prevents you from damaging or corrupting the state of the software you are observing.