Variables in DTrace

Language:

D provides two basic types of variables for use in your tracing programs: scalar variables and associative arrays. This section explores the rules for D variables in more detail and how variables can be associated with different scopes. A special kind of array variable, called an aggregation, is discussed in DTrace Aggregations.

Note -

Scalar variables and associative arrays have a global scope and are not multi-processor safe (MP-safe). It means that the value of these variables can be changed by more than one processor and thus there are chances that the variable can became corrupt.
Aggregations are MP-safe even though they have a global scope.

This section covers the following types of variables:

Scalar Variables

Scalar variables are used to represent individual fixed-size data objects, such as integers and pointers. Scalar variables can also be used for fixed-size objects that are composed of one or more primitive or composite types. D provides the ability to create both arrays of objects as well as composite structures. DTrace also represents strings as fixed-size scalars by permitting them to grow up to a predefined maximum length. For or e information about control over string length, see Strings in DTrace.

Scalar variables are created automatically the first time you assign a value to a previously undefined identifier in your D program. For example, to create a scalar variable named x of type int, you can simply assign it a value of type int in any probe clause:

BEGIN
{
        x = 123;
}

Scalar variables created in this manner are global variables: their name and data storage location is defined once and is visible in every clause of your D program. Any time you reference the identifier x, you are referring to a single storage location associated with this variable.

Unlike ANSI-C, D does not require explicit variable declarations. If you do want to declare a global variable to assign its name and type explicitly before using it, you can place a declaration outside of the probe clauses in your program as shown in the following example. Explicit variable declarations are not necessary in most D programs, but are sometimes useful when you want to carefully control your variable types or when you want to begin your program with a set of declarations and comments documenting your program's variables and their meanings.

int x; /* declare an integer x for later use */

BEGIN
{
        x = 123;
        ...
}

Unlike ANSI-C declarations, D variable declarations may not assign initial values. You must use a BEGIN probe clause to assign any initial values. All global variable storage is filled with zeroes by DTrace before you first reference the variable.

The D language definition places no limit on the size and number of D variables, but limits are defined by the DTrace implementation and by the memory available on your system. The D compiler will enforce any of the limitations that can be applied at the time you compile your program. You can learn more about how to tune options related to program limits in DTrace Options and Tunables.

Associative Arrays

Associative arrays are used to represent collections of data elements that can be retrieved by specifying a name called a key. D associative array keys are formed by a list of scalar expression values called a tuple. You can think of the array tuple itself as an imaginary parameter list to a function that is called to retrieve the corresponding array value when you reference the array. Each D associative array has a fixed key signature consisting of a fixed number of tuple elements where each element has a given, fixed type. You can define different key signatures for each array in your D program.

Associative arrays differ from normal, fixed-size arrays in that they have no predefined limit on the number of elements, the elements can be indexed by any tuple as opposed to just using integers as keys, and the elements are not stored in preallocated consecutive storage locations. Associative arrays are useful in situations where you would use a hash table or other simple dictionary data structure in a C, C++, or Java language program. Associative arrays give you the ability to create a dynamic history of events and state captured in your D program that you can use to create more complex control flows.

To define an associative array, you write an assignment expression of the form:

name [key] = expression;

where name is any valid D identifier and key is a comma-separated list of one or more expressions. For example, the following statement defines an associative array a with key signature [ int, string ] and stores the integer value 456 in a location named by the tuple [ 123, "hello" ]:

a[123, "hello"] = 456;

The type of each object contained in the array is also fixed for all elements in a given array. Because a was first assigned using the integer 456, every subsequent value stored in the array will also be of type int. You can use any of the assignment operators to modify associative array elements, subject to the operand rules defined for each operator. The D compiler will produce an appropriate error message if you attempt an incompatible assignment. You can use any type with an associative array key or value that you can use with a scalar variable. You cannot nest an associative array within another associative array as a key or value. For more information, see Types, Operators, and Expressions in DTrace.

You can reference an associative array using any tuple that is compatible with the array key signature. The rules for tuple compatibility are similar to those for function calls and variable assignments: the tuple must be of the same length and each type in the list of actual parameters must be compatible with the corresponding type in the formal key signature. For example, an associative array x is defined as follows:

x[123ull] = 0;

The key signature is of type unsigned long long and the values are of type int. This array can also be referenced using the expression x['a'] because the tuple consisting of the character constant 'a' of type int and length one is compatible with the key signature unsigned long long according to the arithmetic conversion rules. For more information about arithmetic conversion rules, see Type Conversions.

If you need to explicitly declare a D associative array before using it, you can create a declaration of the array name and key signature outside of the probe clauses in your program source code:

int x[unsigned long long, char];

BEGIN
{
        x[123ull, 'a'] = 456;
}

Once an associative array is defined, references to any tuple of a compatible key signature are enabled, even if the tuple in question has not been previously assigned. Accessing an unassigned associative array element is defined to return a zero-filled object. A consequence of this definition is that underlying storage is not allocated for an associative array element until a non-zero value is assigned to that element. Conversely, assigning an associative array element to zero causes DTrace to deallocate the underlying storage. This behavior is important because the dynamic variable space out of which associative array elements are allocated is finite; if it is exhausted when an allocation is attempted, the allocation will fail and an error message will be generated indicating a dynamic variable drop. Always assign zero to associative array elements that are no longer in use. For other techniques to eliminate dynamic variable drops, see DTrace Options and Tunables.

Thread-Local Variables

In DTrace, you can declare a variable storage that is local to each operating system thread. Thread-local variables are useful in situations where you want to enable a probe and mark every thread that fires the probe with some tag or other data. Creating a program to solve this problem is easy in D because thread-local variables share a common name in your D code but refer to separate data storage associated with each thread. Thread-local variables are referenced by applying the -> operator to the special identifier self:

syscall::read:entry
{
        self->read = 1;
}

This D fragment example enables the probe on the read system call and associates a thread-local variable named read with each thread that fires the probe. Similar to global variables, thread-local variables are created automatically on their first assignment and assume the type used on the right side of the first assignment statement (in this example, int).

Each time the variable self->read is referenced in your D program, the data object referenced is the one associated with the operating system thread that was executing when the corresponding DTrace probe fired. You can think of a thread-local variable as an associative array that is implicitly indexed by a tuple that describes the thread's identity in the system. A thread's identity is unique over the lifetime of the system: if the thread exits and the same operating system data structure is used to create a new thread, this thread does not reuse the same DTrace thread-local storage identity.

Once you have defined a thread-local variable, you can reference it for any thread in the system even if the variable in question has not been previously assigned for that particular thread. If a thread's copy of the thread-local variable has not yet been assigned, the data storage for the copy is defined to be filled with zeros. As with associative array elements, underlying storage is not allocated for a thread-local variable until a non-zero value is assigned to it. Also as with associative array elements, assigning zero to a thread-local variable causes DTrace to deallocate the underlying storage. Always assign zero to thread-local variables that are no longer in use. For other techniques to fine-tune the dynamic variable space, see DTrace Options and Tunables.

Thread-local variables of any type can be defined in your D program, including associative arrays. Some example thread-local variable definitions are:

self->x = 123;              /* integer value */
self->s = "hello";                /* string value */
self->a[123, 'a'] = 456;    /* associative array */

Like any D variable, you are not required to explicitly declare thread-local variables before using them. If you want to create a declaration anyway, you can place one outside of your program clauses by prepending the keyword self:

self int x;    /* declare int x as a thread-local variable */

syscall::read:entry
{
        self->x = 123;
}

Thread-local variables are kept in a separate namespace from global variables so you can reuse names. Remember that x and self->x are not the same variable if you overload names in your program. The following example shows how to use thread-local variables.

Example 5 Computing Time Spent in read

In a text editor, type in the following program and save it in a file named rtime.d.

syscall::read:entry
{
        self->t = timestamp;
}

syscall::read:return
/self->t != 0/
{
        printf("%d/%d spent %d nsecs in read(2)\n",
            pid, tid, timestamp - self->t);
        /*
         * Done with the thread-local variable; assign zero to it to
         * allow the DTrace runtime to reclaim the underlying storage.
         */
        self->t = 0;
}

Go to your shell and type the following command to see a similar output.

# dtrace -q -s rtime.d
100480/1 spent 11898 nsecs in read(2)
100441/1 spent 6742 nsecs in read(2)
100480/1 spent 4619 nsecs in read(2)
100452/1 spent 19560 nsecs in read(2)
100452/1 spent 3648 nsecs in read(2)
100441/1 spent 6645 nsecs in read(2)
100452/1 spent 5168 nsecs in read(2)
100452/1 spent 20329 nsecs in read(2)
100452/1 spent 3596 nsecs in read(2)
...
^C

rtime.d uses a thread-local variable named to capture a timestamp on entry to read by any thread. Then, in the return clause, the program prints out the amount of time spent in read by subtracting self->t from the current timestamp. The built-in D variables pid and tid report the process ID and thread ID of the thread performing the read. Because self->t is no longer needed once this information is reported, it is then assigned 0 to allow DTrace to reuse the underlying storage associated with t for the current thread.

Typically you will see many lines of output without even doing anything because, behind the scenes, server processes and daemons are executing read all the time. Try changing the second clause of rtime.d to use the execname variable to print out the name of the process performing a read to learn more:

printf("%s/%d spent %d nsecs in read(2)\n",
    execname, tid, timestamp - self->t);

If you find a process that is of particular interest, add a predicate to learn more about its read behavior:

syscall::read:entry
/execname == "Xsun"/
{
        self->t = timestamp;
}

For more information about read system call, see the read(2) man page.

Clause-Local Variables

You can also define D variables whose storage is reused for each D program clause. Clause-local variables are similar to automatic variables in a C, C++, or Java language program that are active during each invocation of a function. Like all D program variables, clause-local variables are created on their first assignment. These variables can be referenced and assigned by applying the -> operator to the special identifier this.

BEGIN
{
        this->secs = timestamp / 1000000000;
        ...
}

If you want to explicitly declare a clause-local variable before using it, you can do so using the this keyword.

this int x;   /* an integer clause-local variable */
this char c;  /* a character clause-local variable */

BEGIN
{
        this->x = 123;
        this->c = 'D';
}

Clause-local variables are only active for the lifetime of a given probe clause. After DTrace performs the actions associated with your clauses for a given probe, the storage for all clause-local variables is reclaimed and reused for the next clause. For this reason, clause-local variables are the only D variables that are not initially filled with zeros. Note that if your program contains multiple clauses for a single probe, any clause-local variables will remain intact as the clauses are executed, as shown in Example 6, Using Clause-Local Variables.

While clause-local variables are persistent across clauses enabling the same probe, their values are undefined in the first clause executed for a given probe. Be sure to assign each clause-local variable an appropriate value before using it, or your program may have unexpected results.

Clause-local variables can be defined using any scalar variable type, but associative arrays may not be defined using clause-local scope. The scope of clause-local variables only applies to the corresponding variable data, not to the name and type identity defined for the variable. Once a clause-local variable is defined, this name and type signature may be used in any subsequent D program clause. You cannot rely on the storage location to be the same across different clauses.

You can use clause-local variables to accumulate intermediate results of calculations or as temporary copies of other variables. Access to a clause-local variable is much faster than access to an associative array. Therefore, if you need to reference an associative array value multiple times in the same D program clause, it is more efficient to copy it into a clause-local variable first and then reference the local variable repeatedly.

Example 6 Using Clause-Local Variables

int me;                 /* an integer global variable */
this int foo;           /* an integer clause-local variable */

tick-1sec
{
        /*
         * Set foo to be 10 if and only if this is the first clause executed.
         */
        this->foo = (me % 3 == 0) ? 10 : this->foo;
        printf("Clause 1 is number %d; foo is %d\n", me++ % 3, this->foo++);
}

tick-1sec
{
        /*
         * Set foo to be 20 if and only if this is the first clause executed. 
         */
        this->foo = (me % 3 == 0) ? 20 : this->foo;
        printf("Clause 2 is number %d; foo is %d\n", me++ % 3, this->foo++);
}

tick-1sec
{
        /*
         * Set foo to be 30 if and only if this is the first clause executed.
         */
        this->foo = (me % 3 == 0) ? 30 : this->foo;
        printf("Clause 3 is number %d; foo is %d\n", me++ % 3, this->foo++);
}

Because the clauses are always executed in program order, and because clause-local variables are persistent across different clauses enabling the same probe, running the preceding program will always produce the same output.

# dtrace -q -s clause.d
Clause 1 is number 0; foo is 10
Clause 2 is number 1; foo is 11
Clause 3 is number 2; foo is 12
Clause 1 is number 0; foo is 10
Clause 2 is number 1; foo is 11
Clause 3 is number 2; foo is 12
Clause 1 is number 0; foo is 10
Clause 2 is number 1; foo is 11
Clause 3 is number 2; foo is 12
Clause 1 is number 0; foo is 10
Clause 2 is number 1; foo is 11
Clause 3 is number 2; foo is 12
^C

Built-In Variables

The following table provides a complete list of D built-in variables. All of these variables are scalar global variables; no thread-local or clause-local variables or built-in associative arrays are currently defined by D.

Table 12 DTrace Built-In Variables

Variable	Description
`int64_t arg0, ..., arg9`	The first 10 input arguments to a probe represented as raw 64-bit integers. If fewer than 10 arguments are passed to the current probe, the remaining variables return zero.
`args[]`	The typed arguments to the current probe, if any. The `args[]` array is accessed using an integer index, but each element is defined to be the type corresponding to the given probe argument. For example consider the `start` probe in the `io` provider. In this probe, `args[0]` is of type `bufinfo_t struct`, `args[1]` is of type `devinfo_t struct` and `args[2]` is of type `fileinfo_t struct`.
`uintptr_t caller`	The program counter location that called the current kernel thread, at the time the probe fired.
`uintptr_t ucaller`	The program counter location that called the current user-level thread, at the time the probe fired.
`chipid_t chip`	The CPU chip identifier for the current physical chip. For more information, see sched Provider.
`processorid_t cpu`	The CPU identifier for the current CPU. For more information, see sched Provider.
`cpuinfo_t *curcpu`	The CPU information for the current CPU. For more information, see sched Provider.
`lwpsinfo_t *curlwpsinfo`	The lightweight process (LWP) state of the LWP associated with the current thread. This structure is described in further detail in the `proc` man page.
`psinfo_t *curpsinfo`	The process state of the process associated with the current thread. This structure is described in further detail in the `proc` man page.
`kthread_t *curthread`	The address of the operating system kernel's internal data structure for the current thread, the `kthread_t`. The `kthread_t` is defined in `<sys/thread.h>`.
`string cwd`	The name of the current working directory of the process associated with the current thread.
`uint_t epid`	The enabled probe ID (EPID) for the current probe. This integer uniquely identifies a particular probe that is enabled with a specific predicate and set of actions.
`int errno`	The error value returned by the last system call executed by this thread.
`string execname`	The name that was passed to `exec` to execute the current process.
`gid_t gid`	The real group ID of the current process.
`uint_t id`	The probe ID for the current probe. This ID is the system-wide unique identifier for the probe as published by DTrace and listed in the output of `dtrace -l`.
`uint_t ipl`	The interrupt priority level (IPL) on the current CPU at probe firing time.
`lgrp_id_t lgrp`	The latency group ID for the latency group of which the current CPU is a member. See sched Provider for more information.
`pid_t pid`	The process ID of the current process.
`pid_t ppid`	The parent process ID of the current process.
`string probefunc`	The function name portion of the current probe's description.
`string probemod`	The module name portion of the current probe's description.
`string probename`	The name portion of the current probe's description.
`string probeprov`	The provider name portion of the current probe's description.
`psetid_t pset`	The processor set ID for the processor set containing the current CPU. See sched Provider for more information.
`string root`	The name of the root directory of the process associated with the current thread.
`string kthreadname`	The kernel thread name associated with the currently executing thread.
`string uthreadname`	The user thread name associated with the currently executing thread.
`uint_t stackdepth`	The current thread's stack frame depth at probe firing time.
`id_t tid`	The thread ID of the current thread. For threads associated with user processes, this value is equal to the result of a call to `pthread_self`. For more information, see the `pthread_self`(3C) man page.
`uint64_t timestamp`	The current value of a nanosecond timestamp counter. This counter increments from an arbitrary point in the past and should only be used for relative computations.
`uid_t uid`	The real user ID of the current process.
`uint64_t uregs[]`	The current thread's saved user-mode register values at probe firing time. For information about the use of the `uregs[]` array, see User Process Tracing.
`uint64_t vtimestamp`	The current value of a nanosecond timestamp counter that is virtualized to the amount of time that the current thread has been running on a CPU, minus the time spent in DTrace predicates and actions. This counter increments from an arbitrary point in the past and should only be used for relative time computations.
`uint64_t walltimestamp`	The current number of nanoseconds since 00:00 Universal Coordinated Time, January 1, 1970.
`uint_t dtrace_is_tracing`	Contains non-zero if the current thread is being traced by the current DTrace consumer.
`string zonename`	Name of the zone as specified on creation of the process.
`zoneid_t zoneid`	The zone identifier.

For information about functions built into the D language such as trace, see DTrace Actions and Subroutines.

External Variables

D uses the backquote character (`) as a special scoping operator for accessing variables that are defined in the operating system and not in your D program. For example, the Solaris kernel contains a C declaration of a system tunable named kmem_flags for enabling memory allocator debugging features. For more information about kmem_flags, see the Oracle Solaris 11.3 Tunable Parameters Reference Manual. This tunable is declared as a C variable in the kernel source code as follows:

int kmem_flags;

To access the value of this variable in a D program, use the following D notation:

`kmem_flags

DTrace associates each kernel symbol with the type used for the symbol in the corresponding operating system C code, providing easy source-based access to the native operating system data structures. In order to use external operating system variables, you will need access to the corresponding operating system source code.

When you access external variables from a D program, you are accessing the internal implementation details of another program such as the operating system kernel or its device drivers. These implementation details do not form a stable interface upon which you can rely. Any D programs you write that depend on these details might cease to work when you next upgrade the corresponding piece of software. For this reason, external variables are typically used by kernel and device driver developers and service personnel in order to debug performance or functionality problems using DTrace. For more information about the stability of your D programs, see DTrace Stability Mechanisms.

Kernel symbol names are kept in a separate namespace from D variable and function identifiers, so you never need to worry about these names conflicting with your D variables. When you prefix a variable with a backquote, the D compiler searches the known kernel symbols in order using the list of loaded modules in order to find a matching variable definition. Because the Oracle Solaris kernel supports dynamically loaded modules with separate symbol namespaces, the same variable name might be used more than once in the active operating system kernel. You can resolve these name conflicts by specifying the name of the kernel module whose variable should be accessed prior to the backquote in the symbol name. For example, each loadable kernel module typically provides a _fini(9E) function, so to refer to the address of the _fini function provided by a kernel module named foo, you would write:

foo`_fini

For more information about _fini() function, see the _fini(9E) man page.

You can apply any of the D operators to external variables, except those that modify values, subject to the usual rules for operand types. When you launch DTrace, the D compiler loads the set of variable names corresponding to the active kernel modules, so declarations of these variables are not required. You may not apply any operator to an external variable that modifies its value, such as = or +=. For safety reasons, DTrace prevents you from damaging or corrupting the state of the software you are observing.