D programs describe the probes that are to be enabled together with the predicates and actions that are bound to the probes. D programs can also declare variables and define new types. This section provides an introduction to the important features that you are likely to encounter in simple D programs.
D programs consist of a set of one or more probe clauses. Each probe clause takes the general form shown here:
probe_description_1[,probe_description_2]... [/predicate_statement/] { [action_statement;] . . . }
Every probe clause begins with a list of one or more probe descriptions in this form:
provider:module:function:probe_name
The compiler interprets the fields from right to left. For example, the probe
description settimeofday:entry would match a probe with function
settimeofday and name entry regardless of the value
of the probe's provider and module fields. You can regard a probe description as a pattern
that matches one or more probes based on their names. You can omit the leading colons before
a probe name if the probe that you want to use has a unique name. If several providers
publish probes with the same name, use the available fields to obtain the correct probe. If
you do not specify a provider, you might obtain unexpected results if multiple probes have
the same name. Specifying a provider but leaving the module, function, and probe name fields
blank, matches all probes in a provider. For example, syscall::: matches
every probe published by the syscall provider.
The optional predicate statement uses criteria such as process ID, command name, or timestamp to determine whether the associated actions should take place. If you omit the predicate, any associated actions always run if the probe is triggered.
You can use the ?, *, and []
shell wildcards with probe clauses. For example, syscall::[gs]et*:
matches all syscall probes for function names that begin with
get or set. If necessary, use the
\ character to escape wildcard characters that form part of a name.
You can enable the same actions for more than one probe description. For example, the
following D program uses the trace() function to record a timestamp each
time that any process invokes a system call containing the string mem or
soc:
syscall::*mem*:entry, syscall::*soc*:entry
{
trace(timestamp);
} You can use compiler directives called pragmas in a D program. Pragma lines begin with
a # character, and are usually placed at the beginning of a D program.
The primary use of pragmas is to set run-time DTrace options. For example, the following
pragma statements suppress all output except for traced data and permit destructive
operations.
#pragma D option quiet #pragma D option destructive
D provides fundamental data types for integers and floating-point constants. You can
perform arithmetic only on integers in D programs. D does not support floating-point
operations. D provides floating-point types for compatibility with ANSI-C declarations and
types. You can trace floating-point data objects and use the printf()
function to format them for output. In the current implementation, DTrace supports only the
64-bit data model for writing D programs.
You can use declarations to introduce D variables and external C symbols, or to define
new types for use in D. The following example program, tick.d, declares
and initializes the variable i when the D program starts, displays its
initial value, increments the variable and prints its value once every second, and displays
the final value when the program exits.
BEGIN
{
i = 0;
trace(i);
}
profile:::tick-1sec
{
printf("i=%d\n",++i);
}
END
{
trace(i);
} When run, the program produces output such as the following until you type
Ctrl-C:
#dtrace -s tick.ddtrace: script 'tick.d' matched 3 probes CPU ID FUNCTION:NAME 1 1 :BEGIN 0 1 618 :tick-1sec i=1 1 618 :tick-1sec i=2 1 618 :tick-1sec i=3 1 618 :tick-1sec i=4 1 618 :tick-1sec i=5^C0 2 :END 5
Whenever a probe is triggered, dtrace displays the number of the CPU
core on which the process indicated by its ID is running, and the name of the function and
the probe. BEGIN and END are DTrace probes that
trigger when the dtrace program starts and finishes.
To suppress all output except that from printa(),
printf(), and trace(), specify #pragma D
option quiet in the program or the -q option to
dtrace.
#dtrace -q -s tick.d0i=1 i=2 i=3 i=4 i=5^C5
Predicates are logic statements that select whether DTrace invokes the actions that are
associated with a probe. For example, the predicates in the following program
sc1000.d examine the value of the variable i. This
program also demonstrates how to include C-style comments.
#pragma D option quiet
BEGIN
{
/* Initialize i */
i = 1000;
}
syscall:::entry
/i > 0/
{
/* Decrement i */
i--;
}
syscall:::entry
/(i % 100) == 0/
{
/* Print i after every 100 system calls */
printf("i = %d\n",i);
}
syscall:::entry
/i == 0/
{
printf("i = 0; 1000 system calls invoked\n");
exit(0); /* Exit with a value of 0 */
} The program initializes i with a value of 1000, decrements its value
by 1 whenever a process invokes a system call, prints its value after every 100 system
calls, and exits when the value of 1 reaches 0. Running the program in quite mode produces
output similar to the following:
# dtrace -s sc1000.d
i = 900
i = 700
i = 800
i = 600
i = 500
i = 400
i = 300
i = 200
i = 100
i = 0
i = 0; 1000 system calls invoked Note that the order of the countdown sequence is not as expected. The output for
i=800 appears after the output for i=700. If you
turn off quiet mode, it becomes apparent that the reason is that dtrace
is collecting information from probes that can be triggered on all the CPU cores. You cannot
expect runtime output from DTrace to be sequential in a multithreaded
environment.
# dtrace -s sc1000.d
dtrace: script 'sc1000.d' matched 889 probes
CPU ID FUNCTION:NAME
0 457 clock_gettime:entry i = 900
0 413 futex:entry i = 700
1 41 lseek:entry i = 800
1 25 read:entry i = 600
1 25 read:entry i = 500
1 25 read:entry i = 400
1 71 select:entry i = 300
1 71 select:entry i = 200
1 25 read:entry i = 100
1 25 read:entry i = 0
1 25 read:entry i = 0; 1000 system calls invoked
The next example is an executable DTrace script that displays the file descriptor,
output string, and string length specified to the write() system call
whenever the date command is run on the system.
#!/usr/sbin/dtrace -s
#pragma D option quiet
syscall::write:entry
/execname == "date"/
{
printf("%s(%d, %s, %4d)\n", probefunc, arg0, copyinstr(arg1), arg2);
} If you run the script from one window, while typing the date command in another, you see output such as the following in the first window:
write(1, Wed Aug 15 10:42:34 BST 2012 , 29)
The D language supports scalar arrays, which correspond directly in concept and syntax with arrays in C. A scalar array is a fixed-length group of consecutive memory locations that each store a value of the same type. You access scalar arrays by referring to each location with an integer starting from zero. In D programs, you would usually use scalar arrays to access array data within the operating system.
For example, you would use the following statement to declare a scalar array
sa of 5 integers:
int sa[5];
As in C, sa[0] refers to the first array element,
sa[1] refers to the second, and so on up to sa[4]
for the fifth element.
The D language also supports a special kind of variable called an associative array. An associative array is similar to a scalar array in that it associates a set of keys with a set of values, but in an associative array the keys are not limited to integers of a fixed range. In the D language, you can index associative arrays by a list of one or more values of any type. Together the individual key values form a tuple that you use to index into the array and access or modify the value that corresponds to that key. Each tuple key must be of the same length and must have the same key types in the same order. The value associated with each element of an associative array is also of a single fixed type for the entire array.
For example, the following statement defines a new associative array
aa of value type int with the tuple signature
string, int, and stores the integer value 828 in the
array:
aa["foo", 271] = 828;
Once you have defined an array, you can access its elements in the same way as any other variable. For example, the following statement modifies the array element previously stored in a by incrementing the value from 828 to 829:
a["foo", 271]++;
You can define additional elements for the array by specifying a different tuple with the same tuple signature, as shown here:
aa["bar", 314] = 159; aa["foo", 577] = 216;
The array elements aa["foo", 271] and aa["foo", 577] are
distinct because the values of their tuples differ in the value of their second key.
Syntactically, scalar arrays and associative arrays are very similar. You can declare an associative array of integers referenced by an integer key as follows:
int ai[int];
You could reference an element of this array using the expression such as
ai[0]. However, from a storage and implementation perspective, the two
kinds of array are very different. The scalar array sa consists of five
consecutive memory locations numbered from zero, and the index refers to an offset in the
storage allocated for the array. An associative array such as ai has no
predefined size and it does not store elements in consecutive memory locations. In addition,
associative array keys have no relationship to the storage location of the corresponding
value. If you access the associative array elements a[0] and
a[-5], DTrace allocates only two words of storage, which are not
necessarily consecutive in memory. The tuple keys that you use to index associative arrays
are abstract names for the corresponding value, and they bear no relationship to the
location of the value in memory.
If you create an array using an initial assignment and use a single integer expression
as the array index, for example, a[0] = 2;, the D compiler always creates
a new associative array, even though a could also be interpreted as an
assignment to a scalar array. If you want to use a scalar array, you must explicitly declare
its type and size.
The implementation of pointers in the D language gives you the ability to create and
manipulate the memory addresses of data objects in the operating system kernel, and to store
the contents of those data objects in variables and associative arrays. The syntax of D
pointers is the same as the syntax of pointers in ANSI-C. For example, the following
statement declares a D global variable named p that is a pointer to an
integer.
int *p;
This declaration means that p itself is a 64-bit integer whose value
is the address in memory of another integer.
If you want to create a pointer to a data object inside the kernel, you can compute its
address by using the & reference operator. For example, the kernel
source code declares an unsigned long max_pfn variable. You can access
the value of such an external variable in the D language by prefixing
it with the ` (backquote) scope
operator:
value = `max_pfn;
If more than one kernel module declares a variable with the same name, prefix the scoped
external variable with the name of the module. For example, foo`bar would
refer to the address of the bar() function provided by the module
foo.
You can extract the address of an external variable by applying the
& operator and store it as a
pointer:
p = &`max_pfn;
You can use the * dereference operator to refer to the object that a
pointer addresses:
value = *p;
You cannot use the * dereference operator on the left-hand side of an
assignment expression. You may only assign values directly to D variables by name or by
applying the array index operator [] to a scalar array or an associative
array.
You cannot use pointers to perform indirect function calls. You may only call DTrace functions directly by name.
DTrace executes D programs within the address space of the operating system kernel. Your entire Oracle Linux system manages one address space for the operating system kernel, and one for each user process. As each address space provides the illusion that it can access all of the memory on the system, the same virtual address might be used in different address spaces, but it would translate to different locations in physical memory. If your D programs use pointers, you need to be aware which address space corresponds to those pointers.
For example, if you use the syscall provider to instrument entry to a
system call such as pipe() that takes a pointer to an integer or to an
array of integers as an argument, it is not valid to use the * or [] operators to
dereference that pointer or array. The address is in the address space of the user process
that performed the system call, and not in the address space of the kernel. Dereferencing
the address in D accesses the kernel's address space, which would result in an invalid
address error or return unexpected data to your D program.
To access user process memory from a DTrace probe, use one of the
copyin(), copyinstr(), or
copyinto() functions with an address in user space.
The following D programs show two alternate and equivalent ways to print the file
descriptor, string, and string length arguments that a process passed to the
write() system
call:
syscall::write:entry
{
printf("fd=%d buf=%s count=%d", arg0, stringof(copyin(arg1, arg2)), arg2);
}
syscall::write:entry
{
printf("fd=%d buf=%s count=%d", arg0, copyinstr(arg1, arg2), arg2);
}The arg0, arg1 and arg2
variables contain the value of the fd, buf, and
count arguments to the system call. Note that the value of
arg1 is an address in the address space of the process, and not in the
address space of the kernel.
In this example, it is necessary to use the stringof() function with
copyin() so that DTrace converts the retrieved user data to a string.
The copyinstr() function always returns a string.
To avoid confusion, you should name and comment variables that store user addresses
appropriately. You should also store user addresses as variables of type
uintptr_t so that you do not accidentally compile D code that
dereferences them.
Thread-local variables are defined within the scope of execution of a thread on the
system. To indicate that a variable is thread-local, you prefix it with
self-> as shown in the following example.
#pragma D option quiet
syscall::read:entry
{
self->t = timestamp; /* Initialize a thread-local variable */
}
syscall::read:return
/self->t != 0/
{
printf("%s (pid:tid=%d:%d) spent %d microseconds in read(2)\n",
execname, pid, tid, ((timestamp - self->t)/1000)); /* Divide by 1000 -> microseconds */
self->t = 0; /* Reset the variable */
} This D program (dtrace.d) displays the command name, process ID,
thread ID, and expired time in microseconds whenever a process invokes the
read() system call.
# dtrace -s readtrace.d
nome-terminal (pid:tid=2774:2774) spent 27 microseconds in read(2)
gnome-terminal (pid:tid=2774:2774) spent 16 microseconds in read(2)
hald-addon-inpu (pid:tid=1662:1662) spent 26 microseconds in read(2)
hald-addon-inpu (pid:tid=1662:1662) spent 17 microseconds in read(2)
Xorg (pid:tid=2046:2046) spent 18 microseconds in read(2)
... Clause-local variables are defined within the scope of a probe clause and persist
across clauses that enable the same probe. To indicate that a variable is clause-local, you
prefix it with this-> as shown in the following example.
#pragma D option quiet
proc::do_fork:create
{
/* Extract PID of child process from task_struct pointed to by arg0 */
this->pid = ((struct task_struct *)arg0)->pid;
time[this->pid] = timestamp;
p_pid[this->pid] = pid; /* Current process ID (parent PID of new child) */
p_name[this->pid] = execname; /* Parent command name */
p_exec[this->pid] = ""; /* Child has not yet been exec'ed */
}
proc::do_execve_common:exec
/p_pid[pid] != 0/
{
p_exec[pid] = stringof(arg0); /* Child process pathname */
}
proc::do_exit:exit
/p_pid[pid] != 0 && p_exec[pid] != ""/
{
printf("%s (%d) executed %s (%d) for %d microseconds\n",
p_name[pid], p_pid[pid], p_exec[pid], pid, (timestamp - time[pid])/1000);
}
proc::do_exit:exit
/p_pid[pid] != 0 && p_exec[pid] == ""/
{
printf("%s (%d) forked itself (as %d) for %d microseconds\n",
p_name[pid], p_pid[pid], pid, (timestamp - time[pid])/1000);
} This D program (activity.d) reports fork() and
exec() activity on a system. The program uses the value of the child
process ID in this->pid to initialise globally unique associative array
entries such as p_pid[this->pid]. When you run the program, you see
output similar to the following as you run different programs.
# dtrace -s activity.d
bash (3572) executed /bin/ls (4323) for 6422 microseconds
bash (3572) executed /usr/bin/w (4324) for 128960 microseconds
firefox (4325) executed /bin/basename (4326) for 8548 microseconds
firefox (4325) executed /bin/uname (4327) for 1999 microseconds
mozilla-plugin- (4328) executed /bin/uname (4329) for 2151 microseconds
mozilla-plugin- (4328) executed /usr/lib64/nspluginwrapper/plugin-config (4330)
for 8182 microseconds
firefox (4325) executed /usr/lib64//xulrunner-1.9.2/mozilla-xremote-client (4331)
for 40067 microseconds
firefox (4333) forked itself (as 4334) for 200 microseconds
firefox (4333) executed /bin/sed (4335) for 1070 microseconds
firefox (4336) forked itself (as 4337) for 229 microseconds
firefox (4336) executed /bin/sed (4338) for 1161 microseconds
...The speculative tracing facility in DTrace allows you to tentatively trace data and then later decide whether to commit the data to a tracing buffer or discard it. Predicates are the primary mechanism for filtering out uninteresting events. Predicates are useful when you know at the time that a probe fires whether or not the probe event is of interest. However, in some situations, you might not know whether a probe event is of interest until after the probe fires.
For example, if a system call is occasionally failing with an error code in
errno, you might want to examine the code path leading to the error
condition. You can write trace data at one or more probe locations to speculative buffers,
and then choose which data to commit to the principal buffer at another probe location. As a
result, your trace data contains only the output of interest, no post-processing is
required, and the DTrace overhead is minimized.
To create a speculative buffer, use the speculation() function. This
function returns a speculation identifier, which you use in subsequent calls to the
speculate() function.
Call the speculate() function before performing any data-recording
actions in a clause. DTrace directs all subsequent data that you record in a clause to the
speculative buffer. You can create only one speculation in any given clause.
Typically, you assign a speculation identifier to a thread-local variable, and then use
that variable as a predicate to other probes as well as an argument to
speculate(). For
example:
#!/usr/sbin/dtrace -Fs
syscall::open:entry
{
/*
* The call to speculation() creates a new speculation. If this fails,
* dtrace will generate an error message indicating the reason for
* the failed speculation(), but subsequent speculative tracing will be
* silently discarded.
*/
self->spec = speculation();
speculate(self->spec);
/*
* Because this printf() follows the speculate(), it is being
* speculatively traced; it will only appear in the data buffer if the
* speculation is subsequently commited.
*/
printf("%s", copyinstr(arg0));
}
syscall::open:return
/self->spec/
{
/*
* To balance the output with the -F option, we want to be sure that
* every entry has a matching return. Because we speculated the
* open entry above, we want to also speculate the open return.
* This is also a convenient time to trace the errno value.
*/
speculate(self->spec);
trace(errno);
}If a speculative buffer contains data that you want to retain, use the
commit() function to copy its contents to the principal buffer. If you
want to delete the contents of a speculative buffer, use the discard()
function. The following example clauses commit or discard the speculative buffer based on
the value of the errno
variable:
syscall::open:return
/self->spec && errno != 0/
{
/*
* If errno is non-zero, we want to commit the speculation.
*/
commit(self->spec);
self->spec = 0;
}
syscall::open:return
/self->spec && errno == 0/
{
/*
* If errno is not set, we discard the speculation.
*/
discard(self->spec);
self->spec = 0;
}Running this script produces output similar to the following example when the
open() system call fails:
# ./specopen.d
dtrace: script ’./specopen.d’ matched 4 probes
CPU FUNCTION
1 => open /var/ld/ld.config
1 <= open 2
1 => open /images/UnorderedList16.gif
1 <= open 4
...DTrace provides the following built-in functions for aggregating the data that individual probes gather.
|
Aggregating Function |
Description |
|---|---|
|
|
Returns the arithmetic mean of the expressions that are specified as arguments. |
|
|
Returns the number of times that the function has been called. |
|
|
Returns a linear frequency distribution of the expressions that are specified as arguments, scaled to the specified lower bound, upper bound, and step interval. Increments the value in the highest bucket that is smaller than the specified expression. |
|
|
Returns the maximum value of the expressions that are specified as arguments. |
|
|
Returns the minimum value of the expressions that are specified as arguments. |
|
|
Returns a power-of-two frequency distribution of the expressions that are specified as arguments. Increments the value of the highest power-of-two bucket that is smaller than the specified expression. |
|
|
Returns the standard deviation of the expressions that are specified as arguments. |
|
|
Returns the sum of the expressions that are specified as arguments. |
DTrace indexes the results of an aggregation using a tuple expression similar to that used for an associative array:
@name[list_of_keys] =aggregating_function(args);
The name of the aggregation is prefixed with an @. All aggregations are global. If you do not specify a name, the aggregation is anonymous. The keys are separated by commas, and consist of arguments that are appropriate to the aggregating function.
For example, the following command counts the number of write()
system calls invoked by processes until you type Ctrl-C.
#dtrace -n syscall::write:entry'{ @["write() calls"] = count(); }'dtrace: description 'syscall:::' matched 1 probe^Cwrite() calls 9
The next example counts the number of both read() and
write() system calls:
#dtrace -n syscall::write:entry,syscall::read:entry\'{ @[strjoin(probefunc,"() calls")] = count(); }'dtrace: description 'syscall::write:entry,syscall::read:entry' matched 2 probes^Cwrite() calls 150 read() calls 1555
If you specify the -q option to dtrace or
#pragma D option quiet in a D program, DTrace suppresses the
automatic printing of aggregations. You must use a printa() statement
to display this information.