This chapter provides a tour of the DTrace facility and provides examples of several basic tasks.
You can list all DTrace probes by passing the -l option to the dtrace command:
# dtrace -l ID PROVIDER MODULE FUNCTION NAME 1 dtrace BEGIN 2 dtrace END 3 dtrace ERROR 4 syscall nosys entry 5 syscall nosys return 6 syscall rexit entry 7 syscall rexit return 8 syscall forkall entry 9 syscall forkall return 10 syscall read entry 11 syscall read return ...
To count all the probes that are available on your system, you can type the following command:
# dtrace -l | wc -l
The number of probes reported will vary depending on your operating platform and the software you have installed. Some probes do not list an entry under the MODULE or FUNCTION columns, such as the BEGIN and END probes in the previous example. Probes with blank entries in these fields do not correspond to a specifically instrumented program function or location. These probes refer to more abstract concepts, such as the end of a tracing request. A probe that has a module and function as part of its name is called an anchored probe. A probe that is not associated with a module and function is called an unanchored probe.
You can use additional options to list specific probes, as seen in the following examples.
You can list probes that are associated with a specific function by passing that function name to DTrace with the -f option.
# dtrace -l -f cv_wait ID PROVIDER MODULE FUNCTION NAME 12921 fbt genunix cv_wait entry 12922 fbt genunix cv_wait return
You can list probes that are associated with a specific module by passing that module name to DTrace with the -m option.
# dtrace -l -m sd ID PROVIDER MODULE FUNCTION NAME 17147 fbt sd sdopen entry 17148 fbt sd sdopen return 17149 fbt sd sdclose entry 17150 fbt sd sdclose return 17151 fbt sd sdstrategy entry 17152 fbt sd sdstrategy return ...
You can list probes that have a specific name by passing that name to DTrace with the -n option.
# dtrace -l -n BEGIN ID PROVIDER MODULE FUNCTION NAME 1 dtrace BEGIN
You can list probes that are originate from a specific provider by passing the provider name to DTrace with the -P option.
# dtrace -l -P lockstat ID PROVIDER MODULE FUNCTION NAME 469 lockstat genunix mutex_enter adaptive-acquire 470 lockstat genunix mutex_enter adaptive-block 471 lockstat genunix mutex_enter adaptive-spin 472 lockstat genunix mutex_exit adaptive-release 473 lockstat genunix mutex_destroy adaptive-release 474 lockstat genunix mutex_tryenter adaptive-acquire ...
A specific function or specific module can be supported by multiple providers, as the following example shows.
# dtrace -l -f read ID PROVIDER MODULE FUNCTION NAME 10 syscall read entry 11 syscall read return 4036 sysinfo genunix read readch 4040 sysinfo genunix read sysread 7885 fbt genunix read entry 7886 fbt genunix read return
As the previous examples show, the output for a listing of probes displays the following information:
The probe's uniquely assigned integer probe ID
The probe ID is only unique within a given release or patch level of the Solaris operating system.
The provider name
The module name, if applicable
The function name, if applicable
The probe name
You can fully specify a probe by listing each component of the 4–tuple that uniquely identifies that probe. The format for the probe specification is provider:module:function:name. An empty component in a probe specification matches anything. For example, the specification fbt::alloc:entry specifies a probe with the following attributes:
The probe must be from the fbt provider
The probe may be in any module
The probe must be in the alloc function
The probe must be named entry
Elements on the left hand side of the 4–tuple are optional. The probe specification ::open:entry is equivalent to the specification open:entry. Either specification will match probes from all providers and kernel modules that have a function name of open and are named entry.
# dtrace -l -n open:entry ID PROVIDER MODULE FUNCTION NAME 14 syscall open entry 7386 fbt genunix open entry
You can also describe probes with a pattern matching syntax that is similar to the syntax that is described in the File Name Generation section of the sh(1) man page. The syntax supports the special characters *, ?, [, and ]. The probe description syscall::open*:entry matches both the open and open64 system calls. The ? character represents any single character in the name. The [ and ] characters are used to specify a set of specific characters in the name.
You enable probes with the dtrace command by specifying the probes without the -l option. Without further directions, DTrace performs the default action when the specified probe fires. The default probe action indicates only that the specified probe has fired and does not record any other data. The following code example enables every probe in the sd module.
# dtrace -m sd CPU ID FUNCTION:NAME 0 17329 sd_media_watch_cb:entry 0 17330 sd_media_watch_cb:return 0 17167 sdinfo:entry 0 17168 sdinfo:return 0 17151 sdstrategy:entry 0 17152 sdstrategy:return 0 17661 ddi_xbuf_qstrategy:entry 0 17662 ddi_xbuf_qstrategy:return 0 17649 xbuf_iostart:entry 0 17341 sd_xbuf_strategy:entry 0 17385 sd_xbuf_init:entry 0 17386 sd_xbuf_init:return 0 17342 sd_xbuf_strategy:return 0 17177 sd_mapblockaddr_iostart:entry 0 17178 sd_mapblockaddr_iostart:return 0 17179 sd_pm_iostart:entry 0 17365 sd_pm_entry:entry 0 17366 sd_pm_entry:return 0 17180 sd_pm_iostart:return 0 17181 sd_core_iostart:entry 0 17407 sd_add_buf_to_waitq:entry ...
The output in this example shows that the default action displays the CPU where the probe fired, the integer probe ID that is assigned by DTrace, the function where the probe fired, and the probe name.
# dtrace -P syscall dtrace: description 'syscall' matched 452 probes CPU ID FUNCTION:NAME 0 99 ioctl:return 0 98 ioctl:entry 0 99 ioctl:return 0 98 ioctl:entry 0 99 ioctl:return 0 234 sysconfig:entry 0 235 sysconfig:return 0 234 sysconfig:entry 0 235 sysconfig:return 0 168 sigaction:entry 0 169 sigaction:return 0 168 sigaction:entry 0 169 sigaction:return 0 98 ioctl:entry 0 99 ioctl:return 0 234 sysconfig:entry 0 235 sysconfig:return 0 38 brk:entry 0 39 brk:return ...
# dtrace -n zfod dtrace: description 'zfod' matched 3 probes CPU ID FUNCTION:NAME 0 4080 anon_zero:zfod 0 4080 anon_zero:zfod ^C
# dtrace -n clock:entry dtrace: description 'clock:entry' matched 1 probe CPU ID FUNCTION:NAME 0 4198 clock:entry ^C
Actions enable DTrace to interact with the system outside of the DTrace framework. The most common actions record data to a DTrace buffer. Other actions can stop the current process, raise a specific signal on the current process, or cease tracing. Actions that change the system state are considered destructive actions. Data recording actions record data to the principal buffer by default. The principal buffer is present in every DTrace invocation and is always allocated on a per-CPU basis. Tracing and buffer allocation can be restricted to a single CPU by using the -cpu option. See Chapter 11, Buffers and Buffering, in Solaris Dynamic Tracing Guide for more information about DTrace buffering.
The examples in this section use D expressions that consist of built-in D variables. Some of the most commonly used D variables are listed below:
This variable contains the current process ID.
This variable contains the current executable name.
This variable contains the time since boot, expressed in nanoseconds.
This variable contains a pointer to the kthread_t structure that represents the current thread.
This variable contains the module name of the current probe.
This variable contains the function name of the current probe.
This variable contains the name of the current probe.
For a complete list of the built-in variables of the D scripting language, see Variables.
The D scripting language also provides built-in functions that perform specific actions. You can find a complete list of these built-in functions at Chapter 10, Actions and Subroutines, in Solaris Dynamic Tracing Guide. The trace() function records the result of a D expression to the trace buffer, as in the following examples:
trace(pid) traces the current process ID
trace(execname) traces the name of the current executable
trace(curthread->t_pri) traces the t_pri field of the current thread
trace(probefunc) traces the function name of the probe
To indicate a particular action you want a probe to take, type the name of the action between {} characters, as in the following example.
# dtrace -n 'readch {trace(pid)}' dtrace: description 'readch ' matched 4 probes CPU ID FUNCTION:NAME 0 4036 read:readch 2040 0 4036 read:readch 2177 0 4036 read:readch 2177 0 4036 read:readch 2040 0 4036 read:readch 2181 0 4036 read:readch 2181 0 4036 read:readch 7 ...
Since the requested action is trace(pid), the process identification number (PID) appears in the last column of the output.
# dtrace -m 'ufs {trace(execname)}' dtrace: description 'ufs ' matched 889 probes CPU ID FUNCTION:NAME 0 14977 ufs_lookup:entry ls 0 15748 ufs_iaccess:entry ls 0 15749 ufs_iaccess:return ls 0 14978 ufs_lookup:return ls ... 0 15007 ufs_seek:entry utmpd 0 15008 ufs_seek:return utmpd 0 14963 ufs_close:entry utmpd ^C
# dtrace -n 'syscall:::entry {trace(timestamp)}' dtrace: description 'syscall:::entry ' matched 226 probes CPU ID FUNCTION:NAME 0 312 portfs:entry 157088479572713 0 98 ioctl:entry 157088479637542 0 98 ioctl:entry 157088479674339 0 234 sysconfig:entry 157088479767243 ... 0 98 ioctl:entry 157088481033225 0 60 fstat:entry 157088481050686 0 60 fstat:entry 157088481074680 ^C
To specify multiple actions, list the actions separated by the ; character.
# dtrace -n 'zfod {trace(pid);trace(execname)}' dtrace: description 'zfod ' matched 3 probes CPU ID FUNCTION:NAME 0 4080 anon_zero:zfod 2195 dtrace 0 4080 anon_zero:zfod 2195 dtrace 0 4080 anon_zero:zfod 2195 dtrace 0 4080 anon_zero:zfod 2195 dtrace 0 4080 anon_zero:zfod 2195 dtrace 0 4080 anon_zero:zfod 2197 bash 0 4080 anon_zero:zfod 2207 vi 0 4080 anon_zero:zfod 2207 vi ...
The actions in this section record data to the principal buffer by default, but each action may also be used to record data to speculative buffers. See Speculative Tracing for more details on speculative buffers.
void trace(expression)
The most basic action is the trace() action, which takes a D expression as its argument and traces the result to the directed buffer.
void tracemem(address, size_t nbytes)
The tracemem() action copies data from an address in memory to a buffer . The number of bytes that this action copies is specified in nbytes. The address that the data is copied from is specified in addr as a D expression. The buffer that the data is copied to is specified in buf.
void printf(string format, ...)
Like the trace() action, the printf() action traces D expressions. However, the printf() action lets you control formatting in ways similar to the printf(3C) function. Like the printf function, the parameters consists of a format string followed by a variable number of arguments. By default, the arguments are traced to the directed buffer. The arguments are later formatted for output by the dtrace command according to the specified format string.
For more information on the printf() action, see Chapter 12, Output Formatting, in Solaris Dynamic Tracing Guide.
void printa(aggregation) void printa(string format, aggregation)
The printa() action enables you to display and format aggregations. See Chapter 9, Aggregations, in Solaris Dynamic Tracing Guide for more detail on aggregations. If a format value is not provided, the printa() action only traces a directive to the DTrace consumer. The consumer that receives that directive processes and displays the aggregation with the default format. See Chapter 12, Output Formatting, in Solaris Dynamic Tracing Guide for a more detailed description of the printa() format string.
void stack(int nframes) void stack(void)
The stack() action records a kernel stack trace to the directed buffer. The depth of the kernel stack is given by the value given in nframes. If no value is given for nframes, the stack action records a number of stack frames specified by the stackframes option.
void ustack(int nframes, int strsize) void ustack(int nframes) void ustack(void)
The ustack() action records a user stack trace to the directed buffer. The depth of the user stack is equal to the value specified in nframes. If there is no value for nframes, the ustack action records a number of stack frames that is specified by the ustackframes option. The ustack() action determines the address of the calling frames when the probe fires. The ustack() action does not translate the stack frames into symbols until the DTrace consumer processes the ustack() action at the user level. If a value for strsize is specified and not zero, the ustack() action allocates the specified amount of string space and uses it to perform address-to-symbol translation directly from the kernel.
void jstack(int nframes, int strsize) void jstack(int nframes) void jstack(void)
The jstack() action is an alias for ustack() that uses the value specified by the jstackframes option for the number of stack frames. The jstack action uses the value specified by the jstackstrsize option to determine the string space size. The jstacksize action defaults to a non-zero value.
You must explicitly enable destructive actions in order to use them. You can enable destructive actions by using the -w option. If you attempt to use destructive actions in dtrace without explicitly enabling them, dtrace fails with a message similar to the following example:
dtrace: failed to enable 'syscall': destructive actions not allowed
For more information on DTrace actions, including destructive actions, see Chapter 10, Actions and Subroutines, in Solaris Dynamic Tracing Guide.
Some actions are destructive only to a particular process. These actions are available to users with the dtrace_proc or dtrace_user privileges. See Chapter 35, Security, in Solaris Dynamic Tracing Guide for details on DTrace security privileges.
When a probe fires with the stop() action enabled, the process that fired that probe stops upon leaving the kernel. This process stops in the same way as a process that is stopped by a proc(4) action.
void raise(int signal)
The raise() action sends the specified signal to the currently running process.
void copyout(void *buf, uintptr_t addr, size_t nbytes)
The copyout() action copies data from a buffer to an address in memory. The number of bytes that this action copies is specified in nbytes. The buffer that the data is copied from is specified in buf. The address that the data is copied to is specified in addr. That address is in the address space of the process that is associated with the current thread.
void copyoutstr(string str, uintptr_t addr, size_t maxlen)
The copyoutstr() action copies a string to an address in memory. The string to copy is specified in str. The address that the string is copied to is specified in addr. That address is in the address space of the process that is associated with the current thread.
void system(string program, ...)
The system() action causes the program specified by program to be executed by the system as if it were given to the shell as input.
Some destructive actions are destructive to the entire system. Use these actions with caution. These actions affect every process on the system and may affect other systems, depending upon the affected system's network services.
void breakpoint(void)
The breakpoint() action induces a kernel breakpoint, causing the system to stop and transfer control to the kernel debugger. The kernel debugger will emit a string that denotes the DTrace probe that triggered the action.
void panic(void)
When a probe with the panic() action triggers, the kernel panics. This action can force a system crash dump at a time of interest. You can use this action in conjunction with ring buffering and postmortem analysis to diagnose a system problem. For more information, see Chapter 11, Buffers and Buffering, in Solaris Dynamic Tracing Guide and Chapter 37, Postmortem Tracing, in Solaris Dynamic Tracing Guide respectively.
void chill(int nanoseconds)
When a probe with the chill() action triggers, DTrace spins for the specified number of nanoseconds. The chill() action is useful for exploring problems related to timing. Because interrupts are disabled while in DTrace probe context, any use of chill() will induce interrupt latency, scheduling latency, dispatch latency.
For performance-related questions, aggregated data is often more useful than individual data points. DTrace provides several built-in aggregating functions. When an aggregating function is applied to subsets of a collection of data, then applied again to the results of the analysis of those subsets, the results are identical to the results returned by the aggregating function when it is applied to the collection as a whole.
The DTrace facility stores a running count of data items for aggregations. The aggregating functions store only the current intermediate result and the new element that the function is being applied to. The intermediate results are allocated on a per-CPU basis. Because this allocation scheme does not require locks, the implementation is inherently scalable.
A DTrace aggregation takes the following general form:
@name[ keys ] = aggfunc( args );
In this general form, the variables are defined as follows:
The name of the aggregation, preceded by the @ character.
A comma-separated list of D expressions.
One of the DTrace aggregating functions.
A comma-separated list of arguments appropriate to the aggregating function.
Function Name |
Arguments |
Result |
---|---|---|
count |
none |
The number of times that the count function is called. |
sum |
scalar expression |
The total value of the specified expressions. |
avg |
scalar expression |
The arithmetic average of the specified expressions. |
min |
scalar expression |
The smallest value among the specified expressions. |
max |
scalar expression |
The largest value among the specified expressions. |
lquantize |
scalar expression, lower bound, upper bound, step value |
A linear frequency distribution of the values of the specified expressions that is sized by the specified range. This aggregating function increments the value in the highest bucket that is less than the specified expression. |
quantize |
scalar expression |
A power-of-two frequency distribution of the values of the specified expressions. This aggregating function increments the value in the highest power-of-two bucket that is less than the specified expression. |
This example uses the count aggregating function to count the number of write(2) system calls per process. The aggregation does not output any data until the dtrace command is terminated. The output data represents a summary of the data collected during the time that the dtrace command was active.
# cat writes.d #!/usr/sbin/dtrace -s syscall::write:entry] { @numWrites[execname] = count(); } # ./writes.d dtrace: script 'writes.d' matched 1 probe ^C dtrace 1 date 1 bash 3 grep 20 file 197 ls 201 |