While quantize
is useful for getting quick
insight into data, you might want to examine a distribution
across linear values instead. To display a linear value
distribution, use the lquantize
aggregating
function. The lquantize
function takes three
arguments in addition to a D expression: a lower bound, an upper
bound, and an optional step. Note that the default step value is
1
.
For example, if you wanted to look at the distribution of writes
by file descriptor, a power-of-two quantization would not be
effective. Instead, as shown in the following example, you could
use a linear quantization with a small range, which is saved in
a file named wrlquantize.d
:
syscall::write:entry { @fds[execname] = lquantize(arg0, 0, 100, 1); }
Note that you could also omit the last argument because
1
is the default step value.
Running this script for several seconds yields a large amount of information. The following example shows a selection of the typical output:
#dtrace -s wrlquantize.d
dtrace: script 'wrlquantize.d' matched 1 probe^C
... gnome-session value ------------- Distribution ------------- count 25 | 0 26 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 9 27 | 0 gnome-terminal value ------------- Distribution ------------- count 15 | 0 16 |@@ 1 17 | 0 18 | 0 19 | 0 20 | 0 21 |@@@@@@@@ 4 22 |@@ 1 23 |@@ 1 24 | 0 25 | 0 26 | 0 27 | 0 28 | 0 29 |@@@@@@@@@@@@@ 6 30 |@@@@@@@@@@@@@ 6 31 | 0 ...
You can also use the lquantize
aggregating
function to aggregate on time, starting with some point of time
in the past. This technique enables you to observe a change in
behavior over time.
The following example displays the change in system call
behavior over the lifetime of a process that is executing the
date command. Save it in a file named
dateprof.d
:
syscall::execve:return /execname == "date"/ { self->start = timestamp; } syscall:::entry /self->start/ { /* * We linearly quantize on the current virtual time minus our * process’s start time. We divide by 1000 to yield microseconds * rather than nanoseconds. The range runs from 0 to 10 milliseconds * in steps of 100 microseconds; we expect that no date(1) process * will take longer than 10 milliseconds to complete. */ @a["system calls over time"] = lquantize((timestamp - self->start) / 1000, 0, 10000, 100); } syscall::exit:entry /self->start/ { self->start = 0; }
This script provides greater insight into system call behavior when many date processes are being executed. To see this result, run sh -c 'while true; do date >/dev/null; done' in one window, while executing the D script in another window. The script produces a profile of the system call behavior of the date command that is similar to the following:
#dtrace -s dateprof.d
dtrace: script 'dateprof.d' matched 298 probes^C
system calls over time value ------------- Distribution ------------- count < 0 | 0 0 |@@ 23428 100 |@@@@@ 56263 200 |@@@@@ 61271 300 |@@@@@ 58132 400 |@@@@@ 54617 500 |@@@@ 45545 600 |@@ 26049 700 |@@@ 38859 800 |@@@@ 51569 900 |@@@@ 42553 1000 |@ 11339 1100 | 4020 1200 | 2236 1300 | 1264 1400 | 812 1500 | 706 1600 | 764 1700 | 586 1800 | 266 1900 | 155 2000 | 118 2100 | 86 2200 | 93 2300 | 66 2400 | 32 2500 | 32 2600 | 18 2700 | 23 2800 | 26 2900 | 30 3000 | 26 3100 | 1 3200 | 7 3300 | 9 3400 | 3 3500 | 5 3600 | 1 3700 | 6 3800 | 8 3900 | 8 4000 | 8 4100 | 1 4200 | 1 4300 | 6 4400 | 0
The previous output provides a rough idea of the different
phases of the date command, with respect to
the services that are required of the kernel. To better
understand these phases, you might want to understand which
system calls are being called and when they are called. In this
case, you could change the D script to aggregate on the
probefunc
variable instead of a constant
string.
The log-linear llquantize
aggregating
function combines the capabilities of both the log and linear
functions. While the simple quantize
function
uses base 2 logarithms, with llquantize
, you
specify the base, as well as the minimum and maximum exponents.
Further, each logarithmic range is subdivided linearly with a
number of steps, as specified.