3.2 About Aggregations

DTrace stores the results of aggregating functions in objects called aggregations. The aggregation results are indexed using a tuple of expressions similar to those used for associative arrays. In D, the syntax for an aggregation is:

@name[ keys ] = aggfunc( args );

where name is the name of the aggregation, keys is a comma-separated list of D expressions, aggfunc is one of the DTrace aggregating functions, and args is a comma-separated list of arguments appropriate to the aggregating function. The aggregation name is a D identifier that is prefixed with the special character @. All aggregations named in your D programs are global variables; there are no thread-local or clause-local aggregations. The aggregation names are kept in a separate identifier namespace from other D global variables. Remember that a and @a are not the same variable if you reuse names. The special aggregation name @ can be used to name an anonymous aggregation in simple D programs. The D compiler treats this name as an alias for the aggregation name @_.

The DTrace aggregating functions are shown in the following table. Most aggregating functions take just a single argument that represents the new datum.

Table 3.1 DTrace Aggregating Functions

Function Name

Arguments

Result

count

None

The number of times called.

sum

Scalar expression

The total value of the specified expressions.

avg

Scalar expression

The arithmetic average of the specified expressions.

min

Scalar expression

The smallest value among the specified expressions.

max

Scalar expression

The largest value among the specified expressions.

stddev

Scalar expression

The standard deviation of the specified expressions.

lquantize

Scalar expression, lower bound, upper bound, step value

A linear frequency distribution, sized by the specified range, of the values of the specified expressions. Increments the value in the highest bucket that is less than the specified expression.

quantize

Scalar expression

A power-of-two frequency distribution of the values of the specified expressions. Increments the value in the highest power-of-two bucket that is less than the specified expression.


For example, to count the number of write() system calls in the system, you could use an informative string as a key and the count aggregating function:

syscall::write:entry
{
  @counts["write system calls"] = count();
}

The dtrace command prints aggregation results by default when the process terminates, either as the result of an explicit END action or when you press Ctrl-C. The following example output shows the result of running this command, waiting for a few seconds, and pressing Ctrl-C:

# dtrace -s writes.d
dtrace: script './writes.d' matched 1 probe
^C
  write system calls                               179
#

You can count system calls per process name using the execname variable as the key to an aggregation:

syscall::write:entry
{
  @counts[execname] = count();
}

The following example output shows the result of running this command, waiting for a few seconds, and pressing Ctrl-C:

# dtrace -s writesbycmd.d
dtrace: script 'writesbycmd.d' matched 1 probe
^C
  dirname                                                           1
  dtrace                                                            1
  gnome-panel                                                       1
  mozilla-xremote                                                   1
  ps                                                                1
  avahi-daemon                                                      2
  basename                                                          2
  gconfd-2                                                          2
  java                                                              2
  master                                                            2
  pickup                                                            2
  qmgr                                                              2
  sed                                                               2
  dbus-daemon                                                       3
  rtkit-daemon                                                      3
  uname                                                             3
  w                                                                 5
  bash                                                              9
  cat                                                               9
  gnome-session                                                     9
  Xorg                                                             21
  firefox                                                         149
  gnome-terminal                                                 9421
#

Alternatively, you might want to further examine writes organized by both executable name and file descriptor. The file descriptor is the first argument to write(), so the following example uses a key consisting of both execname and arg0:

syscall::write:entry
{
  @counts[execname, arg0] = count();
}

Running this command results in a table with both executable name and file descriptor, as shown in the following example:

# dtrace -s writesbycmdfd.d
dtrace: script 'writesbycmdfd.d' matched 1 probe
^C

  basename                                                  1        1
  dbus-daemon                                              70        1
  dircolors                                                 1        1
  dtrace                                                    1        1
  gnome-panel                                              35        1
  gnome-terminal                                           16        1
  gnome-terminal                                           18        1
  init                                                      4        1
  master                                                   89        1
  ps                                                        1        1
  pulseaudio                                               20        1
  tput                                                      1        1
  Xorg                                                      2        2
#

The following example displays the average time spent in the write() system call, organized by process name. This example uses the avg aggregating function, specifying the expression to average as the argument. The example averages the wall clock time spent in the system call:

syscall::write:entry
{
  self->ts = timestamp;
}

syscall::write:return
/self->ts/
{
  @time[execname] = avg(timestamp - self->ts);
  self->ts = 0;
}

The following example output shows the result of running this command, waiting for a few seconds, and pressing Ctrl-C:

# dtrace -s writetime.d 
dtrace: script 'writetime.d' matched 2 probes
^C

  gnome-session                                                  8260
  udisks-part-id                                                 9279
  gnome-terminal                                                 9378
  mozilla-xremote                                               10061
  abrt-handle-eve                                               13414
  vgdisplay                                                     13459
  avahi-daemon                                                  14043
  vgscan                                                        14190
  uptime                                                        14533
  lsof                                                          14903
  ip                                                            15075
  date                                                          15371
  ...
  ps                                                            91792
  sestatus                                                      98374
  pstree                                                       102566
  sysctl                                                       175427
  iptables                                                     192835
  udisks-daemon                                                250405
  python                                                       282544
  dbus-daemon                                                  491069
  lsblk                                                        582138
  Xorg                                                        2337328
  gconfd-2                                                   17880523
  cat                                                        59752284
#

The average can be useful, but often does not provide sufficient detail to understand the distribution of data points. To understand the distribution in further detail, use the quantize aggregating function as shown in the following example:

syscall::write:entry
{
  self->ts = timestamp;
}

syscall::write:return
/self->ts/
{
  @time[execname] = quantize(timestamp - self->ts);
  self->ts = 0;
}

Because each line of output becomes a frequency distribution diagram, the output of this script is substantially longer than previous ones. The following example shows a selection of sample output:

# dtrace -s wrquantize.d 
dtrace: script 'wrquantize.d' matched 2 probes
^C
...
  bash                                              
           value  ------------- Distribution ------------- count    
            8192 |                                         0        
           16384 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@         4        
           32768 |                                         0        
           65536 |                                         0        
          131072 |@@@@@@@@                                 1        
          262144 |                                         0        

  gnome-terminal                                    
           value  ------------- Distribution ------------- count    
            4096 |                                         0        
            8192 |@@@@@@@@@@@@@                            5        
           16384 |@@@@@@@@@@@@@                            5        
           32768 |@@@@@@@@@@@                              4        
           65536 |@@@                                      1        
          131072 |                                         0        

  Xorg                                              
           value  ------------- Distribution ------------- count    
            2048 |                                         0        
            4096 |@@@@@@@                                  4        
            8192 |@@@@@@@@@@@@@                            8        
           16384 |@@@@@@@@@@@@                             7        
           32768 |@@@                                      2        
           65536 |@@                                       1        
          131072 |                                         0        
          262144 |                                         0        
          524288 |                                         0        
         1048576 |                                         0        
         2097152 |@@@                                      2        
         4194304 |                                         0        

  firefox                                           
           value  ------------- Distribution ------------- count    
            2048 |                                         0        
            4096 |@@@                                      22       
            8192 |@@@@@@@@@@@                              90       
           16384 |@@@@@@@@@@@@@                            107      
           32768 |@@@@@@@@@                                72       
           65536 |@@@                                      28       
          131072 |                                         3        
          262144 |                                         0        
          524288 |                                         1        
         1048576 |                                         1        
         2097152 |                                         0

The rows for the frequency distribution are always power-of-two values. Each rows indicates the count of the number of elements greater than or equal to the corresponding value, but less than the next larger row value. For example, the above output shows that firefox had 107 writes taking between 16,384 nanoseconds and 32,767 nanoseconds, inclusive.

While quantize is useful for getting quick insight into the data, you might want to examine a distribution across linear values instead. To display a linear value distribution, use the lquantize aggregating function. The lquantize function takes three arguments in addition to a D expression: a lower bound, an upper bound, and a step. For example, if you wanted to look at the distribution of writes by file descriptor, a power-of-two quantization would not be effective. Instead, use a linear quantization with a small range, as shown in the following example:

syscall::write:entry
{
  @fds[execname] = lquantize(arg0, 0, 100, 1);
}

Running this script for several seconds yields a large amount of information. The following example shows a selection of typical output:

# dtrace -s wrlquantize.d
dtrace: script 'wrlquantize.d' matched 1 probe
^C
 ...
  gnome-session                                     
           value  ------------- Distribution ------------- count    
              25 |                                         0        
              26 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 9        
              27 |                                         0        

  gnome-terminal                                    
           value  ------------- Distribution ------------- count    
              15 |                                         0        
              16 |@@                                       1        
              17 |                                         0        
              18 |                                         0        
              19 |                                         0        
              20 |                                         0        
              21 |@@@@@@@@                                 4        
              22 |@@                                       1        
              23 |@@                                       1        
              24 |                                         0        
              25 |                                         0        
              26 |                                         0        
              27 |                                         0        
              28 |                                         0        
              29 |@@@@@@@@@@@@@                            6        
              30 |@@@@@@@@@@@@@                            6        
              31 |                                         0        
 ...

You can also use the lquantize aggregating function to aggregate on time since some point in the past. This technique allows you to observe a change in behavior over time. The following example displays the change in system call behavior over the lifetime of a process executing the date command:

syscall::execve:return
/execname == "date"/
{
  self->start = timestamp;
}

syscall:::entry
/self->start/
{
  /*
   * We linearly quantize on the current virtual time minus our
   * process’s start time. We divide by 1000 to yield microseconds
   * rather than nanoseconds. The range runs from 0 to 10 milliseconds
   * in steps of 100 microseconds; we expect that no date(1) process
   * will take longer than 10 milliseconds to complete.
   */
  @a["system calls over time"] =
  lquantize((timestamp - self->start) / 1000, 0, 10000, 100);
}

syscall::exit:entry
/self->start/
{
  self->start = 0;
}

This script provides greater insight into system call behavior when many date processes are executed. To see this result, run sh -c 'while true; do date >/dev/null; done' in one window, while executing the D script in another. The script produces a profile of the system call behavior of the date command:

# dtrace -s dateprof.d 
dtrace: script 'dateprof.d' matched 298 probes
^C

  system calls over time                            
           value  ------------- Distribution ------------- count    
             < 0 |                                         0        
               0 |@@                                       23428    
             100 |@@@@@                                    56263    
             200 |@@@@@                                    61271    
             300 |@@@@@                                    58132    
             400 |@@@@@                                    54617    
             500 |@@@@                                     45545    
             600 |@@                                       26049    
             700 |@@@                                      38859    
             800 |@@@@                                     51569    
             900 |@@@@                                     42553    
            1000 |@                                        11339    
            1100 |                                         4020     
            1200 |                                         2236     
            1300 |                                         1264     
            1400 |                                         812      
            1500 |                                         706      
            1600 |                                         764      
            1700 |                                         586      
            1800 |                                         266      
            1900 |                                         155      
            2000 |                                         118      
            2100 |                                         86       
            2200 |                                         93       
            2300 |                                         66       
            2400 |                                         32       
            2500 |                                         32       
            2600 |                                         18       
            2700 |                                         23       
            2800 |                                         26       
            2900 |                                         30       
            3000 |                                         26       
            3100 |                                         1        
            3200 |                                         7        
            3300 |                                         9        
            3400 |                                         3        
            3500 |                                         5        
            3600 |                                         1        
            3700 |                                         6        
            3800 |                                         8        
            3900 |                                         8        
            4000 |                                         8        
            4100 |                                         1        
            4200 |                                         1        
            4300 |                                         6        
            4400 |                                         0

This output provides a rough idea of the different phases of the date command with respect to the services required of the kernel. To better understand these phases, you might want to understand which system calls are being called when. If so, you could change the D script to aggregate on the variable probefunc instead of a constant string.

Similarly, you can use the stddev aggregating function to characterize the distribution of data points. This example shows the average and standard deviation of the time it takes to exec processes:

syscall::execve:entry
{
 self->ts = timestamp;
}

syscall::execve:return
/ self->ts /
{
  t = timestamp - self->ts;
  @execavg[probefunc] = avg(t);
  @execsd[probefunc] = stddev(t);
  self->ts = 0;
}

END
{
  printf("AVERAGE:");
  printa(@execavg);
  printf("\nSTDDEV:");
  printa(@execsd);
}

With sample output as follows:

# dtrace -q -s stddev.d
^C
AVERAGE:
  execve                                                       253839

STDDEV:
  execve                                                       260226
Note

The standard deviation is approximated as √((Σ(x2)/N)-(Σx/N)2). This is an imprecise approximation, but it should suffice for most purposes to which DTrace is put.