3 Aggregations

When instrumenting the system to answer performance-related questions, it is useful to consider how data can be aggregated to answer a specific question, rather than thinking in terms of data gathered by individual probes. For example, if you want to know the number of system calls by user ID, you would not necessarily care about the datum collected at each system call. In this cae, you simply want to see a table of user IDs and system calls. Historically, you would answer this question by gathering data at each system call and post-processing the data using a tool like awk or perl. Whereas, in DTrace, the aggregating of data is a first-class operation. This chapter describes the DTrace facilities for manipulating aggregations.

Aggregation Concepts

An aggregating function is one that has the following property:

func(func(x0) U func(x1) U ... U func(xn)) = func(x0 U x1 U ... U xn)

where xn is a set of arbitrary data, which is to say, applying an aggregating function to subsets of the whole and then applying it again to the results yields the same result as applying it to the whole itself. For example, consider the SUM function, which yields the summation of a given data set. If the raw data consists of {2, 1, 2, 5, 4, 3, 6, 4, 2}, the result of applying SUM to the entire set is {29}. Similarly, the result of applying SUM to the subset consisting of the first three elements is {5}, the result of applying SUM to the set consisting of the subsequent three elements is {12}, and the result of applying SUM to the remaining three elements is also {12}. SUM is an aggregating function because applying it to the set of these results, {5, 12, 12}, yields the same result, {29}, as though applying SUM to the original data.

Not all functions are aggregating functions. An example of a non-aggregating function is the MEDIAN function. This function determines the median element of the set. The median is defined to be that element of a set for which as many elements in the set are greater than the element, as those that are less than it. The MEDIAN is derived by sorting the set and selecting the middle element. Returning to the original raw data, if MEDIAN is applied to the set consisting of the first three elements, the result is {2}. The sorted set is {1, 2, 2}; {2} is the set consisting of the middle element. Likewise, applying MEDIAN to the next three elements yields {4} and applying MEDIAN to the final three elements yields {4}. Thus, applying MEDIAN to each of the subsets yields the set {2, 4, 4}. Applying MEDIAN to this set yields the result {4}. Note that sorting the original set yields {1, 2, 2, 2, 3, 4, 4, 5, 6}. Thus, applying MEDIAN to this set yields {3}. Because these results do not match, MEDIAN is not an aggregating function. Nor is MODE, the most common element of a set.

Many common functions that are used to understand a set of data are aggregating functions. These functions include the following:

  • Counting the number of elements in the set.

  • Computing the minimum value of the set.

  • Computing the maximum value of the set.

  • Summing all of the elements in the set.

  • Histogramming the values in the set, as quantized into certain bins.

Moreover, some functions, which strictly speaking are not aggregating functions themselves, can nonetheless be constructed as such. For example, average (arithmetic mean) can be constructed by aggregating the count of the number of elements in the set and the sum of all elements in the set, reporting the ratio of the two aggregates as the final result. Another important example is standard deviation.

Applying aggregating functions to data as it is traced has a number of advantages, including the following:

  • The entire data set need not be stored. Whenever a new element is to be added to the set, the aggregating function is calculated, given the set consisting of the current intermediate result and the new element. When the new result is calculated, the new element can be discarded. This process reduces the amount of storage that is required by a factor of the number of data points, which is often quite large.

  • Data collection does not induce pathological scalability problems. Aggregating functions enable intermediate results to be kept per-CPU instead of in a shared data structure. DTrace then applies the aggregating function to the set consisting of the per-CPU intermediate results to produce the final system-wide result.

Basic Aggregation Statement

DTrace stores the results of aggregating functions in objects called aggregations. In D, the syntax for an aggregation is as follows:

@name[ keys ] = aggfunc( args );

The aggregation name is a D identifier that is prefixed with the special character @. All aggregations that are named in your D programs are global variables. There are no thread-local or clause-local aggregations. The aggregation names are kept in an identifier namespace that is separate from other D global variables. If you reuse names, remember that a and @a are not the same variable. The special aggregation name @ can be used to name an anonymous aggregation in simple D programs. The D compiler treats this name as an alias for the aggregation name @_.

Aggregations are indexed with keys, where keys are a comma-separated list of D expressions, similar to the tuples of expressions used for associative arrays. Keys can also be actions with non-void return values, such as stack, func, sym, mod, ustack, uaddr, and usym.

The aggfunc is one of the DTrace aggregating functions, and args is a comma-separated list of arguments that is appropriate to that function. The DTrace aggregating functions are described in the following table. Most aggregating functions take just a single argument that represents the new datum.

Table 3-1 DTrace Aggregating Functions

Function Name Arguments Result

count

None

Number of times called.

sum

Scalar expression

Total value of the specified expressions.

avg

Scalar expression

Arithmetic average of the specified expressions.

min

Scalar expression

Smallest value among the specified expressions.

max

Scalar expression

Largest value among the specified expressions.

stddev

Scalar expression

Standard deviation of the specified expressions.

quantize

Scalar expression [, increment]

Power-of-two frequency distribution (histogram) of the values of the specified expressions. An optional increment (weight) can be specified.

lquantize

Scalar expression, lower bound, upper bound [, step value [, increment]]

Lnear frequency distribution of the values of the specified expressions, sized by the specified range.

Note that the default step value is 1.

llquantize

Scalar expression, base, lower exponent, upper exponent, number of steps per order of magnitude [, increment]

Log-linear frequency distribution. The logarithmic base is specified, along with lower and upper exponents and the number of steps per order of magnitude.

Aggregation Examples

The following is a series of examples that illustrate aggregations.

Basic Aggregation

To count the number of write() system calls in the system, you could use an informative string as a key and the count aggregating function and save it to file named writes.d:

syscall::write:entry
{
  @counts["write system calls"] = count();
}

The dtrace command prints aggregation results by default when the process terminates, either as the result of an explicit END action or when you press Ctrl-C. The following example shows the result of running this command, waiting a few seconds, and then pressing Ctrl-C:

# dtrace -s writes.d
dtrace: script './writes.d' matched 1 probe
^C
write system calls                               179
#

Using Keys

You can count system calls per process name by specifying the execname variable as the key to an aggregation and saving it in a file named writesbycmd.d:

syscall::write:entry
{
  @counts[execname] = count();
}

The following example output shows the result of running this command, waiting a few seconds, and then pressing Ctrl-C:

# dtrace -s writesbycmd.d
dtrace: script 'writesbycmd.d' matched 1 probe
^C
  dirname                                                           1
  dtrace                                                            1
  gnome-panel                                                       1
  mozilla-xremote                                                   1
  ps                                                                1
  avahi-daemon                                                      2
  basename                                                          2
  gconfd-2                                                          2
  java                                                              2
  pickup                                                            2
  qmgr                                                              2
  sed                                                               2
  dbus-daemon                                                       3
  rtkit-daemon                                                      3
  uname                                                             3
  w                                                                 5
  bash                                                              9
  cat                                                               9
  gnome-session                                                     9
  Xorg                                                             21
  firefox                                                         149
  gnome-terminal                                                 9421
#

Alternatively, you might want to further examine writes that are organized by both executable name and file descriptor. The file descriptor is the first argument to write(). The following example uses a key that is a tuple, which consists of both execname and arg0:

syscall::write:entry
{
  @counts[execname, arg0] = count();
}

Running this command results in a table with both executable name and file descriptor, as shown in the following example:

# dtrace -s writesbycmdfd.d
dtrace: script 'writesbycmdfd.d' matched 1 probe
^C

  basename                                                  1        1
  dbus-daemon                                              70        1
  dircolors                                                 1        1
  dtrace                                                    1        1
  gnome-panel                                              35        1
  gnome-terminal                                           16        1
  gnome-terminal                                           18        1
  init                                                      4        1
  ps                                                        1        1
  pulseaudio                                               20        1
  tput                                                      1        1
  Xorg                                                      2        2
#

A limited set of actions can be used as aggregation keys. Consider the following use of the mod() and stack() actions:

profile-10
{
  @hotmod[mod(arg0)] = count();
  @hotstack[stack()] = count();
}

Here, the hotmod aggregation counts probe firings by module, using the profile probe's arg0 to determine the kernel program counter. The hotstack aggregation counts probe firings by stack. The aggregation output reveals which modules and kernel call stacks are the hottest.

Using the avg Function

The following example displays the average time spent in the write() system call, organized by process name. This example uses the avg aggregating function, specifying the expression to average as the argument. The example averages the wall clock time spent in the system call and is saved in a file named writetime.d:

syscall::write:entry
{
  self->ts = timestamp;
}

syscall::write:return
/self->ts/
{
  @time[execname] = avg(timestamp - self->ts);
  self->ts = 0;
}

The following output shows the result of running this command, waiting a few seconds, and then pressing Ctrl-C:

# dtrace -s writetime.d 
dtrace: script 'writetime.d' matched 2 probes
^C

  gnome-session                                                  8260
  udisks-part-id                                                 9279
  gnome-terminal                                                 9378
  mozilla-xremote                                               10061
  abrt-handle-eve                                               13414
  vgdisplay                                                     13459
  avahi-daemon                                                  14043
  vgscan                                                        14190
  uptime                                                        14533
  lsof                                                          14903
  ip                                                            15075
  date                                                          15371
  ...
  ps                                                            91792
  sestatus                                                      98374
  pstree                                                       102566
  sysctl                                                       175427
  iptables                                                     192835
  udisks-daemon                                                250405
  python                                                       282544
  dbus-daemon                                                  491069
  lsblk                                                        582138
  Xorg                                                        2337328
  gconfd-2                                                   17880523
  cat                                                        59752284
#

Using the stddev Function

Meanwhile, you can use the stddev aggregating function to characterize the distribution of data points. The following example shows the average and standard deviation of the time that it takes to exec processes. Save it in a file named stddev.d:

syscall::execve:entry
{
 self->ts = timestamp;
}

syscall::execve:return
/ self->ts /
{
  t = timestamp - self->ts;
  @execavg[probefunc] = avg(t);
  @execsd[probefunc] = stddev(t);
  self->ts = 0;
}

END
{
  printf("AVERAGE:");
  printa(@execavg);
  printf("\nSTDDEV:");
  printa(@execsd);
}

The sample output is as follows:

# dtrace -q -s stddev.d
^C
AVERAGE:
  execve                                                       253839

STDDEV:
  execve                                                       260226

Note:

The standard deviation is approximated as √((Σ(x2)/N)-(Σx/N)2), which is an imprecise approximation, but should suffice for most purposes to which DTrace is put.

Using the quantize Function

The average and standard deviation can be useful for crude characterization, but often do not provide sufficient detail to understand the distribution of data points. To understand the distribution in further detail, use the quantize aggregating function, as shown in the following example, which is saved in a file named wrquantize.d:

syscall::write:entry
{
  self->ts = timestamp;
}

syscall::write:return
/self->ts/
{
  @time[execname] = quantize(timestamp - self->ts);
  self->ts = 0;
}

Because each line of output becomes a frequency distribution diagram, the output of this script is substantially longer than previous scripts. The following example shows a selection of sample output:

# dtrace -s wrquantize.d 
dtrace: script 'wrquantize.d' matched 2 probes
^C
...
  bash                                              
           value  ------------- Distribution ------------- count    
            8192 |                                         0        
           16384 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@         4        
           32768 |                                         0        
           65536 |                                         0        
          131072 |@@@@@@@@                                 1        
          262144 |                                         0        

  gnome-terminal                                    
           value  ------------- Distribution ------------- count    
            4096 |                                         0        
            8192 |@@@@@@@@@@@@@                            5        
           16384 |@@@@@@@@@@@@@                            5        
           32768 |@@@@@@@@@@@                              4        
           65536 |@@@                                      1        
          131072 |                                         0        

  Xorg                                              
           value  ------------- Distribution ------------- count    
            2048 |                                         0        
            4096 |@@@@@@@                                  4        
            8192 |@@@@@@@@@@@@@                            8        
           16384 |@@@@@@@@@@@@                             7        
           32768 |@@@                                      2        
           65536 |@@                                       1        
          131072 |                                         0        
          262144 |                                         0        
          524288 |                                         0        
         1048576 |                                         0        
         2097152 |@@@                                      2        
         4194304 |                                         0        

  firefox                                           
           value  ------------- Distribution ------------- count    
            2048 |                                         0        
            4096 |@@@                                      22       
            8192 |@@@@@@@@@@@                              90       
           16384 |@@@@@@@@@@@@@                            107      
           32768 |@@@@@@@@@                                72       
           65536 |@@@                                      28       
          131072 |                                         3        
          262144 |                                         0        
          524288 |                                         1        
         1048576 |                                         1        
         2097152 |                                         0

The rows for the frequency distribution are always power-of-two values. Each row indicates a count of the number of elements that are greater than or equal to the corresponding value, but less than the next larger row's value. For example, the previous output shows that firefox had 107 writes, taking between 16,384 nanoseconds and 32,767 nanoseconds, inclusive.

The previous example shows the distribution of numbers of write times. You might also be interested in knowing which write times are contributing to the overall run time the most. You can optionally use the increment argument with the quantize function for this purpose. Note that the default value is 1, but this argument can be a D expression, as well as have negative values.

The following example shows a modified script:

 syscall::write:entry
{
  self->ts = timestamp;
}

syscall::write:return
/self->ts/
{
  self->delta = timestamp - self->ts;
  @time[execname] = quantize(self->delta, self->delta);
  self->ts = 0;
}

Using the lquantize Function

While quantize is useful for getting quick insight into data, you might want to examine a distribution across linear values instead. To display a linear value distribution, use the lquantize aggregating function. The lquantize function takes three arguments in addition to a D expression: a lower bound, an upper bound, and an optional step. Note that the default step value is 1.

For example, if you wanted to look at the distribution of writes by file descriptor, a power-of-two quantization would not be effective. Instead, as shown in the following example, you could use a linear quantization with a small range, which is saved in a file named wrlquantize.d:

syscall::write:entry
{
  @fds[execname] = lquantize(arg0, 0, 100, 1);
}

Note that you could also omit the last argument because 1 is the default step value.

Running this script for several seconds yields a large amount of information. The following example shows a selection of the typical output:

# dtrace -s wrlquantize.d
dtrace: script 'wrlquantize.d' matched 1 probe
^C
 ...
  gnome-session                                     
           value  ------------- Distribution ------------- count    
              25 |                                         0        
              26 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 9        
              27 |                                         0        

  gnome-terminal                                    
           value  ------------- Distribution ------------- count    
              15 |                                         0        
              16 |@@                                       1        
              17 |                                         0        
              18 |                                         0        
              19 |                                         0        
              20 |                                         0        
              21 |@@@@@@@@                                 4        
              22 |@@                                       1        
              23 |@@                                       1        
              24 |                                         0        
              25 |                                         0        
              26 |                                         0        
              27 |                                         0        
              28 |                                         0        
              29 |@@@@@@@@@@@@@                            6        
              30 |@@@@@@@@@@@@@                            6        
              31 |                                         0        
 ...

You can also use the lquantize aggregating function to aggregate on time, starting with some point of time in the past. This technique enables you to observe a change in behavior over time.

The following example displays the change in system call behavior over the lifetime of a process that is executing the date command. Save it in a file named dateprof.d:

syscall::execve:return
/execname == "date"/
{
  self->start = timestamp;
}

syscall:::entry
/self->start/
{
  /*
   * We linearly quantize on the current virtual time minus our
   * process’s start time. We divide by 1000 to yield microseconds
   * rather than nanoseconds. The range runs from 0 to 10 milliseconds
   * in steps of 100 microseconds; we expect that no date(1) process
   * will take longer than 10 milliseconds to complete.
   */
  @a["system calls over time"] =
  lquantize((timestamp - self->start) / 1000, 0, 10000, 100);
}

syscall::exit:entry
/self->start/
{
  self->start = 0;
}

This script provides greater insight into system call behavior when many date processes are being executed. To see this result, run sh -c 'while true; do date >/dev/null; done' in one window, while executing the D script in another window. The script produces a profile of the system call behavior of the date command that is similar to the following:

# dtrace -s dateprof.d 
dtrace: script 'dateprof.d' matched 298 probes
^C

  system calls over time                            
           value  ------------- Distribution ------------- count    
             < 0 |                                         0        
               0 |@@                                       23428    
             100 |@@@@@                                    56263    
             200 |@@@@@                                    61271    
             300 |@@@@@                                    58132    
             400 |@@@@@                                    54617    
             500 |@@@@                                     45545    
             600 |@@                                       26049    
             700 |@@@                                      38859    
             800 |@@@@                                     51569    
             900 |@@@@                                     42553    
            1000 |@                                        11339    
            1100 |                                         4020     
            1200 |                                         2236     
            1300 |                                         1264     
            1400 |                                         812      
            1500 |                                         706      
            1600 |                                         764      
            1700 |                                         586      
            1800 |                                         266      
            1900 |                                         155      
            2000 |                                         118      
            2100 |                                         86       
            2200 |                                         93       
            2300 |                                         66       
            2400 |                                         32       
            2500 |                                         32       
            2600 |                                         18       
            2700 |                                         23       
            2800 |                                         26       
            2900 |                                         30       
            3000 |                                         26       
            3100 |                                         1        
            3200 |                                         7        
            3300 |                                         9        
            3400 |                                         3        
            3500 |                                         5        
            3600 |                                         1        
            3700 |                                         6        
            3800 |                                         8        
            3900 |                                         8        
            4000 |                                         8        
            4100 |                                         1        
            4200 |                                         1        
            4300 |                                         6        
            4400 |                                         0

The previous output provides a rough idea of the different phases of the date command, with respect to the services that are required of the kernel. To better understand these phases, you might want to understand which system calls are being called and when they are called. In this case, you could change the D script to aggregate on the probefunc variable instead of a constant string.

The log-linear llquantize aggregating function combines the capabilities of both the log and linear functions. While the simple quantize function uses base 2 logarithms, with llquantize, you specify the base, as well as the minimum and maximum exponents. Further, each logarithmic range is subdivided linearly with a number of steps, as specified.

Printing Aggregations

By default, multiple aggregations are displayed in the order in which they are introduced in the D program. You can override this behavior by using the printa function to print the aggregations. The printa function also enables you to precisely format the aggregation data by using a format string, as described in Output Formatting.

If an aggregation is not formatted with a printa statement in your D program, the dtrace command snapshots the aggregation data and prints the results after tracing has completed, using the default aggregation format. If a given aggregation is formatted with a printa statement, the default behavior is disabled. You can achieve equivalent results by adding the printa(@aggregation-name) statement to an END probe clause in your program. The default output format for the avg, count, min, max, and sum aggregating functions displays an integer decimal value corresponding to the aggregated value for each tuple. The default output format for the quantize, lquantize, and llquantize aggregating functions displays an ASCII table with the results. Aggregation tuples are printed as though trace had been applied to each tuple element.

Data Normalization

When aggregating data over some period of time, you might want to normalize the data, with respect to some constant factor. This technique enables you to compare disjointed data more easily. For example, when aggregating system calls, you might want to output system calls as a per-second rate instead of as an absolute value over the course of the run. The DTrace normalize action enables you to normalize data in this way. The parameters to normalize are an aggregation and a normalization factor. The output of the aggregation shows each value divided by the normalization factor.

The following example shows how to aggregate data by system call:

#pragma D option quiet

BEGIN
{
  /*
   * Get the start time, in nanoseconds.
   */
  start = timestamp;
}

syscall:::entry
{
  @func[execname] = count();
}

END
{
  /*
   * Normalize the aggregation based on the number of seconds we have
   * been running. (There are 1,000,000,000 nanoseconds in one second.)
   */
  normalize(@func, (timestamp - start) / 1000000000);
}

Running the previous script for a brief period of time results in the following output:

# dtrace -s normalize.d
^C
  memballoon                                                        1
  udisks-daemon                                                     1
  vmstats                                                           1
  rtkit-daemon                                                      2
  automount                                                         2
  gnome-panel                                                       3
  gnome-settings-                                                   5
  NetworkManager                                                    6
  gvfs-afc-volume                                                   6
  metacity                                                          6
  qpidd                                                             9
  hald-addon-inpu                                                  14
  gnome-terminal                                                   19
  Xorg                                                             35
  VBoxClient                                                       52
  X11-NOTIFY                                                      104
  java                                                            143
  dtrace                                                          309
  sh                                                            36467
  date                                                          68142

The normalize action sets the normalization factor for the specified aggregation, but this action does not modify the underlying data. The denormalize action takes only an aggregation. Adding the denormalize action to the preceding example returns both raw system call counts and per-second rates. Type the following source code and save it in a file named denorm.d:

#pragma D option quiet

BEGIN
{
  start = timestamp;
}

syscall:::entry
{
  @func[execname] = count();
}

END
{
  this->seconds = (timestamp - start) / 1000000000;
  printf("Ran for %d seconds.\n", this->seconds);
  printf("Per-second rate:\n");
  normalize(@func, this->seconds);
  printa(@func);
  printf("\nRaw counts:\n");
  denormalize(@func);
  printa(@func);
}

Running the previous script for a brief period of time produces output similar to the following:

# dtrace -s denorm.d
^C
Ran for 7 seconds.
Per-second rate:

  audispd                                                           0
  auditd                                                            0
  memballoon                                                        0
  rtkit-daemon                                                      0
  timesync                                                          1
  gnome-power-man                                                   1
  vmstats                                                           1
  automount                                                         2
  udisks-daemon                                                     2
  gnome-panel                                                       2
  metacity                                                          2
  gnome-settings-                                                   3
  qpidd                                                             4
  clock-applet                                                      4
  gvfs-afc-volume                                                   5
  crond                                                             6
  gnome-terminal                                                    7
  vminfo                                                           15
  hald-addon-inpu                                                  32
  VBoxClient                                                       45
  Xorg                                                             63
  X11-NOTIFY                                                       90
  java                                                            126
  dtrace                                                          315
  sh                                                            31430
  date                                                          58724

Raw counts:

  audispd                                                           1
  auditd                                                            4
  memballoon                                                        4
  rtkit-daemon                                                      6
  timesync                                                          8
  gnome-power-man                                                   9
  vmstats                                                          12
  automount                                                        16
  udisks-daemon                                                    16
  gnome-panel                                                      20
  metacity                                                         20
  gnome-settings-                                                  22
  qpidd                                                            28
  clock-applet                                                     34
  gvfs-afc-volume                                                  40
  crond                                                            42
  gnome-terminal                                                   54
  vminfo                                                          105
  hald-addon-inpu                                                 225
  VBoxClient                                                      318
  Xorg                                                            444
  X11-NOTIFY                                                      634
  java                                                            883
  dtrace                                                         2207
  sh                                                           220016
  date                                                         411073

Aggregations can also be renormalized. If normalize is called more than once for the same aggregation, the normalization factor is the factor specified in the most recent call. The following example displays only the per-second system call rates of the top ten system-calling applications in a ten-second period. Type the following source code and save it in a file named truncagg.d:

#pragma D option quiet

BEGIN
{
  start = timestamp;
}

syscall:::entry
{
  @func[execname] = count();
}

tick-10sec
{
  normalize(@func, (timestamp - start) / 1000000000);
  printa(@func);
}

Clearing Aggregations

When using DTrace to build simple monitoring scripts, you can periodically clear the values in an aggregation by using the clear function. This function takes an aggregation as its only parameter. The clear function clears only the aggregation's values, while the aggregation's keys are retained. Therefore, the presence of a key in an aggregation that has an associated value of zero indicates that the key had a non-zero value that was subsequently set to zero as part of a clear. To discard both an aggregation's values and its keys, use the trunc function. See Truncating Aggregations.

The following example uses clear to show the system call rate only for the most recent ten-second period:

#pragma D option quiet

BEGIN
{
  last = timestamp;
}

syscall:::entry
{
  @func[execname] = count();
}

tick-10sec
{
  normalize(@func, (timestamp - last) / 1000000000);
  printa(@func);
  clear(@func);
  last = timestamp;
}

Truncating Aggregations

When looking at aggregation results, you often care only about the top several results. The keys and values that are associated with anything other than the highest values are not of interest. You might also choose to discard an entire aggregation result, removing both the keys and values. The DTrace trunc function is used in both of these situations.

The parameters to trunc are an aggregation and an optional truncation value. Without the truncation value, trunc discards both the aggregation values and the aggregation keys for the entire aggregation. When a truncation value n is present, trunc discards the aggregation values and keys, except for those values and keys that are associated with the highest n values. That is to say, trunc(@foo, 10) truncates the aggregation named foo after the top ten values, where trunc(@foo) discards the entire aggregation. The entire aggregation is also discarded if 0 is specified as the truncation value.

To see the bottom n values instead of the top n values, specify a negative truncation value to trunc. For example, trunc(@foo, -10) truncates the aggregation named foo after the bottom ten values.

The following example displays only the per-second system call rates of the top ten system-calling applications in a ten-second period:

#pragma D option quiet

BEGIN
{
  last = timestamp;
}

syscall:::entry
{
  @func[execname] = count();
}

tick-10sec
{
  trunc(@func, 10);
  normalize(@func, (timestamp - last) / 1000000000);
  printa(@func);
  clear(@func);
  last = timestamp;
}

The following example shows the output from running the previous script on a lightly loaded system:

# dtrace -s truncagg.d 

  dbus-daemon                                                       0
  NetworkManager                                                    1
  gmain                                                             1
  systemd-logind                                                    1
  sendmail                                                          1
  systemd                                                           1
  httpd                                                             2
  tuned                                                             5
  dtrace                                                           44

  rpcbind                                                           0
  dbus-daemon                                                       0
  gmain                                                             0
  sshd                                                              1
  systemd-logind                                                    1
  sendmail                                                          1
  systemd                                                           1
  httpd                                                             2
  tuned                                                             5
  dtrace                                                           41

  dbus-daemon                                                       0
  gmain                                                             1
  sshd                                                              1
  systemd-logind                                                    1
  sendmail                                                          1
  systemd                                                           1
  httpd                                                             2
  tuned                                                             5
  automount                                                         7
  dtrace                                                           41
^C

#

Minimizing Drops

Because DTrace buffers some aggregation data in the kernel, space might not be available when a new key is added to an aggregation. In this case, the data is dropped, the counter is incremented, and dtrace generates a message indicating an aggregation drop. You should note that this situation rarely occurs because DTrace keeps state information consisting of the aggregation's key and intermediate results at user level, where space can grow dynamically. In the unlikely event that an aggregation drop occurs, you can increase the aggregation buffer size by using the aggsize option, which reduces the likelihood of drops.

You can also use this option to minimize the memory footprint of DTrace. As with any size option, aggsize can be specified with any size suffix. The resizing policy of this buffer is dictated by the bufresize option. For more information about buffering, see Buffers and Buffering.

An alternative method to eliminate aggregation drops is to increase the rate at which aggregation data is consumed at the user level. This rate defaults to once per second, and may be explicitly tuned with the aggrate option. As with any rate option, aggrate can be specified with any time suffix, but defaults to rate-per-second. For more information about the aggsize option, see Options and Tunables.