DTrace Aggregations

For performance-related questions, aggregated data is often more useful than individual data points. DTrace provides several built-in aggregating functions. When an aggregating function is applied to subsets of a collection of data, then applied again to the results of the analysis of those subsets, the results are identical to the results returned by the aggregating function when it is applied to the collection as a whole.

The DTrace facility stores a running count of data items for aggregations. The aggregating functions store only the current intermediate result and the new element that the function is being applied to. The intermediate results are allocated on a per-CPU basis. Because this allocation scheme does not require locks, the implementation is inherently scalable.

DTrace Aggregation Syntax

A DTrace aggregation takes the following general form:

@name[ keys ] = aggfunc( args );

In this general form, the variables are defined as follows:

name: The name of the aggregation, preceded by the @ character.
keys: A comma-separated list of D expressions.
aggfunc: One of the DTrace aggregating functions.
args: A comma-separated list of arguments appropriate to the aggregating function.

Table 2-1 DTrace Aggregating Functions

Function Name	Arguments	Result
`count`	none	The number of times that the `count` function is called.
`sum`	scalar expression	The total value of the specified expressions.
`avg`	scalar expression	The arithmetic average of the specified expressions.
`min`	scalar expression	The smallest value among the specified expressions.
`max`	scalar expression	The largest value among the specified expressions.
`lquantize`	scalar expression, lower bound, upper bound, step value	A linear frequency distribution of the values of the specified expressions that is sized by the specified range. This aggregating function increments the value in the highest bucket that is less than the specified expression.
`quantize`	scalar expression	A power-of-two frequency distribution of the values of the specified expressions. This aggregating function increments the value in the highest power-of-two bucket that is less than the specified expression.

Example 2-14 Using an Aggregating Function

This example uses the count aggregating function to count the number of write(2) system calls per process. The aggregation does not output any data until the dtrace command is terminated. The output data represents a summary of the data collected during the time that the dtrace command was active.

# cat writes.d
#!/usr/sbin/dtrace -s
syscall::write:entry]
{   @numWrites[execname] = count();
}

# ./writes.d
dtrace: script 'writes.d' matched 1 probe
^C
  dtrace                           1
  date                             1
  bash                             3
  grep                            20
  file                           197
  ls                             201