2.8 More Complex Data Aggregations

You can use the lquantize() and quantize() functions to display linear and power-of-two frequency distributions of data. See Aggregations in the Oracle Linux Dynamic Tracing Guide for a description of aggregation functions.

In the following example, we display the distribution of the sizes specified to arg2 of read() calls that were invoked by all instances of firefox that are running.

# dtrace -n 'syscall::read:entry /execname=="firefox"/{@dist["firefox"]=quantize(arg2);}'
dtrace: description 'syscall::read:entry ' matched 1 probe
^C

  firefox                                           
           value  ------------- Distribution ------------- count    
               0 |                                         0        
               1 |@                                        566      
               2 |                                         0        
               4 |                                         0        
               8 |                                         7        
              16 |                                         4        
              32 |                                         0        
              64 |                                         0        
             128 |                                         8        
             256 |@                                        436      
             512 |                                         8        
            1024 |@@                                       959      
            2048 |@                                        230      
            4096 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@       13785    
            8192 |                                         3        
           16384 |                                         4        
           32768 |                                         0        
           65536 |                                         0        
          131072 |                                         73       
          262144 |                                         0

If the program is simple as this one, it is often convenient to run it from the command line.

The following script, diskact.d, uses io provider probes (enabled by the sdt kernel module) to display the distribution of I/O throughput for the block devices on the system.

Example 2.18 diskact.d: Display the distribution of I/O throughput for block devices

#pragma D option quiet

/* diskact.d -- Display the distribution of I/O throughput for block devices */

io:::start
{
  start[args[0]->b_edev, args[0]->b_blkno] = timestamp;
}

io:::done
/start[args[0]->b_edev, args[0]->b_blkno]/
{
  /*
     We want to get an idea of our throughput to this device in KB/sec
     but we have values measured in bytes and nanoseconds.
     We want to calculate:
    
     bytes / 1024
     ------------------------
     nanoseconds / 1000000000
    
     As DTrace uses integer arithmetic and the denominator is usually
     between 0 and 1 for most I/O, the calculation as shown will lose
     precision. So we restate the fraction as:
    
     bytes         1000000000      bytes * 976562
     ----------- * ------------- = --------------
     nanoseconds   1024            nanoseconds
    
     This is easy to calculate using integer arithmetic.
   */
  this->elapsed = timestamp - start[args[0]->b_edev, args[0]->b_blkno];
  @[args[1]->dev_statname, args[1]->dev_pathname] =
    quantize((args[0]->b_bcount * 976562) / this->elapsed);
  start[args[0]->b_edev, args[0]->b_blkno] = 0;
}

END
{
  printa(" %s (%s)\n%@d\n", @);
}

We use the #pragma D option quiet statement to suppress unwanted output and the printa() function to display the results of the aggregation.

See io Provider in the Oracle Linux Dynamic Tracing Guide for a description of the arguments to the io:::start and io:::done probes.

See Output Formatting in the Oracle Linux Dynamic Tracing Guide for a description of the printa() function.

After running the program for about a minute, we type Ctrl-C to display the results:

# dtrace -s diskact.d
^C
 sda (/dev/sda)

           value  ------------- Distribution ------------- count    
             128 |                                         0        
             256 |@@@@@@@@@@@                              2        
             512 |@@@@@@                                   1        
            1024 |                                         0        
            2048 |                                         0        
            4096 |                                         0        
            8192 |@@@@@@@@@@@@@@@@@                        3        
           16384 |                                         0        
           32768 |@@@@@@                                   1        
           65536 |                                         0        

 dm-3 (/dev/dm-3)

           value  ------------- Distribution ------------- count    
               4 |                                         0        
               8 |                                         1        
              16 |@@                                       24       
              32 |@@@@@                                    72       
              64 |@@@@@@                                   86       
             128 |@@@@@@@@@@@                              144      
             256 |                                         5        
             512 |                                         1        
            1024 |@@@@@@@@@                                123      
            2048 |@@@@                                     60       
            4096 |                                         6        
            8192 |@                                        17       
           16384 |@                                        7        
           32768 |                                         2        
           65536 |                                         0        

The next example is a bash shell script that uses an embedded D program to display cumulative read and write block counts for a local file system according to their location on the file system's underlying block device. We use the lquantize() aggregation function to display the results linearly in tenths of the total distance across the device.

Example 2.20 fsact: Display cumulative read and write activity across a file system device

#!/bin/bash

# fsact -- Display cumulative read and write activity across a file system device
#
#          Usage: fsact [<filesystem>]

# Load the required DTrace modules
grep profile /proc/modules > /dev/null 2>&1 || modprobe profile
grep sdt /proc/modules > /dev/null 2>&1 || modprobe sdt

# If no file system is specified, assume /
[ $# -eq 1 ] && FSNAME=$1 || FSNAME="/"
[ ! -e $FSNAME ] && echo "$FSNAME not found" && exit 1

# Determine the mountpoint, major and minor numbers, and file system size
MNTPNT=$(df $FSNAME | gawk '{ getline; print $1; exit }')
MAJOR=$(printf "%d\n" 0x$(stat -Lc "%t" $MNTPNT))
MINOR=$(printf "%d\n" 0x$(stat -Lc "%T" $MNTPNT))
FSSIZE=$(stat -fc "%b" $FSNAME)

# Run the embedded D program
dtrace -qs /dev/stdin << EOF
io:::done
/args[1]->dev_major == $MAJOR && args[1]->dev_minor == $MINOR/
{
  iodir = args[0]->b_flags & B_READ ? "READ" : "WRITE";
  /* Normalize the block number as an integer in the range 0 to 10 */ 
  blkno = (args[0]->b_blkno)*10/$FSSIZE;
  /* Aggregate blkno linearly over the range 0 to 10 in steps of 1 */ 
  @a[iodir] = lquantize(blkno,0,10,1)
}

tick-10s
{
  printf("%Y\n",walltimestamp);
  /* Display the results of the aggregation */
  printa("%s\n%@d\n",@a);
  /* To reset the aggregation every tick, uncoment the following line */
  /* clear(@a); */
}
EOF

We embed the D program in a shell script so that we can set up the parameters that we need: the major and minor numbers of the underlying device and the total size of the file system in file system blocks. We then write the values of these parameters directly into the here-script.

Note

An alternate way of passing values into the D program is to use C preprocessor directives, for example:

dtrace -C -D MAJ=$MAJOR -D MIN=$MINOR -D FSZ=$FSSIZE -qs /dev/stdin << EOF

You can then refer to the variables in the D program by their macro names instead of their shell names:

/args[1]->dev_major == MAJ && args[1]->dev_minor == MIN/

blkno = (args[0]->b_blkno)*10/FSZ;

The following is sample output from running fsact after making the script executable.

# chmod +x fsact
# ./fsact
2013 Nov 10 17:14:35
WRITE

           value  ------------- Distribution ------------- count    
             < 0 |                                         0        
               0 |                                         3        
               1 |                                         3        
               2 |                                         1        
               3 |                                         0        
               4 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@   442      
               5 |@                                        16       
               6 |                                         1        
               7 |                                         0        
               8 |                                         1        
               9 |                                         0        

READ

           value  ------------- Distribution ------------- count    
             < 0 |                                         0        
               0 |@@@                                      118      
               1 |@@@@@@                                   273      
               2 |@@@                                      151      
               3 |@                                        48       
               4 |@@@@@@@@@@@@@@@@@@@@                     874      
               5 |@@@@@                                    231      
               6 |                                         0        
               7 |                                         0        
               8 |                                         0        
               9 |@                                        44       
           >= 10 |                                         0        

2013 Nov 10 17:14:45
WRITE

           value  ------------- Distribution ------------- count    
             < 0 |                                         0        
               0 |                                         3        
               1 |                                         3        
               2 |                                         1        
               3 |                                         0        
               4 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@  1443     
               5 |                                         16       
               6 |                                         1        
               7 |                                         0        
               8 |                                         1        
               9 |                                         0        

READ

           value  ------------- Distribution ------------- count    
             < 0 |                                         0        
               0 |@@@@@@@@@@@@                             1376     
               1 |@@@@                                     509      
               2 |@@                                       240      
               3 |@                                        103      
               4 |@@@@@@@@@                                1046     
               5 |@@@@@@                                   727      
               6 |                                         0        
               7 |                                         3        
               8 |@@                                       254      
               9 |@@@                                      324      
           >= 10 |                                         0        

^C