14 Statically Defined Tracing of Kernel Modules
DTrace provides a facility for developers to define customized
probes in kernel modules. These static probes appear as additional
probes of the sdt
provider and impose little to
no overhead if the sdt
module is not loaded.
For example, for x86_64, the overhead is a single-byte NOP,
followed by a 4-byte NOP. This chapter provides a full example of
how to define and use static probes in a kernel module.
The general principles for naming probes and choosing insertion
points are the same for kernel modules as they are for user-space
applications. You should define probes that have a semantic meaning
that is readily understood by your DTrace user community. Typically,
you might name probes for the routine in which you place them and
their position in that routine. For example, if your probes provide
information about data values on entry to or return from a routine
named foo
, you might name them
foo-entry
and foo-return
. The
data values that are returned by such probes could present the
routine as a black box, rather than return
intermediate values from the internal implementation of the module.
To gather data from deeper within a module, you might insert
additional probes with names such as foo-stage1
or foo-post-hardware-init
.
In one respect, using static probes with kernel modules can be
simpler than for user-space applications. You do not need to modify
the build files unless you want to conditionally compile a module to
include the probes. Inserting the probes in the source code is
slightly more complex, as you cannot use the dtrace
-h command to generate the probe macros. However, using a
DTRACE_PROBE
macro to insert a probe is a
relatively simple change to make to the source code.
You can insert sdt
static probes in any Oracle Linux
kernel module for which you have the source files and the necessary
build infrastructure, but note that DTrace supports statically
defined tracing of 64-bit kernel modules only.
For more information about the sdt
provider, see
sdt provider.
For an introduction to the concepts of statically defined tracing as applied to user-space applications, see Statically Defined Tracing of User Applications.
Inserting Static Probe Points
You can embed static probes within the source code for which you want to capture the current state of a module and its data.
The following example pseudo character device driver consists of three source files:
-
revdev.h
-
Is the header file for the module.
-
rev_mod.c
-
Defines the module's properties and its
init
andexit
routines. -
rev_dev.c
-
Defines the driver's
open
,read
,release
,unlocked_ioctl
, andwrite
routines. The static probes are inserted in theread
,unlocked_ioctl
, andwrite
routines, although probes could also be inserted in the other routines, if required.
revdev.h Example
The module header file revdev.h
must be
prepared, as indicated in bold font in the following example, by
adding lines to include linux/sdt.h
and to
define probe macros.
#include <asm/uaccess.h> #include <linux/cdev.h> #include <linux/fs.h> #include <linux/kernel.h> #include <linux/miscdevice.h> #include <linux/module.h> #include <linux/mutex.h> #include <linux/types.h> #include <linux/sdt.h> #define DEVICE "revdev" #define REVDEV_IOCTL_ENTRY_PROBE(name, file, cmd, arg) \ DTRACE_PROBE3(ioctl__##name, struct file *, file, \ unsigned int, cmd, unsigned long, arg) #define REVDEV_IOCTL_RETURN_PROBE(name, str) \ DTRACE_PROBE1(ioctl__##name, struct char *, str) #define REVDEV_READ_ENTRY_PROBE(name,fp,buf) \ DTRACE_PROBE2(read__##name, file *, fp, char *, buf) #define REVDEV_READ_RETURN_PROBE(name,buf,n) \ DTRACE_PROBE2(read__##name, char *, buf, size_t, n) #define REVDEV_WRITE_ENTRY_PROBE(name,fp,buf,n) \ DTRACE_PROBE3(write__##name, file *, fp, char *, buf, size_t, n) #define REVDEV_WRITE_RETURN_PROBE(name,buf,n) \ DTRACE_PROBE2(write__##name, char *, buf, size_t, n)
The DTRACE_PROBE
macros that are defined in
/lib/modules/`uname
-r`/build/include/linux/sdt.h
support from zero to
eight arguments.
You can define your own macros for the inserted probes, as shown in the preceding example. Unlike user-space static probes, you cannot use the dtrace -h command to create a header file that includes suitable probe definitions. You do not need to create a provider definition file for the probes.
The probes are named according to the first argument of the
DTRACE_PROBE
macro. The suffix
N in the macro name
DTRACE_PROBEN
refers the number of arguments that are passed to the probe. The
first argument to the probe macro is the probe name. As
described in Declaring Probes, two consecutive
underscores are converted to a single dash. The remaining macro
arguments are pairs of arguments that define the DTrace
argn
variables
that are assigned when the probe fires. Each pair of arguments
defines the variable type and a variable name, for example:
#define REVDEV_WRITE_ENTRY_PROBE(name, fp, buf, n) \ DTRACE_PROBE3(write__##name, file *, fp, char *, buf, size_t, n)
The values of fp
, buf
, and
n
are made available by the
arg0
, arg1
, and
arg2
variables in DTrace when the probe
fires.
The provider, module, and function elements of the complete
probe are named for sdt
, the driver module
name (without the .ko
), and the driver
routine.
The probes inherit the stability attributes of the
sdt
provider.
rev_mod.c Example
No changes are made in the following example, which does not
insert any probes in the module's init
and
exit
routines. Note that there is no
restriction on inserting probes in these routines.
#include "revdev.h" MODULE_AUTHOR("DTrace Example"); MODULE_DESCRIPTION("Using DTrace SDT probes with a device driver"); MODULE_VERSION("v1.0"); MODULE_LICENSE("GPL"); extern const struct file_operations revdev_fops; static struct miscdevice revdev = { .minor = 0, .name = DEVICE, .fops = &revdev_fops, }; DEFINE_MUTEX(revdev_mutex); static int revdev_entry(void){ /* Register device */ int retval; retval = misc_register(&revdev); if (retval < 0) { printk(KERN_ERR "revdev: Could not register device"); return retval; } mutex_init(&revdev_mutex); return 0; } static void revdev_exit(void){ misc_deregister(&revdev); } /* Define module init and exit calls */ module_init(revdev_entry); module_exit(revdev_exit);
rev_dev.c Example
No existing lines of code are modified in this example. Only
line insertions are required for the entry
and return
probes in each of the
read
, unlocked_ioctl
, and
write
routines.
The changes in this example appear in bold font.
#include "revdev.h" static struct device_buffer { char data[80]; } devbuf; static char *oddeven[] = { "Even", "Odd" }; extern struct mutex revdev_mutex; static long revdev_ioctl(struct file *file, unsigned int cmd, unsigned long arg) { char *cp; REVDEV_IOCTL_ENTRY_PROBE(entry, file, cmd, arg); cp = oddeven[arg%2]; REVDEV_IOCTL_RETURN_PROBE(return, cp); return -EAGAIN; } static int revdev_open(struct inode *inode, struct file *fp){ if (!mutex_trylock(&revdev_mutex)){ printk(KERN_INFO "revdev: Device already in use"); return -EBUSY; } return 0; } static void revstr(char *s) { /* After Kernighan and Ritchie */ int i, j, t; for (i = 0, j = strlen(s)-1; i < j; i++, j--) t = s[i], s[i] = s[j], s[j] = t; } static ssize_t revdev_read(struct file *fp, char* buf, size_t n, loff_t *o){ int retval; REVDEV_READ_ENTRY_PROBE(entry, fp, devbuf.data); revstr(devbuf.data); n = strlen(devbuf.data); retval = copy_to_user(buf, devbuf.data, n); REVDEV_READ_RETURN_PROBE(return, buf, n); if (retval != 0) return -EINVAL; return 0; } static ssize_t revdev_write(struct file *fp, const char* buf, size_t n, loff_t *o){ int retval; REVDEV_WRITE_ENTRY_PROBE(entry, fp, buf, n); retval = copy_from_user(devbuf.data, buf, n); devbuf.data[n-retval] = '\0'; REVDEV_WRITE_RETURN_PROBE(return, devbuf.data, n); if (retval != 0) return -EINVAL; return 0; } static int revdev_close(struct inode *inode, struct file *fp){ mutex_unlock(&revdev_mutex); return 0; } const struct file_operations revdev_fops = { .owner = THIS_MODULE, .read = revdev_read, .write = revdev_write, .unlocked_ioctl = revdev_ioctl, .open = revdev_open, .release = revdev_close, };
Building Modules With Static Probes
Note:
The following example requires that you link the module against a UEK version that supports the DTrace modules, which can be either UEK R5 or UEK R4 for Oracle Linux 7 or UEK R4 for Oracle Linux 6.
A bug in the current implementation means that a module containing SDT probes must be built from two or more source files.
The following Kbuild
and
Makefile
are used to build the example pseudo
driver module revdev.ko
and a test program
named testrevdev.
Makefile Example
Note:
All of the command lines in the Makefile, such as those
beginning with gcc
in the following
example, must start with tabs.
KERNEL_DIR = /lib/modules/`uname -r`/build modules:: testrevdev install:: modules_install testrevdev: testrevdev.c gcc -o testrevdev testrevdev.c %:: $(MAKE) -C $(KERNEL_DIR) M=`pwd` $@
The source file for testrevdev is
testrevdev.c
.
testrevdev.c Example
#include <fcntl.h> #include <stdio.h> #include <stdlib.h> #include <string.h> #define DEVICE_FILE "/dev/revdev" int main() { char buf[81]; int i, fd, n; if ((fd = open(DEVICE_FILE, O_RDWR)) != 0) { perror("open"); exit(1); } i=0; while (1) { (i++)%20; printf("Write: "); scanf(" %80[^\n]", buf); n = strlen(buf); if (!strncmp(buf, "exit", 4)) break; else if (!strncmp(buf, "ioctl", 5)) ioctl(fd,128,i); else { write(fd, buf, n); read(fd, buf, n); buf[n]='\0'; printf(" Read: %s\n", buf); } } close(fd); exit(0); }
When run, testrevdev reads a string that you
enter, writes the string to the revdev
device, and then reads the reversed string from the device.
If the input string begins with ioctl
, the
program calls ioctl
on the open file
descriptor, which invokes the device's
unlocked_ioctl
routine. An input string that
begins with exit
terminates the program.
To build the module and test program, use the make command:
# make make -C /lib/modules/`uname -r`/build M=`pwd` modules make[1]: Entering directory `/usr/src/kernels/4.1.12-version.el6uek.x86_64' CC [M] /root/revdev/rev_dev.o CC [M] /root/revdev/rev_mod.o SDTSTB /root/revdev/revdev.sdtstub.S AS [M] /root/revdev/revdev.sdtstub.o LD [M] /root/revdev/revdev.o Building modules, stage 2. MODPOST 1 modules SDTINF /root/revdev/revdev.sdtinfo.c CC /root/revdev/revdev.mod.o CTF LD [M] /root/revdev/revdev.ko make[1]: Leaving directory `/usr/src/kernels/4.1.12-version.el6uek.x86_64'
Using DTrace to Test Modules With Static Probes
You can use DTrace to display information when one of the embedded static probes in a module fires.
To test the example module revdev.ko
:
-
Set up a
udev
rule to create the/dev/revdev
device file:# echo "KERNEL==\"revdev\", MODE=\"0660\"" > /etc/udev/rules.d/10-revdev.rules
-
Load the
revdev.ko
module:# insmod revdev.ko
You can use dtrace to test that the probes are now available:
# dtrace -l -m revdev ID PROVIDER MODULE FUNCTION NAME 4 sdt revdev revdev_ioctl ioctl-return 5 sdt revdev revdev_ioctl ioctl-entry 6 sdt revdev revdev_write write-return 7 sdt revdev revdev_write write-entry 8 sdt revdev revdev_read read-return 9 sdt revdev revdev_read read-entry
-
Enter the following DTrace script (traceflow):
#!/usr/sbin/dtrace -qs #pragma D option nspec=10 self int indent; syscall:::entry /execname == "testrevdev"/ { self->specflag = 0; self->spec = speculation(); self->indent += 2; speculate(self->spec); } syscall:::entry /self->spec/ { speculate(self->spec); printf("%*s ", self->indent, "->"); printf("%s() entry\n",probefunc); self->indent += 2; } syscall:::return /self->spec/ { speculate(self->spec); self->indent -= 2; printf("%*s ", self->indent, "<-"); printf("%s() return\n",probefunc); } syscall:::return /self->spec && self->specflag == 0/ { discard(self->spec); self->indent -= 2; self->spec = 0; } syscall:::return /self->spec && self->specflag == 1/ { commit(self->spec); self->indent -= 2; self->spec = 0; } sdt:revdev::ioctl-entry /self->spec/ { speculate(self->spec); self->specflag = 1; printf("%*s ", self->indent, "=>"); printf("%s() entry file: %s cmd: %d arg: %d\n", probefunc, d_path(&(((struct file *)arg0)->f_path)), arg1, arg2); } sdt:revdev::ioctl-return /self->spec/ { speculate(self->spec); printf("%*s ", self->indent, "<="); printf("%s() return cpstr: %s\n", probefunc, stringof((char*)arg0)); } sdt:revdev::read-entry /self->spec/ { speculate(self->spec); self->specflag = 1; printf("%*s ", self->indent, "=>"); printf("%s() entry file: %s devbuf: %s\n", probefunc, d_path(&(((struct file *)arg0)->f_path)), stringof((char *)arg1)); } sdt:revdev::read-return /self->spec/ { speculate(self->spec); printf("%*s ", self->indent, "<="); printf("%s() return string: %s len: %d\n", probefunc, stringof((char *)arg0), arg1); } sdt:revdev::write-entry /self->spec/ { speculate(self->spec); self->specflag = 1; printf("%*s ", self->indent, "=>"); printf("%s() entry file: %s string: %s len: %d\n", probefunc, d_path(&(((struct file *)arg0)->f_path)), stringof((char *)arg1), arg2); } sdt:revdev::write-return /self->spec/ { speculate(self->spec); printf("%*s ", self->indent, "<="); printf("%s() return string: %s len: %d\n", probefunc, stringof((char *)arg0), arg1); }
When one of the inserted probes fires, traceflow displays information about data values in the module by using the probe argument variables (
arg0
,arg1
,arg2
,...).Note:
Argument variables that return pointer types, such as
file *
andchar *
, must be explicitly cast.The script uses
d_path
andstringof
to create printable file paths and strings. For example,(struct file *)arg0
casts the value ofarg0
to a file pointer (struct file *
). Thef_path
member of thestruct file
contains the path structure (struct path
) for a file. Asd_path
takes a path pointer (struct path *
) as its argument, the&
operator is used to return a pointer to thestruct path
.See d_path and String Conversion for more information.
-
Make traceflow executable:
# chmod +x traceflow
-
In one window, run traceflow:
# ./traceflow
-
In another window, run testrevdev and enter input, for example:
# ./testrevdev Write: hello Read: olleh Write: world Read: dlrow Write: ioctl Write: ioctl Write: exit
In the window that traceflow is running, you should see output similar to the following , as DTrace responds to the probes in
revdev.ko
that are firing:# ./traceflow -> write() entry => revdev_write() entry file: /dev/revdev string: hello len: 5 <= revdev_write() return string: hello len: 5 <- write() return -> read() entry => revdev_read() entry file: /dev/revdev devbuf: hello <= revdev_read() return string: olleh len: 5 <- read() return -> write() entry => revdev_write() entry file: /dev/revdev string: world len: 5 <= revdev_write() return string: world len: 5 <- write() return -> read() entry => revdev_read() entry file: /dev/revdev devbuf: world <= revdev_read() return string: dlrow len: 5 <- read() return -> ioctl() entry => revdev_ioctl() entry file: /dev/revdev cmd: 128 arg: 3 <= revdev_ioctl() return cpstr: Odd <- ioctl() return -> ioctl() entry => revdev_ioctl() entry file: /dev/revdev cmd: 128 arg: 4 <= revdev_ioctl() return cpstr: Even <- ioctl() return