Writing Device Drivers

Post-Mortem Debugging

When kadb is running and the system panics, control is passed to the debugger so that you can investigate the source of the problem. However, kadb is not always the best tool for problem analysis; frequently it is easier to use ':c' to continue execution and allow the system to save a crash dump. When the system reboots, you can perform post-mortem analysis on the saved crash dump. This process is analogous to debugging an application crash from a process core file.

Post-mortem analysis offers several advantages to driver developers: it allows more than one developer to examine a problem in parallel; it allows developers to retrieve information on a problem that occurred in production at a customer site, where it is not acceptable to debug interactively; it is necessary to perform certain types of advanced kernel analysis, such as checking for kernel memory leaks.

Getting Started With MDB

MDB provides sophisticated debugging support for analyzing kernel problems. This section provides an overview of MDB's features. For a more complete discussion of MDB's capabilities, refer to the Solaris Modular Debugger Guide.

MDB's command syntax is compatible with the kadb syntax and MDB can execute all of the kadb (and legacy adb) macros. These are stored in /usr/lib/adb and in /usr/platform/`uname -i`/lib/adb for 32-bit kernels; and in /usr/lib/adb/sparcv9 and /usr/platform/`uname -i`/lib/adb/sparcv9 for 64-bit kernels.

In addition to macro files, MDB supports 'debugger commands' or dcmds. These dcmds can be dynamically loaded at runtime from a set of debugger modules. MDB provides a first-class programming API for implementing debugger modules so that driver developers can implement their own custom debugging support. MDB also provides a host of usability features, such as command line editing, command history, an output pager, and online help.

MDB provides a rich set of modules and dcmds for debugging the Solaris kernel and associated modules and device drivers. These facilities allow you to formulate complex debugging queries: locate all the memory allocated by a particular thread; print a visual picture of a kernel STREAM; determine what type of structure a particular address refers to; locate leaked memory blocks in the kernel; analyze memory to locate stack traces; and more.


Note -

In earlier versions of the Solaris operating environment, adb(1) was the recommended tool for post-mortem analysis. In the Solaris 8 operating environment, mdb(1) is the new recommended tool for post-mortem analysis; it provides an upward-compatible syntax and feature set. In addition, mdb includes features that surpass the set of commands available from the legacy crash(1M) utility.


To get started, run mdb and supply it with a system crash dump:

% cd /var/crash/testsystem
% ls
bounds     unix.0     vmcore.0
% mdb unix.0 vmcore.0
Loading modules: [ unix krtld genunix ip logindmux ptm pts nfs lofs ]
> ::status
debugging crash dump vmcore.1 (32-bit) from testsystem
operating system: 5.8 generic (sun4u)

When mdb responds with the '>' prompt, it is ready for commands. To examine the running kernel on a live system, type:

# mdb -k
Loading modules: [ unix krtld genunix ip logindmux ptm nfs ipc ]
> ::status
debugging live kernel (32-bit) on testsystem
operating system: 5.8 Generic (sun4u)

Important MDB Commands

This section provides a tutorial for some of the MDB debugger commands most applicable to driver authors. Note that the information presented here is dependent on the type of system used. A Sun Ultra 1 workstation running the 32-bit kernel was used to produce these examples.

The Solaris Modular Debugger Guide provides details about each debugger command discussed here, as well as more information about all aspects of MDB. Online help is available from within MDB using the ::help built-in command.

Navigating the Device Tree

MDB provides the ::prtconf dcmd to display the kernel device tree. The output of this dcmd is similar to the output of the prtconf(1M) command:

> ::prtconf
DEVINFO  NAME
704c9f00 SUNW,Ultra-1
    704c9e00 packages (driver not attached)
        704c9c00 terminal-emulator (driver not attached)
        704c9b00 deblocker (driver not attached)
        704c9a00 obp-tftp (driver not attached)
        704c9900 disk-label (driver not attached)
        704c9800 sun-keyboard (driver not attached)
        704c9700 ufs-file-system (driver not attached)
    704c9d00 chosen (driver not attached)
    704c9600 openprom (driver not attached)
        704c9400 client-services (driver not attached)
    704c9500 options, instance #0
    704c9300 aliases (driver not attached)
    704c9200 memory (driver not attached)
    704c9100 virtual-memory (driver not attached)
    704c9000 counter-timer (driver not attached)
    704c8f00 sbus, instance #0
        704c8d00 SUNW,CS4231 (driver not attached)
        704c8c00 auxio (driver not attached)
        704c8b00 flashprom (driver not attached)
        704c8a00 SUNW,fdtwo (driver not attached)
...

Each line of output represents a node in the kernel's device tree. The address to the left of each node name is the address of the devinfo node. The node can then be displayed using the $<devinfo macro or the ::devinfo dcmd:

> 704c9f00::devinfo
704c9f00 SUNW,Ultra-1
         Driver properties at 0x704bb208:
           pm-hardware-state: "no-suspend-resume"
         System properties at 0x704bb190:
           relative-addressing: 0x1
           MMU_PAGEOFFSET: 0x1fff
           MMU_PAGESIZE: 0x2000
           PAGESIZE: 0x2000

> 704c9f00$<devinfo
0x704cbd20:
                name:
SUNW,Ultra-1
0x704c9f00:     parent          child           sibling
                0               704c9e00        0
0x704c9f10:     addr            nodeid          instance
                704be758        f00297e4        ffffffff
0x704c9f1c:     ops             parent_data     driver_data
                rootnex_ops     702baad0        0
0x704c9f28:     drv_prop_ptr    sys_prop_ptr    minor
                704bb208        704bb190        0
0x704c9f34:     next
                0

Use ::prtconf to see where your driver has attached in the device tree, and to display device properties. You can also specify the verbose (-v) flag to ::prtconf to display the properties for each device node:

> ::prtconf -v
DEVINFO  NAME
704c9f00 SUNW,Ultra-1
         Driver properties at 0x704bb208:
           pm-hardware-state: "no-suspend-resume"
         System properties at 0x704bb190:
           relative-addressing: 0x1
           MMU_PAGEOFFSET: 0x1fff
           MMU_PAGESIZE: 0x2000
           PAGESIZE: 0x2000
         ...
         704c8400 espdma, instance #0
            704c8200 esp, instance #0
                     Driver properties at 0x702ba7d8:
                       target0-sync-speed: 0x2710
                       target0-TQ
                       scsi-selection-timeout: <000000fa.> (device: <0x3d/0x
                       00000000>)
                       scsi-options: <00001ff8.> (device: <0x3d/0x00000000>)
                       scsi-watchdog-tick: <0000000a.> (device: <0x3d/0x00000000>)
                       scsi-tag-age-limit: <00000002.> (device: <0x3d/0x00000000>)
                       scsi-reset-delay: <00000bb8.> (device: <0x3d/0x00000000>)
         ...
                704c7c00 sd, instance #1 (driver not attached)
                         System properties at 0x704ba4e8:
                           lun: 0x0
                           target: 0x1
                           class_prop: "atapi"
                           class: "scsi"
         ...

Another way to locate instances of your driver is the ::devbindings dcmd. Given a driver name, it displays a list of all instances of the named driver:

> ::devbindings sd
704c8100 sd (driver not attached)
704c7d00 sd, instance #0
         Driver properties at 0x702ba5a8:
           pm-hardware-state: "needs-suspend-resume"
           ddi-kernel-ioctl
         System properties at 0x704ba588:
           lun: 0x0
           target: 0x0
           class_prop: "atapi"
           class: "scsi"
704c7c00 sd, instance #1 (driver not attached)
         System properties at 0x704ba4e8:
           lun: 0x0
           target: 0x1
           class_prop: "atapi"
           class: "scsi"
704c7b00 sd, instance #2 (driver not attached)
         System properties at 0x704ba448:
           lun: 0x0
           target: 0x2
           class_prop: "atapi"
           class: "scsi"
...

Retrieving Driver Soft State Information

A common problem when debugging a driver is retrieving the "soft state" for a particular driver instance. The soft state is allocated with the ddi_soft_state_zalloc(9F) routine and obtained by drivers using ddi_get_soft_state(9F). If you know the name of the soft state pointer (the first argument to ddi_soft_state_init(9F)), MDB lets you retrieve the soft state for a particular driver instance using the ::softstate dcmd:

> bst_state::softstate 0x3
702b7578

In this case, ::softstate is used to fetch the soft state for instance 3 of the bst sample driver. This pointer points to a bst_soft structure used by the driver to track state for this instance.

Detecting Kernel Memory Leaks

The ::findleaks dcmd provides powerful and efficient detection of memory leaks in kernel crash dumps. The full set of kernel memory debugging features should be enabled for ::findleaks to be effective. For more information see "kmem_flags". Running ::findleaks during driver development and testing can detect code which is leaking memory and wasting kernel resources. See "Debugging With the Kernel Memory Allocator" in the Solaris Modular Debugger Guide for a complete discussion of ::findleaks.


Note -

Use ::findleaks to detect and eliminate kernel memory leaks caused by your code. Code that leaks kernel memory can render the system vulnerable to denial-of-service attacks.


Writing Debugger Commands

MDB provides a powerful API for implementing new debugger facilities that you can use to debug your driver. The Solaris Modular Debugger Guide explains the programming API in more detail, and the SUNWmdbdm package installs sample MDB source code in the directory /usr/demo/mdb. You can use MDB to automate lengthy debugging chores or help to validate that your driver is behaving properly. You can also package your MDB debugging modules with your driver product so that these facilities will be available to service personnel at a customer site.