Writing Device Drivers

Part III Building a Device Driver

The third part of this book provides advice on building device drivers for the Oracle Solaris OS:

Chapter 21 Compiling, Loading, Packaging, and Testing Drivers

This chapter describes the procedure for driver development, including code layout, compilation, packaging, and testing.

This chapter provides information on the following subjects:

Driver Development Summary

This chapter and the following two chapters, Chapter 22, Debugging, Testing, and Tuning Device Drivers and Chapter 23, Recommended Coding Practices, provide detailed information on developing a device driver.

    Take the following steps to build a device driver:

  1. Write, compile, and link the new code.

    See Driver Code Layout for the conventions on naming files. Use a C compiler to compile the driver. Link the driver using ld(1). See Compiling and Linking the Driver and Module Dependencies.

  2. Create the necessary hardware configuration files.

    Create a hardware configuration file unique to the device called xx.conf where xx is the prefix for the device. This file is used to update the driver.conf(4) file. See Writing a Hardware Configuration File. For a pseudo device driver, create a pseudo(4) file.

  3. Copy the driver to the appropriate module directory.

    See Copying the Driver to a Module Directory.

  4. Install the device driver using add_drv(1M).

    Installing the driver with add_drv is usually done as part of a postinstall script. See Installing Drivers with add_drv. Use the update_drv(1M) command to make any changes to the driver. See Updating Driver Information.

  5. Load the driver.

    The driver can be loaded automatically by accessing the device. See Loading and Unloading Drivers and Package Postinstall. Drivers can also be loaded by using the modload(1M) command. The modload command does not call any routines in the module and therefore is useful for testing. See Loading and Unloading Test Modules.

  6. Test the driver.

    Drivers should be rigorously tested in the following areas:

    For additional driver-specific testing, see Testing Specific Types of Drivers.

  7. Remove the driver if necessary.

    Use the rem_drv(1M) command to remove a device driver. See Removing the Driver and Package Preremove.

Driver Code Layout

The code for a device driver is usually divided into the following files:

Header Files

Header files provide the following definitions:

Some of the header file definitions, such as the state structure, might be needed only by the device driver. This information should go in private header files that are only included by the device driver itself.

Any information that an application might require, such as the I/O control commands, should be in public header files. These files are included by the driver and by any applications that need information about the device.

While there is no standard for naming private and public files, one convention is to name the private header file xximpl.h and the public header file xxio.h.

Source Files

A C source file (a .c file) for a device driver has the following responsibilities:

Configuration Files

In general, the configuration file for a driver defines all of the properties that the driver needs. Entries in the driver configuration file specify possible device instances that the driver can probe for existence. Driver global properties can be set in the driver's configuration file. See the driver.conf(4) man page for more information.

Driver configuration files are required for devices that are not self-identifying.

Driver configuration files are optional for self-identifying devices (SID). For self-identifying devices, the configuration file can be used to add properties into SID nodes.

The following properties are examples of properties that are not set in the driver configuration file:

Preparing for Driver Installation

    The following steps precede installation of a driver:

  1. Compile the driver.

  2. Create a configuration file if necessary.

  3. Identify the driver module to the system through either of the following alternatives:

    • Match the driver's name to the name of the device node.

    • Use either add_drv(1M) or update_drv(1M) to inform the system of the module names.

The system maintains a one-to-one association between the name of the driver module and the name of the dev_info node. For example, consider a dev_info node for a device that is named mydevice. The device mydevice is handled by a driver module that is also named mydevice. The mydevice module resides in a subdirectory that is called drv, which is in the module path. The module is in drv/mydevice if you are using a 32-bit kernel. The module is in drv/sparcv9/mydevice if you are using a 64-bit SPARC kernel. The module is in drv/amd64/mydevice if you are using a 64-bit x86 kernel.

If the driver is a STREAMS network driver, then the driver name must meet the following constraints:

If the driver must manage dev_info nodes with different names, the add_drv(1M) utility can create aliases. The -i flag specifies the names of other dev_info nodes that the driver handles. The update_drv command can also modify aliases for an installed device driver.

Compiling and Linking the Driver

You need to compile each driver source file and link the resulting object files into a driver module. The OS is compatible with both the Oracle Solaris Studio C compiler and the GNU C compiler from the Free Software Foundation, Inc. The examples in this section use the Oracle Solaris Studio C compiler unless otherwise noted. For information on the Sun Studio C compiler, see the Sun Studio 12: C User’s Guide and the Sun Studio Documentation. For more information on compile and link options, see the Sun Studio Man Pages. The GNU C compiler is supplied in the /usr/sfw directory. For information on the GNU C compiler, see http://gcc.gnu.org/ or check the man pages in /usr/sfw/man.

The example below shows a driver that is called xx with two C source files. A driver module that is called xx is generated. The driver that is created in this example is for a 32-bit kernel. You must use ld -r even if your driver has only one object module.


% cc -D_KERNEL -c xx1.c
% cc -D_KERNEL -c xx2.c
% ld -r -o xx xx1.o xx2.o

The _KERNEL symbol must be defined to indicate that this code defines a kernel module. No other symbols should be defined, except for driver private symbols. The DEBUG symbol can be defined to enable any calls to ASSERT(9F).

If you are compiling for a 64-bit SPARC architecture using Sun Studio 9, Sun Studio 10, or Sun Studio 11, use the -xarch=v9 option:


% cc -D_KERNEL -xarch=v9 -c xx.c

If you are compiling for a 64-bit SPARC architecture using Sun Studio 12, use the -m64 option:


% cc -D_KERNEL -m64 -c xx.c

If you are compiling for a 64-bit x86 architecture using Sun Studio 10 or Sun Studio 11, use both the -xarch=amd64 option and the -xmodel=kernel option:


% cc -D_KERNEL -xarch=amd64 -xmodel=kernel -c xx.c

If you are compiling for a 64-bit x86 architecture using Sun Studio 12, use the -m64 option, the -xarch=sse2a option, and the -xmodel=kernel option:


% cc -D_KERNEL -m64 -xarch=sse2a -xmodel=kernel -c xx.c

Note –

Sun Studio 9 does not support 64-bit x86 architectures. Use Sun Studio 10, Sun Studio 11, or Sun Studio 12 to compile and debug drivers for 64-bit x86 architectures.


After the driver is stable, you might want to add optimization flags to build a production quality driver. See the cc(1) man page in Sun Studio Man Pages for specific information on optimizations in the Sun Studio C compiler.

Global variables should be treated as volatile in device drivers. The volatile tag is discussed in greater detail in Declaring a Variable Volatile. Use of the flag depends on the platform. See the man pages.

Module Dependencies

If the driver module depends on symbols exported by another kernel module, the dependency can be specified by the -dy and -N options of the loader, ld(1). If the driver depends on a symbol exported by misc/mySymbol, the example below should be used to create the driver binary.


% ld -dy -r -o xx xx1.o xx2.o -N misc/mySymbol

Writing a Hardware Configuration File

If a device is non-self-identifying, the kernel might require a hardware configuration file for that device. If the driver is called xx, the hardware configuration file for the driver should be called xx.conf.

On the x86 platform, device information is now supplied by the booting system. Hardware configuration files should no longer be needed, even for non-self-identifying devices.

See the driver.conf(4), pseudo(4), sbus(4), scsi_free_consistent_buf(9F), and update_drv(1M) man pages for more information on hardware configuration files.

Arbitrary properties can be defined in hardware configuration files. Entries in the configuration file are in the form property=value, where property is the property name and value is its initial value. The configuration file approach enables devices to be configured by changing the property values.

Installing, Updating, and Removing Drivers

Before a driver can be used, the system must be informed that the driver exists. The add_drv(1M) utility must be used to correctly install the device driver. After a driver is installed, that driver can be loaded and unloaded from memory without using the add_drv command.

Copying the Driver to a Module Directory

Three conditions determine a device driver module's path:

Device drivers reside in the following locations:

/platform/`uname -i`/kernel/drv

Contains 32-bit drivers that run only on a specific platform.

/platform/`uname -i`/kernel/drv/sparcv9

Contains 64-bit drivers that run only on a specific SPARC-based platform.

/platform/`uname -i`/kernel/drv/amd64

Contains 64-bit drivers that run only on a specific x86-based platform.

/platform/`uname -m`/kernel/drv

Contains 32-bit drivers that run only on a specific family of platforms.

/platform/`uname -m`/kernel/drv/sparcv9

Contains 64-bit drivers that run only on a specific family of SPARC-based platforms.

/platform/`uname -m`/kernel/drv/amd64

Contains 64-bit drivers that run only on a specific family of x86-based platforms.

/usr/kernel/drv

Contains 32-bit drivers that are independent of platforms.

/usr/kernel/drv/sparcv9

Contains 64-bit drivers on SPARC-based systems that are independent of platforms.

/usr/kernel/drv/amd64

Contains 64-bit drivers on x86-based systems that are independent of platforms.

To install a 32-bit driver, the driver and its configuration file must be copied to a drv directory in the module path. For example, to copy a driver to /usr/kernel/drv, type:


$ su
# cp xx /usr/kernel/drv
# cp xx.conf /usr/kernel/drv

To install a SPARC driver, copy the driver to a drv/sparcv9 directory in the module path. Copy the driver configuration file to the drv directory in the module path. For example, to copy a driver to /usr/kernel/drv, you would type:


$ su
# cp xx /usr/kernel/drv/sparcv9
# cp xx.conf /usr/kernel/drv

To install a 64-bit x86 driver, copy the driver to a drv/amd64 directory in the module path. Copy the driver configuration file to the drv directory in the module path. For example, to copy a driver to /usr/kernel/drv, you would type:


$ su
# cp xx /usr/kernel/drv/amd64
# cp xx.conf /usr/kernel/drv

Note –

All driver configuration files (.conf files) must go in the drv directory in the module path. The .conf files cannot go into any subdirectory of the drv directory.


Installing Drivers with add_drv

Use the add_drv(1M) command to install the driver in the system. If the driver installs successfully,add_drv runs devfsadm(1M) to create the logical names in the /dev directory.


# add_drv xx

In this case, the device identifies itself as xx. The device special files have default ownership and permissions (0600 root sys). The add_drv command also allows additional names for the device (aliases) to be specified. See the add_drv(1M) man page for information on adding aliases and setting file permissions explicitly.


Note –

Do not use the add_drv command to install a STREAMS module. See the STREAMS Programming Guide for details.


If the driver creates minor nodes that do not represent terminal devices such as disks, tapes, or ports, you can modify /etc/devlink.tab to cause devfsadm to create logical device names in /dev. Alternatively, logical names can be created by a program that is run at driver installation time.

Updating Driver Information

Use the update_drv(1M) command to notify the system of any changes to an installed device driver. By default, the system re-reads the driver configuration file and reloads the driver binary module.

Removing the Driver

To remove a driver from the system, use the rem_drv(1M) command, and then delete the driver module and configuration file from the module path. A driver cannot be used again until that driver is reinstalled with add_drv(1M). The removal of a SCSI HBA driver requires a reboot to take effect.

Loading and Unloading Drivers

Opening a special file (accessing the device) that is associated with a device driver causes that driver to be loaded. You can use the modload(1M) command to load the driver into memory, but modload does not call any routines in the module. The preferred method is to open the device.

Normally, the system automatically unloads device drivers when the device is no longer in use. During development, you might want to use modunload(1M) to unload the driver explicitly. In order for modunload to be successful, the device driver must be inactive. No outstanding references to the device should exist, such as through open(2) or mmap(2).

The modunload command takes a runtime-dependent module_id as an argument. To find the module_id, use grep to search the output of modinfo(1M) for the driver name in question. Check in the first column.


# modunload -i module-id

To unload all currently unloadable modules, specify module ID zero:


# modunload -i 0

In addition to being inactive, the driver must have working detach(9E) and _fini(9E) routines for modunload(1M) to succeed.

Driver Packaging

The normal delivery vehicle for software is to create a package that contains all of the software components. A package provides a controlled mechanism for installation and removal of all the components of a software product. In addition to the files for using the product, the package includes control files for installing and uninstalling the application. The postinstall and preremove installation scripts are two such control files.

Package Postinstall

After a package with a driver binary is installed onto a system, the add_drv(1M) command must be run. The add_drv command completes the installation of the driver. Typically, add_drv is run in a postinstall script, as in the following example.

#!/bin/sh
#
#       @(#)postinstall 1.1

PATH="/usr/bin:/usr/sbin:${PATH}"
export PATH

#
# Driver info
#
DRV=<driver-name>
DRVALIAS="<company-name>,<driver-name>"
DRVPERM='* 0666 root sys'

ADD_DRV=/usr/sbin/add_drv

#
# Select the correct add_drv options to execute.
# add_drv touches /reconfigure to cause the
# next boot to be a reconfigure boot.
#
if [ "${BASEDIR}" = "/" ]; then
    #
    # On a running system, modify the
    # system files and attach the driver
    #
    ADD_DRV_FLAGS=""
else     
    #
    # On a client, modify the system files
    # relative to BASEDIR
    #
    ADD_DRV_FLAGS="-b ${BASEDIR}"
fi       
 
#
# Make sure add_drv has not been previously executed
# before attempting to add the driver.
#
grep "^${DRV} " $BASEDIR/etc/name_to_major > /dev/null 2>&1
if [ $? -ne 0 ]; then
    ${ADD_DRV} ${ADD_DRV_FLAGS} -m "${DRVPERM}" -i "${DRVALIAS}" ${DRV}
    if [ $? -ne 0 ]; then
        echo "postinstall: add_drv $DRV failed\n" >&2
        exit 1
    fi
fi
exit 0

Package Preremove

When removing a package that includes a driver, the rem_drv(1M) command must be run prior to removing the driver binary and other components. The following example demonstrates a preremove script that uses the rem_drv command for driver removal.

#!/bin/sh
#
#       @(#)preremove  1.1
 
PATH="/usr/bin:/usr/sbin:${PATH}"
export PATH
 
#
# Driver info
#
DRV=<driver-name>
REM_DRV=/usr/sbin/rem_drv
 
#
# Select the correct rem_drv options to execute.
# rem_drv touches /reconfigure to cause the
# next boot to be a reconfigure boot.
#
if [ "${BASEDIR}" = "/" ]; then
    #
    # On a running system, modify the
    # system files and remove the driver
    #
    REM_DRV_FLAGS=""
else     
    #
    # On a client, modify the system files
    # relative to BASEDIR
    #
    REM_DRV_FLAGS="-b ${BASEDIR}"
fi
 
${REM_DRV} ${REM_DRV_FLAGS} ${DRV}
 
exit 0

Criteria for Testing Drivers

Once a device driver is functional, that driver should be thoroughly tested prior to distribution. Besides testing the features in traditional UNIX device drivers, Oracle Solaris drivers require testing power management features, such as dynamic loading and unloading of drivers.

Configuration Testing

A driver's ability to handle multiple device configurations is an important part of the test process. Once the driver is working on a simple, or default, configuration, additional configurations should be tested. Depending on the device, configuration testing can be accomplished by changing jumpers or DIP switches. If the number of possible configurations is small, all configurations should be tried. If the number is large, various classes of possible configurations should be defined, and a sampling of configurations from each class should be tested. Defining these classes depends on the potential interactions among the different configuration parameters. These interactions are a function of the type of the device and the way in which the driver was written.

For each device configuration, the basic functions must be tested, which include loading, opening, reading, writing, closing, and unloading the driver. Any function that depends upon the configuration deserves special attention. For example, changing the base memory address of device registers is not likely to affect the behavior of most driver functions. If a driver works well with one address, that driver is likely to work as well with a different address. On the other hand, a special I/O control call might have different effects depending on the particular device configuration.

Loading the driver with varying configurations ensures that the probe(9E) and attach(9E) entry points can find the device at different addresses. For basic functional testing, using regular UNIX commands such as cat(1) or dd(1M) is usually sufficient for character devices. Mounting or booting might be required for block devices.

Functionality Testing

After a driver has been completely tested for configuration, all of the driver's functionality should be thoroughly tested. These tests require exercising the operation of all of the driver's entry points.

Many drivers require custom applications to test functionality. However, basic drivers for devices such as disks, tapes, or asynchronous boards can be tested using standard system utilities. All entry points should be tested in this process, including devmap(9E), chpoll(9E), and ioctl(9E), if applicable. The ioctl() tests might be quite different for each driver. For nonstandard devices, a custom testing application is generally required.

Error Handling

A driver might perform correctly in an ideal environment but fail in cases of errors, such as erroneous operations or bad data. Therefore, an important part of driver testing is the testing of the driver's error handling.

All possible error conditions of a driver should be exercised, including error conditions for actual hardware malfunctions. Some hardware error conditions might be difficult to induce, but an effort should be made to force or to simulate such errors if possible. All of these conditions could be encountered in the field. Cables should be removed or be loosened, boards should be removed, and erroneous user application code should be written to test those error paths. See also Chapter 13, Hardening Oracle Solaris Drivers.


Caution – Caution –

Be sure to take proper electrical precautions when testing.


Testing Loading and Unloading

Because a driver that does not load or unload can force unscheduled downtime, loading and unloading must be thoroughly tested.

A script like the following example should suffice:

#!/bin/sh
cd <location_of_driver>
while [ 1 ]
do
    modunload -i 'modinfo | grep " <driver_name> " | cut -cl-3' &
    modload <driver_name> &
done

Stress, Performance, and Interoperability Testing

To help ensure that a driver performs well, that driver should be subjected to vigorous stress testing. For example, running single threads through a driver does not test locking logic or conditional variables that have to wait. Device operations should be performed by multiple processes at once to cause several threads to execute the same code simultaneously.

Techniques for performing simultaneous tests depend upon the driver. Some drivers require special testing applications, while starting several UNIX commands in the background is suitable for others. Appropriate testing depends upon where the particular driver uses locks and condition variables. Testing a driver on a multiprocessor machine is more likely to expose problems than testing on a single-processor machine.

Interoperability between drivers must also be tested, particularly because different devices can share interrupt levels. If possible, configure another device at the same interrupt level as the one being tested. A stress test can determine whether the driver correctly claims its own interrupts and operates according to expectations. Stress tests should be run on both devices at once. Even if the devices do not share an interrupt level, this test can still be valuable. For example, consider a case in which serial communication devices experience errors when a network driver is tested. The same problem might be causing the rest of the system to encounter interrupt latency problems as well.

Driver performance under these stress tests should be measured using UNIX performance-measuring tools. This type of testing can be as simple as using the time(1) command along with commands to be used in the stress tests.

DDI/DKI Compliance Testing

To ensure compatibility with later releases and reliable support for the current release, every driver should be DDI/DKI compliant. Check that only kernel routines in man pages section 9: DDI and DKI Kernel Functions and man pages section 9: DDI and DKI Driver Entry Points and data structures in man pages section 9: DDI and DKI Properties and Data Structures are used.

Installation and Packaging Testing

Drivers are delivered to customers in packages. A package can be added or be removed from the system using a standard mechanism (see the Application Packaging Developer’s Guide).

The ability of a user to add or remove the package from a system should be tested. In testing, the package should be both installed and removed from every type of media to be used for the release. This testing should include several system configurations. Packages must not make unwarranted assumptions about the directory environment of the target system. Certain valid assumptions, however, can be made about where standard kernel files are kept. Also test adding and removing of packages on newly installed machines that have not been modified for a development environment. A common packaging error is for a package to rely on a tool or file that is used in development only. For example, no tools from the Source Compatibility package, SUNWscpu, should be used in driver installation programs.

The driver installation must be tested on a minimal Oracle Solaris system without any optional packages.

Testing Specific Types of Drivers

This section provides some suggestions about how to test certain types of standard devices.

Tape Drivers

Tape drivers should be tested by performing several archive and restore operations. The cpio(1) and tar(1) commands can be used for this purpose. Use the dd(1M) command to write an entire disk partition to tape. Next, read back the data, and write the data to another partition of the same size. Then compare the two copies. The mt(1) command can exercise most of the I/O controls that are specific to tape drivers. See the mtio(7I) man page. Try to use all the options. These three techniques can test the error-handling capabilities of tape drivers:

Tape drivers typically implement exclusive-access open(9E) calls. These open() calls can be tested by opening a device and then having a second process try to open the same device.

Disk Drivers

Disk drivers should be tested in both the raw and block device modes. For block device tests, create a new file system on the device. Then try to mount the new file system. Then try to perform multiple file operations.


Note –

The file system uses a page cache, so reading the same file over and over again does not really exercise the driver. The page cache can be forced to retrieve data from the device by memory-mapping the file with mmap(2). Then use msync(3C) to invalidate the in-memory copies.


Copy another (unmounted) partition of the same size to the raw device. Then use a command such as fsck(1M) to verify the correctness of the copy. The new partition can also be mounted and then later compared to the old partition on a file-by-file basis.

Asynchronous Communication Drivers

Asynchronous drivers can be tested at the basic level by setting up a login line to the serial ports. A good test is to see whether a user can log in on this line. To sufficiently test an asynchronous driver, however, all the I/O control functions must be tested, with many interrupts at high speed. A test involving a loopback serial cable and high data transfer rates can help determine the reliability of the driver. You can run uucp(1C) over the line to provide some exercise. However, because uucp performs its own error handling, verify that the driver is not reporting excessive numbers of errors to the uucp process.

These types of devices are usually STREAMS-based. See the STREAMS Programming Guide for more information.

Network Drivers

Network drivers can be tested using standard network utilities. The ftp(1) and rcp(1) commands are useful because the files can be compared on each end of the network. The driver should be tested under heavy network loading, so that various commands can be run by multiple processes.

Heavy network loading includes the following conditions:

Network cables should be unplugged while the tests are executing to ensure that the driver recovers gracefully from the resulting error conditions. Another important test is for the driver to receive multiple packets in rapid succession, that is, back-to-back packets. In this case, a relatively fast host on a lightly loaded network should send multiple packets in quick succession to the test machine. Verify that the receiving driver does not drop the second and subsequent packets.

These types of devices are usually STREAMS-based. See the STREAMS Programming Guide for more information.

Chapter 22 Debugging, Testing, and Tuning Device Drivers

This chapter presents an overview of the various tools that are provided to assist with testing, debugging, and tuning device drivers. This chapter provides information on the following subjects:

Testing Drivers

To avoid data loss and other problems, you should take special care when testing a new device driver. This section discusses various testing strategies. For example, setting up a separate system that you control through a serial connection is the safest way to test a new driver. You can load test modules with various kernel variable settings to test performance under different kernel conditions. Should your system crash, you should be prepared to restore backup data, analyze any crash dumps, and rebuild the device directory.

Enable the Deadman Feature to Avoid a Hard Hang

If your system is in a hard hang, then you cannot break into the debugger. If you enable the deadman feature, the system panics instead of hanging indefinitely. You can then use the kmdb(1) kernel debugger to analyze your problem.

The deadman feature checks every second whether the system clock is updating. If the system clock is not updating, then you are in an indefinite hang. If the system clock has not been updated for 50 seconds, the deadman feature induces a panic and puts you in the debugger.

    Take the following steps to enable the deadman feature:

  1. Make sure you are capturing crash images with dumpadm(1M).

  2. Set the snooping variable in the /etc/system file. See the system(4) man page for information on the /etc/system file.

    set snooping=1
  3. Reboot the system so that the /etc/system file is read again and the snooping setting takes effect.

Note that any zones on your system inherit the deadman setting as well.

If your system hangs while the deadman feature is enabled, you should see output similar to the following example on your console:

panic[cpu1]/thread=30018dd6cc0: deadman: timed out after 9 seconds of
clock inactivity

panic: entering debugger (continue to save dump)

Inside the debugger, use the ::cpuinfo command to investigate why the clock interrupt was not able to fire and advance the system time.

Testing With a Serial Connection

Using a serial connection is a good way to test drivers. Use the tip(1) command to make a serial connection between a host system and a test system. With this approach, the tip window on the host console is used as the console of the test machine. See the tip(1) man page for additional information.

A tip window has the following advantages:


Note –

Although using a tip connection and a second machine are not required to debug an Oracle Solaris device driver, this technique is still recommended.


ProcedureTo Set Up the Host System for a tip Connection

  1. Connect the host system to the test machine using serial port A on both machines.

    This connection must be made with a null modem cable.

  2. On the host system, make sure there is an entry in /etc/remote for the connection. See the remote(4) man page for details.

    The terminal entry must match the serial port that is used. The operating system comes with the correct entry for serial port B, but a terminal entry must be added for serial port A:


    debug:\
            :dv=/dev/term/a:br#9600:el=^C^S^Q^U^D:ie=%$:oe=^D:

    Note –

    The baud rate must be set to 9600.


  3. In a shell window on the host, run tip(1) and specify the name of the entry:


    % tip debug
    connected

    The shell window is now a tip window with a connection to the console of the test machine.


    Caution – Caution –

    Do not use STOP-A for SPARC machines or F1-A for x86 architecture machines on the host machine to stop the test machine. This action actually stops the host machine. To send a break to the test machine, type ~# in the tip window. Commands such as ~# are recognized only if these characters on first on the line. If the command has no effect, press either the Return key or Control-U.


Setting Up a Target System on the SPARC Platform

A quick way to set up the test machine on the SPARC platform is to unplug the keyboard before turning on the machine. The machine then automatically uses serial port A as the console.

Another way to set up the test machine is to use boot PROM commands to make serial port A the console. On the test machine, at the boot PROM ok prompt, direct console I/O to the serial line. To make the test machine always come up with serial port A as the console, set the environment variables: input-device and output-device.


Example 22–1 Setting input-device and output-device With Boot PROM Commands


ok setenv input-device ttya
ok setenv output-device ttya

The eeprom command can also be used to make serial port A the console. As superuser, execute the following commands to make the input-device and output-device parameters point to serial port A. The following example demonstrates the eeprom command.


Example 22–2 Setting input-device and output-device With the eeprom Command


# eeprom input-device=ttya
# eeprom output-device=ttya

The eeprom commands cause the console to be redirected to serial port A at each subsequent system boot.

Setting Up a Target System on the x86 Platform

On x86 platforms, use the eeprom command to make serial port A the console. This procedure is the same as the SPARC platform procedure. See Setting Up a Target System on the SPARC Platform. The eeprom command causes the console to switch to serial port A (COM1) during reboot.


Note –

x86 machines do not transfer console control to the tip connection until an early stage in the boot process unless the BIOS supports console redirection to a serial port. In SPARC machines, the tip connection maintains console control throughout the boot process.


Setting Up Test Modules

The system(4) file in the /etc directory enables you to set the value of kernel variables at boot time. With kernel variables, you can toggle different behaviors in a driver and take advantage of debugging features that are provided by the kernel. The kernel variables moddebug and kmem_flags, which can be very useful in debugging, are discussed later in this section. See also Enable the Deadman Feature to Avoid a Hard Hang.

Changes to kernel variables after boot are unreliable, because /etc/system is read only once when the kernel boots. After this file is modified, the system must be rebooted for the changes to take effect. If a change in the file causes the system not to work, boot with the ask (-a) option. Then specify /dev/null as the system file.


Note –

Kernel variables cannot be relied on to be present in subsequent releases.


Setting Kernel Variables

The set command changes the value of module or kernel variables. To set module variables, specify the module name and the variable:


set module_name:variable=value

For example, to set the variable test_debug in a driver that is named myTest, use set as follows:


% set myTest:test_debug=1

To set a variable that is exported by the kernel itself, omit the module name.

You can also use a bitwise OR operation to set a value, for example:


% set moddebug | 0x80000000

Loading and Unloading Test Modules

The commands modload(1M), modunload(1M), and modinfo(1M) can be used to add test modules, which is a useful technique for debugging and stress-testing drivers. These commands are generally not needed in normal operation, because the kernel automatically loads needed modules and unloads unused modules. The moddebug kernel variable works with these commands to provide information and set controls.

Using the modload() Function

Use modload(1M) to force a module into memory. The modload command verifies that the driver has no unresolved references when that driver is loaded. Loading a driver does not necessarily mean that the driver can attach. When a driver loads successfully, the driver's _info(9E) entry point is called. The attach() entry point is not necessarily called.

Using the modinfo() Function

Use modinfo(1M) to confirm that the driver is loaded.


Example 22–3 Using modinfo to Confirm a Loaded Driver


$ modinfo
 Id Loadaddr   Size Info Rev Module Name
  6 101b6000    732   -   1  obpsym (OBP symbol callbacks)
  7 101b65bd  1acd0 226   1  rpcmod (RPC syscall)
  7 101b65bd  1acd0 226   1  rpcmod (32-bit RPC syscall)
  7 101b65bd  1acd0   1   1  rpcmod (rpc interface str mod)
  8 101ce8dd  74600   0   1  ip (IP STREAMS module)
  8 101ce8dd  74600   3   1  ip (IP STREAMS device)
...
$ modinfo | grep mydriver
169 781a8d78   13fb   0   1  mydriver (Test Driver 1.5)

The number in the info field is the major number that has been chosen for the driver. The modunload(1M) command can be used to unload a module if the module ID is provided. The module ID is found in the left column of modinfo output.

Sometimes a driver does not unload as expected after a modunload is issued, because the driver is determined to be busy. This situation occurs when the driver fails detach(9E), either because the driver really is busy, or because the detach entry point is implemented incorrectly.

Using modunload()

To remove all of the currently unused modules from memory, run modunload(1M) with a module ID of 0:


# modunload -i 0

Setting the moddebug Kernel Variable

The moddebug kernel variable controls the module loading process. The possible values of moddebug are:

0x80000000

Prints messages to the console when loading or unloading modules.

0x40000000

Gives more detailed error messages.

0x20000000

Prints more detail when loading or unloading, such as including the address and size.

0x00001000

No auto-unloading drivers. The system does not attempt to unload the device driver when the system resources become low.

0x00000080

No auto-unloading streams. The system does not attempt to unload the STREAMS module when the system resources become low.

0x00000010

No auto-unloading of kernel modules of any type.

0x00000001

If running with kmdb, moddebug causes a breakpoint to be executed and a return to kmdb immediately before each module's _init() routine is called. This setting also generates additional debug messages when the module's _info() and _fini() routines are executed.

Setting kmem_flags Debugging Flags

The kmem_flags kernel variable enables debugging features in the kernel's memory allocator. Set kmem_flags to 0xf to enable the allocator's debugging features. These features include runtime checks to find the following code conditions:

The Oracle Solaris Modular Debugger Guide describes how to use the kernel memory allocator to analyze such problems.


Note –

Testing and developing with kmem_flags set to 0xf can help detect latent memory corruption bugs. Because setting kmem_flags to 0xf changes the internal behavior of the kernel memory allocator, you should thoroughly test without kmem_flags as well.


Avoiding Data Loss on a Test System

A driver bug can sometimes render a system incapable of booting. By taking precautions, you can avoid system reinstallation in this event, as described in this section.

Back Up Critical System Files

A number of driver-related system files are difficult, if not impossible, to reconstruct. Files such as /etc/name_to_major, /etc/driver_aliases, /etc/driver_classes, and /etc/minor_perm can be corrupted if the driver crashes the system during installation. See the add_drv(1M) man page.

To be safe, make a backup copy of the root file system after the test machine is in the proper configuration. If you plan to modify the /etc/system file, make a backup copy of the file before making modifications.

ProcedureTo Boot With an Alternate Kernel

To avoid rendering a system inoperable, you should boot from a copy of the kernel and associated binaries rather than from the default kernel.

  1. Make a copy of the drivers in /platform/*.


    # cp -r /platform/`uname -i`/kernel /platform/`uname -i`/kernel.test
    
  2. Place the driver module in /platform/`uname -i`/kernel.test/drv.

  3. Boot the alternate kernel instead of the default kernel.

    After you have created and stored the alternate kernel, you can boot this kernel in a number of ways.

    • You can boot the alternate kernel by rebooting:


      # reboot -- kernel.test/unix
      
    • On a SPARC-based system, you can also boot from the PROM:


      ok boot kernel.test/sparcv9/unix
      

      Note –

      To boot with the kmdb debugger, use the -k option as described in Getting Started With the Modular Debugger.


    • On an x86-based system, when the Select (b)oot or (i)nterpreter: message is displayed in the boot process, type the following:


      boot kernel.test/unix
      

Example 22–4 Booting an Alternate Kernel

The following example demonstrates booting with an alternate kernel.


ok boot kernel.test/sparcv9/unix
Rebooting with command: boot kernel.test/sparcv9/unix
Boot device: /sbus@1f,0/espdma@e,8400000/esp@e,8800000/sd@0,0:a File and \
    args:
kernel.test/sparcv9/unix


Example 22–5 Booting an Alternate Kernel With the -a Option

Alternatively, the module path can be changed by booting with the ask (-a) option. This option results in a series of prompts for configuring the boot method.


ok boot -a
Rebooting with command: boot -a
Boot device: /sbus@1f,0/espdma@e,8400000/esp@e,8800000/sd@0,0:a File and \
args: -a
Enter filename [kernel/sparcv9/unix]: kernel.test/sparcv9/unix
Enter default directory for modules
[/platform/sun4u/kernel.test /kernel /usr/kernel]: <CR>
Name of system file [etc/system]: <CR>
SunOS Release 5.10 Version Generic 64-bit
Copyright 1983-2002 Sun Microsystems, Inc. All rights reserved.
root filesystem type [ufs]: <CR>
Enter physical name of root device
[/sbus@1f,0/espdma@e,8400000/esp@e,8800000/sd@0,0:a]: <CR>

Consider Alternative Back-Up Plans

If the system is attached to a network, the test machine can be added as a client of a server. If a problem occurs, the system can be booted from the network. The local disks can then be mounted, and any fixes can be made. Alternatively, the system can be booted directly from the Oracle Solaris system CD-ROM.

Another way to recover from disaster is to have another bootable root file system. Use format(1M) to make a partition that is the exact size of the original. Then use dd(1M) to copy the bootable root file system. After making a copy, run fsck(1M) on the new file system to ensure its integrity.

Subsequently, if the system cannot boot from the original root partition, boot the backup partition. Use dd(1M) to copy the backup partition onto the original partition. You might have a situation where the system cannot boot even though the root file system is undamaged. For example, the damage might be limited to the boot block or the boot program. In such a case, you can boot from the backup partition with the ask (-a) option. You can then specify the original file system as the root file system.

Capture System Crash Dumps

When a system panics, the system writes an image of kernel memory to the dump device. The dump device is by default the most suitable swap device. The dump is a system crash dump, similar to core dumps generated by applications. On rebooting after a panic, savecore(1M) checks the dump device for a crash dump. If a dump is found, savecore makes a copy of the kernel's symbol table, which is called unix.n. The savecore utility then dumps a core file that is called vmcore.n in the core image directory. By default, the core image directory is /var/crash/machine_name. If /var/crash has insufficient space for a core dump, the system displays the needed space but does not actually save the dump. The mdb(1) debugger can then be used on the core dump and the saved kernel.

In the Oracle Solaris operating system, crash dump is enabled by default. The dumpadm(1M) command is used to configure system crash dumps. Use the dumpadm command to verify that crash dumps are enabled and to determine the location of core files that have been saved.


Note –

You can prevent the savecore utility from filling the file system. Add a file that is named minfree to the directory in which the dumps are to be saved. In this file, specify the number of kilobytes to remain free after savecore has run. If insufficient space is available, the core file is not saved.


Recovering the Device Directory

Damage to the /devices and /dev directories can occur if the driver crashes during attach(9E). If either directory is damaged, you can rebuild the directory by booting the system and running fsck(1M) to repair the damaged root file system. The root file system can then be mounted. Recreate the /devices and /dev directories by running devfsadm(1M) and specifying the /devices directory on the mounted disk.

The following example shows how to repair a damaged root file system on a SPARC system. In this example, the damaged disk is /dev/dsk/c0t3d0s0, and an alternate boot disk is /dev/dsk/c0t1d0s0.


Example 22–6 Recovering a Damaged Device Directory


ok boot disk1
...
Rebooting with command: boot kernel.test/sparcv9/unix
Boot device: /sbus@1f,0/espdma@e,8400000/esp@e,8800000/sd@31,0:a File and \
    args:
kernel.test/sparcv9/unix
...
# fsck /dev/dsk/c0t3d0s0** /dev/dsk/c0t3d0s0
** Last Mounted on /
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups
1478 files, 9922 used, 29261 free
     (141 frags, 3640 blocks, 0.4% fragmentation)
# mount /dev/dsk/c0t3d0s0 /mnt
# devfsadm -r /mnt


Note –

A fix to the /devices and /dev directories can allow the system to boot while other parts of the system are still corrupted. Such repairs are only a temporary fix to save information, such as system crash dumps, before reinstalling the system.


Debugging Tools

This section describes two debuggers that can be applied to device drivers. Both debuggers are described in detail in the Oracle Solaris Modular Debugger Guide.

The kmdb and mdb debuggers mostly share the same user interface. Many debugging techniques therefore can be applied with the same commands in both tools. Both debuggers support macros, dcmds, and dmods. A dcmd (pronounced dee-command) is a routine in the debugger that can access any of the properties of the current target program. A dcmd can be dynamically loaded at runtime. A dmod, which is short for debugger module, is a package of dcmds that can be loaded to provide non-standard behavior.

Both mdb and kmdb are backward-compatible with legacy debuggers such as adb and kadb. The mdb debugger can execute all of the macros that are available to kmdb as well as any legacy user-defined macros for adb. See the Oracle Solaris Modular Debugger Guide for information about where to find standard macro sets.

Postmortem Debugging

Postmortem analysis offers numerous advantages to driver developers. More than one developer can examine a problem in parallel. Multiple instances of the debugger can be used simultaneously on a single crash dump. The analysis can be performed offline so that the crashed system can be returned to service, if possible. Postmortem analysis enables the use of user-developed debugger functionality in the form of dmods. Dmods can bundle functionality that would be too memory-intensive for real-time debuggers, such as kmdb.

When a system panics while kmdb is loaded, control is passed to the debugger for immediate investigation. If kmdb does not seem appropriate for analyzing the current problem, a good strategy is to use :c to continue execution and save the crash dump. When the system reboots, you can perform postmortem analysis with mdb on the saved crash dump. This process is analogous to debugging an application crash from a process core file.


Note –

In earlier versions of the Oracle Solaris operating system, adb(1) was the recommended tool for postmortem analysis. In the current Oracle Solaris operating system, mdb(1) is the recommended tool for postmortem analysis. The mdb() feature set surpasses the set of commands from the legacy crash(1M) utility. The crash utility is no longer available in the Oracle Solaris operating system.


Using the kmdb Kernel Debugger

The kmdb debugger is an interactive kernel debugger that provides the following capabilities:

This section assumes that you are already familiar with the kmdb debugger. The focus in this section is on kmdb capabilities that are useful in device driver design. To learn how to use kmdb in detail, refer to the kmdb(1) man page and to the Oracle Solaris Modular Debugger Guide. If you are familiar with kadb, refer to the kadb(1M) man page for the major differences between kadb and kmdb.

The kmdb debugger can be loaded and unloaded at will. Instructions for loading and unloading kmdb are in the Oracle Solaris Modular Debugger Guide. For safety and convenience, booting with an alternate kernel is highly encouraged. The boot process is slightly different between the SPARC platform and the x86 platform, as described in this section.


Note –

By default, kmdb uses the CPU ID as the prompt when kmdb is running. In the examples in this chapter [0] is used as the prompt unless otherwise noted.


Booting kmdb With an Alternate Kernel on the SPARC Platform

Use either of the following commands to boot a SPARC system with both kmdb and an alternate kernel:


boot kmdb -D kernel.test/sparcv9/unix 
boot kernel.test/sparcv9/unix -k

Booting kmdb With an Alternate Kernel on the x86 Platform

Use either of the following commands to boot an x86 system with both kmdb and an alternate kernel:


b kmdb -D kernel.test/unix 
b kernel.test/unix -k

Setting Breakpoints in kmdb

Use the bp command to set a breakpoint, as shown in the following example.


Example 22–7 Setting Standard Breakpoints in kmdb


[0]> myModule`myBreakpointLocation::bp
        

If the target module has not been loaded, then an error message that indicates this condition is displayed, and the breakpoint is not created. In this case you can use a deferred breakpoint. A deferred breakpoint activates automatically when the specified module is loaded. Set a deferred breakpoint by specifying the target location after the bp command. The following example demonstrates a deferred breakpoint.


Example 22–8 Setting Deferred Breakpoints in kmdb


[0]>::bp myModule`myBreakpointLocation       

For more information on using breakpoints, see the Oracle Solaris Modular Debugger Guide. You can also get help by typing either of the following two lines:


> ::help bp
> ::bp dcmd

kmdb Macros for Driver Developers

The kmdb(1M) debugger supports macros that can be used to display kernel data structures. Use $M to display kmdb macros. Macros are used in the form:


[ address ] $<macroname

Note –

Neither the information displayed by these macros nor the format in which the information is displayed, constitutes an interface. Therefore, the information and format can change at any time.


The kmdb macros in the following table are particularly useful to developers of device drivers. For convenience, legacy macro names are shown where applicable.

Table 22–1 kmdb Macros

Dcmd 

Legacy Macro 

Description 

::devinfo

devinfo

devinfo_brief

devinfo.prop

Print a summary of a device node 

::walk devinfo_parents

devinfo.parent

Walk the ancestors of a device node 

::walk devinfo_sibling

devinfo.sibling

Walk the siblings of a device node 

::minornodes

devinfo.minor

Print the minor nodes that correspond to the given device node 

::major2name

 

Print the name of a device that is bound to a given device node. 

::devbindings

 

Print the device nodes that are bound to a given device node or major number. 

The ::devinfo dcmd displays a node state that can have one of the following values:

DS_ATTACHED

The driver's attach(9E) routine returned successfully.

DS_BOUND

The node is bound to a driver, but the driver's probe(9E) routine has not yet been called.

DS_INITIALIZED

The parent nexus has assigned a bus address for the driver. The implementation-specific initializations have been completed. The driver's probe(9E) routine has not yet been called at this point.

DS_LINKED

The device node has been linked into the kernel's device tree, but the system has not yet found a driver for this node.

DS_PROBED

The driver's probe(9E) routine returned successfully.

DS_READY

The device is fully configured.

Using the mdb Modular Debugger

The mdb(1) modular debugger can be applied to the following types of files:

The mdb debugger provides sophisticated debugging support for analyzing kernel problems. This section provides an overview of mdb features. For a complete discussion of mdb, refer to the Oracle Solaris Modular Debugger Guide.

Although mdb can be used to alter live kernel state, mdb lacks the kernel execution control that is provided by kmdb. As a result kmdb is preferred for runtime debugging. The mdb debugger is used more for static situations.


Note –

The prompt for mdb is >.


Getting Started With the Modular Debugger

The mdb debugger provides an extensive programming API for implementing debugger modules so that driver developers can implement custom debugging support. The mdb debugger also provides many usability features, such as command-line editing, command history, an output pager, and online help.


Note –

The adb macros should no longer be used. That functionality has largely been superseded by the dcmds in mdb.


The mdb debugger provides a rich set of modules and dcmds. With these tools, you can debug the Oracle Solaris kernel, any associated modules, and device drivers. These facilities enable you to perform tasks such as:

To get started, switch to the crash directory and type mdb, specifying a system crash dump, as illustrated in the following example.


Example 22–9 Invoking mdb on a Crash Dump


% cd /var/crash/testsystem
% ls
bounds     unix.0    vmcore.0
% mdb unix.0 vmcore.0
Loading modules: [ unix krtld genunix ufs_log ip usba s1394 cpc nfs ]
> ::status
debugging crash dump vmcore.0 (64-bit) from testsystem
operating system: 5.10 Generic (sun4u)
panic message: zero
dump content: kernel pages only

When mdb responds with the > prompt, you can run commands.

To examine the running kernel on a live system, run mdb from the system prompt as follows.


Example 22–10 Invoking mdb on a Running Kernel


# mdb -k
Loading modules: [ unix krtld genunix ufs_log ip usba s1394 ptm cpc ipc nfs ]
> ::status
debugging live kernel (64-bit) on testsystem
operating system: 5.10 Generic (sun4u)

Useful Debugging Tasks With kmdb and mdb

This section provides examples of useful debugging tasks. The tasks in this section can be performed with either mdb or kmdb unless specifically noted. This section assumes a basic knowledge of the use of kmdb and mdb. Note that the information presented here is dependent on the type of system used. A Sun Blade 100 workstation running the 64-bit kernel was used to produce these examples.


Caution – Caution –

Because irreversible destruction of data can result from modifying data in kernel structures, you should exercise extreme caution. Do not modify or rely on data in structures that are not part of the Oracle Solaris DDI. See the Intro(9S) man page for information on structures that are part of the Oracle Solaris DDI.


Exploring System Registers With kmdb

The kmdb debugger can display machine registers as a group or individually. To display all registers as a group, use $r as shown in the following example.


Example 22–11 Reading All Registers on a SPARC Processor With kmdb


[0]: $r

g0    0                                 l0      0
g1    100130a4      debug_enter         l1      edd00028
g2    10411c00      tsbmiss_area+0xe00  l2      10449c90
g3    10442000      ti_statetbl+0x1ba   l3      1b
g4    3000061a004                       l4      10474400     ecc_syndrome_tab+0x80
g5    0                                 l5      3b9aca00
g6    0                                 l6      0
g7    2a10001fd40                       l7      0
o0    0                                 i0      0
o1    c                                 i1      10449e50
o2    20                                i2      0
o3    300006b2d08                       i3      10
o4    0                                 i4      0
o5    0                                 i5      b0
sp    2a10001b451                       fp      2a10001b521
o7    1001311c      debug_enter+0x78    i7      1034bb24     zsa_xsint+0x2c4
y     0
tstate: 1604  (ccr=0x0, asi=0x0, pstate=0x16, cwp=0x4)
pstate: ag:0 ie:1 priv:1 am:0 pef:1 mm:0 tle:0 cle:0 mg:0 ig:0
winreg: cur:4 other:0 clean:7 cansave:1 canrest:5 wstate:14
tba   0x10000000
pc    edd000d8 edd000d8:        ta      %icc,%g0 + 125
npc   edd000dc edd000dc:        nop

The debugger exports each register value to a variable with the same name as the register. If you read the variable, the current value of the register is returned. If you write to the variable, the value of the associated machine register is changed. The following example changes the value of the %o0 register from 0 to 1 on an x86 machine.


Example 22–12 Reading and Writing Registers on an x86 Machine With kmdb


[0]> &<eax=K
        c1e6e0f0
[0]> 0>eax
[0]> &<eax=K
        0
[0]>  c1e6e0f0>eax

If you need to inspect the registers of a different processor, you can use the ::cpuregs dcmd. The ID of the processor to be examined can be supplied as either the address to the dcmd or as the value of the -c option, as shown in the following example.


Example 22–13 Inspecting the Registers of a Different Processor


[0]> 0::cpuregs
   %cs = 0x0158            %eax = 0xc1e6e0f0 kmdbmod`kaif_dvec
   %ds = 0x0160            %ebx = 0x00000000

The following example switches from processor 0 to processor 3 on a SPARC machine. The %g3 register is inspected and then cleared. To confirm the new value, %g3 is read again.


Example 22–14 Retrieving the Value of an Individual Register From a Specified Processor


[0]> 3::switch
[3]> <g3=K
        24
[3]> 0>g3
[3]> <g3
        0

Detecting Kernel Memory Leaks

The ::findleaks dcmd provides powerful, efficient detection of memory leaks in kernel crash dumps. The full set of kernel-memory debugging features must be enabled for ::findleaks to be effective. For more information, see Setting kmem_flags Debugging Flags. Run ::findleaks during driver development and testing to detect code that leaks memory, thus wasting kernel resources. See Chapter 9, Debugging With the Kernel Memory Allocator, in Oracle Solaris Modular Debugger Guide for a complete discussion of ::findleaks.


Note –

Code that leaks kernel memory can render the system vulnerable to denial-of-service attacks.


Writing Debugger Commands With mdb

The mdb debugger provides a powerful API for implementing debugger facilities that you customize to debug your driver. The Oracle Solaris Modular Debugger Guide explains the programming API in detail.

The SUNWmdbdm package installs sample mdb source code in the directory /usr/demo/mdb. You can use mdb to automate lengthy debugging chores or help to validate that your driver is behaving properly. You can also package your mdb debugging modules with your driver product. With packaging, these facilities are available to service personnel at a customer site.

Obtaining Kernel Data Structure Information

The Oracle Solaris kernel provides data type information in structures that can be inspected with either kmdb or mdb.


Note –

The kmdb and mdb dcmds can be used only with objects that contain compressed symbolic debugging information that has been designed for use with mdb. This information is currently available only for certain Oracle Solaris kernel modules. The SUNWzlib package must be installed to process the symbolic debugging information.


The following example demonstrates how to display the data in the scsi_pkt structure.


Example 22–15 Displaying Kernel Data Structures With a Debugger


> 7079ceb0::print -t 'struct scsi_pkt'
{
    opaque_t pkt_ha_private = 0x7079ce20
    struct scsi_address pkt_address = {
        struct scsi_hba_tran *a_hba_tran = 0x70175e68
        ushort_t a_target = 0x6
        uchar_t a_lun = 0
        uchar_t a_sublun = 0
    }
    opaque_t pkt_private = 0x708db4d0
    int (*)() *pkt_comp = sd_intr
    uint_t pkt_flags = 0
    int pkt_time = 0x78
    uchar_t *pkt_scbp = 0x7079ce74
    uchar_t *pkt_cdbp = 0x7079ce64
    ssize_t pkt_resid = 0
    uint_t pkt_state = 0x37
    uint_t pkt_statistics = 0
    uchar_t pkt_reason = 0
}

The size of a data structure can be useful in debugging. Use the ::sizeof dcmd to obtain the size of a structure, as shown in the following example.


Example 22–16 Displaying the Size of a Kernel Data Structure


> ::sizeof struct scsi_pkt
sizeof (struct scsi_pkt) = 0x58

The address of a specific member within a structure is also useful in debugging. Several methods are available for determining a member's address.

Use the ::offsetof dcmd to obtain the offset for a given member of a structure, as in the following example.


Example 22–17 Displaying the Offset to a Kernel Data Structure


> ::offsetof struct scsi_pkt pkt_state
offsetof (struct pkt_state) = 0x48

Use the ::print dcmd with the -a option to display the addresses of all members of a structure, as in the following example.


Example 22–18 Displaying the Relative Addresses of a Kernel Data Structure


> ::print -a struct scsi_pkt
{
    0 pkt_ha_private
    8 pkt_address {
    ...
    }
    18 pkt_private
    ...
}

If an address is specified with ::print in conjunction with the -a option, the absolute address for each member is displayed.


Example 22–19 Displaying the Absolute Addresses of a Kernel Data Structure


> 10000000::print -a struct scsi_pkt
{
    10000000 pkt_ha_private
    10000008 pkt_address {
    ...
    }
    10000018 pkt_private
    ...
}

The ::print, ::sizeof and ::offsetof dcmds enable you to debug problems when your driver interacts with the Oracle Solaris kernel.


Caution – Caution –

This facility provides access to raw kernel data structures. You can examine any structure whether or not that structure appears as part of the DDI. Therefore, you should refrain from relying on any data structure that is not explicitly part of the DDI.



Note –

These dcmds should be used only with objects that contain compressed symbolic debugging information that has been designed for use with mdb. Symbolic debugging information is currently available for certain Oracle Solaris kernel modules only. The SUNWzlib (32-bit) or SUNWzlibx (64-bit) decompression software must be installed to process the symbolic debugging information. The kmdb debugger can process symbolic type data with or without the SUNWzlib or SUNWzlibx packages.


Obtaining Device Tree Information

The mdb debugger provides the ::prtconf dcmd for displaying the kernel device tree. The output of the ::prtconf dcmd is similar to the output of the prtconf(1M) command.


Example 22–20 Using the ::prtconf Dcmd


> ::prtconf
300015d3e08      SUNW,Sun-Blade-100
    300015d3c28      packages (driver not attached)
        300015d3868      SUNW,builtin-drivers (driver not attached)
        300015d3688      deblocker (driver not attached)
        300015d34a8      disk-label (driver not attached)
        300015d32c8      terminal-emulator (driver not attached)
        300015d30e8      obp-tftp (driver not attached)
        300015d2f08      dropins (driver not attached)
        300015d2d28      kbd-translator (driver not attached)
        300015d2b48      ufs-file-system (driver not attached)
    300015d3a48      chosen (driver not attached)
    300015d2968      openprom (driver not attached)

You can display the node by using a macro, such as the ::devinfo dcmd, as shown in the following example.


Example 22–21 Displaying Device Information for an Individual Node


> 300015d3e08::devinfo
300015d3e08      SUNW,Sun-Blade-100
        System properties at 0x300015abdc0:
            name='relative-addressing' type=int items=1
                value=00000001
            name='MMU_PAGEOFFSET' type=int items=1
                value=00001fff
            name='MMU_PAGESIZE' type=int items=1
                value=00002000
            name='PAGESIZE' type=int items=1
                value=00002000
        Driver properties at 0x300015abe00:
            name='pm-hardware-state' type=string items=1
                value='no-suspend-resume'

Use ::prtconf to see where your driver has attached in the device tree, and to display device properties. You can also specify the verbose (-v) flag to ::prtconf to display the properties for each device node, as follows.


Example 22–22 Using the ::prtconf Dcmd in Verbose Mode


> ::prtconf -v
DEVINFO          NAME
300015d3e08      SUNW,Sun-Blade-100
        System properties at 0x300015abdc0:
            name='relative-addressing' type=int items=1
                value=00000001
            name='MMU_PAGEOFFSET' type=int items=1
                value=00001fff
            name='MMU_PAGESIZE' type=int items=1
                value=00002000
            name='PAGESIZE' type=int items=1
                value=00002000
        Driver properties at 0x300015abe00:
            name='pm-hardware-state' type=string items=1
                value='no-suspend-resume'
        ...
        300015ce798      pci10b9,5229, instance #0
                Driver properties at 0x300015ab980:
                    name='target2-dcd-options' type=any items=4
                        value=00.00.00.a4
                    name='target1-dcd-options' type=any items=4
                        value=00.00.00.a2
                    name='target0-dcd-options' type=any items=4
                        value=00.00.00.a4

Another way to locate instances of your driver is the ::devbindings dcmd. Given a driver name, the command displays a list of all instances of the named driver as demonstrated in the following example.


Example 22–23 Using the ::devbindings Dcmd to Locate Driver Instances


> ::devbindings dad
300015ce3d8      ide-disk (driver not attached)
300015c9a60      dad, instance #0
        System properties at 0x300015ab400:
            name='lun' type=int items=1
                value=00000000
            name='target' type=int items=1
                value=00000000
            name='class_prop' type=string items=1
                value='ata'
            name='type' type=string items=1
                value='ata'
            name='class' type=string items=1
                value='dada'
...
300015c9880      dad, instance #1
        System properties at 0x300015ab080:
            name='lun' type=int items=1
                value=00000000
            name='target' type=int items=1
                value=00000002
            name='class_prop' type=string items=1
                value='ata'
            name='type' type=string items=1
                value='ata'
            name='class' type=string items=1
                value='dada'

Retrieving Driver Soft State Information

A common problem when debugging a driver is retrieving the soft state for a particular driver instance. The soft state is allocated with the ddi_soft_state_zalloc(9F) routine. The driver can obtain the soft state through ddi_get_soft_state(9F). The name of the soft state pointer is the first argument to ddi_soft_state_init(9F)). With the name, you can use mdb to retrieve the soft state for a particular driver instance through the ::softstate dcmd:


> *bst_state::softstate 0x3
702b7578

In this case, ::softstate is used to fetch the soft state for instance 3 of the bst sample driver. This pointer references a bst_soft structure that is used by the driver to track state for this instance.

Modifying Kernel Variables

You can use both kmdb and mdb to modify kernel variables or other kernel state. Kernel state modification with mdb should be done with care, because mdb does not stop the kernel before making modifications. Groups of modifications can be made atomically by using kmdb, because kmdb stops the kernel before allowing access by the user. The mdb debugger is capable of making single atomic modifications only.

Be sure to use the proper format specifier to perform the modification. The formats are:

Use the ::sizeof dcmd to determine the size of the variable to be modified.

The following example overwrites the value of moddebug with the value 0x80000000.


Example 22–24 Modifying a Kernel Variable With a Debugger


> moddebug/W 0x80000000
    moddebug:       0 = 0x80000000

Tuning Drivers

The Oracle Solaris OS provides kernel statistics structures so that you can implement counters for your driver. The DTrace facility enables you to analyze performance in real time. This section presents the following topics on device performance:

Kernel Statistics

To assist in performance tuning, the Oracle Solaris kernel provides the kstat(3KSTAT) facility. The kstat facility provides a set of functions and data structures for device drivers and other kernel modules to export module-specific kernel statistics.

A kstat is a data structure for recording quantifiable aspects of a device's usage. A kstat is stored as a null-terminated linked list. Each kstat has a common header section and a type-specific data section. The header section is defined by the kstat_t structure.

The article “Using kstat From Within a Program in the Oracle Solaris OS” on the Sun Developer Network at http://developers.sun.com/solaris/articles/kstat_api.html provides two practical examples on how to use the kstat(3KSTAT) and libkstat(3LIB) APIs to extract metrics from the Oracle Solaris OS. The examples include “Walking Through All the kstat” and “Getting NIC kstat Output Using the Java Platform.”

Kernel Statistics Structure Members

The members of a kstat structure are:

ks_class[KSTAT_STRLEN]

Categorizes the kstat type as bus, controller, device_error, disk, hat, kmem_cache, kstat, misc, net, nfs, pages, partition, rps, ufs, vm, or vmem.

ks_crtime

Time at which the kstat was created. ks_crtime is commonly used in calculating rates of various counters.

ks_data

Points to the data section for the kstat.

ks_data_size

Total size of the data section in bytes.

ks_instance

The instance of the kernel module that created this kstat. ks_instance is combined with ks_module and ks_name to give the kstat a unique, meaningful name.

ks_kid

Unique ID for the kstat.

ks_module[KSTAT_STRLEN]

Identifies the kernel module that created this kstat. ks_module is combined with ks_instance and ks_name to give the kstat a unique, meaningful name. KSTAT_STRLEN sets the maximum length of ks_module.

ks_name[KSTAT_STRLEN]

A name assigned to the kstat in combination with ks_module and ks_instance. KSTAT_STRLEN sets the maximum length of ks_module.

ks_ndata

Indicates the number of data records for those kstat types that support multiple records: KSTAT_TYPE_RAW, KSTAT_TYPE_NAMED, and KSTAT_TYPE_TIMER

ks_next

Points to next kstat in the chain.

ks_resv

A reserved field.

ks_snaptime

The timestamp for the last data snapshot, useful in calculating rates.

ks_type

The data type, which can be KSTAT_TYPE_RAW for binary data, KSTAT_TYPE_NAMED for name/value pairs, KSTAT_TYPE_INTR for interrupt statistics, KSTAT_TYPE_IO for I/O statistics, and KSTAT_TYPE_TIMER for event timers.

Kernel Statistics Structures

The structures for the different kinds of kstats are:

kstat(9S)

Each kernel statistic (kstat) that is exported by device drivers consists of a header section and a data section. The kstat(9S) structure is the header portion of the statistic.

kstat_intr(9S)

Structure for interrupt kstats. The types of interrupts are:

  • Hard interrupt – Sourced from the hardware device itself

  • Soft interrupt – Induced by the system through the use of some system interrupt source

  • Watchdog interrupt – Induced by a periodic timer call

  • Spurious interrupt – An interrupt entry point was entered but there was no interrupt to service

  • Multiple service – An interrupt was detected and serviced just prior to returning from any of the other types

Drivers generally report only claimed hard interrupts and soft interrupts from their handlers, but measurement of the spurious class of interrupts is useful for auto-vectored devices to locate any interrupt latency problems in a particular system configuration. Devices that have more than one interrupt of the same type should use multiple structures.

kstat_io(9S)

Structure for I/O kstats.

kstat_named(9S)

Structure for named kstats. A named kstat is an array of name-value pairs. These pairs are kept in the kstat_named structure.

Kernel Statistics Functions

The functions for using kstats are:

kstat_create(9F)

Allocate and initialize a kstat(9S) structure.

kstat_delete(9F)

Remove a kstat from the system.

kstat_install(9F)

Add a fully initialized kstat to the system.

kstat_named_init(9F), kstat_named_setstr(9F)

Initialize a named kstat. kstat_named_setstr() associates str, a string, with the named kstat pointer.

kstat_queue(9F)

A large number of I/O subsystems have at least two basic queues of transactions to be managed. One queue is for transactions that have been accepted for processing but for which processing has yet to begin. The other queue is for transactions that are actively being processed but not yet done. For this reason, two cumulative time statistics are kept: wait time and run time. Wait time is prior to service. Run time is during the service. The kstat_queue() family of functions manages these times based on the transitions between the driver wait queue and run queue:

Kernel Statistics for Oracle Solaris Ethernet Drivers

The kstat interface described in the following table is an effective way to obtain Ethernet physical layer statistics from the driver. Ethernet drivers should export these statistics to guide users in better diagnosis and repair of Ethernet physical layer problems. With exception of link_up, all statistics have a default value of 0 when not present. The value of the link_up statistic should be assumed to be 1.

The following example gives all the shared link setup. In this case mii is used to filter statistics.

kstat ce:0:mii:link_*
Table 22–2 Ethernet MII/GMII Physical Layer Interface Kernel Statistics

Kstat Variable 

Type 

Description 

xcvr_addr

KSTAT_DATA_UINT32

Provides the MII address of the transceiver that is currently in use. 

  • (0) - (31) are for the MII address of the physical layer device in use for a given Ethernet device.

  • (-1) is used where there is no externally accessible MII interface, and therefore the MII address is undefined or irrelevant.

xcvr_id

KSTAT_DATA_UINT32

Provides the specific vendor ID or device ID of the transceiver that is currently in use. 

xcvr_inuse

KSTAT_DATA_UINT32

Indicates the type of transceiver that is currently in use. The IEEE aPhytType enumerates the following set:

  • (0) other undefined

  • (1) no MII interface is present, but no transceiver is connected

  • (2) 10 Mbits/s Clause 7 10 Mbits/s Manchester

  • (3) 100BASE-T4 Clause 23 100 Mbits/s 8B/6T

  • (4) 100BASE-X Clause 24 100 Mbits/s 4B/5B

  • (5) 100BASE-T2 Clause 32 100 Mbits/s PAM5X5

  • (6) 1000BASE-X Clause 36 1000 Mbits/s 8B/10B

  • (7) 1000BASE-T Clause 40 1000 Mbits/s 4D-PAM5

This set is smaller than the set specified by ifMauType, which is defined to include all of the above plus their half duplex/full duplex options. Since this information can be provided by the cap_* statistics, the missing definitions can be derived from the combination of xcvr_inuse and cap_* to provide all the combinations of ifMayType.

cap_1000fdx

KSTAT_DATA_CHAR

Indicates the device is 1 Gbits/s full duplex capable. 

cap_1000hdx

KSTAT_DATA_CHAR

Indicates the device is 1 Gbits/s half duplex capable. 

cap_100fdx

KSTAT_DATA_CHAR

Indicates the device is 100 Mbits/s full duplex capable. 

cap_100hdx

KSTAT_DATA_CHAR

Indicates the device is 100 Mbits/s half duplex capable. 

cap_10fdx

KSTAT_DATA_CHAR

Indicates the device is 10 Mbits/s full duplex capable. 

cap_10hdx

KSTAT_DATA_CHAR

Indicates the device is 10 Mbits/s half duplex capable. 

cap_asmpause

KSTAT_DATA_CHAR

Indicates the device is capable of asymmetric pause Ethernet flow control. 

cap_pause

KSTAT_DATA_CHAR

Indicates the device is capable of symmetric pause Ethernet flow control when cap_pause is set to 1 and cap_asmpause is set to 0. When cap_asmpause is set to 1, cap_pause has the following meaning:

  • cap_pause = 0 Transmit pauses based on receive congestion.

  • cap_pause = 1 Receive pauses and slow down transmit to avoid congestion.

cap_rem_fault

KSTAT_DATA_CHAR

Indicates the device is capable of remote fault indication. 

cap_autoneg

KSTAT_DATA_CHAR

Indicates the device is capable of auto-negotiation. 

adv_cap_1000fdx

KSTAT_DATA_CHAR

Indicates the device is advertising 1 Gbits/s full duplex capability. 

adv_cap_1000hdx

KSTAT_DATA_CHAR

Indicates the device is advertising 1 Gbits/s half duplex capability. 

adv_cap_100fdx

KSTAT_DATA_CHAR

Indicates the device is advertising 100 Mbits/s full duplex capability. 

adv_cap_100hdx

KSTAT_DATA_CHAR

Indicates the device is advertising 100 Mbits/s half duplex capability. 

adv_cap_10fdx

KSTAT_DATA_CHAR

Indicates the device is advertising 10 Mbits/s full duplex capability. 

adv_cap_10hdx

KSTAT_DATA_CHAR

Indicates the device is advertising 10 Mbits/s half duplex capability. 

adv_cap_asmpause

KSTAT_DATA_CHAR

Indicates the device is advertising the capability of asymmetric pause Ethernet flow control. 

adv_cap_pause

KSTAT_DATA_CHAR

Indicates the device is advertising the capability of symmetric pause Ethernet flow control when adv_cap_pause is set to 1 and adv_cap_asmpause is set to 0. When adv_cap_asmpause is set to 1, adv_cap_pause has the following meaning:

  • adv_cap_pause = 0 Transmit pauses based on receive congestion.

  • adv_cap_pause = 1 Receive pauses and slow down transmit to avoid congestion.

adv_rem_fault

KSTAT_DATA_CHAR

Indicates the device is experiencing a fault that it is going to forward to the link partner. 

adv_cap_autoneg

KSTAT_DATA_CHAR

Indicates the device is advertising the capability of auto-negotiation. 

lp_cap_1000fdx

KSTAT_DATA_CHAR

Indicates the link partner device is 1 Gbits/s full duplex capable. 

lp_cap_1000hdx

KSTAT_DATA_CHAR

Indicates the link partner device is 1 Gbits/s half duplex capable. 

lp_cap_100fdx

KSTAT_DATA_CHAR

Indicates the link partner device is 100 Mbits/s full duplex capable. 

lp_cap_100hdx

KSTAT_DATA_CHAR

Indicates the link partner device is 100 Mbits/s half duplex capable. 

lp_cap_10fdx

KSTAT_DATA_CHAR

Indicates the link partner device is 10 Mbits/s full duplex capable. 

lp_cap_10hdx

KSTAT_DATA_CHAR

Indicates the link partner device is 10 Mbits/s half duplex capable. 

lp_cap_asmpause

KSTAT_DATA_CHAR

Indicates the link partner device is capable of asymmetric pause Ethernet flow control. 

lp_cap_pause

KSTAT_DATA_CHAR

Indicates the link partner device is capable of symmetric pause Ethernet flow control when lp_cap_pause is set to 1 and lp_cap_asmpause is set to 0. When lp_cap_asmpause is set to 1, lp_cap_pause has the following meaning:

  • lp_cap_pause = 0 Link partner will transmit pauses based on receive congestion.

  • lp_cap_pause = 1 Link partner will receive pauses and slow down transmit to avoid congestion.

lp_rem_fault

KSTAT_DATA_CHAR

Indicates the link partner is experiencing a fault with the link. 

lp_cap_autoneg

KSTAT_DATA_CHAR

Indicates the link partner device is capable of auto-negotiation. 

link_asmpause

KSTAT_DATA_CHAR

Indicates the link is operating with asymmetric pause Ethernet flow control. 

link_pause

KSTAT_DATA_CHAR

Indicates the resolution of the pause capability. Indicates the link is operating with symmetric pause Ethernet flow control when link_pause is set to 1 and link_asmpause is set to 0. When link_asmpause is set to 1 and is relative to a local view of the link, link_pause has the following meaning:

  • link_pause = 0 This station will transmit pauses based on receive congestion.

  • link_pause = 1 This station will receive pauses and slow down transmit to avoid congestion.

link_duplex

KSTAT_DATA_CHAR

Indicates the link duplex. 

  • link_duplex = 0 Link is down and duplex is unknown.

  • link_duplex = 1 Link is up and in half duplex mode.

  • link_duplex = 2 Link is up and in full duplex mode.

link_up

KSTAT_DATA_CHAR

Indicates whether the link is up or down. 

  • link_up = 0 Link is down.

  • link_up = 1 Link is up.

DTrace for Dynamic Instrumentation

DTrace is a comprehensive dynamic tracing facility for examining the behavior of both user programs and the operating system itself. With DTrace, you can collect data at strategic locations in your environment, referred to as probes. DTrace enables you to record such data as stack traces, timestamps, the arguments to a function, or simply counts of how often the probe fires. Because DTrace enables you to insert probes dynamically, you do not need to recompile your code. The DTrace BigAdmin System Administration Portal contains many links to articles, XPerts sessions, and other information about DTrace.

Chapter 23 Recommended Coding Practices

This chapter describes how to write drivers that are robust. Drivers that are written in accordance with the guidelines that are discussed in this chapter are easier to debug. The recommended practices also protect the system from hardware and software faults.

This chapter provides information on the following subjects:

Debugging Preparation Techniques

Driver code is more difficult to debug than user programs because:

Be sure to build debugging support into your driver. This support facilitates both maintenance work and future development.

Use a Unique Prefix to Avoid Kernel Symbol Collisions

The name of each function, data element, and driver preprocessor definition must be unique for each driver.

A driver module is linked into the kernel. The name of each symbol unique to a particular driver must not collide with other kernel symbols. To avoid such collisions, each function and data element for a particular driver must be named with a prefix common to that driver. The prefix must be sufficient to uniquely name each driver symbol. Typically, this prefix is the name of the driver or an abbreviation for the name of the driver. For example, xx_open() would be the name of the open(9E) routine of driver xx.

When building a driver, a driver must necessarily include a number of system header files. The globally-visible names within these header files cannot be predicted. To avoid collisions with these names, each driver preprocessor definition must be given a unique name by using an identifying prefix.

A distinguishing driver symbol prefix also is an aid to deciphering system logs and panics when troubleshooting. Instead of seeing an error related to an ambiguous attach() function, you see an error message about xx_attach().

Use cmn_err() to Log Driver Activity

Use the cmn_err(9F) function to print messages to a system log from within the device driver. The cmn_err(9F) function for kernel modules is similar to the printf(3C) function for applications. The cmn_err(9F) function provides additional format characters, such as the %b format to print device register bits. The cmn_err(9F) function writes messages to a system log. Use the tail(1) command to monitor these messages on /var/adm/messages.


% tail -f /var/adm/messages

Use ASSERT() to Catch Invalid Assumptions

Assertions are an extremely valuable form of active documentation. The syntax for ASSERT(9F) is as follows:

void ASSERT(EXPRESSION)

The ASSERT() macro halts the execution of the kernel if a condition that is expected to be true is actually false. ASSERT() provides a way for the programmer to validate the assumptions made by a piece of code.

The ASSERT() macro is defined only when the DEBUG compilation symbol is defined. When DEBUG is not defined, the ASSERT() macro has no effect.

The following example assertion tests the assumption that a particular pointer value is not NULL:

ASSERT(ptr != NULL);

If the driver has been compiled with DEBUG, and if the value of ptr is NULL at this point in execution, then the following panic message is printed to the console:

panic: assertion failed: ptr != NULL, file: driver.c, line: 56

Note –

Because ASSERT(9F) uses the DEBUG compilation symbol, any conditional debugging code should also use DEBUG.


Use mutex_owned() to Validate and Document Locking Requirements

The syntax for mutex_owned(9F) is as follows:

int mutex_owned(kmutex_t *mp);

A significant portion of driver development involves properly handling multiple threads. Comments should always be used when a mutex is acquired. Comments can be even more useful when an apparently necessary mutex is not acquired. To determine whether a mutex is held by a thread, use mutex_owned() within ASSERT(9F):

void helper(void)
{
    /* this routine should always be called with xsp's mutex held */
    ASSERT(mutex_owned(&xsp->mu));
    /* ... */
}

Note –

mutex_owned() is only valid within ASSERT() macros. You should use mutex_owned() to control the behavior of a driver.


Use Conditional Compilation to Toggle Costly Debugging Features

You can insert code for debugging into a driver through conditional compiles by using a preprocessor symbol such as DEBUG or by using a global variable. With conditional compilation, unnecessary code can be removed in the production driver. Use a variable to set the amount of debugging output at runtime. The output can be specified by setting a debugging level at runtime with an ioctl or through a debugger. Commonly, these two methods are combined.

The following example relies on the compiler to remove unreachable code, in this case, the code following the always-false test of zero. The example also provides a local variable that can be set in /etc/system or patched by a debugger.

#ifdef DEBUG
/* comments on values of xxdebug and what they do */
static int xxdebug;
#define dcmn_err if (xxdebug) cmn_err
#else
#define dcmn_err if (0) cmn_err
#endif
/* ... */
    dcmn_err(CE_NOTE, "Error!\n");

This method handles the fact that cmn_err(9F) has a variable number of arguments. Another method relies on the fact that the macro has one argument, a parenthesized argument list for cmn_err(9F). The macro removes this argument. This macro also removes the reliance on the optimizer by expanding the macro to nothing if DEBUG is not defined.

#ifdef DEBUG
/* comments on values of xxdebug and what they do */
static int xxdebug;
#define dcmn_err(X) if (xxdebug) cmn_err X
#else
#define dcmn_err(X) /* nothing */
#endif
/* ... */
/* Note:double parentheses are required when using dcmn_err. */
    dcmn_err((CE_NOTE, "Error!"));

You can extend this technique in many ways. One way is to specify different messages from cmn_err(9F), depending on the value of xxdebug. However, in such a case, you must be careful not to obscure the code with too much debugging information.

Another common scheme is to write an xxlog() function, which uses vsprintf(9F) or vcmn_err(9F) to handle variable argument lists.

Declaring a Variable Volatile

volatile is a keyword that must be applied when declaring any variable that will reference a device register. Without the use of volatile, the compile-time optimizer can inadvertently delete important accesses. Neglecting to use volatile might result in bugs that are difficult to track down.

The correct use of volatile is necessary to prevent elusive bugs. The volatile keyword instructs the compiler to use exact semantics for the declared objects, in particular, not to remove or reorder accesses to the object. Two instances where device drivers must use the volatile qualifier are:

The following example uses volatile. A busy flag is used to prevent a thread from continuing while the device is busy and the flag is not protected by a lock:

while (busy) {
    /* do something else */
}

The testing thread will continue when another thread turns off the busy flag:

busy = 0;

Because busy is accessed frequently in the testing thread, the compiler can potentially optimize the test by placing the value of busy in a register and test the contents of the register without reading the value of busy in memory before every test. The testing thread would never see busy change and the other thread would only change the value of busy in memory, resulting in deadlock. Declaring the busy flag as volatile forces its value to be read before each test.


Note –

An alternative to the busy flag is to use a condition variable. See Condition Variables in Thread Synchronization.


When using the volatile qualifier, avoid the risk of accidental omission. For example, the following code

struct device_reg {
    volatile uint8_t csr;
    volatile uint8_t data;
};
struct device_reg *regp;

is preferable to the next example:

struct device_reg {
    uint8_t csr;
    uint8_t data;
};
volatile struct device_reg *regp;

Although the two examples are functionally equivalent, the second one requires the writer to ensure that volatile is used in every declaration of type struct device_reg. The first example results in the data being treated as volatile in all declarations and is therefore preferred. As mentioned above, using the DDI data access functions to access device registers makes qualifying variables as volatile unnecessary.

Serviceability

To ensure serviceability, the driver must be enabled to take the following actions:

Periodic Health Checks

A latent fault is one that does not show itself until some other action occurs. For example, a hardware failure occurring in a device that is a cold standby could remain undetected until a fault occurs on the master device. At this point, the system now contains two defective devices and might be unable to continue operation.

Latent faults that remain undetected typically cause system failure eventually. Without latent fault checking, the overall availability of a redundant system is jeopardized. To avoid this situation, a device driver must detect latent faults and report them in the same way as other faults.

You should provide the driver with a mechanism for making periodic health checks on the device. In a fault-tolerant situation where the device can be the secondary or failover device, early detection of a failed secondary device is essential to ensure that the secondary device can be repaired or replaced before any failure in the primary device occurs.

Periodic health checks can be used to perform the following activities: