|Skip Navigation Links|
|Exit Print View|
|Writing Device Drivers Oracle Solaris 10 1/13 Information Library|
Once a device driver is functional, that driver should be thoroughly tested prior to distribution. Besides testing the features in traditional UNIX device drivers, Oracle Solaris drivers require testing power management features, such as dynamic loading and unloading of drivers.
A driver's ability to handle multiple device configurations is an important part of the test process. Once the driver is working on a simple, or default, configuration, additional configurations should be tested. Depending on the device, configuration testing can be accomplished by changing jumpers or DIP switches. If the number of possible configurations is small, all configurations should be tried. If the number is large, various classes of possible configurations should be defined, and a sampling of configurations from each class should be tested. Defining these classes depends on the potential interactions among the different configuration parameters. These interactions are a function of the type of the device and the way in which the driver was written.
For each device configuration, the basic functions must be tested, which include loading, opening, reading, writing, closing, and unloading the driver. Any function that depends upon the configuration deserves special attention. For example, changing the base memory address of device registers is not likely to affect the behavior of most driver functions. If a driver works well with one address, that driver is likely to work as well with a different address. On the other hand, a special I/O control call might have different effects depending on the particular device configuration.
Loading the driver with varying configurations ensures that the probe(9E) and attach(9E) entry points can find the device at different addresses. For basic functional testing, using regular UNIX commands such as cat(1) or dd(1M) is usually sufficient for character devices. Mounting or booting might be required for block devices.
After a driver has been completely tested for configuration, all of the driver's functionality should be thoroughly tested. These tests require exercising the operation of all of the driver's entry points.
Many drivers require custom applications to test functionality. However, basic drivers for devices such as disks, tapes, or asynchronous boards can be tested using standard system utilities. All entry points should be tested in this process, including devmap(9E), chpoll(9E), and ioctl(9E), if applicable. The ioctl() tests might be quite different for each driver. For nonstandard devices, a custom testing application is generally required.
A driver might perform correctly in an ideal environment but fail in cases of errors, such as erroneous operations or bad data. Therefore, an important part of driver testing is the testing of the driver's error handling.
All possible error conditions of a driver should be exercised, including error conditions for actual hardware malfunctions. Some hardware error conditions might be difficult to induce, but an effort should be made to force or to simulate such errors if possible. All of these conditions could be encountered in the field. Cables should be removed or be loosened, boards should be removed, and erroneous user application code should be written to test those error paths. See also Chapter 13, Hardening Oracle Solaris Drivers.
Caution - Be sure to take proper electrical precautions when testing.
Because a driver that does not load or unload can force unscheduled downtime, loading and unloading must be thoroughly tested.
A script like the following example should suffice:
#!/bin/sh cd <location_of_driver> while [ 1 ] do modunload -i 'modinfo | grep " <driver_name> " | cut -cl-3' & modload <driver_name> & done
To help ensure that a driver performs well, that driver should be subjected to vigorous stress testing. For example, running single threads through a driver does not test locking logic or conditional variables that have to wait. Device operations should be performed by multiple processes at once to cause several threads to execute the same code simultaneously.
Techniques for performing simultaneous tests depend upon the driver. Some drivers require special testing applications, while starting several UNIX commands in the background is suitable for others. Appropriate testing depends upon where the particular driver uses locks and condition variables. Testing a driver on a multiprocessor machine is more likely to expose problems than testing on a single-processor machine.
Interoperability between drivers must also be tested, particularly because different devices can share interrupt levels. If possible, configure another device at the same interrupt level as the one being tested. A stress test can determine whether the driver correctly claims its own interrupts and operates according to expectations. Stress tests should be run on both devices at once. Even if the devices do not share an interrupt level, this test can still be valuable. For example, consider a case in which serial communication devices experience errors when a network driver is tested. The same problem might be causing the rest of the system to encounter interrupt latency problems as well.
Driver performance under these stress tests should be measured using UNIX performance-measuring tools. This type of testing can be as simple as using the time(1) command along with commands to be used in the stress tests.
To ensure compatibility with later releases and reliable support for the current release, every driver should be DDI/DKI compliant. Check that only kernel routines in man pages section 9: DDI and DKI Kernel Functions and man pages section 9: DDI and DKI Driver Entry Points and data structures in man pages section 9: DDI and DKI Properties and Data Structures are used.
Drivers are delivered to customers in packages. A package can be added or be removed from the system using a standard mechanism (see the Application Packaging Developer’s Guide).
The ability of a user to add or remove the package from a system should be tested. In testing, the package should be both installed and removed from every type of media to be used for the release. This testing should include several system configurations. Packages must not make unwarranted assumptions about the directory environment of the target system. Certain valid assumptions, however, can be made about where standard kernel files are kept. Also test adding and removing of packages on newly installed machines that have not been modified for a development environment. A common packaging error is for a package to rely on a tool or file that is used in development only. For example, no tools from the Source Compatibility package, SUNWscpu, should be used in driver installation programs.
The driver installation must be tested on a minimal Oracle Solaris system without any optional packages.
Tape drivers should be tested by performing several archive and restore operations. The cpio(1) and tar(1) commands can be used for this purpose. Use the dd(1M) command to write an entire disk partition to tape. Next, read back the data, and write the data to another partition of the same size. Then compare the two copies. The mt(1) command can exercise most of the I/O controls that are specific to tape drivers. See the mtio(7I) man page. Try to use all the options. These three techniques can test the error-handling capabilities of tape drivers:
Remove the tape and try various operations
Write-protect the tape and try a write
Turn off power in the middle of different operations
Tape drivers typically implement exclusive-access open(9E) calls. These open() calls can be tested by opening a device and then having a second process try to open the same device.
Disk drivers should be tested in both the raw and block device modes. For block device tests, create a new file system on the device. Then try to mount the new file system. Then try to perform multiple file operations.
Note - The file system uses a page cache, so reading the same file over and over again does not really exercise the driver. The page cache can be forced to retrieve data from the device by memory-mapping the file with mmap(2). Then use msync(3C) to invalidate the in-memory copies.
Copy another (unmounted) partition of the same size to the raw device. Then use a command such as fsck(1M) to verify the correctness of the copy. The new partition can also be mounted and then later compared to the old partition on a file-by-file basis.
Asynchronous drivers can be tested at the basic level by setting up a login line to the serial ports. A good test is to see whether a user can log in on this line. To sufficiently test an asynchronous driver, however, all the I/O control functions must be tested, with many interrupts at high speed. A test involving a loopback serial cable and high data transfer rates can help determine the reliability of the driver. You can run uucp(1C) over the line to provide some exercise. However, because uucp performs its own error handling, verify that the driver is not reporting excessive numbers of errors to the uucp process.
These types of devices are usually STREAMS-based. See the STREAMS Programming Guide for more information.
Network drivers can be tested using standard network utilities. The ftp(1) and rcp(1) commands are useful because the files can be compared on each end of the network. The driver should be tested under heavy network loading, so that various commands can be run by multiple processes.
Heavy network loading includes the following conditions:
Traffic to the test machine is heavy.
Traffic among all machines on the network is heavy.
Network cables should be unplugged while the tests are executing to ensure that the driver recovers gracefully from the resulting error conditions. Another important test is for the driver to receive multiple packets in rapid succession, that is, back-to-back packets. In this case, a relatively fast host on a lightly loaded network should send multiple packets in quick succession to the test machine. Verify that the receiving driver does not drop the second and subsequent packets.
These types of devices are usually STREAMS-based. See the STREAMS Programming Guide for more information.
Drivers supporting SR-IOV require additional testing. Standard bare-metal testing is also required, and the utilities used for bare-metal testing such as ftp and rcp for network devices can be used.
See Chapter 21, SR-IOV Drivers for information about SR-IOV Drivers.
Use the following commands to test the status of the Virtual Functions (VFs):
VF enabled – hotplug install
VF disabled – hotplug uninstall
VF assigned – hotplug list
Use the ldm(1M) command in addition to the hotplug commands when testing the status of VFs on a SPARC system.
It is also important to test SR-IOV devices in a variety of virtualized configurations. Try the following options when testing SR-IOV drivers on both SPARC and x86 platforms:
Do not configure any Virtual Functions (VFs)
Configure only one VF
Increase configured VFs by powers of 2 until the maximum number of VFs is reached
On SPARC platforms, test the functionality with a varying number of IO Domains and varying distribution of the VFs within those domains. Try the following configurations:
Assign a single VF to a single IO Domain
Assign powers of 2 VFs (up to the maximum) to a single IO Domain
Create 2,4, or 8 IO Domains and assign varying numbers of VF to each of the domains
Assign some VFs to the root domain and assign some VFs to an IO Domain
Test the following features if the device or the platform support it :
Boot the IO Domain from a VF
Physically hotplug or unplug the SR-IOV card
Perform dynamic reconfiguration operations on the SR-IOV card
If your device is supported on multiple versions of Oracle Solaris, a final testing can be performed by mixing the OS versions across the Root and IO Domains for some of the tests.