The guidelines are organized into the following categories:
Use these guidelines when you write the code for your driver:
The name of each function, data element, and driver preprocessor definition must be unique for each driver.
A driver module is linked into the kernel. The name of each symbol unique to a particular driver must not collide with other kernel symbols. To avoid such collisions, each function and data element for a particular driver must be named with a prefix common to that driver. The prefix must be sufficient to uniquely name each driver symbol. Typically, this prefix is the name of the driver or an abbreviation for the name of the driver. For example, xx_open() would be the name of the open(9E) routine of driver xx.
When building a driver, a driver must necessarily include a number of system header files. The globally-visible names within these header files cannot be predicted. To avoid collisions with these names, each driver preprocessor definition must be given a unique name by using an identifying prefix.
A distinguishing driver symbol prefix also is an aid to deciphering system logs and panics when troubleshooting. Instead of seeing an error related to an ambiguous attach() function, you see an error message about xx_attach().
The -n option in the add_drv(1M) command enables you to update the system configuration files for a driver without loading or attaching the driver.
You can use the cmn_err(9F) function to display information from your driver similar to the way you might use print statements to display information from a user program. The cmn_err(9F) function writes low priority messages to /dev/log. The syslogd(1M) daemon reads messages from /dev/log and writes low priority messages to /var/adm/messages. Use the following command to monitor the output from your cmn_err(9F) messages:
% tail -f /var/adm/messages
Be sure to remove cmn_err() calls that are used for development or debugging before you compile your production version driver. You might want to use cmn_err() calls in a production driver to write error messages that would be useful to a system administrator.
When the driver exits, whether intentionally or prematurely, you need to perform such tasks as closing opened files, freeing allocated memory, releasing mutex locks, and destroying any mutexes that have been created. In addition, the system must be able to close all minor devices and detach driver instances even after the hardware fails. An orderly approach is to reverse _init() actions in the _fini() routine, reverse open() operations in the close() routine, and reverse attach() operations in the detach() routine.
Use ASSERT(9F) to catch unexpected error returns.
ASSERT() is a macro that halts the kernel execution if a condition that was expected to be true turns out to be false. To activate ASSERT(), you need to include the sys/debug.h header file and specify the DEBUG preprocessor symbol during compilation.
The mutex_owned(9F) function helps determine whether the current thread owns a specified mutex. To determine whether a mutex is held by a thread, use mutex_owned() within ASSERT().
The Solaris OS provides various debugging functions, such as ASSERT() and mutex-owned(), that can be turned on by specifying the DEBUG preprocessor symbol when the driver is compiled. With conditional compilation, unnecessary code can be removed from the production driver. This approach can also be accomplished by using a global variable.
Use a separate instance of the driver for each device to be controlled.
Use DDI functions as much as possible in your device drivers.
These interfaces shield the driver from platform-specific dependencies such as mismatches between processor and device endianness and any other data order dependencies. With these interfaces, a single-source driver can run on the SPARC platform, x86 platform, and related processor architectures.
Anticipate corrupted data.
Always check that the integrity of data before that data is used. The driver must avoid releasing bad data to the rest of the system.
A device should only write to DMA buffers that are controlled solely by the driver.
This technique prevents a DMA fault from corrupting an arbitrary part of the system's main memory.
Use the ddi_umem_alloc(9F) function when you need to make DMA transfers.
This function guarantees that only whole, aligned pages are transferred.
The device driver must not be an unlimited drain on system resources if the device locks up. The driver should time out if a device claims to be continuously busy. The driver should also detect a pathological (stuck) interrupt request and take appropriate action.
See Thread Interaction in Writing Device Drivers for more information.
User requests can be destructive. The design of the driver should take into consideration the construction of each type of potential ioctl() request.
Try to avoid situations where a driver continues to function without detecting a device failure.
A driver should switch to an alternative device rather than try to work around a device failure.
All devices need to be able to be installed or removed without requiring a reboot of the system.
Power management provides the ability to control and manage the electrical power usage of a computer system or device. Power management enables systems to conserve energy by using less power when idle and by shutting down completely when not in use.
Without the volatile keyword, the compile-time optimizer can delete important accesses to a register.
Perform periodic health checks to detect and report faulty devices.
A periodic health check should include the following activities:
Check any register or memory location on the device whose value might have been altered since the last poll.
Timestamp outgoing requests such as transmit blocks or commands that are issued by the driver.
Initiate a test action on the device that should be completed before the next scheduled check.
Testing a device driver can cause the system to panic and can harm the kernel.
The following tips can help you avoid problems when testing your driver:
Install drivers in the /tmp directory until you are finished modifying and testing the _info(), _init(), and attach() routines. Copy the driver binary to the /tmp directory. Link to the driver from the kernel driver directory.
If a driver has an error in its _info(), _init(), or attach() function, your machine could get into a state of infinite panic. The Solaris OS automatically reboots itself after a panic. The Solaris OS loads any drivers it can during boot. If you have an error in your attach() function that panics the system when you load the driver, then the system will panic again when it tries to reboot after the panic. The system will continue the cycle of panic, reboot, panic as it attempts to reload the faulty driver every time it reboots after panic.
To avoid an infinite panic, keep the driver in the /tmp area until it is well tested. Link to the driver in the /tmp area from the kernel driver area. The Solaris OS removes all files from the /tmp area every time the system reboots. If your driver causes a panic, the Solaris OS reboots successfully because the driver has been removed automatically from the /tmp area. The link in the kernel driver area points to nothing. The faulty driver did not get loaded, so the system does not go back into a panic. You can modify the driver, copy it again to the /tmp area, and continue testing and developing. When the driver is well tested, copy it to the /usr/kernel/drv area so that it will remain available after a reboot.
The following example shows you where to link the driver for a 32-bit platform. For other architectures, see the instructions in Installing a Driver.
# cp mydriver /tmp # ln -s /tmp/mydriver /usr/kernel/drv/mydriver
If your system is in a hard hang, then you cannot break into the debugger. If you enable the deadman feature, the system panics instead of hanging indefinitely. You can then use the kmdb(1) kernel debugger to analyze your problem.
The deadman feature checks every second whether the system clock is updating. If the system clock is not updating, then you are in an indefinite hang. If the system clock has not been updated for 50 seconds, the deadman feature induces a panic and puts you in the debugger.
Take the following steps to enable the deadman feature:
Make sure you are capturing crash images with dumpadm(1M).
Set the snooping variable in the /etc/system file.
Reboot the system so that the /etc/system file is read again and the snooping setting takes effect.
Note that any zones on your system inherit the deadman setting as well.
If your system hangs while the deadman feature is enabled, you should see output similar to the following example on your console:
panic[cpu1]/thread=30018dd6cc0: deadman: timed out after 9 seconds of clock inactivity panic: entering debugger (continue to save dump)
Inside the debugger, use the ::cpuinfo command to investigate why the clock interrupt was not able to fire and advance the system time.
This technique is explained in Testing With a Serial Connection in Writing Device Drivers.
Booting from a copy of the kernel and the associated binaries rather than from the default kernel avoids inadvertently rendering the system inoperable.
This approach isolates experiments with the kernel variable settings. See Setting Up Test Modules in Writing Device Drivers.
If your test system is set up as a client of a server, then you can boot from the network if problems occur. You could also create a special partition to hold a copy of a bootable root file system. See Avoiding Data Loss on a Test System in Writing Device Drivers.
Use fsck(1M) to repair the damaged root file system temporarily if your system crashes during the attach(9E) process so that any crash dumps can be salvaged. See Recovering the Device Directory in Writing Device Drivers.
Install drivers in the /tmp directory until you are finished modifying and testing the _info(), _init(), and attach() routines.
Keep a driver in the /tmp directory until the driver has been well tested. If a panic occurs, the driver will be removed from /tmp directory and the system will reboot successfully.
The Solaris OS provides various tools for debugging and tuning your device driver:
You might receive the following warning message from the add_drv(1M) command:
Warning: Driver (driver_name) successfully added to system but failed to attach
This message might have one of the following causes:
The hardware has not been detected properly. The system cannot find the device.
The configuration file is missing. See Writing a Configuration File for information on when you need a configuration file and what information goes into a configuration file. Be sure to put the configuration file in /kernel/drv or /usr/kernel/drv and not in the driver directory.
Use the kmdb(1) kernel debugger for runtime debugging.
The kmdb debugger provides typical runtime debugger facilities, such as breakpoints, watch points, and single-stepping. For more information, see Solaris Modular Debugger Guide.
Use the mdb(1) modular debugger for postmortem debugging.
Postmortem debugging is performed on a system crash dump rather than on a live system. With postmortem debugging, the same crash dump can be analyzed by different people or processes simultaneously. In addition, mdb enables you to create special macros called dmods to perform rigorous analysis on the dump. For more information, see Solaris Modular Debugger Guide.
Use the kstat(3KSTAT) facility to export module-specific kernel statistics for your device driver.
Use the DTrace facility to add instrumentation to your driver dynamically so that you can perform tasks such as analyzing the system and measuring performance. For information on DTrace, see the Solaris Dynamic Tracing Guide and the DTrace User Guide.
If your driver does not behave as expected on a 64-bit platform, make sure you are using a 64-bit driver. By default, compilation on the Solaris OS yields a 32-bit result on every architecture. To obtain a 64-bit result, follow the instructions in Building a Driver.
Use the file(1) command to determine whether you have a 64-bit driver.
% file qotd_3 qotd_3: ELF 32-bit LSB relocatable 80386 Version 1
If you are using a 64-bit system and you are not certain whether you are currently running the 64-bit kernel or the 32-bit kernel, use the -k option of the isainfo(1) command. The -v option reports all instruction set architectures of the system. The -k option reports the instruction set architecture that is currently in use.
% isainfo -v 64-bit sparcv9 applications vis2 vis 32-bit sparc applications vis2 vis v8plus div32 mul32 % isainfo -kv 64-bit sparcv9 kernel modules
If your driver seems to have an error in a function that you did not write, make sure you have called that function with the correct arguments and specified the correct include files. Many kernel functions have the same names as system calls and user functions. For example, read() and write() can be system calls, user library functions, or kernel functions. Similarly, ioctl() and mmap() can be system calls or kernel functions. The man mmap command displays the mmap(2) man page. To see the arguments, description, and include files for the kernel function, use the man mmap.9e command. If you do not know whether the function you want is in section 9E or section 9F, use the man -l mmap command, for example.