Writing Device Drivers

Detecting Corrupted Data

The following sections describe where data corruption can occur and how to detect corruption.

Corruption of Device Management and Control Data

The driver should assume that any data obtained from the device, whether by PIO or DMA, could have been corrupted. In particular, extreme care should be taken with pointers, memory offsets, and array indexes that are based on data from the device. Such values can be malignant, in that these values can cause a kernel panic if dereferenced. All such values should be checked for range and alignment (if required) before use.

Even a pointer that is not malignant can still be misleading. For example, a pointer can point to a valid but not correct instance of an object. Where possible, the driver should cross-check the pointer with the object to which it is pointing, or otherwise validate the data obtained through that pointer.

Other types of data can also be misleading, such as packet lengths, status words, or channel IDs. These data types should be checked to the extent possible. A packet length can be range-checked to ensure that the length is neither negative nor larger than the containing buffer. A status word can be checked for ”impossible” bits. A channel ID can be matched against a list of valid IDs.

Where a value is used to identify a stream, the driver must ensure that the stream still exists. The asynchronous nature of processing STREAMS means that a stream can be dismantled while device interrupts are still outstanding.

The driver should not reread data from the device. The data should be read once, validated, and stored in the driver's local state. This technique avoids the hazard of data that is correct when initially read, but is incorrect when reread later.

The driver should also ensure that all loops are bounded. For example, a device that returns a continuous BUSY status should not be able to lock up the entire system.

Corruption of Received Data

Device errors can result in corrupted data being placed in receive buffers. Such corruption is indistinguishable from corruption that occurs beyond the domain of the device, for example, within a network. Typically, existing software is already in place to handle such corruption. One example is the integrity checks at the transport layer of a protocol stack. Another example is integrity checks within the application that uses the device.

If the received data is not to be checked for integrity at a higher layer, the data can be integrity-checked within the driver itself. Methods of detecting corruption in received data are typically device-specific. Checksums and CRC are examples of the kinds of checks that can be done.