Writing Device Drivers

Controlling Device Access

This section describes aspects of the open() and close() entry points that are specific to block device drivers. See Chapter 10, Drivers for Character Devices for more information on open(9E) and close(9E).

open() Entry Point (Block Drivers)

The open(9E) entry point is used to gain access to a given device. The open(9E) routine of a block driver is called when a user thread issues an open(2) or mount(2) system call on a block special file associated with the minor device, or when a layered driver calls open(9E). See File I/O for more information.

The open(9E) entry point should check for the following:

Example 11–2 demonstrates a block driver open(9E) entry point.


Example 11–2 Block Driver open(9E) Routine

static int
xxopen(dev_t *devp, int flags, int otyp, cred_t *credp)
{
       minor_t             instance;
       struct xxstate            *xsp;

     instance = getminor(*devp);
     xsp = ddi_get_soft_state(statep, instance);
     if (xsp == NULL)
               return (ENXIO);
     mutex_enter(&xsp->mu);
     /*
        * only honor FEXCL. If a regular open or a layered open
        * is still outstanding on the device, the exclusive open
        * must fail.
        */
     if ((flags & FEXCL) && (xsp->open || xsp->nlayered)) {
           mutex_exit(&xsp->mu);
           return (EAGAIN);
     }
     switch (otyp) {
       case OTYP_LYR:
             xsp->nlayered++;
             break;
      case OTYP_BLK:
             xsp->open = 1;
             break;
     default:
             mutex_exit(&xsp->mu);
             return (EINVAL);
     }
   mutex_exit(&xsp->mu);
      return (0);
}

The otyp argument is used to specify the type of open on the device. OTYP_BLK is the typical open type for a block device. A device can be opened several times with otyp set to OTYP_BLK, although close(9E) will be called only once when the final close of type OTYP_BLK has occurred for the device. otyp is set to OTYP_LYR if the device is being used as a layered device. For every open of type OTYP_LYR, the layering driver issues a corresponding close of type OTYP_LYR. The example keeps track of each type of open so the driver can determine when the device is not being used in close(9E).

close() Entry Point (Block Drivers)

The arguments of the close(9E) entry point are identical to arguments of open(9E), except that dev is the device number, as opposed to a pointer to the device number.

The close(9E) routine should verify otyp in the same way as was described for the open(9E) entry point. In Example 11–3, close(9E) must determine when the device can really be closed based on the number of block opens and layered opens.


Example 11–3 Block Device close(9E) Routine

static int
xxclose(dev_t dev, int flag, int otyp, cred_t *credp)
{
     minor_t instance;
     struct xxstate *xsp;

     instance = getminor(dev);
     xsp = ddi_get_soft_state(statep, instance);
       if (xsp == NULL)
              return (ENXIO);
     mutex_enter(&xsp->mu);
     switch (otyp) {
       case OTYP_LYR:
           xsp->nlayered--;
           break;
      case OTYP_BLK:
           xsp->open = 0;
           break;
     default:
           mutex_exit(&xsp->mu);
           return (EINVAL);
       }

     if (xsp->open || xsp->nlayered) {
           /* not done yet */
           mutex_exit(&xsp->mu);
           return (0);
     }
       /* cleanup (rewind tape, free memory, etc.) */
   /* wait for I/O to drain */
     mutex_exit(&xsp->mu);

     return (0);
}

strategy() Entry Point

The strategy(9E) entry point is used to read and write data buffers to and from a block device. The name strategy refers to the fact that this entry point might implement some optimal strategy for ordering requests to the device.

strategy(9E) can be written to process one request at a time (synchronous transfer), or to queue multiple requests to the device (asynchronous transfer). When choosing a method, the abilities and limitations of the device should be taken into account.

The strategy(9E) routine is passed a pointer to a buf(9S) structure. This structure describes the transfer request, and contains status information on return. buf(9S) and strategy(9E) are the focus of block device operations.

buf Structure

The following buf structure members are important to block drivers:


       int                b_flags;            /* Buffer Status */
     struct buf       *av_forw;        /* Driver work list link */
     struct buf       *av_back;        /* Driver work lists link */
     size_t           b_bcount;        /* # of bytes to transfer */
     union {
         caddr_t      b_addr;          /* Buffer's virtual address */
     } b_un;
     daddr_t          b_blkno;         /* Block number on device */
     diskaddr_t       b_lblkno;        /* Expanded block number on device */
     size_t           b_resid;         /* # of bytes not transferred */
                                       /* after error */
     int              b_error;         /* Expanded error field */
     void             *b_private;      /* “opaque” driver private area */
     dev_t            b_edev;          /* expanded dev field */

b_flags contains status and transfer attributes of the buf structure. If B_READ is set, the buf structure indicates a transfer from the device to memory; otherwise, it indicates a transfer from memory to the device. If the driver encounters an error during data transfer, it should set the B_ERROR field in the b_flags member and provide a more specific error value in b_error. Drivers should use bioerror(9F) rather than setting B_ERROR.


Caution – Caution –

Drivers should never clear b_flags.


av_forw and av_back

Pointers that the driver can use to manage a list of buffers by the driver. See Asynchronous Data Transfers (Block Drivers) for a discussion of the av_forw and av_back pointers.

b_bcount

Specifies the number of bytes to be transferred by the device.

b_un.b_addr

The kernel virtual address of the data buffer. Only valid after bp_mapin(9F) call.

b_blkno

The starting 32-bit logical block number on the device for the data transfer, expressed in DEV_BSIZE (512 bytes) units. The driver should use either b_blkno or b_lblkno, but not both.

b_lblkno

The starting 64-bit logical block number on the device for the data transfer, expressed in DEV_BSIZE (512 bytes) units. The driver should use either b_blkno or b_lblkno, but not both.

b_resid

Set by the driver to indicate the number of bytes that were not transferred because of an error. See Example 11–8 for an example of setting b_resid. The b_resid member is overloaded: it is also used by disksort(9F).

b_error

Set to an error number by the driver when a transfer error occurs. It is set in conjunction with the b_flags B_ERROR bit. See Intro(9E) for details regarding error values. Drivers should use bioerror(9F) rather than setting b_error directly.

b_private

For exclusive use by the driver to store driver-private data.

b_edev

Contains the device number of the device involved in the transfer.

bp_mapin Structure

When a buf structure pointer is passed into the device driver's strategy(9E) routine, the data buffer referred to by b_un.b_addr is not necessarily mapped in the kernel's address space. This means that the driver cannot directly access the data. Most block-oriented devices have DMA capability, and therefore do not need to access the data buffer directly. Instead, they use the DMA mapping routines to allow the device's DMA engine to do the data transfer. For details about using DMA, see Chapter 8, Direct Memory Access (DMA).

If a driver needs to directly access the data buffer (as opposed to having the device access the data), it must first map the buffer into the kernel's address space using bp_mapin(9F). bp_mapout(9F) should be used when the driver no longer needs to access the data directly.


Caution – Caution –

bp_mapout(9F) should only be called on buffers that have been allocated and are owned by the device driver. It must not be called on buffers passed to the driver through the strategy(9E) entry point (for example a file system). Because bp_mapin(9F) does not keep a reference count, bp_mapout(9F) will remove any kernel mapping that a layer above the device driver might rely on.