Writing Device Drivers

Device Configuration

Each driver must provide the following entry points that are used by the kernel for device configuration:

Every device driver must have an attach(9E) and getinfo(9E) routine. probe(9E) is only required for non self-identifying devices. For self-identifying devices an explicit probe routine may be provided or nulldev(9F) may be specified in the dev_ops structure for the probe(9E) entry point.

Note -

The identify(9E) entry point is obsolete and is no longer required.

The driver is single-threaded on a per-device basis when the kernel calls these entry points for autoconfiguration, with the exception of getinfo(9E). The kernel may be in a multithreaded state when calling getinfo(9E), which can occur at any time. No calls to attach(9E) will occur on the same device concurrently. However, calls to attach(9E) on different devices that the driver handles might occur concurrently.

Instance Numbers

The system assigns an instance number to each device. The driver may not reliably predict the value of the instance number assigned to a particular device. The driver should retrieve the particular instance number that has been assigned by calling ddi_get_instance(9F), as shown in Example 5-4.

Instance numbers are derived in an implementation-specific manner from different properties for the different device types. The following properties are used to derive instance numbers:

The reg property is used for SBus, PCI, VMEbus, ISA, EISA, and MCA devices. Non-self-identifying device drivers provide this property in the hardware configuration file. See sbus(4), pci(4), isa(4), and vme(4).
The target and lun properties are used for SCSI target devices. These are provided in the hardware configuration file. See scsi(4).
The instance property is used for pseudo-devices. This is provided in the hardware configuration file. See pseudo(4).

Persistent Instances

Once an instance number has been assigned to a particular physical device by the system, it stays the same even across reconfiguration and reboot. Because of this, instance numbers seen by a driver may not appear to be in consecutive order.

`identify()`

The identify(9E) entry point is obsolete and is no longer required. identify(9E) was used to determine whether a driver accessed the device pointed to by dip. identify(9E) is currently supported only to provide backward compatibility with older drivers and should not be implemented. A driver should specify nulldev(9F) for this entry point in the dev_ops(9S) structure.

`probe()`

This entry point is not required for self-identifying devices such as SBus or PCI devices. nulldev(9F) may be used instead.

For non-self-identifying devices (see "Device Identification") this entry point should determine whether the hardware device is present on the system and return:

DDI_PROBE_SUCCESS if the probe was successful

DDI_PROBE_FAILURE if the probe failed

DDI_PROBE_DONTCARE if the probe was unsuccessful, yet attach(9E) should still be called

DDI_PROBE_PARTIAL if the instance is not present now, but may be present in the future

For a given device instance, attach(9E) will not be called before probe(9E) has succeeded at least once on that device.

It is important that probe(9E) free all the resources it allocates, because it may be called multiple times; however, attach(9E) will not necessarily be called even if probe(9E) succeeds.

For probe to determine whether the instance of the device is present, probe(9E) may need to do many of the things also commonly done by attach(9E). In particular, it may need to map the device registers.

Probing the device registers is device specific. The driver probably has to perform a series of tests of the hardware to assure that the hardware is really there. The test criteria must be rigorous enough to avoid misidentifying devices. It may, for example, appear that the device is present when in fact it is not, because a different device appears to behave like the expected device.

When the driver's probe(9E) routine is called, it does not know whether the device being probed exists on the bus. Therefore, it is possible that the driver may attempt to access device registers for a nonexistent device. A bus fault may be generated on some buses as a result.

Buses such as ISA, EISA, and MCA do not generate bus faults as a result of such accesses. Example 5-2 is a sample probe(9E) routine for devices on these buses.

Example 5-2 probe(9E) Routine

static int
xxprobe(dev_info_t *dip)
{
	int			instance;
	caddr_t 			reg_addr;
	ddi_acc_handle_t 	data_access_handle;

	/* define device access attributes */
	ddi_device_acc_attr_t access_attr = {
		  DDI_DEVICE_ATTR_V0,
	  	DDI_STRUCTURE_BE_ACC,
	  	DDI_STRICTORDER_ACC
	};
	if (ddi_dev_is_sid(dip) == DDI_SUCCESS) /* no need to probe */
	  	return (DDI_PROBE_DONTCARE);
	instance = ddi_get_instance(dip);										/* assigned instance */
	if (ddi_intr_hilevel(dip, inumber)) {
	  	cmn_err(CE_CONT,
			"?xx driver does not support high level interrupts."
			" Probe failed.");
	  	return (DDI_PROBE_FAILURE);
	}

	/* Map device registers and try to contact device.*/
	if (ddi_regs_map_setup(dip, rnumber, &reg_addr, offset, len,
			&access_attr, &data_access_handle) != DDI_SUCCESS)
	  	return (DDI_PROBE_FAILURE);
	if (ddi_get8(data_access_handle, (uint8_t *)reg_addr) !=
 			some_value)
	  	goto failed;
	free allocated resources
	ddi_regs_map_free(&data_access_handle);
	if (device is present and ready for attach)
	    	return (DDI_PROBE_SUCCESS);
	else if (device is present but not ready for attach)
	    	return (DDI_PROBE_PARTIAL);
	else		/* device is not present */
	    	return (DDI_PROBE_FAILURE);
failed:
	free allocated resources
	ddi_regs_map_free(&data_access_handle);
	return (DDI_PROBE_FAILURE);
}

The string printed in the high-level interrupt case begins with a `?' character. This causes the message to be printed only if the kernel was booted with the verbose (-v) flag. (See kernel(1M)). Otherwise the message only goes into the message log, where it can be seen by running dmesg(1M).

ddi_dev_is_sid(9F) may be used in a driver's probe(9E) routine to determine if the device is self-identifying. This is useful in drivers written for self-identifying and non-self-identifying versions of the same device.

For VME device drivers, a fault may occur as a result of attempting to access device registers for a device that is not present. In this case, the ddi_peek(9F) and ddi_poke(9F) family of routines must be used to access the device registers. Example 5-3 shows a probe(9E) routine that uses ddi_peek(9F) and ddi_poke(9F) to check for the existence of the device.

Example 5-3 probe(9E) Routine Using ddi_peek(9F)

static int
xxprobe(dev_info_t *dip)
{
	int			instance;
	caddr_t 			reg_addr;
	if (ddi_dev_is_sid(dip) == DDI_SUCCESS) /* no need to probe */
	    	return (DDI_PROBE_DONTCARE);
	instance = ddi_get_instance(dip);										/* assigned instance */
	if (ddi_intr_hilevel(dip, inumber)) {
	   	cmn_err(CE_CONT,
			"?xx driver does not support high level interrupts."
			" Probe failed.");
	   	return (DDI_PROBE_FAILURE);
	}
	/*
	 * Map device registers and try to contact device.
	 */
	if (ddi_regs_map_setup(dip, rnumber, &reg_addr, offset, len,
			&access_attr, &data_access_handle) != DDI_SUCCESS)
	  	return (DDI_PROBE_FAILURE);
	if (ddi_peek8(dip, reg_addr, NULL) != DDI_SUCCESS)
	    	goto failed;
	free allocated resources
ddi_regs_map_free(&data_access_handle);
	if (device is present and ready for attach)
	    	return (DDI_PROBE_SUCCESS);
	else if (device is present but not ready for attach)
	    	return (DDI_PROBE_PARTIAL);
	else		/* device is not present */
	    	return (DDI_PROBE_FAILURE);
	failed:
	free allocated resources
	ddi_regs_map_free(&data_access_handle);
	return (DDI_PROBE_FAILURE);
}

In this example, ddi_regs_map_setup(9F) is used to map the device registers. ddi_peek8(9F) reads a single character from the location reg_addr.

`attach()`

The system calls attach(9E) to attach a device instance to the system or to resume operation after power has been suspended. attach(9E) should handle the following commands:

DDI_ATTACH - attach(9E) is called with DDI_ATTACH to initialize the device instance.
DDI_PM_RESUME - attach(9E) is called with DDI_PM_RESUME to restore the hardware state of a device when the device has been suspended.
DDI_RESUME - attach(9E) is called with DDI_RESUME to restore the hardware state of a device when the entire system has been suspended.

Only the DDI_ATTACH command is discussed in this section. For information on DDI_PM_RESUME and DDI_RESUME, see Chapter 8, Power Management.

Note that attach(9E) is single-threaded when processing the DDI_ATTACH command, but is not single-threaded when processing the DDI_RESUME or DDI_PM_RESUME commands.

The responsibilities of the DDI_ATTACH case of attach(9E) include:

Optionally allocating a soft state structure for the instance
Registering an interrupt handler
Mapping device registers
Initializing per-instance mutexes and condition variables
Creating minor device nodes for the instance
Creating power-manageable components

Example 5-4 provides an example of an attach(9E) routine.

Example 5-4 attach(9E) Routine

static int
xxattach(dev_info_t *dip, ddi_attach_cmd_t cmd)
{
	struct xxstate *xsp;
	int	instance;

	/* define device access attributes */
	ddi_device_acc_attr_t access_attr = {
	    	DDI_DEVICE_ATTR_V0,
	    	DDI_STRUCTURE_BE_ACC,
	    	DDI_STRICTORDER_ACC
	};
	switch (cmd) {
	case DDI_ATTACH:

		/* get assigned instance number */
		instance = ddi_get_instance(dip);
		if (ddi_soft_state_zalloc(statep, instance) != 0)
			return (DDI_FAILURE);
		xsp = ddi_get_soft_state(statep, instance);

		/* retrieve interrupt block cookie */
		if (ddi_get_iblock_cookie(dip, inumber,
				&xsp->iblock_cookie) != DDI_SUCCESS) {
			ddi_soft_state_free(statep, instance);
			return (DDI_FAILURE);
		}
		/* initialize locks. Note that mutex_init wants a */
 	/* ddi_iblock_cookie, not the _address_ of one, */
		/* as the fourth argument.*/
		mutex_init(&xsp->mu, "xx mutex", MUTEX_DRIVER,
			(void *)xsp->iblock_cookie);
		cv_init(&xsp->cv, "xx cv", CV_DRIVER, NULL);

		/* set up interrupt handler for the device */
		if (ddi_add_intr(dip, inumber, NULL,
			&xsp->idevice_cookie, NULL, intr_handler,intr_handler_arg)
			!= DDI_SUCCESS) {
			ddi_soft_state_free(statep, instance);
			return (DDI_FAILURE);
		}
		/* map device registers */
		if (ddi_regs_map_setup(dip, rnumber, &xsp->regp, offset, 
			sizeof(struct device_reg), &access_attr,
			&xsp->data_access_handle) != DDI_SUCCESS) {
			ddi_remove_intr(dip, inumber, xsp->iblock_cookie);
			ddi_soft_state_free(statep, instance);
			return (DDI_FAILURE);
		}
		xsp->dip = dip;
   initialize the rest of the software state structure;
		make device quiescent;				/* device-specific */

		/*
		 * for devices with programmable bus interrupt level
		 */
		program device interrupt level using xsp->idevice_cookie;
		if device has power manageable components, then include the following statement:
		if (pm_create_components(dip, num_components) != DDI_SUCCESS)
			goto failed;
		for (i = 0; i< num_components; i++) {
			if (pm_idle_component(dip, i) == DDI_FAILURE)
				goto failed;
		}
		/* If the driver manages devices with "remote" hardware,
		 * suspend/resume will not be called unless requested, by
		 * setting the "pm_hardware_state" property to the value
		 * "needs_suspend_resume".
		 */
		if (ddi_prop_update_string (ddi_dev_t_none, dip, 
			"pm_hardware_state", "needs_suspend_resume") != DDI_PROP_SUCCESS) {
			goto failed;
		}
		if (ddi_create_minor_node(dip, "minor name", S_IFCHR,
			minor_number, node_type, 0) != DDI_SUCCESS)
			goto failed;
	
		initialize driver data, prepare for a later open of the device; 
   /*device-specific */
		ddi_report_dev(dip);
		return (DDI_SUCCESS);

	case DDI_PM_RESUME:
		For information, see Chapter 8, Power Management
	case DDI_RESUME:
		For information, see Chapter 8, Power Management	
   default:
		return (DDI_FAILURE);
	}
failed:
	free allocated resources
	ddi_regs_map_free(&xsp->data_access_handle);
	ddi_remove_intr(dip, inumber, xsp->iblock_cookie);
	pm_destroy_components(dip);
	cv_destroy(&xsp->cv);
	mutex_destroy(&xsp->mu);
	ddi_soft_state_free(statep, instance);
	return (DDI_FAILURE);
}

During the autoconfiguration process, attach(9E) checks for the DDI_ATTACH command and then calls ddi_get_instance(9F) to get the instance number the system has assigned to the dev_info node indicated by dip. Since the driver must be able to return a pointer to its dev_info node for each instance, attach(9E) must save dip, usually in a field of a per-instance state structure.

If any of the resource allocation routines fail, the code at the failed label should free any resources that had already been allocated before returning DDI_FAILURE. This can be done with a series of checks that look like this:

	if (xsp->regp)
 		ddi_regs_map_free(&xsp->data_access_handle);

There should be such a check and a deallocation operation for each allocation operation that may have been performed.

Note also that drivers should return DDI_FAILURE for all commands they do not recognize.

Registering Interrupts Overview

In the call to ddi_add_intr(9F), inumber specifies which of several possible interrupt specifications is to be handled by intr_handler. For example, if the device interrupts at only one level, pass 0 for inumber. The interrupt specifications being referred to by inumber are described by the interrupts property (see driver.conf(4), isa(4), eisa(4), mca(4), sysbus(4), vme(4), and sbus(4)). intr_handler is a pointer to a function, in this case xxintr()(), to be called when the device issues the specified interrupt. intr_handler_arg is an argument of type caddr_t to be passed to intr_handler. intr_handler_arg may be a pointer to a data structure representing the device instance that issued the interrupt. ddi_add_intr(9F) returns a device cookie in xsp->idevice_cookie for use with devices having programmable bus-interrupt levels. The device cookie contains the following fields:

	u_short			idev_vector;
 	u_short			idev_priority;

The idev_priority field of the returned structure contains the bus interrupt priority level, and the idev_vector field contains the vector number for vectored bus architectures such as VMEbus.

Note -

There is a potential race condition in attach(9E). The interrupt routine is eligible to be called as soon as ddi_add_intr(9F) returns. This may result in the interrupt routine being called before any mutexes have been initialized with the interrupt block cookie. If the interrupt routine acquires the mutex before it has been initialized, undefined behavior may result. See "Registering Interrupts" for a solution to this problem.

Mapping Device Registers

In the ddi_regs_map_setup(9F) call, dip is the dev_info pointer passed to attach(9E). rnumber specifies which register set to map if there is more than one. For devices with only one register set, pass 0 for rnumber. The register specifications referred to by rnumber are described by the reg property (see driver.conf(4), isa(4), eisa(4), mca(4), sysbus(4), vme(4), sbus(4), and pci(4)). ddi_regs_map_setup(9F) maps a device register set (register specification) and returns a bus address base in xsp->regp. This address is offset bytes from the base of the device register set, and the mapping extends sizeof(struct device_reg) bytes beyond that. To map all of a register set, pass zero for offset and the length.

Minor Device Nodes

A minor device node contains the information exported by the device that the system uses to create a special file for the device under /devices in the file system.

In the call to ddi_create_minor_node(9F), the minor name is the character string that is the last part of the base name of the special file to be created for this minor device number; for example, "b,raw" in "fd@1,f7200000:b,raw". S_IFCHR means create a character special file. Finally, the node type is one of the following system macros, or any string constant that does not conflict with the values of these macros (see ddi_create_minor_node(9F) for more information).

Table 5-1 Possible Node Types


Constant	Description
`DDI_NT_SERIAL`	Serial port
`DDI_NT_SERIAL_DO`	Dialout ports
`DDI_NT_BLOCK`	Hard disks
`DDI_NT_BLOCK_CHAN`	Hard disks with channel or target numbers
`DDI_NT_CD`	ROM drives (CD-ROM)
`DDI_NT_CD_CHAN`	ROM drives with channel or target numbers
`DDI_NT_FD`	Floppy disks
`DDI_NT_TAPE`	Tape drives
`DDI_NT_NET`	Network devices
`DDI_NT_DISPLAY`	Display devices
`DDI_NT_MOUSE`	Mouse
`DDI_NT_KEYBOARD`	Keyboard
`DDI_PSEUDO`	General pseudo devices

The node types DDI_NT_BLOCK, DDI_NT_BLOCK_CHAN, DDI_NT_CD, and DDI_NT_CD_CHAN cause disks(1M) to identify the device instance as a disk and to create a symbolic link in the /dev/dsk or /dev/rdsk directory pointing to the device node in the /devices directory tree.

The node type DDI_NT_TAPE causes tapes(1M) to identify the device instance as a tape and to create a symbolic link from the /dev/rmt directory to the device node in the /devices directory tree.

The node type DDI_NT_SERIAL causes ports(1M) to identify the device instance as a serial port and to create symbolic links from the /dev/term and /dev/cua directories to the device node in the /devices directory tree and to add a new entry to /etc/inittab.

Vendor-supplied strings should include an identifying value to make them unique, such as their name or stock symbol (if appropriate). The string (along with the other node types not consumed by disks(1M), tapes(1M), or ports(1M)) can be used in conjunction with devlinks(1M) and devlink.tab(4) to create logical names in /dev.

Deferred Attach

open(9E) might be called before attach(9E) has succeeded. open(9E) must then return ENXIO, which will cause the system to attempt to attach the device. If the attach succeeds, the open is retried automatically.

`detach()`

detach(9E) handles the following commands:

DDI_DETACH - detach(9E) is called with DDI_DETACH for each device instance when the system attempts to unload a driver module.
DDI_PM_SUSPEND - detach(9E) is called with DDI_PM_SUSPEND to suspend activity of a device before power is removed. This command is issued when the device is being suspended.
DDI_SUSPEND - detach(9E) is called with DDI_SUSPEND to suspend activity of a device before power is removed. This command is issued when the entire system is being suspended.

This section discusses only the DDI_DETACH command. For information on DDI_PM_SUSPEND and DDI_SUSPEND, see Chapter 8, Power Management. Note that detach(9E) is single-threaded when processing the DDI_DETACH command, but is not single-threaded when processing the DDI_SUSPEND or DDI_PM_SUSPEND commands.

When processing the DDI_DETACH command, detach(9E) is the inverse operation of attach(9E). The main purpose of the DDI_DETACH case of detach(9E) is to free resources allocated by attach(9E) for the specified device. For example, detach(9E) should unmap any mapped device registers, remove any interrupts registered with the system, and free the soft state structure for this device instance.

The system calls the DDI_DETACH case of detach(9E) for a device instance only if the device instance is not open. No calls to other driver entry points for that device instance occur during detach(9E), although interrupts and time-outs may occur.

If the detach(9E) routine entry in the dev_ops(9S) structure is initialized to nodev, it implies that detach(9E) always fails, and the driver will not be unloaded. This is the simplest way to specify that a driver is not unloadable.

Example 5-5 detach(9E) Routine

static int
xxdetach(dev_info_t *dip, ddi_detach_cmd_t cmd)
{
	struct xxstate *xsp;
	int		instance;

	switch (cmd) {
	case DDI_DETACH:
	   	instance = ddi_get_instance(dip);
	   	xsp = ddi_get_soft_state(statep, instance);
		    make device quiescent;							/* device-specific */
	    	ddi_remove_minor_node(dip, NULL);
	    	pm_destroy_components(dip);
	    	ddi_regs_map_free(&xsp->data_access_handle);
	    	ddi_remove_intr(dip, inumber, xsp->iblock_cookie);
	    	mutex_destroy(&xsp->mu);
	    	cv_destroy(&xsp->cv);
	    	ddi_soft_state_free(statep, instance);
	    	return (DDI_SUCCESS);
	case DDI_PM_SUSPEND:
	    	For information, see Chapter 8, Power Management	case DDI_SUSPEND:
	    	For information, see Chapter 8, Power Management	default:
	    	return (DDI_FAILURE);
	}
}

In the call to ddi_regs_map_free(9F), xsp->data_access_handle is the data access handle previously allocated by the call to ddi_regs_map_setup(9F) in attach(9E). Similarly, in the call to ddi_remove_intr(9F), inumber is the same value that was passed to ddi_add_intr(9F).

Callbacks

The detach(9E) routine must not return DDI_SUCCESS while it has callback functions pending. This is critical only for callbacks registered for device instances that are not currently open, since the DDI_DETACH case is not entered if the device is open.

There are two types of callback routines of interest: callbacks that can be canceled, and callbacks that must run to completion. Callbacks that can be canceled do not pose a problem; the driver should cancel the callback before detach(9E) returns DDI_SUCCESS. Each of the callback cancellation routines in Table 5-2 atomically cancels callbacks so that a callback routine does not run while it is being canceled.

Table 5-2 Example of Functions With Cancelable Callbacks


Function	Canceling Function
timeout(9F)	untimeout(9F)
bufcall(9F)	unbufcall(9F)
esbbcall(9F)	unbufcall(9F)

Some callbacks cannot be canceled--for these it is necessary to wait until the callback has been called. In some cases, such as ddi_dma_buf_bind_handle(9F), the callback must also be prevented from rescheduling itself. See "Canceling DMA Callbacks" for an example.

Following is a list of some functions that may establish callbacks that cannot be canceled:

esballoc(9F)
ddi_dma_addr_bind_handle(9F)
ddi_dma_buf_bind_handle(9F)
scsi_init_pkt(9F)

`getinfo()`

The system calls getinfo(9E) to obtain configuration information that only the driver knows. The mapping of minor numbers to device instances is entirely under the control of the driver. The system sometimes needs to ask the driver which device a particular dev_t represents.

getinfo(9E) is called during module loading and at other times during the life of the driver. It can take one of two commands as its infocmd argument: DDI_INFO_DEVT2INSTANCE, which asks for a device's instance number, and DDI_INFO_DEVT2DEVINFO, which asks for pointer to the device's dev_info structure.

In the DDI_INFO_DEVT2INSTANCE case, arg is a dev_t, and getinfo(9E) must translate the minor number to an instance number. In the following example, the minor number is the instance number, so it simply passes back the minor number. In this case, the driver must not assume that a state structure is available, since getinfo(9E) may be called before attach(9E). The mapping the driver defines between minor device number and instance number does not necessarily follow the mapping shown in the example. In all cases, however, the mapping must be static.

In the DDI_INFO_DEVT2DEVINFO case, arg is again a dev_t, so getinfo(9E) first decodes the instance number for the device. It then passes back the dev_info pointer saved in the driver's soft state structure for the appropriate device. This is shown in Example 5-6.

Example 5-6 getinfo(9E) Routine

static int
xxgetinfo(dev_info_t *dip, ddi_info_cmd_t infocmd, void *arg,
  void **result)
{
	struct xxstate *xsp;
	dev_t		dev;
	int		instance, error;
	switch (infocmd) {
	case DDI_INFO_DEVT2INSTANCE:
	    	dev = (dev_t)arg;
	    	*result = (void *)getminor(dev);
	    	error = DDI_SUCCESS;
	     	break;

	case DDI_INFO_DEVT2DEVINFO:
	    	dev = (dev_t)arg;
	    	instance = getminor(dev);
	    	xsp = ddi_get_soft_state(statep, instance);
	    	if (xsp == NULL)
			    return (DDI_FAILURE);
	    	*result = (void *)xsp->dip;
	    	error = DDI_SUCCESS;
	     	break;

	default:
	    	error = DDI_FAILURE;
	    	break;
	}
	return (error);
}