Writing Device Drivers

Device Power Management Model

The following sections describe the details of the device power management model. This model includes the following elements:

Components

In the power management model, each device is composed of zero or more power-manageable components. If a device has no components, then the device is not power manageable.

Components correspond to parts of the device that can be put into a state that requires less power than normal. The definition of which components a device implements depends on the device driver writer, with one exception: component zero must represent all parts of the device that have hardware state that would be lost if power were to be completely removed from the device.

The device driver notifies the system of the device components by calling pm_create_components(9F) in its attach(9E) entry point as part of driver initialization.

Idleness

Each component of a device may be in one of two states: busy or idle. The device driver notifies the framework of changes in the device state by calling pm_busy_component(9F) and pm_idle_component(9F).

Power Levels

The current implementation of the Device Power Management framework only keeps track of two power levels for each device, its current power level and its normal power level. The normal power level of a component is the power level required for normal operation of the component, and is the power level to which the component is returned by the framework when the device is needed. The device driver informs the framework of the normal power level of the component by calling pm_set_normal_power(9F).

The current power level of the component is the power level at which the component is currently operating. The device driver should ensure that the component is set to the normal power level at initialization time. The framework assumes that when a device attaches it will be operating at its normal power level until the framework power manages it.

Power-level values that represent power on states must be positive integers greater than zero. A value of zero means the device has been set to the lowest operating power available.

Dependency

A device component might depend on one or more other devices. A device component depends on another device if the component can be powered off only when all the components of all the devices it depends on are also powered off. For example, the component of the frame buffer device that represents the monitor depends on the mouse and keyboard devices. The frame buffer monitor component can thus only be powered off when both the mouse and keyboard devices are powered off.

The power.conf(4) file specifies the dependencies among devices.

Policy

The power.conf(4) file lists the devices that may be powered off and specifies dependencies between devices. Associated with each component of a device is a threshold of idle time. The threshold for each power-manageable device component is also specified in the power.conf(4) file.

The system checks the state of each device specified in power.conf(4). When a component has been idle for threshold seconds and all the dependents of the device are powered off, that component of the device is set to power level zero.

Device Power Management Interfaces

A device driver that supports a device with power-manageable components must notify the system of the existence of these components and their normal power values, and notify the system of the component state transitions from idle to busy and vice versa.

The notification of the existence of the components and their normal power values is typically done in the driver's attach(9E) entry point as part of driver initialization. The following interfaces handle creating and destroying device components and setting and getting the normal power levels of device components.

pm_create_components()

	int pm_create_components(dev_info_t *dip, int components);

pm_create_components(9F) notifies the system that the device indicated by dip has the number of components indicated by components. This function is called in the attach(9E) routine of the device driver.

pm_destroy_components()

	void pm_destroy_components(dev_info_t *dip);

pm_destroy_components(9F) removes all the components associated with the device indicated by dip from the system. This function is called in the detach(9E) routine.

pm_set_normal_power()

	pm_set_normal_power(dev_info_t *dip, int component, int level);

pm_set_normal_power(9F) sets the normal power level for the specified component. Whenever the system turns the component on again, it calls into the driver to set the current power level to normal power level.

pm_get_normal_power()

	pm_get_normal_power(dev_info_t *dip, int component);

pm_get_normal_power(9F) retrieves the current setting of the normal power level for a component.

Busy-Idle State Transitions

The driver must keep the framework informed of device state transitions from idle to busy or busy to idle. Where these transitions happen is entirely device specific. Some components are created and marked busy and never change. Some are created and never marked busy (components created by pm_create_components(9F) are created in an idle state). For example, a frame buffer currently supports two components: component 0 represents the frame buffer electronics and is always busy, and component 1 represents the monitor and is always idle (but dependent on the keyboard and mouse).


Note -

Component 0 represents the state of the device that would be lost if power is removed.


Some devices, such as the keyboard and mouse, are never marked busy but have their idle time reset each time a keystroke or mouse event is processed. The transitions from idle to busy and from busy to idle depend on the nature of the device and the abstraction represented by the specific component. For example, SCSI disk target drivers typically export a single component, which represents whether the SCSI target disk drive is spun up or not. It is marked busy whenever there is an outstanding request to the drive and idle when the last queued request finishes.

The following interfaces notify the Power Management framework of busy-idle state transitions.

pm_busy_component()

	int pm_busy_component(dev_info_t *dip, int component);

pm_busy_component(9F) marks the component as busy.

While the component is busy, it will not be powered off. If the component is already powered off, then marking it busy doesn't change its power level. The driver needs to call ddi_dev_is_needed(9F) for this purpose. Calls to pm_busy_component(9F) are stacked and require a corresponding number of calls to pm_idle_component(9F) to idle the component.

pm_idle_component()

	int pm_idle_component(dev_info_t *dip, int component);

pm_idle_component(9F) marks component as idle. An idle component is subject to being powered off.

pm_idle_component(9F) must be called once for each call to pm_busy_component(9F) in order to idle the component.

Device Power State Transitions

A device driver can call ddi_dev_is_needed(9F) to request that a component be set to a given power level. This is necessary before using a component that has been powered off. For example, a SCSI disk target driver's read(9E) or write(9E) routine might need to spin up the disk if it had been powered off before completing the read or write. ddi_dev_is_needed(9F) notifies the Power Management framework of device state transitions.

ddi_dev_is_needed()

	int ddi_dev_is_needed(dev_info_t *dip, int component,
 						 int level);

ddi_dev_is_needed(9F) is called when the driver discovers that a component needed for some operation has been powered off. This interface arranges for the driver to be called to set the current power level of the component to the level specified in the request. All the devices that depend on this component are also brought back to normal power by this call.

When a component has been powered off by pm(7D) and a request or interrupt occurs that requires the component to be powered up, the driver must call ddi_dev_is_needed(9F) so that the framework can restore the component (and all of the devices that depend on it) to normal power.

Entry Points Used by Device Power Management

The Power Management framework uses the following entry points:

power()

	int xxpower(dev_info_t *dip, int component, int level);

The system calls the power(9E) entry point (either directly or as a result of a call to ddi_dev_is_needed(9F)) when it determines that a component's current power level needs to be changed. The action taken by this entry point is device driver specific. In the example of the SCSI target disk driver mentioned previously, setting the power level of the component to 0 results in sending a SCSI command to spin down the disk, while setting the power level to the normal power level results in sending a SCSI command to spin up the disk. Example 8-1 shows a sample power(9E) routine.


Example 8-1 power(9E) Routine

int
xxpower(dev_info_t *dip, int component, int level)
{
  	struct xxstate *xsp;
	   int	instance;
	   instance = ddi_get_instance(dip);
  	xsp = ddi_get_soft_state(statep, instance);

	   /*
	   * Make sure that the request is valid
  	*/
 	if (xx_valid_power_level(component, level))
	    	return (DDI_FAILURE);

 	mutex_enter(&xsp->mu);
 	if (xsp->xx_power_level[component] != level) {
	    	device- and component-specific setting of power level.
	    	xsp->xx_power_level[component] = level;
 	}
 	mutex_exit(&xsp->mu);
 	return (DDI_SUCCESS);
}

detach()

	int detach(dev_info_t *dip, ddi_detach_cmd_t cmd);

Before the system sets component 0 (entire device) to power level 0, it calls the driver's detach(9E) entry point with a detach command of DDI_PM_SUSPEND to allow the driver to save all hardware state to memory.

If the device is busy and has outstanding operations, it should fail the detach(9E) call. The framework will try again later after the device has been idle for its threshold time. Otherwise, the driver must arrange to block all subsequent accesses to the hardware until the device has been resumed (which the driver can initiate by calling ddi_dev_is_needed(9F)), and save all hardware state to memory.

Example 8-2 shows an example of a detach(9E) routine with DDI_PM_SUSPEND implemented.


Example 8-2 detach(9E) Routine Showing the Use of DDI_PM_SUSPEND

int
xxdetach(devinfo_t *dip, ddi_detach_cmd_t cmd)
{
 	struct xxstate *xsp;
 	int	instance;
 	instance = ddi_get_instance(dip);
 	xsp = ddi_get_soft_state(statep, instance);

 	switch (cmd) {
 	case DDI_DETACH:
	   	see chapter 5, Autoconfiguration, for discussion

		case DDI_SUSPEND:
	    	see Example 8-4		case DDI_PM_SUSPEND:
		   /*
		    * We won't be called with DDI_PM_SUSPEND when already called
		    * with DDI_SUSPEND.
		    */
	    	 mutex_enter(&xsp->mu);
	    	 if (xsp->xx_busy) {
			      mutex_exit(&xsp->mu);
			      return(DDI_FAILURE);
		    }

	    	 xsp->xx_pm_suspended = 1; 
	    	 Save device register contents into xsp->xx_device_state
	
	    	this section is optional, only needed if the driver maintains a running 	
	        	timeout (but be sure to drop the  mutex in any case)
	    	 /* cancel timeouts */
	    	 if (xsp->xx_timeout_id) {
			     timeout_id_t temp_timeout_id = xsp->xx_timeout_id;

			     xsp->xx_timeout_id = 0;
			     mutex_exit(&xsp->mu);
			     untimeout(temp_timeout_id);
	    	 } else {
			     mutex_exit(&xsp->mu);
	    	 }
	   	 return(DDI_SUCCESS);

 	default:
	    	 return(DDI_FAILURE);
 	}
}

attach()

	int attach(dev_info_t *dip, ddi_attach_cmd_t cmd);

When a device that has been suspended is needed again, its power(9E) entry point is called to restore the power level of component 0 (entire device) to its normal power. The driver's attach(9E) entry point is then called with an attach command value of DDI_PM_RESUME to restore the device hardware state saved in the detach(9E) routine and unblock any pending operations. Example 8-3 shows an attach(9E) routine with DDI_PM_RESUME implemented.


Example 8-3 attach(9E) Showing the Use of DDI_PM_RESUME

int
xxattach(devinfo_t *dip, ddi_attach_cmd_t cmd)
{
 	struct xxstate *xsp;
 	int	instance;

 	instance = ddi_get_instance(dip);
 	xsp = ddi_get_soft_state(statep, instance);

 	switch (cmd) {
 	case DDI_ATTACH:
	   	see chapter 5, Autoconfiguration for discussion

 	case DDI_RESUME:
	   	see Example 8-5 for DDI_RESUME implementation

 	case DDI_PM_RESUME:
		   /*
		    * We won't be DDI_PM_RESUMEd while DDI_SUSPENDed
		    */
	   	 mutex_enter(&xsp->mu);
	   	 Restore device register contents from xsp->xx_device_state
		
	      	this section is optional, only needed if the driver maintains a running timeout
	   	 /* restart timeouts */
	   	 xsp->xx_timeout_id = timeout({...});
		
	   	 xsp->xx_pm_suspended = 0;	/* allow new operations */
	   	 cv_broadcast(&xsp->cv);
	   	 mutex_exit(&xsp->mu);
	   	 return(DDI_SUCCESS);

 	default:
	   	return(DDI_FAILURE);
 	}
}