Writing Device Drivers

Device Power Management Interfaces

A device driver that supports a device with power-manageable components must notify the system of the existence of these components and the power levels that they support by creating a pm-components(9P) property. This is typically done from the driver's attach(9E) entry point by calling ddi_prop_update_string_array(9F), but may be done from a driver.conf(4) file instead. See the pm-components(9P) man page for details.

Busy-Idle State Transitions

The driver must keep the framework informed of device state transitions from idle to busy or busy to idle. Where these transitions happen is entirely device-specific. The transitions from idle to busy and from busy to idle depend on the nature of the device and the abstraction represented by the specific component. For example, SCSI disk target drivers typically export a single component, which represents whether the SCSI target disk drive is spun up or not. It is marked busy whenever there is an outstanding request to the drive and idle when the last queued request finishes. Some components are created and never marked busy (components created by pm-components(9P) are created in an idle state).

The pm_busy_component(9F) and pm_idle_component(9F) interfaces notify the power management framework of busy-idle state transitions. The syntax for pm_busy_component(9F) is:

    int pm_busy_component(dev_info_t *dip, int component);

pm_busy_component(9F) marks component as busy. While the component is busy, it will not be powered off. If the component is already powered off, then marking it busy doesn't change its power level. The driver needs to call pm_raise_power(9F) for this purpose. Calls to pm_busy_component(9F) are cumulative and require a corresponding number of calls to pm_idle_component(9F) to idle the component.

The syntax for pm_idle_component(9F) is:

int pm_idle_component(dev_info_t *dip, int component);

pm_idle_component(9F) marks component as idle. An idle component is subject to being powered off. pm_idle_component(9F) must be called once for each call to pm_busy_component(9F) in order to idle the component.

Device Power State Transitions

A device driver can call pm_raise_power(9F) to request that a component be set to at least a given power level. This is necessary before using a component that has been powered off. For example, a SCSI disk target driver's read(9E) or write(9E) routine might need to spin up the disk before completing the read or write, if the disk has already been powered off. pm_raise_power(9F) requests the power management framework to initiate a device power state transition to a higher power level. Normally, reductions in component power levels are initiated by the framework. However, a device driver should call pm_lower_power(9F) when detaching, in order to reduce the power consumption of unused devices as much as possible.

Powering down can pose risks for some devices. For example, some tape drives damage tapes when power is removed; likewise, some disk drives have a limited tolerance for power cycles, since each cycle results in a head landing. Such devices should export the no-involuntary-power-cycles(9P) property to notify the system that all power cycles for the device must be under control of a device driver. This prevents power from being removed from a device while the device driver is detached, unless the device was powered off by a driver's call to pm_lower_power(9F).

pm_raise_power(9F) is called when the driver discovers that a component needed for some operation is at a power level less than is needed for that operation. This interface arranges for the driver to be called to raise the current power level of the component at least to the level specified in the request. All the devices that depend on this device are also brought back to full power by this call.

pm_lower_power(9F) is called when the device is detaching, once access to the device is no longer needed. It should be called for each component to set each component to its lowest power so that the device uses as little power as possible while it is not in use. The syntax for pm_lower_power(9F) is the same as that for pm_raise_power(9F).

pm_power_has_changed(9F) is called to notify the framework when a device has made a power transition on its own, or to inform the framework of the power level of a device, for example, after a suspend-resume operation. The syntax for pm_power_has_changed(9F) is the same as that for pm_raise_power(9F).