Writing Device Drivers

Error Handling

This section describes how to use I/O fault services APIs to handle errors within a driver. This section discusses how drivers should indicate and initialize their fault management capabilities, generate error reports, and register the driver's error handler routine.

Excerpts are provided from source code examples that demonstrate the use of the I/O fault services API from the Broadcom 1Gb NIC driver, bge. Follow these examples as a model for how to integrate fault management capability into your own drivers. Take the following steps to study the complete bge driver code:

Drivers that have been instrumented to provide FMA error report telemetry detect errors and determine the impact of those errors on the services provided by the driver. Following the detection of an error, the driver should determine when its services have been impacted and to what degree.

An I/O driver must respond immediately to detected errors. Appropriate responses include:

Errors detected by the driver are communicated to the fault management daemon as an ereport. An ereport is a structured event defined by the FMA event protocol. The event protocol is a specification for a set of common data fields that must be used to describe all possible error and fault events, in addition to the list of suspected faults. Ereports are gathered into a flow of error telemetry and dispatched to the diagnosis engine.

Declaring Fault Management Capabilities

A hardened device driver must declare its fault management capabilities to the I/O Fault Management framework. Use the ddi_fm_init(9F) function to declare the fault management capabilities of your driver.

void ddi_fm_init(dev_info_t *dip, int *fmcap, ddi_iblock_cookie_t *ibcp)

The ddi_fm_init() function can be called from kernel context in a driver attach(9E) or detach(9E) entry point. The ddi_fm_init() function usually is called from the attach() entry point. The ddi_fm_init() function allocates and initializes resources according to fmcap. The fmcap parameter must be set to the bitwise-inclusive-OR of the following fault management capabilities:

A hardened leaf driver generally sets all these capabilities. However, if its parent nexus is not capable of supporting any one of the requested capabilities, the associated bit is cleared and returned as such to the driver. Before returning from ddi_fm_init(9F), the I/O fault services framework creates a set of fault management capability properties: fm-ereport-capable, fm-accchk-capable, fm-dmachk-capable and fm-errcb-capable. The currently supported fault management capability level is observable by using the prtconf(1M) command.

To make your driver support administrative selection of fault management capabilities, export and set the fault management capability level properties to the values described above in the driver.conf(4) file. The fm-capable properties must be set and read prior to calling ddi_fm_init() with the desired capability list.

The following example from the bge driver shows the bge_fm_init() function, which calls the ddi_fm_init(9F) function. The bge_fm_init() function is called in the bge_attach() function.

static void
bge_fm_init(bge_t *bgep)
        ddi_iblock_cookie_t iblk;

        /* Only register with IO Fault Services if we have some capability */
        if (bgep->fm_capabilities) {
                bge_reg_accattr.devacc_attr_access = DDI_FLAGERR_ACC;
                dma_attr.dma_attr_flags = DDI_DMA_FLAGERR;
                 * Register capabilities with IO Fault Services
                ddi_fm_init(bgep->devinfo, &bgep->fm_capabilities, &iblk);
                 * Initialize pci ereport capabilities if ereport capable
                if (DDI_FM_EREPORT_CAP(bgep->fm_capabilities) ||
                 * Register error callback if error callback capable
                if (DDI_FM_ERRCB_CAP(bgep->fm_capabilities))
                        bge_fm_error_cb, (void*) bgep);
        } else {
                 * These fields have to be cleared of FMA if there are no
                 * FMA capabilities at runtime.
                bge_reg_accattr.devacc_attr_access = DDI_DEFAULT_ACC;
                dma_attr.dma_attr_flags = 0;

Cleaning Up Fault Management Resources

The ddi_fm_fini(9F) function cleans up resources allocated to support fault management for dip.

void ddi_fm_fini(dev_info_t *dip)

The ddi_fm_fini() function can be called from kernel context in a driver attach(9E) or detach(9E) entry point.

The following example from the bge driver shows the bge_fm_fini() function, which calls the ddi_fm_fini(9F) function. The bge_fm_fini() function is called in the bge_unattach() function, which is called in both the bge_attach() and bge_detach() functions.

static void
bge_fm_fini(bge_t *bgep)
        /* Only unregister FMA capabilities if we registered some */
        if (bgep->fm_capabilities) {
                 * Release any resources allocated by pci_ereport_setup()
                if (DDI_FM_EREPORT_CAP(bgep->fm_capabilities) ||
                 * Un-register error callback if error callback capable
                if (DDI_FM_ERRCB_CAP(bgep->fm_capabilities))
                 * Unregister from IO Fault Services

Getting the Fault Management Capability Bit Mask

The ddi_fm_capable(9F) function returns the capability bit mask currently set for dip.

void ddi_fm_capable(dev_info_t *dip)

Reporting Errors

This section provides information about the following topics:

Queueing an Error Event

The ddi_fm_ereport_post(9F) function causes an ereport event to be queued for delivery to the fault manager daemon, fmd(1M).

void ddi_fm_ereport_post(dev_info_t *dip, 
                         const char *error_class, 
                         uint64_t ena, 
                         int sflag, ...)

The sflag parameter indicates whether the caller is willing to wait for system memory and event channel resources to become available.

The ENA indicates the Error Numeric Association (ENA) for this error report. The ENA might have been initialized and obtained from another error detecting software module such as a bus nexus driver. If the ENA is set to 0, it will be initialized by ddi_fm_ereport_post().

The name-value pair (nvpair) variable argument list contains one or more name, type, value pointer nvpair tuples for non-array data_type_t types or one or more name, type, number of element, value pointer tuples for data_type_t array types. The nvpair tuples make up the ereport event payload required for diagnosis. The end of the argument list is specified by NULL.

The ereport class names and payloads described in Reporting Standard I/O Controller Errors for I/O controllers are used as appropriate for error_class. Other ereport class names and payloads can be defined, but they must be registered in the Sun event registry and accompanied by driver specific diagnosis engine software, or the Eversholt fault tree (eft) rules. For more information about the Sun event registry and about Eversholt fault tree rules, see the Fault Management community on the OpenSolaris project .

bge_fm_ereport(bge_t *bgep, char *detail)
        uint64_t ena;
        char buf[FM_MAX_CLASS];
        (void) snprintf(buf, FM_MAX_CLASS, "%s.%s", DDI_FM_DEVICE, detail);
        ena = fm_ena_generate(0, FM_ENA_FMT1);
        if (DDI_FM_EREPORT_CAP(bgep->fm_capabilities)) {
                ddi_fm_ereport_post(bgep->devinfo, buf, ena, DDI_NOSLEEP,

Detecting and Reporting PCI-Related Errors

PCI-related errors, including PCI, PCI-X, and PCI-E, are automatically detected and reported when you use pci_ereport_post(9F).

void pci_ereport_post(dev_info_t *dip, ddi_fm_error_t *derr, uint16_t *xx_status)

Drivers do not need to generate driver-specific ereports for errors that occur in the PCI Local Bus configuration status registers. The pci_ereport_post() function can report data parity errors, master aborts, target aborts, signaled system errors, and much more.

If pci_ereport_post() is to be used by a driver, then pci_ereport_setup(9F) must have been previously called during the driver's attach(9E) routine, and pci_ereport_teardown(9F) must subsequently be called during the driver's detach(9E) routine.

The bge code samples below show the bge driver invoking the pci_ereport_post() function from the driver's error handler. See also Registering an Error Handler.

 * The I/O fault service error handling callback function
static int
bge_fm_error_cb(dev_info_t *dip, ddi_fm_error_t *err, const void *impl_data)
      * as the driver can always deal with an error 
      * in any dma or access handle, we can just return 
      * the fme_status value.
     pci_ereport_post(dip, err, NULL);
     return (err->fme_status);

Reporting Standard I/O Controller Errors

A standard set of device ereports is defined for commonly seen errors for I/O controllers. These ereports should be generated whenever one of the error symptoms described in this section is detected.

The ereports described in this section are dispatched for diagnosis to the eft diagnosis engine, which uses a common set of standard rules to diagnose them. Any other errors detected by device drivers must be defined as ereport events in the Sun event registry and must be accompanied by device specific diagnosis software or eft rules.


The driver has detected that the device is in an invalid state.

A driver should post an error when it detects that the data it transmits or receives appear to be invalid. For example, in the bge code, the bge_chip_reset() and bge_receive_ring() routines generate the ereport.io.device.inval_state error when these routines detect invalid data.

 * The SEND INDEX registers should be reset to zero by the
 * global chip reset; if they're not, there'll be trouble
 * later on.
sx0 = bge_reg_get32(bgep, NIC_DIAG_SEND_INDEX_REG(0));
if (sx0 != 0) {
    BGE_REPORT((bgep, "SEND INDEX - device didn't RESET"));
    bge_fm_ereport(bgep, DDI_FM_DEVICE_INVAL_STATE);
    return (DDI_FAILURE);
/* ... */
 * Sync (all) the receive ring descriptors
 * before accepting the packets they describe
if (*rrp->prod_index_p >= rrp->desc.nslots) {
    bgep->bge_chip_state = BGE_CHIP_ERROR;
    bge_fm_ereport(bgep, DDI_FM_DEVICE_INVAL_STATE);
    return (NULL);

The device has reported a self-corrected internal error. For example, a correctable ECC error has been detected by the hardware in an internal buffer within the device.

This error flag is not used in the bge driver. See the nxge_fm.c file on OpenSolaris for examples that use this error. Take the following steps to study the nxge driver code:

  • Go to OpenSolaris.

  • Click Source Browser in the upper right corner of the page.

  • Enter nxge in the File Path field.

  • Click the Search button.


The device has reported an uncorrectable internal error. For example, an uncorrectable ECC error has been detected by the hardware in an internal buffer within the device.

This error flag is not used in the bge driver. See the nxge_fm.c file on OpenSolaris for examples that use this error.


The driver has detected that data transfer has stalled unexpectedly.

The bge_factotum_stall_check() routine provides an example of stall detection.

dogval = bge_atomic_shl32(&bgep->watchdog, 1);
if (dogval < bge_watchdog_count)
    return (B_FALSE);

BGE_REPORT((bgep, "Tx stall detected, 
watchdog code 0x%x", dogval));
bge_fm_ereport(bgep, DDI_FM_DEVICE_STALL);
return (B_TRUE);

The device is not responding to a driver command.

bge_chip_poll_engine(bge_t *bgep, bge_regno_t regno,
        uint32_t mask, uint32_t val)
        uint32_t regval;
        uint32_t n;

        for (n = 200; n; --n) {
                regval = bge_reg_get32(bgep, regno);
                if ((regval & mask) == val)
                        return (B_TRUE);
        bge_fm_ereport(bgep, DDI_FM_DEVICE_NO_RESPONSE);
        return (B_FALSE);

The device has raised too many consecutive invalid interrupts.

The bge_intr() routine within the bge driver provides an example of stuck interrupt detection. The bge_fm_ereport() function is a wrapper for the ddi_fm_ereport_post(9F) function. See the bge_fm_ereport() example in Queueing an Error Event.

if (bgep->missed_dmas >= bge_dma_miss_limit) {
     * If this happens multiple times in a row,
     * it means DMA is just not working.  Maybe
     * the chip has failed, or maybe there's a
     * problem on the PCI bus or in the host-PCI
     * bridge (Tomatillo).
     * At all events, we want to stop further
     * interrupts and let the recovery code take
     * over to see whether anything can be done
     * about it ...
    goto chip_stop;

Service Impact Function

A fault management capable driver must indicate whether or not an error has impacted the services provided by a device. Following detection of an error and, if necessary, a shutdown of services, the driver should invoke the ddi_fm_service_impact(9F) routine to reflect the current service state of the device instance. The service state can be used by diagnosis and recovery software to help identify or react to the problem.

The ddi_fm_service_impact() routine should be called both when an error has been detected by the driver itself, and when the framework has detected an error and marked an access or DMA handle as faulty.

void ddi_fm_service_impact(dev_info_t *dip, int svc_impact)

The following service impact values (svc_impact) are accepted by ddi_fm_service_impact():


The service provided by the device is unavailable due to a device fault or software defect.


The driver is unable to provide normal service, but the driver can provide a partial or degraded level of service. For example, the driver might have to make repeated attempts to perform an operation before it succeeds, or it might be running at less that its configured speed.


The driver has detected an error, but the services provided by the device instance are unaffected.


All of the device's services have been restored.

The call to ddi_fm_service_impact() generates the following ereports on behalf of the driver, based on the service impact argument to the service impact routine:

In the following bge code, the driver determines that it is unable to successfully restart transmitting or receiving packets as the result of an error. The service state of the device transitions to DDI_SERVICE_LOST.

 * All OK, reinitialize hardware and kick off GLD scheduling
if (bge_restart(bgep, B_TRUE) != DDI_SUCCESS) {
    (void) bge_check_acc_handle(bgep, bgep->cfg_handle);
    (void) bge_check_acc_handle(bgep, bgep->io_handle);
    ddi_fm_service_impact(bgep->devinfo, DDI_SERVICE_LOST);
    return (DDI_FAILURE);

Note –

The ddi_fm_service_impact() function should not be called from the registered callback routine.

Access Attributes Structure

A DDI_FM_ACCCHK_CAPABLE device driver must set its access attributes to indicate that it is capable of handling programmed I/O (PIO) access errors that occur during a register read or write. The devacc_attr_access field in the ddi_device_acc_attr(9S) structure should be set as an indicator to the system that the driver is capable of checking for and handling data path errors. The ddi_device_acc_attr structure contains the following members:

ushort_t devacc_attr_version;
uchar_t devacc_attr_endian_flags;
uchar_t devacc_attr_dataorder;
uchar_t devacc_attr_access;             /* access error protection */

Errors detected in the data path to or from a device can be processed by one or more of the device driver's nexus parents.

The devacc_attr_version field must be set to at least DDI_DEVICE_ATTR_V1. If the devacc_attr_version field is not set to at least DDI_DEVICE_ATTR_V1, the devacc_attr_access field is ignored.

The devacc_attr_access field can be set to the following values:


This flag indicates the system will take the default action (panic if appropriate) when an error occurs. This attribute cannot be used by DDI_FM_ACCCHK_CAPABLE drivers.


This flag indicates that the system will attempt to handle and recover from an error associated with the access handle. The driver should use the techniques described in Defensive Programming Techniques for Solaris Device Drivers and should use ddi_fm_acc_err_get(9F) to regularly check for errors before the driver allows data to be passed back to the calling application.

The DDI_FLAGERR_ACC flag provides:

  • Error notification via the driver callback

  • An error condition observable via ddi_fm_acc_err_get(9F)


The DDI_CAUTIOUS_ACC flag provides a high level of protection for each Programmed I/O access made by the driver.

Note –

Use of this flag will cause a significant impact on the performance of the driver.

The DDI_CAUTIOUS_ACC flag signifies that an error is anticipated by the accessing driver. The system attempts to handle and recover from an error associated with this handle as gracefully as possible. No error reports are generated as a result, but the handle's fme_status flag is set to DDI_FM_NONFATAL. This flag is functionally equivalent to ddi_peek(9F) and ddi_poke(9F).

The use of the DDI_CAUTIOUS_ACC provides:

  • Exclusive access to the bus

  • On trap protection - (ddi_peek() and ddi_poke())

  • Error notification through the driver callback registered with ddi_fm_handler_register(9F)

  • An error condition observable through ddi_fm_acc_err_get(9F)

Generally, drivers should check for data path errors at appropriate junctures in the code path to guarantee consistent data and to ensure that proper error status is presented in the I/O software stack.

DDI_FM_ACCCHK_CAPABLE device drivers must set their devacc_attr_access field to DDI_FLAGERR_ACC or DDI_CAUTIOUS_ACC.

DMA Attributes Structure

As with access handle setup, a DDI_FM_DMACHK_CAPABLE device driver must set the dma_attr_flag field of its ddi_dma_attr(9S) structure to the DDI_DMA_FLAGERR flag. The system attempts to recover from an error associated with a handle that has DDI_DMA_FLAGERR set. The ddi_dma_attr structure contains the following members:

uint_t          dma_attr_version;       /* version number */
uint64_t        dma_attr_addr_lo;       /* low DMA address range */
uint64_t        dma_attr_addr_hi;       /* high DMA address range */
uint64_t        dma_attr_count_max;     /* DMA counter register */
uint64_t        dma_attr_align;         /* DMA address alignment */
uint_t          dma_attr_burstsizes;    /* DMA burstsizes */
uint32_t        dma_attr_minxfer;       /* min effective DMA size */
uint64_t        dma_attr_maxxfer;       /* max DMA xfer size */
uint64_t        dma_attr_seg;           /* segment boundary */
int             dma_attr_sgllen;        /* s/g length */
uint32_t        dma_attr_granular;      /* granularity of device */
uint_t          dma_attr_flags;         /* Bus specific DMA flags */

Drivers that set the DDI_DMA_FLAGERR flag should use the techniques described in Defensive Programming Techniques for Solaris Device Drivers and should use ddi_fm_dma_err_get(9F) to check for data path errors whenever DMA transactions are completed or at significant points within the code path. This ensures consistent data and proper error status presented to the I/O software stack.

Use of DDI_DMA_FLAGERR provides:

Getting Error Status

If a fault has occurred that affects the resource mapped by the handle, the error status structure is updated to reflect error information captured during error handling by a bus or other device driver in the I/O data path.

void ddi_fm_dma_err_get(ddi_dma_handle_t handle, ddi_fm_error_t *de, int version)

void ddi_fm_acc_err_get(ddi_acc_handle_t handle, ddi_fm_error_t *de, int version)

The ddi_fm_dma_err_get(9F) and ddi_fm_acc_err_get(9F)functions return the error status for a DMA or access handle respectively. The version field should be set to DDI_FME_VERSION.

An error for an access handle means that an error has been detected that has affected PIO transactions to or from the device using that access handle. Any data received by the driver, for example via a recent ddi_get8(9F) call, should be considered potentially corrupt. Any data sent to the device, for example via a recent ddi_put32(9F) call might also have been corrupted or might not have been received at all. The underlying fault might, however, be transient, and the driver can therefore attempt to recover by calling ddi_fm_acc_err_clear(9F), resetting the device to get it back into a known state, and retrying any potentially failed transactions.

If an error is indicated for a DMA handle, it implies that an error has been detected that has (or will) affect DMA transactions between the device and the memory currently bound to the handle (or most recently bound, if the handle is currently unbound). Possible causes include the failure of a component in the DMA data path, or an attempt by the device to make an invalid DMA access. The driver might be able to continue by retrying and reallocating memory. The contents of the memory currently (or previously) bound to the handle should be regarded as indeterminate and should be released back to the system. The fault indication associated with the current transaction is lost once the handle is bound or re-bound, but because the fault might persist, future DMA operations might not succeed.

Clearing Errors

The ddi_fm_acc_err_clear() and ddi_fm_dma_err_clear(9F) routines should be called when the driver wants to retry a request after an error was detected by the handle without needing to free and reallocate the handle first.

void ddi_fm_acc_err_clear(ddi_acc_handle_t handle, int version)

void ddi_fm_dma_err_clear(ddi_dma_handle_t handle, int version)

Registering an Error Handler

Error handling activity might begin at the time that the error is detected by the operating system via a trap or error interrupt. If the software responsible for handling the error (the error handler) cannot immediately isolate the device that was involved in the failed I/O operation, it must attempt to find a software module within the device tree that can perform the error isolation. The Solaris device tree provides a structural means to propagate nexus driver error handling activities to children who might have a more detailed understanding of the error and can capture error state and isolate the problem device.

A driver can register an error handler callback with the I/O Fault Services Framework. The error handler should be specific to the type of error and subsystem where error detection has occurred. When the driver's error handler routine is invoked, the driver must check for any outstanding errors associated with device transactions and generate ereport events. The driver must also return error handler status in its ddi_fm_error(9S) structure. For example, if it has been determined that the system's integrity has been compromised, the most appropriate action might be for the error handler to panic the system.

The callback is invoked by a parent nexus driver when an error might be associated with a particular device instance. Device drivers that register error handlers must be DDI_FM_ERRCB_CAPABLE.

void ddi_fm_handler_register(dev_info_t *dip, ddi_err_func_t handler, void *impl_data)

The ddi_fm_handler_register(9F) routine registers an error handler callback with the I/O fault services framework. The ddi_fm_handler_register() function should be called in the driver's attach(9E) entry point for callback registration following driver fault management initialization (ddi_fm_init()).

The error handler callback function must do the following:

Driver error handlers receive the following:

The ddi_fm_handler_register() and ddi_fm_handler_unregister(9F) routines must be called from kernel context in a driver's attach(9E) or detach(9E) entry point. The registered error handler callback can be called from kernel, interrupt, or high-level interrupt context. Therefore the error handler:

A device driver is responsible for:

These actions can be carried out within the error handler function. However, because of the restrictions on locking and because the error handler function does not always know the context of what the driver was doing at the point where the fault occurred, it is more usual for these actions to be carried out following inline calls to ddi_fm_acc_err_get(9F) and ddi_fm_dma_err_get(9F) within the normal paths of the driver as described previously.

 * The I/O fault service error handling callback function
static int
bge_fm_error_cb(dev_info_t *dip, ddi_fm_error_t *err, const void *impl_data)
      * as the driver can always deal with an error 
      * in any dma or access handle, we can just return 
      * the fme_status value.
     pci_ereport_post(dip, err, NULL);
     return (err->fme_status);

Fault Management Data and Status Structure

Driver error handling callbacks are passed a pointer to a data structure that contains common fault management data and status for error handling.

The data structure ddi_fm_error contains an FMA protocol ENA for the current error, the status of the error handler callback, an error expectation flag, and any potential access or DMA handles associated with an error detected by the parent nexus.


This field is initialized by the calling parent nexus and might have been incremented along the error handling propagation chain before reaching the driver's registered callback routine. If the driver detects a related error of its own, it should increment this ENA prior to calling ddi_fm_ereport_post().

fme_acc_handle, fme_dma_handle

These fields contain a valid access or DMA handle if the parent was able to associate an error detected at its level to a handle mapped or bound by the device driver.


The fme_flag is set to DDI_FM_ERR_EXPECTED if the calling parent determines the error was the result of a DDI_CAUTIOUS_ACC protected operation. In this case, the fme_acc_handle is valid and the driver should check for and report only errors not associated with the DDI_CAUTIOUS_ACC protected operation. Otherwise, fme_flag is set to DDI_FM_ERR_UNEXPECTED and the driver must perform the full range of error handling tasks.


Upon return from its error handler callback, the driver must set fme_status to one of the following values:

  • DDI_FM_OK – No errors were detected and the operational state of this device instance remains the same.

  • DDI_FM_FATAL – An error has occurred and the driver considers it to be fatal to the system. For example, a call to pci_ereport_post(9F) might have detected a system fatal error. In this case, the driver should report any additional error information it might have in the context of the driver.

  • DDI_FM_NONFATAL – An error has been detected by the driver but is not considered fatal to the system. The driver has identified the error and has either isolated the error or is committing that it will isolate the error.

  • DDI_FM_UNKNOWN – An error has been detected, but the driver is unable to isolate the device or determine the impact of the error on the operational state of the system.