Writing Device Drivers for Oracle® Solaris 11.2

Exit Print View

Updated: September 2014
 
 

Error Handling

This section describes how to use I/O fault services APIs to handle errors within a driver. This section discusses how drivers should indicate and initialize their fault management capabilities, generate error reports, and register the driver's error handler routine.

Drivers that have been instrumented to provide FMA error report telemetry detect errors and determine the impact of those errors on the services provided by the driver. Following the detection of an error, the driver should determine when its services have been impacted and to what degree.

    An I/O driver must respond immediately to detected errors. Appropriate responses include:

  • Attempt recovery

  • Retry an I/O transaction

  • Attempt fail-over techniques

  • Report the error to the calling application/stack

  • If the error cannot be constrained any other way, then panic

Errors detected by the driver are communicated to the fault management daemon as an ereport. An ereport is a structured event defined by the FMA event protocol. The event protocol is a specification for a set of common data fields that must be used to describe all possible error and fault events, in addition to the list of suspected faults. Ereports are gathered into a flow of error telemetry and dispatched to the diagnosis engine.

Declaring Fault Management Capabilities

A hardened device driver must declare its fault management capabilities to the I/O Fault Management framework. Use the ddi_fm_init(9F) function to declare the fault management capabilities of your driver.

void ddi_fm_init(dev_info_t *dip, int *fmcap, ddi_iblock_cookie_t *ibcp)

The ddi_fm_init() function can be called from kernel context in a driver attach(9E) or detach(9E) entry point. The ddi_fm_init() function usually is called from the attach() entry point. The ddi_fm_init() function allocates and initializes resources according to fmcap. The fmcap parameter must be set to the bitwise-inclusive-OR of the following fault management capabilities:

  • DDI_FM_EREPORT_CAPABLE - Driver is responsible for and capable of generating FMA protocol error events (ereports) upon detection of an error condition.

  • DDI_FM_ACCCHK_CAPABLE - Driver is responsible for and capable of checking for errors upon completion of one or more access I/O transactions.

  • DDI_FM_DMACHK_CAPABLE - Driver is responsible for and capable of checking for errors upon completion of one or more DMA I/O transactions.

  • DDI_FM_ERRCB_CAPABLE - Driver has an error callback function.

A hardened leaf driver generally sets all these capabilities. However, if its parent nexus is not capable of supporting any one of the requested capabilities, the associated bit is cleared and returned as such to the driver. Before returning from ddi_fm_init(9F), the I/O fault services framework creates a set of fault management capability properties: fm-ereport-capable, fm-accchk-capable, fm-dmachk-capable and fm-errcb-capable. The currently supported fault management capability level is observable by using the prtconf(1M) command.

To make your driver support administrative selection of fault management capabilities, export and set the fault management capability level properties to the values described above in the driver.conf(4) file. The fm-capable properties must be set and read prior to calling ddi_fm_init() with the desired capability list.

The following example from the bge driver shows the bge_fm_init() function, which calls the ddi_fm_init(9F) function. The bge_fm_init() function is called in the bge_attach() function.

static void
bge_fm_init(bge_t *bgep)
{
        ddi_iblock_cookie_t iblk;

        /* Only register with IO Fault Services if we have some capability */
        if (bgep->fm_capabilities) {
                bge_reg_accattr.devacc_attr_access = DDI_FLAGERR_ACC;
                dma_attr.dma_attr_flags = DDI_DMA_FLAGERR;
                /* 
                 * Register capabilities with IO Fault Services
                 */
                ddi_fm_init(bgep->devinfo, &bgep->fm_capabilities, &iblk);
                /*
                 * Initialize pci ereport capabilities if ereport capable
                 */
                if (DDI_FM_EREPORT_CAP(bgep->fm_capabilities) ||
                    DDI_FM_ERRCB_CAP(bgep->fm_capabilities))
                        pci_ereport_setup(bgep->devinfo);
                /*
                 * Register error callback if error callback capable
                 */
                if (DDI_FM_ERRCB_CAP(bgep->fm_capabilities))
                        ddi_fm_handler_register(bgep->devinfo,
                        bge_fm_error_cb, (void*) bgep);
        } else {
                /*
                 * These fields have to be cleared of FMA if there are no
                 * FMA capabilities at runtime.
                 */
                bge_reg_accattr.devacc_attr_access = DDI_DEFAULT_ACC;
                dma_attr.dma_attr_flags = 0;
        }
}

Cleaning Up Fault Management Resources

The ddi_fm_fini(9F) function cleans up resources allocated to support fault management for dip.

void ddi_fm_fini(dev_info_t *dip)

The ddi_fm_fini() function can be called from kernel context in a driver attach(9E) or detach(9E) entry point.

The following example from the bge driver shows the bge_fm_fini() function, which calls the ddi_fm_fini(9F) function. The bge_fm_fini() function is called in the bge_unattach() function, which is called in both the bge_attach() and bge_detach() functions.

static void
bge_fm_fini(bge_t *bgep)
{
        /* Only unregister FMA capabilities if we registered some */
        if (bgep->fm_capabilities) {
                /*
                 * Release any resources allocated by pci_ereport_setup()
                 */
                if (DDI_FM_EREPORT_CAP(bgep->fm_capabilities) ||
                    DDI_FM_ERRCB_CAP(bgep->fm_capabilities))
                        pci_ereport_teardown(bgep->devinfo);
                /*
                 * Un-register error callback if error callback capable
                 */
                if (DDI_FM_ERRCB_CAP(bgep->fm_capabilities))
                        ddi_fm_handler_unregister(bgep->devinfo);
                /*
                 * Unregister from IO Fault Services
                 */
                ddi_fm_fini(bgep->devinfo);
        }
}

Getting the Fault Management Capability Bit Mask

The ddi_fm_capable(9F) function returns the capability bit mask currently set for dip.

void ddi_fm_capable(dev_info_t *dip)

Reporting Errors

Queueing an Error Event

The ddi_fm_ereport_post(9F) function causes an ereport event to be queued for delivery to the fault manager daemon, fmd(1M).

void ddi_fm_ereport_post(dev_info_t *dip, 
                         const char *error_class, 
                         uint64_t ena, 
                         int sflag, ...)

The sflag parameter indicates whether the caller is willing to wait for system memory and event channel resources to become available.

The ENA indicates the Error Numeric Association (ENA) for this error report. The ENA might have been initialized and obtained from another error detecting software module such as a bus nexus driver. If the ENA is set to 0, it will be initialized by ddi_fm_ereport_post().

The name-value pair (nvpair) variable argument list contains one or more name, type, value pointer nvpair tuples for non-array data_type_t types or one or more name, type, number of element, value pointer tuples for data_type_t array types. The nvpair tuples make up the ereport event payload required for diagnosis. The end of the argument list is specified by NULL.

The ereport class names and payloads described in Reporting Standard I/O Controller Errors for I/O controllers are used as appropriate for error_class. Other ereport class names and payloads can be defined, but they must be registered in the Oracle event registry and accompanied by driver specific diagnosis engine software, or the Eversholt fault tree (eft) rules.

void
bge_fm_ereport(bge_t *bgep, char *detail)
{
        uint64_t ena;
        char buf[FM_MAX_CLASS];
        (void) snprintf(buf, FM_MAX_CLASS, "%s.%s", DDI_FM_DEVICE, detail);
        ena = fm_ena_generate(0, FM_ENA_FMT1);
        if (DDI_FM_EREPORT_CAP(bgep->fm_capabilities)) {
                ddi_fm_ereport_post(bgep->devinfo, buf, ena, DDI_NOSLEEP,
                    FM_VERSION, DATA_TYPE_UINT8, FM_EREPORT_VERS0, NULL);
        }
}
Detecting and Reporting PCI-Related Errors

PCI-related errors, including PCI, PCI-X, and PCI-E, are automatically detected and reported when you use pci_ereport_post(9F).

void pci_ereport_post(dev_info_t *dip, ddi_fm_error_t *derr, uint16_t *xx_status)

Drivers do not need to generate driver-specific ereports for errors that occur in the PCI Local Bus configuration status registers. The pci_ereport_post() function can report data parity errors, master aborts, target aborts, signaled system errors, and much more.

If pci_ereport_post() is to be used by a driver, then pci_ereport_setup(9F) must have been previously called during the driver's attach(9E) routine, and pci_ereport_teardown(9F) must subsequently be called during the driver's detach(9E) routine.

The bge code samples below show the bge driver invoking the pci_ereport_post() function from the driver's error handler.

/*
 * The I/O fault service error handling callback function
 */
/*ARGSUSED*/
static int
bge_fm_error_cb(dev_info_t *dip, ddi_fm_error_t *err, const void *impl_data)
{
     /*
      * as the driver can always deal with an error 
      * in any dma or access handle, we can just return 
      * the fme_status value.
      */
     pci_ereport_post(dip, err, NULL);
     return (err->fme_status);
}
Reporting Standard I/O Controller Errors

A standard set of device ereports is defined for commonly seen errors for I/O controllers. These ereports should be generated whenever one of the error symptoms described in this section is detected.

The ereports described in this section are dispatched for diagnosis to the eft diagnosis engine, which uses a common set of standard rules to diagnose them. Any other errors detected by device drivers must be defined as ereport events in the Sun event registry and must be accompanied by device specific diagnosis software or eft rules.

DDI_FM_DEVICE_INVAL_STATE

The driver has detected that the device is in an invalid state.

A driver should post an error when it detects that the data it transmits or receives appear to be invalid. For example, in the bge code, the bge_chip_reset() and bge_receive_ring() routines generate the ereport.io.device.inval_state error when these routines detect invalid data.

/*
 * The SEND INDEX registers should be reset to zero by the
 * global chip reset; if they're not, there'll be trouble
 * later on.
 */
sx0 = bge_reg_get32(bgep, NIC_DIAG_SEND_INDEX_REG(0));
if (sx0 != 0) {
    BGE_REPORT((bgep, "SEND INDEX - device didn't RESET"));
    bge_fm_ereport(bgep, DDI_FM_DEVICE_INVAL_STATE);
    return (DDI_FAILURE);
}
/* ... */
/*
 * Sync (all) the receive ring descriptors
 * before accepting the packets they describe
 */
DMA_SYNC(rrp->desc, DDI_DMA_SYNC_FORKERNEL);
if (*rrp->prod_index_p >= rrp->desc.nslots) {
    bgep->bge_chip_state = BGE_CHIP_ERROR;
    bge_fm_ereport(bgep, DDI_FM_DEVICE_INVAL_STATE);
    return (NULL);
}
DDI_FM_DEVICE_INTERN_CORR

The device has reported a self-corrected internal error. For example, a correctable ECC error has been detected by the hardware in an internal buffer within the device. This error flag is not used in the bge driver.

DDI_FM_DEVICE_INTERN_UNCORR

The device has reported an uncorrectable internal error. For example, an uncorrectable ECC error has been detected by the hardware in an internal buffer within the device.

This error flag is not used in the bge driver.

DDI_FM_DEVICE_STALL

The driver has detected that data transfer has stalled unexpectedly.

The bge_factotum_stall_check() routine provides an example of stall detection.

dogval = bge_atomic_shl32(&bgep->watchdog, 1);
if (dogval < bge_watchdog_count)
    return (B_FALSE);

BGE_REPORT((bgep, "Tx stall detected, 
watchdog code 0x%x", dogval));
bge_fm_ereport(bgep, DDI_FM_DEVICE_STALL);
return (B_TRUE);
DDI_FM_DEVICE_NO_RESPONSE

The device is not responding to a driver command.

bge_chip_poll_engine(bge_t *bgep, bge_regno_t regno,
        uint32_t mask, uint32_t val)
{
        uint32_t regval;
        uint32_t n;

        for (n = 200; n; --n) {
                regval = bge_reg_get32(bgep, regno);
                if ((regval & mask) == val)
                        return (B_TRUE);
                drv_usecwait(100);
        }
        bge_fm_ereport(bgep, DDI_FM_DEVICE_NO_RESPONSE);
        return (B_FALSE);
}
DDI_FM_DEVICE_BADINT_LIMIT

The device has raised too many consecutive invalid interrupts.

The bge_intr() routine within the bge driver provides an example of stuck interrupt detection. The bge_fm_ereport() function is a wrapper for the ddi_fm_ereport_post(9F) function. See the bge_fm_ereport() example in Queueing an Error Event.

if (bgep->missed_dmas >= bge_dma_miss_limit) {
    /*
     * If this happens multiple times in a row,
     * it means DMA is just not working.  Maybe
     * the chip has failed, or maybe there's a
     * problem on the PCI bus or in the host-PCI
     * bridge (Tomatillo).
     *
     * At all events, we want to stop further
     * interrupts and let the recovery code take
     * over to see whether anything can be done
     * about it ...
     */
    bge_fm_ereport(bgep,
        DDI_FM_DEVICE_BADINT_LIMIT);
    goto chip_stop;
}
Service Impact Function

A fault management capable driver must indicate whether or not an error has impacted the services provided by a device. Following detection of an error and, if necessary, a shutdown of services, the driver should invoke the ddi_fm_service_impact(9F) routine to reflect the current service state of the device instance. The service state can be used by diagnosis and recovery software to help identify or react to the problem.

The ddi_fm_service_impact() routine should be called both when an error has been detected by the driver itself, and when the framework has detected an error and marked an access or DMA handle as faulty.

void ddi_fm_service_impact(dev_info_t *dip, int svc_impact)

The following service impact values (svc_impact) are accepted by ddi_fm_service_impact():

DDI_SERVICE_LOST

The service provided by the device is unavailable due to a device fault or software defect.

DDI_SERVICE_DEGRADED

The driver is unable to provide normal service, but the driver can provide a partial or degraded level of service. For example, the driver might have to make repeated attempts to perform an operation before it succeeds, or it might be running at less that its configured speed.

DDI_SERVICE_UNAFFECTED

The driver has detected an error, but the services provided by the device instance are unaffected.

DDI_SERVICE_RESTORED

All of the device's services have been restored.

The call to ddi_fm_service_impact() generates the following ereports on behalf of the driver, based on the service impact argument to the service impact routine:

  • ereport.io.service.lost

  • ereport.io.service.degraded

  • ereport.io.service.unaffected

  • ereport.io.service.restored

In the following bge code, the driver determines that it is unable to successfully restart transmitting or receiving packets as the result of an error. The service state of the device transitions to DDI_SERVICE_LOST.

/*
 * All OK, reinitialize hardware and kick off GLD scheduling
 */
mutex_enter(bgep->genlock);
if (bge_restart(bgep, B_TRUE) != DDI_SUCCESS) {
    (void) bge_check_acc_handle(bgep, bgep->cfg_handle);
    (void) bge_check_acc_handle(bgep, bgep->io_handle);
    ddi_fm_service_impact(bgep->devinfo, DDI_SERVICE_LOST);
    mutex_exit(bgep->genlock);
    return (DDI_FAILURE);
}

Note - The ddi_fm_service_impact() function should not be called from the registered callback routine.