ChorusOS 5.0 Board Support Package Developer's Guide

Containment

Once a failure is detected, a hardened driver will ensure that the effects of the fault are contained. The driver will maintain internally an instance-wide state of the serviced device (working/failed). This flag should be checked at critical points within the driver code to prevent the propagation of erroneous data or events outside the driver, or from multiplying errors by accepting new requests on a failed device.

Code Example 13-1, from the dec21x4x ethernet driver, shows how the 'failed' field of the driver instance data is used to ensure that effects from faults are contained. This field is zeroed in the drv_init() routine. It is set in fail() (see code Example 13-2) and on device removal and is checked on entry to each ethernet DDI down call routine.


Example 13-1 dec21x4x fault containment

typedef struct Dec21Data {
    DevNode            node;
    char*              path;
    int                pathSize;
    DevRegId           devRegId;    /* Device registry id            */
    DevRegEntry        entry;       /* device registry entry         */
    PciBusEvent        evtState;    /* bus events being processed    */

[...]

    Bool               failed;      /* instance wide failed flag     */

[...]

} Dec21Data;

#define IS_DEV_FAILED(d) (d->failed)

    static KnError
open (void* devId, void* clientCookie, EtherCallBack* clientOps)
{
    Dec21Data* dec21 = (Dec21Data*)devId;

    if (IS_DEV_FAILED(dec21)) {
        return K_EFAIL;
    }

[...]

    return K_OK;
}

[...]

    /*
     * PCI bus event handlers.
     * The event handler is invoked by the parent bus driver when a bus
     * event occurs in the bus.
     *
     * The DEC21 driver always supports the PCI_SYS_SHUTDOWN and
     * PCI_DEV_SHUTDOWN events. The PCI_DEV_REMOVAL support is optional and
     * is provided only when DEC21_DEV_REMOVAL is defined.
     */
    static KnError
eventHandler (void* cookie, BusEvent event, void* arg)
{
    Dec21Data* dec21 = (Dec21Data*)cookie;
    KnError    res   = K_OK;

    switch (event) {

[...]

    case PCI_DEV_REMOVAL:
            /*
             * The device removal is processed from either the
             * normal mode or shut down mode. In other words,
             * this event is ignored if the driver already operates in the
             * device removal mode.
             *
             * Here, we flag that the device is entered into removal
             * mode (dev->evtState). We ask the device registry to 
             * notify clients about the device removal event.
             * The real shut down procedure will be done by the relHandler()
             * handler. This handler is called by device registry when the
             * reference to the driver instance goes away (i.e. when
             * svDeviceRelease() is called by client).
             */
        if (dec21->evtState != PCI_DEV_REMOVAL) {
            dec21->evtState = PCI_DEV_REMOVAL;
            dec21->failed   = TRUE;
            DKI_MSG(("%s: entered into removal mode\n", dec21->path));
            svDeviceEvent(dec21->devRegId, DEV_EVENT_REMOVAL, NULL);
        }
        break;

[...]
}