ChorusOS 5.0 Board Support Package Developer's Guide

Part IV Driver Hardening

This section contains information on hardening ChorusOS device drivers. It explains what driver hardening is and details the demands of a hardened driver. The ChorusOS Device Driver Framework is explained from a hardened driver perspective.

Chapter 12 Overview of Driver Hardening

This chapter introduces the concept of hardened drivers and highlights issues involved in writing them.

"Hardened Drivers" explains what hardening is, particularly device driver hardening.
"Overview of the Process" is a brief description of the steps involved when developing hardened device drivers.
"Developer Resonsibilities" outlines the duties of hardened driver developers.

Hardened Drivers

Hardening refers to the process of ensuring that software is resilient to hardware faults. It concerns the requirement for graceful degradation and failure when the hardware is faulty.

A hardened driver is a driver that is resilient to faults in the I/O device it controls, as well as faults originating outside the system core. The driver does not panic, hang, or allow propogation of corrupted data as a result of these types of faults.

When writing hardened drivers you must assume that even when the underlying device hardware is not working properly, the driver must continue to behave reasonably. The driver must respect the DDI protocols under a hardware failure condition.

All ChorusOS device drivers are held in a hierarchical tree, the ChorusOS device driver framework, to aid referencing and maintenance. A hardened driver must be catered for in that framework. For further information regarding the device driver framework refer to Part III. In order to obtain the desired fail-safe behavior of a leaf device driver, the bus/nexus drivers stored in the path from the DKI to that hardened leaf device driver should also be hardened. Moreover, the operating system as a whole, including the microkernel, drivers, POSIX sub-system and other system actors must be hardened for the hardening effort at microkernel driver level to be effective.

Overview of the Process

A hardened driver obeys all of the rules of a standard ChorusOS device driver as well as some additional rules:

Each piece of hardware should be controlled by a separate instance of the device driver.
Programmed I/O (PIO) must be performed only through the DDI access functions, using the appropriate data access handler.
The device driver must assume that the data it receives from the device could be corrupt. The driver should check the integrity of the data before using it.
The driver must control the effects of any faults it detects. Data supplied by the device may be checked for integrity before it is released to the rest of the system.
The driver must not be an unlimited drain on system resources if the device locks up. It should timeout if a device claims to be continuously busy. The driver should also detect a pathological (stuck) interrupt request and take appropriate action.
The driver must free up resources after a fault. For example, the system must be able to close all minor devices and detach driver instances, even after the hardware fails.

Developer Resonsibilities

In order to develop hardened device drivers you must take responsibility for:

Correct use of the DDI functions
Handling devices with deviant interrupt logic
Detecting any corruption of device I/O

These responsibilities are elaborated in Chapter 13, Hardened Driver Requirements.

Chapter 13 Hardened Driver Requirements

This chapter details the main rules that a hardened driver should satisfy:

"No Panic" describes how a hardened driver must not panic on error.
"Containment" describes how a hardened driver ensures that the effects of the fault are contained.
"Logging" describes how a hardened driver logs driver errors and device failures.
"Notification" describes how a hardened driver notifies its clients of device failure.
"Bus Exceptions" describes how a hardened bus driver handles hardware bus exceptions.
"Corrupt Data Detection" describes how a hardened driver detects corrupt data read from the device.
"Stuck Interrupts" describes how a hardened driver handles persistently asserted interrupts.
"Periodic Health Checks" describes how a hardened driver carries out periodic health checks.

Many of the above requirements are illustrated using code taken from two real hardened device drivers:

The dec21x4x ethernet (PCI) device driver (see the dec21x4x(9DRV) man page).
The Raven PCI host bridge bus driver (see the raven(9DRV) man page).

No Panic

A hardened driver does not panic on error. All detected errors and failures are considered as fault conditions and the system is notified using Driver Framework mechanisms (see "Notification" ).

Containment

Once a failure is detected, a hardened driver will ensure that the effects of the fault are contained. The driver will maintain internally an instance-wide state of the serviced device (working/failed). This flag should be checked at critical points within the driver code to prevent the propagation of erroneous data or events outside the driver, or from multiplying errors by accepting new requests on a failed device.

Code Example 13-1, from the dec21x4x ethernet driver, shows how the 'failed' field of the driver instance data is used to ensure that effects from faults are contained. This field is zeroed in the drv_init() routine. It is set in fail() (see code Example 13-2) and on device removal and is checked on entry to each ethernet DDI down call routine.

Example 13-1 dec21x4x fault containment

typedef struct Dec21Data {
    DevNode            node;
    char*              path;
    int                pathSize;
    DevRegId           devRegId;    /* Device registry id            */
    DevRegEntry        entry;       /* device registry entry         */
    PciBusEvent        evtState;    /* bus events being processed    */

[...]

    Bool               failed;      /* instance wide failed flag     */

[...]

} Dec21Data;

#define IS_DEV_FAILED(d) (d->failed)

    static KnError
open (void* devId, void* clientCookie, EtherCallBack* clientOps)
{
    Dec21Data* dec21 = (Dec21Data*)devId;

    if (IS_DEV_FAILED(dec21)) {
        return K_EFAIL;
    }

[...]

    return K_OK;
}

[...]

    /*
     * PCI bus event handlers.
     * The event handler is invoked by the parent bus driver when a bus
     * event occurs in the bus.
     *
     * The DEC21 driver always supports the PCI_SYS_SHUTDOWN and
     * PCI_DEV_SHUTDOWN events. The PCI_DEV_REMOVAL support is optional and
     * is provided only when DEC21_DEV_REMOVAL is defined.
     */
    static KnError
eventHandler (void* cookie, BusEvent event, void* arg)
{
    Dec21Data* dec21 = (Dec21Data*)cookie;
    KnError    res   = K_OK;

    switch (event) {

[...]

    case PCI_DEV_REMOVAL:
            /*
             * The device removal is processed from either the
             * normal mode or shut down mode. In other words,
             * this event is ignored if the driver already operates in the
             * device removal mode.
             *
             * Here, we flag that the device is entered into removal
             * mode (dev->evtState). We ask the device registry to 
             * notify clients about the device removal event.
             * The real shut down procedure will be done by the relHandler()
             * handler. This handler is called by device registry when the
             * reference to the driver instance goes away (i.e. when
             * svDeviceRelease() is called by client).
             */
        if (dec21->evtState != PCI_DEV_REMOVAL) {
            dec21->evtState = PCI_DEV_REMOVAL;
            dec21->failed   = TRUE;
            DKI_MSG(("%s: entered into removal mode\n", dec21->path));
            svDeviceEvent(dec21->devRegId, DEV_EVENT_REMOVAL, NULL);
        }
        break;

[...]
}

Logging

Driver errors and device failures will be logged using the Driver Framework logging mechanism. The DKI_ERR(), DKI_WARN() and DKI_MSG() macros should be used for that purpose.

Code Example 13-2, from the dec21x4x ethernet driver, illustrates how detected errors are turned into event notifications, and how they are logged. In the example, the Dec21Data structure contains driver instance specific data. In this structure, the failed field is the instance-wide flag indicating a device failure. The evtState field is used to record the current state of the driver instance, regarding propagated events.

Example 13-2 dec21x4x error handling

    /*
     * Enter failed mode (called on device failure).
     * On device failure, the REMOVAL event is propagated to clients, 
     * as the device should not be accessed any more.
     * However, internal state (dec21->evtState) is set to 
     * PCI_DEV_SHUTDOWN, as we may need to further access the device 
     * (to reset it for example). The dec21->failed flag is also set 
     * for next client down calls to return a K_EFAIL result.
     */
    static KnError
fail (Dec21Data* dec21)
{
    dec21->failed = TRUE;
    if (!dec21->evtState) {
        dec21->evtState = PCI_DEV_SHUTDOWN;
        DKI_MSG(("%s: entered into failed mode\n", dec21->path));    
        svDeviceEvent(dec21->devRegId, DEV_EVENT_REMOVAL, NULL);
    }
    return K_OK;
}

    /*
     * The I/O error handler is called by the parent bus driver
     * if a bus error occurs when accessing the device registers.
     * This is considered a device failure.
     */
    static void
ioErrHandler (void* cookie, PciBusError* error)
{
    DKI_ERR(("%s: error -- I/O (%d) at 0x%08x\n",
             ((Dec21Data*)cookie)->path, error->code, error->offset));
    fail(cookie);
}

    /*
     * DMA error handler.
     * This is considered a device failure.
     */
    static void
dmaErrHandler (void* cookie, PciBusError* error)
{
    DKI_ERR(("%s: error -- DMA (%d) at 0x%08x\n",
             ((Dec21Data*)cookie)->path, error->code, error->offset));
    fail(cookie);
}

Refer to Part III for details about these macros.

Notification

As soon as a device failure is detected, a hardened driver will notify its clients that the device has failed. It will turn the arbitrary and uncontrolled error into a notification event. A nexus driver will notify its child drivers through the bus event handler invocation. A leaf device driver will then notify its clients through the device event mechanism provided by the device registry (svDeviceEvent()).

Note -

If a nexus driver also exports an extra (orthogonal) interface through the device registry, both mechanisms will be used.

The dec21x4x event handler, shown in code Example 13-3, illustrates how a hardened device driver should notify its clients. This handler is called from the bus driver. The fail() routine is called either from the event handler, or directly from the driver on failure detection.

Example 13-3 dec21x4x event handler

    /*
     * PCI bus event handlers.
     * The event handler is invoked by the parent bus driver when a bus
     * event occurs in the bus.
     *
     * The DEC21 driver always supports the PCI_SYS_SHUTDOWN and
     * PCI_DEV_SHUTDOWN events. The PCI_DEV_REMOVAL support is optional and
     * is provided only when DEC21_DEV_REMOVAL is defined.
     */
    static KnError
eventHandler (void* cookie, BusEvent event, void* arg)
{
    Dec21Data* dec21 = (Dec21Data*)cookie;
    KnError    res   = K_OK;

    switch (event) {

[...]

    case PCI_SYS_ERROR: {
        uint32_f csr5;
            /*
             * PCI_SYS_ERROR is considered a fatal bus error.
             * We first check if this event was caused by our device on the
             * PCI bus.
             * If positive, all bus accesses from the device are disabled,
             * thus, we put the device into failed mode.
             */
        if (arg) {
            csr5 = *((uint32_f*)arg);
        } else {
            csr5 = dec21->pciIoOps->load_32(dec21->pciIoId, CSR5);
        }
        if (csr5 & CSR5_FBE) {
            DKI_ERR(("%s: error -- Fatal bus error (csr5=0x%08x)\n",
                     dec21->path, csr5));
            fail(dec21);
        }
        break;
    }

    case PCI_INTR_DEFECTIVE: {
            /*
             * Our interrupt line is defective (stuck interrupt).
             * We put the device into failed mode.
             */
        DKI_ERR(("%s: error -- interrupt line is defective\n", 
                dec21->path));
        res = fail(dec21);
        break;
    }

    default:
        /*
         * Palette events are ignored
         */
        res = K_ENOTIMP;
    }

    return res;
}

Refer to Part III for details about bus and system event notification.

Bus Exceptions

A hardened bus driver (usually the host bridge driver) will handle hardware bus exceptions to identify the bus address at fault. By invoking the associated driver's error handler, the hardened bus driver will propagate the appropriate error message upstream.

A hardened driver will not panic when its error handlers are invoked. This is considered as a device fault condition by a leaf driver, and is propagated to child drivers by a nexus driver.

In the ChorusOS Driver Framework a bus exception is reported by the nexus driver to a child driver through the error handler invocation. The error handler is specific to a mapped region: a driver specifies an error handler when calling its parent nexus driver to map a memory, I/O or DMA region. When an exception occurs, the bus driver analyzes the faulty bus address, detects the fault region and invokes the associated error handler.

Code Example 13-4 shows the Raven bus error interrupt handler. This handler manages errors at bus level and propagates warning of those errors to appropriate faulty device driver error (or event) handlers.

Example 13-4 Raven bus error handler

        /*
         * PowerPC "machine check" interrupt handler
         * It is called from DKI after context have been saved in stack.
         * This handler uses the RAVEN internal register to analyze and
         * dispatch bus errors to device error handlers.
         */
    static CpuIntrStatus
errHandler (RavenData* raven)
{
            PciBusError   error;
            PciMap*       pciMap;
            uint32_f      merad;
            uint16_f      merat;
   volatile uint8_f       merstReg;
            uint8_f       merst;
            uint8_f       overflow;
            KnIntrCtx*    intrCtx;
            CpuIntrStatus status = CPU_INTR_UNCLAIMED;
       /*
        * Read Raven MPC_MERST register ... and clear error bits that will
        * be handled
        */
   merst = merstReg = READ_REG_8(raven->regs.vaddr, MPC_MERST);
   WRITE_REG_8(raven->regs.vaddr, MPC_MERST, merstReg);

   overflow = merst & MPC_MERST_OVF;
   svIntrCtxGet(&intrCtx);
       /*
        * DATA PARITY ERROR and PCI SYSTEM ERROR are propagated as a
        * PCI_SYS_ERROR event, because the RAVEN does not latch any address
        * in this cases.
        */
   if (merst & (MPC_MERST_PERR | MPC_MERST_SERR)) {
       PciDev* pciDev;

       DKI_ERR(("%s: error -- (MC) pc=0x%08x lr=0x%08x sp=0x%08x\n",
                raven->path, intrCtx->pc, intrCtx->lr, intrCtx->r1));
       if (merst & MPC_MERST_PERR) {
           DKI_ERR(("%s: error -- PARITY ERROR detected\n", raven->path));
       }
       if (merst & MPC_MERST_SERR) {
           DKI_ERR(("%s: error -- SYSTEM ERROR detected\n", raven->path));
       }
       pciDev = raven->dev;
       while (pciDev) {
           if (pciDev->evtHandler) {
               pciDev->evtHandler(pciDev->cookie, PCI_SYS_ERROR, NULL);
           }
           pciDev = pciDev->next;
       }
       return CPU_INTR_CLAIMED;
   }

   if (overflow) {
       DKI_WARN(("%s: warning -- error overflow (0x%02x)\n",
                 raven->path, merst));
   }
       /*
        * Read latched fault address and cycle attributes
        */
   merad = READ_REG_32(raven->regs.vaddr, MPC_MERAD);
   merat = READ_REG_16(raven->regs.vaddr, MPC_MERAT);
       /*
        * PowerPC bus error: MERAT/MERAD contains PowerPC cycles attributes
        */
   if (merst & MPC_MERST_MATO) {
       error.code = PCI_ERR_TARGET_ABORT;
           /*
            * Search an existing map to which the latched address belong
            * and call the associated error handler.
            * DMA maps are in memMap list.
            */
       pciMap = raven->memMap;
       while (pciMap) {
           if ((pciMap->memChunk.paddr <= merad) && 
               (merad <= pciMap->memChunk.paddr + pciMap->memChunk.psize)) {
               error.offset = merad - pciMap->memChunk.paddr;
               pciMap->errHandler(pciMap->errCookie, &error);
               status = CPU_INTR_CLAIMED;
               break;
           }
           pciMap = pciMap->next;
       }
           /*
            * Save fault address in dar to raise a kernel exception
            */
       intrCtx->dar = merad;

       if (status == CPU_INTR_UNCLAIMED) {
         DKI_ERR(("%s: error -- (MC) pc=0x%08x lr=0x%08x sp=0x%08x\n",
                  raven->path, intrCtx->pc, intrCtx->lr, intrCtx->r1));
         DKI_ERR(("%s: error -- PowerPC timed-out 0x%08x (merat=0x%04x)\n",
                  raven->path, merad, merat));
         DKI_ERR(("%s: error -- from %s %s TT=0x%02x TSIZ=0x%01x\n",
                raven->path,
                (MPC_MERAT_MID(merat) == MPC_MID_RAVEN) ? "raven" : "cpu",
                (merat & MPC_MERAT_TBST) ? "burst" : "",
                merat & MPC_MERAT_TT,
                MPC_MERAT_TSIZ(merat)));
       }
       return status;
   }
       /*
        * PCI bus errors: MERAT/MERAD contains PCI cycle attributs
        */
   if (merst & MPC_MERST_RTA) {
       error.code = PCI_ERR_TARGET_ABORT;
   }else if (merst & MPC_MERST_SMA) {
       error.code = PCI_ERR_MASTER_ABORT;
   }
   switch (merat & MPC_MERAT_COMM) {
   case MPC_MERAT_IACK:
       DKI_ERR(("%s: error -- PCI IACK cycle\n", raven->path));
       break;
   case MPC_MERAT_CFG_READ:
   case MPC_MERAT_CFG_WRITE:
           /*
            * An error occured while accessing PCI configuration space
            */
       if ((merad == (raven->confAddr & ~0x3)) ||
           (merad == CONFIG_ADDR_TO_ADDR(raven->confAddr))) {
               /*
                * The latched error address matches the one currently 
                * accessed through a conf_load_xx operation.
                * Reset the accessed (confAddr) address to indicate
                * the operation failed.
                */
           raven->confAddr = 0;
           status = CPU_INTR_CLAIMED;
       } else {
               /*
                * The error is not due to our conf_xxx() operations !
                */
           DKI_ERR(("%s: error -- PCI Configuration cycle\n", raven->path));
       }
       break;
   case MPC_MERAT_IO_READ:
   case MPC_MERAT_IO_WRITE:
           /*
            * An error occured while accessing PCI I/O space
            * Search an existing map to which the latched address belong
            * and call the associated error handler
            */
       pciMap = raven->ioMap;
       while (pciMap) {
           if ((pciMap->first <= merad) && (merad <= pciMap->last)) {
               error.offset = merad - pciMap->first;
               pciMap->errHandler(pciMap->errCookie, &error);
               status = CPU_INTR_CLAIMED;
               break;
           }
           pciMap = pciMap->next;
       }
       break;
   case MPC_MERAT_MEM_READ:
   case MPC_MERAT_MEM_WRITE:
   case MPC_MERAT_MEM_READ_MULTI:
   case MPC_MERAT_MEM_READ_LINE:
   case MPC_MERAT_MEM_WRITE_INVAL:
           /*
            * An error occured while accessing PCI Memory space
            * Search an existing map to which the latched address belong
            * and call the associated error handler
            */
       pciMap = raven->memMap;
       while (pciMap) {
           if ((pciMap->first <= merad) && (merad <= pciMap->last)) {
               error.offset = merad - pciMap->first;
               pciMap->errHandler(pciMap->errCookie, &error);
               status = CPU_INTR_CLAIMED;
               break;
           }
           pciMap = pciMap->next;
       }
   }

   if (status == CPU_INTR_UNCLAIMED) {
     DKI_ERR(("%s: error -- (MC) pc=0x%08x lr=0x%08x sp=0x%08x\n",
               raven->path, intrCtx->pc, intrCtx->lr, intrCtx->r1));
    DKI_ERR(("%s: error -- (%d) PCI at 0x%08x merat=0x%04x merst=0x%02x\n",
               raven->path, error.code, merad, merat, merst));
     DKI_ERR(("%s: error -- from %s %s BYTE_EN=0x%02x\n",
             raven->path,
             (MPC_MERAT_MID(merat) == MPC_MID_RAVEN) ? "raven" : "cpu",
             (merat & MPC_MERAT_TBST) ? "write-posted" : "",
             merat & MPC_MERAT_BYTE_EN));
   }

   return status;
}

Refer to the ChorusOS man pages section 9DDI: Device Driver Interfaces for details about bus error handling interfaces.

Corrupt Data Detection

A hardened driver assumes that any data which it reads from the device may be corrupt. The data should be sanity checked, before use, if undesirable consequences are anticipated from its use or propagation.

In the dec21x4x ethernet driver the DMA buffers are not checked against corruption because this is already done by client's protocol stack (TCP/IP). However, code Example 13-5 illustrates how you could avoid an infinate loop when reading a register, by adding a break condition to the loop.

Example 13-5 loop on register value

/*
  * Reset the PHY device.
  */
  static void
phy_reset (Dec21Data* dec21)
{
  unsigned int count = 10000;

  mii_write_reg(dec21, MII_CTRL_REG, MII_CTRL_RESET);
  do {
      msecBusyWait(1);
  } while ((mii_read_reg(dec21, MII_CTRL_REG) & MII_CTRL_RESET) && count--);
  msecBusyWait(1);
}

Device Management and Control Data

Hardened drivers must act with extreme caution when using pointers, array indexes or memory offsets which are read or calculated from data retrieved from the device. These values should not be used until they are checked to ensure that they are within an expected range and have legal alignment. These types of pointer mechanisms can become misleading or malignant if the device has developed a fault.

A hardened driver will never loop simply upon a register value. An infinite loop may occur if a device breaks and returns stuck data. The hardened driver must have a method to break this type of loop.

Driver state information should be maintained in main memory, not on an I/O card.

Received Data

Device errors can result in corrupt data being placed in receive buffers. This corruption is indistinguishable from corruption occurring beyond the domain of the device, for example within a network. Typically, existing software will already be in place to handle such corruption through, for example, integrity checks at the transport layer of a protocol stack or within the application using the device.

If the received data is not going to be subjected to an integrity check at a higher layer, as in the case of a disk driver, it can be integrity-checked within the driver itself. However, such low level integrity checking can cause the greatest degradation to system performance. By not performing such checks at device level the results are, at worst, application failure or file corruption; it is not likely to cause a total system crash.

DMA

A defective device may be able to falsely initiate a DMA transfer over the bus. This type of data transfer may corrupt the system memory.

Some host bus bridges provide an IOMMU which allows you to map a DMA region (within the bus address space) to the system memory. On such hardware, the bus driver is able to protect the system memory (which is not used for DMA buffers) from corruption caused by a falsely initiated DMA transfer. The bus driver should not use a static one-to-one mapping (from the system memory to the bus space) to handle DMA transfers. Instead, it should manage IOMMU mappings dynamically. The dma_alloc() method maps a memory region to the bus space, enabling DMA transfers. The dma_free() method invalidates any mapping, disabling DMA to the memory region.

Note -

A defective device may still corrupt a DMA buffer managed by another device driver.

Stuck Interrupts

A persistently asserted interrupt will severely affect system performance, almost certainly stalling a single processor board. An interrupt handler needs to be able to identify whether it has been called as a result of a hoax interrupt.

A hardened driver's interrupt handler will return a BUS_INTR_UNCLAIMED result unless it detects that the device legitimately asserted the interrupt. Conceptually, an interrupt is legitimate if the device actually requires the driver to do some useful work.

A hardened bus driver is able to detect whether an interrupt line is defective. It disables the defective interrupt line (through the bus controller) and notifies any attached child drivers by calling their event handler, specifying a BUS_INTR_DEFECTIVE event, and passing the child driver the interrupt identifier as an argument.

To detect a defective interrupt line, a bus driver should maintain a count of unclaimed interrupts for each interrupt line. The bus driver may count unclaimed interrupts occurring between two claimed interrupts, resetting the total when an interrupt is claimed. Alternatively, it may count the unclaimed interrupts occurring during a given, configurable period of time, resetting the counter on a time-out invocation. In both cases, if the counter reaches a predetermined, configurable watermark, the bus driver should consider the interrupt line defective. Note that, in such a model, all devices sharing the same interrupt line will fail if stuck interrupts are detected on that line.

Code Example 13-6 illustrates how stuck interrupts may be detected by both the bus and device driver interrupt handlers. The Raven handler counts consecutive unclaimed interrupts, and raises a PCI_INTR_DEFECTIVE event when this count reaches a configured value. This handler also forbids enabling defective interrupt lines.

Example 13-6 Raven interrupt handler

#define IS_INTR_DEFECTIVE(raven, l)  (raven->unclaimed[(l)] == (uint32_f)-1)
#define SET_INTR_DEFECTIVE(raven, l) (raven->unclaimed[(l)]  = (uint32_f)-1)

    static void
unmask (PciIntrId intrId)
{
    RavenData* raven = ((PciIntr*)intrId)->devId->pciId;

        /*
         * Check if interrupt line is defective
         */
    if (IS_INTR_DEFECTIVE(raven, ((PciIntr*)intrId)->intrLine)) {
        return;
    }
        /*
         * Mask all PCI interrupts while working on MPIC registers
         */
    raven->intrOps->mask(raven->intrId);
    OPIC_INTR_UNMASK(raven->mpicIoOps,
                     raven->mpicIoId,
                     ((PciIntr*)intrId)->intrLine);
    raven->intrOps->unmask(raven->intrId);
}

    /*
     * Declare an interrupt line as defective
     */
    static void
intrDefective(RavenData* raven, uint32_f line)
{
   PciIntr* intr;
   PciDev*  dev;
        /*
         * Mask defective interrupt line at interrupt controller level
         */
    raven->intrOps->mask(raven->intrId);
    OPIC_INTR_MASK(raven->mpicIoOps, raven->mpicIoId, line);
    SET_INTR_DEFECTIVE(raven, line);
    raven->intrOps->unmask(raven->intrId);
        /*
         * Raise an event to all devices attached to this interrupt.
         * Interrupt identifier is passed as a specific argument.
         */
   for (intr = raven->intr[line] ; intr ; intr = intr->next) {
       dev = intr->devId;
       if (dev->evtHandler) {
          dev->evtHandler(dev->cookie, PCI_INTR_DEFECTIVE, (PciIntrId)intr);
       }
   }
}
        /*
         * PowerPC external interrupts handler.
         * It is called from DKI after context have been saved in stack.
         * This handler manages the RAVEN internal MPIC which is OpenPIC
         * compliant.
         */
    static CpuIntrStatus
intrHandler (RavenData* raven)
{
   uint32_f      vector;
   PciIntrStatus intrStatus = PCI_INTR_UNCLAIMED;
   uint32_f      cpu        = mfspr_PIR ();  /* processor id register */
   PciIoOps*     mpicIoOps  = raven->mpicIoOps;
   PciIoId       mpicIoId   = raven->mpicIoId;
   CpuIntrOps*   intrOps    = raven->intrOps;
   CpuIntrId     intrId     = raven->intrId;
   PciIntr*      pciIntr;
   int           claimed    = 0;
       /*
        * Get vector to identify the interrupt source
        */
   vector = OPIC_INTR_ACKNOWLEDGE(mpicIoOps, mpicIoId, cpu);

       /*
        * Ignore spurious interrupt requests
        */
   if (vector == MPIC_SPURIOUS_INTR_VECTOR) {
      raven->spurious++;
      return CPU_INTR_CLAIMED;
   }

       /*
        * Enable external interrupts on CPU
        */
   intrOps->unmask(intrId);
       /*
        * Call device handlers attached to this interrupt vector
        */
   for (pciIntr = raven->intr[vector] ;
        pciIntr ;
        pciIntr = pciIntr->next) {
       intrStatus = pciIntr->intrHandler(pciIntr->intrCookie);
       if (intrStatus != PCI_INTR_UNCLAIMED) {
           claimed++;
       }
   }
       /*
        * Disable external interrupts on CPU
        */
   intrOps->mask(intrId);

   if (intrStatus == PCI_INTR_ACKNOWLEDGED) {
           /*
            * Interrupt handler has already done:
            * - enable()
            * - ....
            * - disable()
            * So we just:
            * - reset task priority to re-enable lower priority interrupts
            * - unmask current interrupt (masked by disable()).
            */
       OPIC_CURRENT_TASK_SET_PRIORITY(mpicIoOps, mpicIoId, cpu,
                                      OPIC_PRIORITY_MIN);
       OPIC_INTR_UNMASK(mpicIoOps, mpicIoId,  vector);
   } else {
           /*
            * Interrupt was just serviced by the handler.
            * Send a non-specific EOI command to open PIC
            */
       OPIC_INTR_EOI(mpicIoOps, mpicIoId, cpu);
   }

   if (claimed == 0) {
           /*
            * Increment unclaimed counter and check against max.
            */
       if (++(raven->unclaimed[vector]) > raven->maxUnclaimed) {
           intrDefective(raven, vector);
       }
   } else {
       raven->unclaimed[vector] = 0; /* Reset unclaimed counter */
   }

   return CPU_INTR_CLAIMED;
}

The dec21x4x interrupt handler, shown in code Example 13-7, checks for unexpected interrupts by masking them from the interrupt status register that is read. If an unexpected interrupt is received, it is considered unclaimed.

Example 13-7 `dec21x4x` interrupt handler

    /* 
     * The interrupt handler 
     */
    static PciIntrStatus
intrHandler (void* cookie)
{
    Dec21Data* dec21 = (Dec21Data*)cookie;
    uint32_f   csr5;
        /*
         * Get current status and acknowledge all interrupt sources ASAP.
         */
    csr5 = dec21->pciIoOps->load_32(dec21->pciIoId, CSR5);
    dec21->pciIoOps->store_32(dec21->pciIoId, CSR5, csr5);

#ifdef DEBUG_DEC21
    sysLog("%s: intrHandler csr5=0x%08x\n", dec21->path, csr5);
#endif
        /*
         * Check if an unmasked interrupt is pending
         */
    csr5 &= dec21->csr7;
    if (csr5 == 0) {
        return PCI_INTR_UNCLAIMED;
    }

        /*
         * Process Rx interrupt
         */
    if (csr5 & CSR5_RI) {
        CSR7_INTR_MASK(dec21, CSR7_RIE);
        dec21->clientOps->receiptNotify(dec21->clientCookie);
    }
        /*
         * Process Tx interrupt
         */
    if (csr5 & CSR5_TI) {
        CSR7_INTR_MASK(dec21, CSR7_TIE);        
        dec21->clientOps->transmitNotify(dec21->clientCookie);
    }
        /*
         * Process errors, if Abnormal error summary bit is set.
         */
    if (csr5 & CSR5_AIS) {
        intrErr(dec21, csr5);
    }

    return PCI_INTR_CLAIMED;
}

Refer to the ChorusOS man pages section 9DDI: Device Driver Interfaces for details about bus interrupt handling interfaces

Periodic Health Checks

A latent fault will not show itself until some other action occurs. For example, a hardware failure occurring in a PCI card that is a cold standby could remain undetected until a fault occurs on the master PCI card. Only at that point will it be discovered that the system now contains defective PCI cards. It is essential to identify a failed secondary device so that it can be repaired or replaced before any failure in the primary device occurs. As a general rule, latent faults that are allowed to remain undetected will eventually cause system failure.

A hardened driver must perform periodic health checks on all the devices that it manages. Although this does not directly protect the system from the device, it does allow timely detection of failure during quiet periods. A device may be quiet because it has failed.

Periodic health checks can:

Run a quick access check on the board.
Check a register or memory location on the device whose value the driver expects to have deterministically altered since the last poll.
Time-stamp outgoing requests in order to detect any over-age requests which have not completed.
Initiate an action on the device which should be completed before the next scheduled check.

Note -

These kind of health checks are intended to be triggered and controlled through the Management DDI. A driver should not start periodic health checks itself, but rather rely on a driver component manager client to trigger the checks at a rate appropriate to the device, and service it provides. Please refer to the mngt(1CC) man page for details about Management DDI.

Chapter 14 Testing Hardened Drivers

Testing the resilience of a hardened driver is crucial to its success. This chapter describes hardened driver testing and validation through fault injection.

"Fault Injection" explains how fault injection is used to test hardened drivers.
"Initializing a Fault Injection Bus Driver" illustrates fault injection testing in a bus driver.

Fault Injection

In general, a device driver is connected to its parent bus driver using the bus driver interface (bus DDI, either common or bus architecture specific) as shown by the dashed line in Figure 14-1.

Typically, a device driver uses its bus DDI to:

Map and access device I/O registers.
Map device memory.
Allocate and manage DMA buffers.
Notify device interrupts.
Notify bus exceptions.
Notify bus and system events.

Figure 14-1 Driver Framework: Fault Injection Bus Drivers

The Driver Framework defines the DDIs and mechanisms to implement fault injection bus drivers which can be transparently inserted between the effective bus driver and the hardened device driver being tested, as shown in Figure 14-1 by continuous lines.

Basically, the fault injection (fi) bus driver is like a filter between the bus and device drivers. Note that the existence of such a filter is totally transparent to both bus and device drivers. From a bus driver perspective, the fi driver looks like a normal device driver instance started on a child node. From the device driver perspective, the fi driver looks like a normal parent bus driver providing the requested bus DDI (for example, PCI DDI).

The bus fi driver is responsible for:

Binding itself in place of the effective device driver being tested (and retaining the initial binding).
Starting the device driver instance.
Providing its own bus DDI operations to the tested driver instance.
Optionally providing an additional bus FI DDI which defines the fault injection operations allowed on that bus.

This bus FI DDI may be used by a bus fi driver's client to test, or validate, the hardened device driver. The bus DDI operations implemented in the bus fi driver are basically wrapper functions which simply call appropriate methods on the parent bus DDI (in an upstream direction) and call appropriate handlers in the device driver (in a downstream direction). However, internally or through its client, the bus fi driver's behavior can be changed by injecting faults to simulate a device malfunction. For example:

load() may return a corrupted value for a given register.
store() may ignore the write, or corrupt value to be written, to a given register.
A device driver interrupt handler may be called to simulate spurious, or stuck, interrupts.
An interrupt, delivered by the bus driver, may be ignored to simulate lost interrupts.
A device driver exception handler may be called to simulate bus exceptions.

In addition, because the fi driver is able to snoop all requests issued by the device driver, it is able to corrupt memory regions and DMA buffers mapped, or allocated, by the device driver.

By watching the device driver requests issued to the bus, the fi driver is able to perform multiple checks on the validity of those requests and their arguments. This allows you to remove all the debugging checks from the real bus driver's code, and to detect many other problems. By monitoring driver requests, a fi driver is able to:

Check if a bus resource is allocated before being used. For example, a region being mapped (I/O, memory) must be already allocated to a bus, as a resource.
Check if a bus resource is not used twice. For example, a region should not be mapped twice.
Check that arguments are in the correct range. For example, the offset of an I/O load and store operation must be within the mapped I/O region.
Check that all bus resources are freed when the tested driver closes its connection with the bus.

There is one instance of the fi bus driver running for each hardened driver tested.

Each instance may have one or no client at all.

A fi bus driver may be either:

Device oriented
Bus oriented (generic)

A fi bus driver may be specifically developed for a given device (connected to a given bus architecture). This type of driver would typically incorporate many, embedded fault scenarios which are device hardware-specific. A driver of this kind could be a self-contained validation test for a given hardened device driver.

On the other hand, you could develop a generic fi bus driver for a given bus architecture. This type of driver would typically provide an FI DDI interface allowing you to dynamically specify a fault to be injected. Then, using such an interface, the fi driver may be driven by a client application. A client of this kind could be a fault scenario interpreter.

In either case, the Driver Framework should define:

A mechanism to allow the bus fi driver to start transparently.
A new bus FI DDI for each existing bus DDI (each class of bus, for example BUS, PCI, ISA and so on).

Initializing a Fault Injection Bus Driver

This section describes one way of starting a fi bus driver, transparently, to test particular devices.

Devices for which fault simulation is required are dynamically identified by the fi bus driver at binding time.

The following property is defined:

PROP_FI_DRIVER: serves as a secondary binding property for a device node. It is used to copy the initial binding while starting the fi bus driver instead of the driver initially bound. Its value type is a null terminated string, as for the PROP_DRIVER property.

A fi bus driver registers itself in the driver registry, specifying the bus class it emulates as its required parent class. For example, a fault injection bus driver emulating the PCI DDI bus class requires a PCI class as its parent bus class, like any other PCI device driver.

Probing

A fi bus driver has an empty (NULL) drv_probe() method. As it is a pseudo driver, having no associated physical device, and because it binds itself to tested device nodes (instead of effective device drivers), it does not need to create any device tree node.

Binding

The drv_bind() routine of a fi bus driver is called for each node that is a child node of the emulated bus class. In this routine, the driver gets the opportunity to detect a device requiring fault injection and binds itself to this device, instead of the initially bound device driver.

Code Example 14-1 illustrates how a generic PCI Fault Injection bus driver (pciFi(9DRV)) may bind itself to all PCI device nodes. By default, this driver is transparent and only becomes active for a particular device when requested by a client.

Example 14-1 pciFi `drv_bind()` routine

    /*
     * Driver bind method
     */
    static void
drv_bind (DevNode node)
{
    DevProperty propDriver;
    DevProperty propNewDriver;
    DevProperty propFiDriver;

    propDriver = dtreePropFind(node, PROP_DRIVER);
        /*
         * If the device node is not bound to any driver
         * try to bind it
         */
    if (! propDriver) {
        DrvRegId     drv_curr;
        DrvRegId     drv_prev;
        DrvRegEntry* entry;
        
        drv_curr = svDriverLookupFirst();
        while (drv_curr) {
            entry = svDriverEntry(drv_curr);
            if (entry != &pciFiDrv) {
                if (entry->drv_bind &&
                    !strcmp(pciFiDrv.bus_class, entry->bus_class) &&
                    (pciFiDrv.bus_version >= entry->bus_version)) {
                        /*
                         * Try to bind the node
                         */
                    entry->drv_bind(node);
                }
            }
            drv_prev = drv_curr;
            drv_curr = svDriverLookupNext(drv_curr);
            svDriverRelease(drv_prev);
        }
        propDriver = dtreePropFind(node, PROP_DRIVER);
    }
            /*
             * If node is bound to a driver, copy its PROP_DRIVER property
             * into a PROP_FI_DRIVER property. Then bind our own driver to
             * the node, in order to be started in place of the original
             * driver.
             */
    if (propDriver && !dtreePropFind(node, PROP_FI_DRIVER)) {
        propFiDriver = dtreePropAdd(node, PROP_FI_DRIVER,
                                    dtreePropValue(propDriver),
                                    dtreePropLength(propDriver));
        if (!propFiDriver) {
            return;
        }
                /*
                 * Replace old PROP_DRIVER with my own driver name
                 */
        propNewDriver = dtreePropAdd(node, PROP_DRIVER, pciFiDrv.drv_name,
                                     strlen(pciFiDrv.drv_name) + 1);
        if (!propNewDriver) {
            dtreePropDetach(propFiDriver);
            dtreePropFree(propFiDriver);
            return;
        }
        dtreePropDetach(propDriver);
        dtreePropFree(propDriver);
    }
}

At the end of the binding phase, all device nodes (children of the emulated bus class node) are bound to the fi bus driver. The standard binding mechanism is not disturbed and the original bindings are duplicated in a PROP_FI_DRIVER property in each node.

Initializing

At initialization, the fi bus driver's drv_init() method is called for each device node tested. The fi bus driver will then get the opportunity to:

Launch an instance of itself, on that node.
Launch an instance of the tested device driver referenced in the PROP_FI_DRIVER property, giving its own bus operations instead of the original bus driver operations.
Optionally register itself in the device registry, to export a bus FI DDI, if it is intended to have a client.

Code Example 14-2 illustrates the initialization of a generic PCI Fault Injection bus driver (pciFi(9DRV)). In this example, the PciFiDev structure contains the fault injection driver instance specific data. This data is allocated and initialized in drv_init(). Fields of interest for the examples are:

node, which contains the bus FI driver's device node.
dev.node, which contains the tested child driver's device node.
entry, which contains data to be registered in the device registry.
devRegId, which contains the identifier of allocated device registry entry.

Example 14-2 pciFi `drv_init()` routine

    /*
     * Try to start (initialize) tested child driver
     */
    static KnError
childInit(PciFiDev* pciFi)
{
    char*        drv_name;
    DrvRegId     drv_curr;
    DrvRegId     drv_prev;
    DrvRegEntry* entry;
    DevProperty  prop;
    DevNode      node;
        /*
         * Check if not already started
         */
    if (pciFi->dev.node) {
        return K_EBUSY;
    }
    node = pciFi->node;
        /* 
         * Check for PROP_FI_DRIVER property
         */
    prop = dtreePropFind(node, PROP_FI_DRIVER);
    if (prop == NULL) {
        DKI_ERR(("%s: error -- %s required property not found\n",
                 pciFi->path, PROP_FI_DRIVER));
        return K_EFAIL;
    }
        /* 
         * Try to start PROP_FI_DRIVER driver
         */
    drv_name = (char*)dtreePropValue(prop);
    drv_curr = svDriverLookupFirst();
    while (drv_curr) {
        entry = svDriverEntry(drv_curr);
        if (entry->drv_init &&
            !strcmp(pciFiDrv.bus_class, entry->bus_class) &&
            (pciFiDrv.bus_version >= entry->bus_version) &&
            !strcmp(drv_name, entry->drv_name)) {
            entry->drv_init(node, &pciFiPciBusOps, pciFi);
            if (pciFi->dev.node) {
                svDriverRelease(drv_curr);
                break;
            }
        }
        drv_prev = drv_curr;
        drv_curr = svDriverLookupNext(drv_curr);
        svDriverRelease(drv_prev);
    }

    return (pciFi->dev.node ? K_OK : K_EFAIL);
}

    /*
     * Driver initialization method
     */
    static void
drv_init (DevNode node, void* busOps, void* busId)
{
    PciFiDev*      pciFi;
    int            pathSz;
    char*          path;
    KnError        res;
    DevProperty    prop;
    PciPropBusNum  bus;
    PciPropDevNum  dev;
    PciPropFuncNum func;
        /*
         * Get my path name in the device tree (for errors)
         */
    pathSz = dtreePathLeng(node);
    path   = (char*)svMemAlloc(pathSz);
    if (!path) {
        DKI_ERR(("%s: error -- not enough memory\n", pciFiDrv.drv_name));
        return;
    }
    dtreePathGet(node, path);
        /*
         * Get mandatory properties
         */
    [...]
        /*
         * Allocate driver instance data
         */
    pciFi = (PciFiDev*)svMemAlloc(sizeof(PciFiDev));
    if (! pciFi) {
        DKI_ERR(("%s: error -- not enough memory\n", path));
        svMemFree(path, pathSz);
        return;
    }
        /*
         * Initialize driver instance data
         */
    [...]
        /*
         * Allocate objects associated to allocated resources
         */
    [...]
        /*
         * Open parent PCI bus connection
         */
    res = pciFi->pciOps->open(busId,
                              node,
                              eventHandler, /* my event handler   */
                              loadHandler,  /* my load handler    */
                              pciFi,        /* my handlers cookie */
                              &pciFi->pciDevId);
    if (res != K_OK) {
        DKI_ERR(("%s: error -- open() failed (%d)\n", path, res));
        svMemFree(path, pathSz);
        svMemFree(pciFi, sizeof(PciFiDev));
        return;
    }
        /*
         * Allocate PCI_FI instance driver descriptor in the device registry
         */
    pciFi->entry.dev_class = PCIFI_CLASS;
    pciFi->entry.dev_id    = pciFi;
    pciFi->entry.dev_node  = node;
    pciFi->entry.dev_ops   = &pciFiOps;
    pciFi->devRegId        = svDeviceAlloc(&pciFi->entry,
                                           PCIFI_VERSION_INITIAL,
                                           FALSE, /* pciFi cannot be 
                                                   * shared */
                                           relHandler);
    if (! pciFi->devRegId) {
        DKI_ERR(("%s: error -- not enough memory\n", path));
        pciFi->pciOps->close(pciFi->pciDevId);
        svMemFree(pciFi, sizeof(PciFiDev));
        svMemFree(path, pathSz);
        return;
    }
        /*
         * Chain driver instance in list
         */
    pciFi->next = pciFiDevs;
    pciFiDevs   = pciFi;
        /*
         * Finally, register the new device driver instance
         * in the device registry. In case a shut down event
         * has been signaled during the initialization, the device entry
         * remains invalid and the relHandler() handler is invoked
         * to shut down the device driver instance. Otherwise, the device
         * entry becames valid and therefore visible for driver clients.
         */
    svDeviceRegister(pciFi->devRegId);

    DKI_MSG(("%s: %s driver started\n", path, pciFiDrv.drv_name));

        /*
         * Try to start a tested driver. If this fails, a tested driver
         * may be started later by the loadHandler() handler.
         */
    (void)childInit(pciFi);
}

Part IV Driver Hardening

Chapter 12 Overview of Driver Hardening

Hardened Drivers

Overview of the Process

Developer Resonsibilities

Chapter 13 Hardened Driver Requirements

No Panic

Containment

Example 13-1 dec21x4x fault containment

Logging

Example 13-2 dec21x4x error handling

Notification

Example 13-3 dec21x4x event handler

Bus Exceptions

Example 13-4 Raven bus error handler

Corrupt Data Detection

Example 13-5 loop on register value

Device Management and Control Data

Received Data

DMA

Stuck Interrupts

Example 13-6 Raven interrupt handler

Example 13-7 dec21x4x interrupt handler

Periodic Health Checks

Chapter 14 Testing Hardened Drivers

Fault Injection

Figure 14-1 Driver Framework: Fault Injection Bus Drivers

Initializing a Fault Injection Bus Driver

Probing

Binding

Example 14-1 pciFi drv_bind() routine

Initializing

Example 14-2 pciFi drv_init() routine

Example 13-7 `dec21x4x` interrupt handler

Example 14-1 pciFi `drv_bind()` routine

Example 14-2 pciFi `drv_init()` routine