ChorusOS 5.0 Board Support Package Developer's Guide

Periodic Health Checks

A latent fault will not show itself until some other action occurs. For example, a hardware failure occurring in a PCI card that is a cold standby could remain undetected until a fault occurs on the master PCI card. Only at that point will it be discovered that the system now contains defective PCI cards. It is essential to identify a failed secondary device so that it can be repaired or replaced before any failure in the primary device occurs. As a general rule, latent faults that are allowed to remain undetected will eventually cause system failure.

A hardened driver must perform periodic health checks on all the devices that it manages. Although this does not directly protect the system from the device, it does allow timely detection of failure during quiet periods. A device may be quiet because it has failed.

Periodic health checks can:


Note -

These kind of health checks are intended to be triggered and controlled through the Management DDI. A driver should not start periodic health checks itself, but rather rely on a driver component manager client to trigger the checks at a rate appropriate to the device, and service it provides. Please refer to the mngt(1CC) man page for details about Management DDI.