The watchdog timer feature enables a two-step watchdog mechanism on hardware. It consists of a lower-level system layer provided by the driver, that exposes a DDI, and a higher-level layer that hides the DDI and provides an easier API for any user program. The watchdog itself has two steps:
If the watchdog is not patted within a certain delay, an interrupt handler provided by the system is invoked. This interrupt handler attempts to shut down the system and to perform a system dump of the node to collect evidence of the problem.
If the interrupt step gets stuck or lasts too long, the watchdog resets the board, causing it to reboot.
The watchdog is either started by the system at system initialization or possibly by the boot loader. It is expected that a dedicated user-level process will be responsible for patting the watchdog throughout the normal life of the system. A failure in the patting process will lead to the interrupt step of the watchdog mechanism.
To cope gracefully with transitions at initialization time, as well as at system shut-down time, the system is designed to pat the watchdog by itself for a configurable amount of time at system initialization and system shut down. During these periods, where a patting process in user mode might not be possible, the system will play that role implicitly. However, the duration of these initialization and shut-down periods is bound to system configurable values, so it is impossible for initialization to reach the point where the user-level patting process begins without the watchdog interrupt occurring. Similarly, shut down is guaranteed to be bound, or the watchdog interrupt will occur.
Some hardware can support more than one watchdog. The API copes with such situations by associating handles to watchdogs. The WDT feature API is similar to the watchdog API for the Solaris operating environment.
For details on watchdog timer, see the WDT(5FEA) man page.
The watchdog timer API is summarized in the following table:
Function |
Description |
---|---|
wdt_pat() |
Pat (reload) the watchdog timer |
wdt_alloc() |
Allocate a watchdog timer |
wdt_realloc() |
Reallocate a watchdog timer |
wdt_free() |
Disarm and free a watchdog timer |
wdt_get_maxinterval() |
Get the maximum limit (hardware) of a watchdog |
wdt_set_interval() |
Set the interval duration of a watchdog |
wdt_get_interval() |
Get the interval duration of a watchdog |
wdt_arm() |
Arm a watchdog |
wdt_disarm() |
Disarm a watchdog |
wdt_is_armed() |
Check whether a watchdog is armed |
wdt_startup_commit() |
Tells the system the initialiazation phase is over |
wdt_shutdown() |
Tells the system to start patting for shut down |
The wdt_realloc() function enables a process to regain control over a watchdog allocated by a possibly dead process.