ChorusOS 5.0 Application Developer's Guide

Watchdog Timer

The watchdog timer API provides a set of system calls that enable a ChorusOS actor to manage a two-stage watchdog timer. The two-stage watchdog timer can be used by ChorusOS applications or middleware to monitor their sanity.

A user or supervisor actor takes control of the timer by invoking the watchdog timer API. When under the control of an application, the two-stage watchdog timer must be reloaded periodically. If the first-stage timer expires, an interrupt is triggered, enabling the collection of diagnostic information by means of the system dump feature. The system is then restarted.

If the system freezes before the completion of stage one, it will be reset when the second-stage timer expires.

The two-stage watchdog timer uses two watchdog devices. The first device operates in interrupt mode (stage one), while the second operates in reset mode (stage two). The first-stage interrupt handler is system-wide and is therefore not exposed to the user, as shown in Figure 14-1.

Figure 14-1 Two-Stage Watchdog Timer

Graphic

When the watchdog timer is enabled, the second stage reset-mode watchdog device is armed and is running all the time, even when the two-stage watchdog timer is unallocated and not controlled by an application.

In this situation, the running watchdog device is reloaded silently by the watchdog API. It is never subsequently disarmed. This ensures that the system is guarded continually against system lockups.


Note -

If a process is unscheduled for a long time, for example, as the result of a debugging session, for example, a system dump will occur (if implemented). Following the system dump, the system will be reset on timeout of the first and second-stage watchdog devices.


The watchdog timer API includes the following system calls. Note that in this table, the watchdog timer refers to the watchdog timer named by handle.

Table 14-1 Watchdog Timer API System Calls

System Call 

Purpose 

wdt_alloc()

Returns a valid handle that identifies the allocated watchdog timer 

wdt_realloc()

Reallocates the watchdog timer, that was returned by the last call to wdt_alloc(). This enables a new actor created in a higly available (HA) environment to take over watchdog management for an actor that has died. If the watchdog timer is armed, it will also be reloaded by this call.

wdt_free()

Disarms and releases the watchdog timer 

wdt_get_maxinterval()

Returns the maximum timeout interval that can be set for the timer 

wdt_set_interval()

Sets the timeout interval for the watchdog timer. If both components of interval are zero, the timer is disarmed. 

wdt_get_interval()

Returns the current timeout interval set for the watchdog timer 

wdt_arm()

Starts a new timeout interval, the duration of which will be set by wdt_set_interval()

wdt_disarm()

Disarms the watchdog timer 

wdt_is_armed()

Returns the state of the watchdog timer. A positive value is returned if the timer is armed. 

wdt_pat()

Reloads the watchdog timer. The timer begins a new timeout interval, the duration of which is set by wdt_set_interval().

wdt_startup_commit()

Indicates whether the watchdog timer startup sequence has completed successfully. Successful completion denotes that the system no longer needs to be rebooted if the reset mode watchdog device armed by the boot framework expires. Because the reset mode watchdog device must not be disarmed, the system will continue to reload the reset mode device silently until it is shut down, or until the watchdog timer is explicitly allocated and armed by the HA framework (or by another application). 

wdt_shutdown()

Indicates whether the system is being shut down. During shutdown, the reset mode watchdog device must not be reloaded for more than the configured timeout interval. This will ensure that the system is reset even if the shutdown sequence does not complete within the expected time period. This call will fail if it is invoked while the watchdog device is armed.