Go to main content
Oracle® Server Management Agents User's Guide

Exit Print View

Updated: January 2017
 
 

HMP Watchdog Agent Overview

When it is installed and configured, the HMP Watchdog Agent periodically checks the host and/or Oracle ILOM and performs a user-configured action if either proves unresponsive. The actions can include posting a warning to a log file, resetting the corresponding device, and in the case of the host, power cycling or powering off the host.

The Oracle HMP watchdog agent is an optional Oracle HMP component that you can install using the Oracle HMP installer. For information on how to install this component, refer to the Oracle Hardware Management Pack Installation Guide.

Note the following important information regarding the HMP watchdog agent:

  • Your system must meet the following requirements to run the HMP watchdog agent:

    • A Linux operating system installed

    • Oracle HMP 2.3.0.0 or later installed

    • Oracle ILOM 3.2.2 or later on the SP

  • The agent must be started after it is installed and it must be restarted after the host is reset..

  • The host or ILOM watchdog configuration is preserved after the host is reset.

  • You can configure the HMP watchdog agent using the ilomconfig CLI command, or by editing the hmp_watchdogd.conf file. For instructions, see Configuring the HMP Watchdog Agent. The ilomconfig is the preferred method.

The HMP watchdog agent provides two services, ILOM watchdog and host watchdog.

ILOM Watchdog Overview

The ILOM watchdog periodically queries Oracle ILOM. If Oracle ILOM becomes unresponsive, the host either posts a warning or resets Oracle ILOM.

It also logs an appropriate message into the HMP watchdog log file, and in the host's system log at /var/log/messages.

You can control ILOM watchdog's parameters from the ilomconfig command, or by editing the HMP watchdog agent's configuration file. See ILOM Watchdog Parameters for more information on the ILOM watchdog parameters that you can modify. The ilomconfig command is the preferred method for controlling ILOM watchdog.

Host Watchdog Overview

The host watchdog causes Oracle ILOM to watch the host. If the host becomes unresponsive, Oracle ILOM performs a customer-configured action, which can be: Warning, Reset, Power Off, or Power Cycle.

You can control the host watchdog parameters using the ilomconfig command, or by editing the HMP watchdog agent's configuration file. The ilomconfig command is the preferred method for controlling host watchdog.

See Host Watchdog Parameters for more information on the host watchdog parameters that you can modify.

Host watchdog is built on top of standard IPMI watchdog timer capability. The host watchdog interacts with the IPMI watchdog timer as follows:

  • When the host watchdog is enabled by the user, it first checks to see if the IPMI watchdog timer is already started. If the IPMI watchdog timer is started, the host watchdog issues a log message indicating that the watchdog timer is already started, and the host watchdog remains in the Disable state.

  • When the host watchdog has already been enabled, the host watchdog periodically resets the IPMI watchdog timer to the configuration values of host watchdog. This takes care of the case where someone changes configuration outside of HMP watchdog agent.

  • When the OS hangs, the IPMI watchdog timer is not reset by the time the timer expires. This causes Oracle ILOM to perform the action specified by timer-action parameter.