Sun Cluster 3.1 Data Service for Netbackup

Fault Monitoring Sun Cluster HA for NetBackup

When the application starts, NetBackup starts three daemons—vmd, bprd, and bpdbm. The Sun Cluster HA for NetBackup fault monitor monitors these processes. While the START method runs, the fault monitor waits until the daemons are online before monitoring the application. The Probe_timeout extension property specifies the amount of time that the fault monitor waits.

After the daemons are online, the fault monitor uses kill (pid, 0) to determine whether the daemons are running. If any daemon is not running, the fault monitor initiates the following actions, in order, until all of the probes are running successfully.

  1. Restarts the resource on the current node.

  2. Restarts the resource group on the current node.

  3. Fails over the resource group to the next node on the resource group's nodelist.

All process IDs (PIDs) are stored in a temporary file, /var/run/.netbackup_master.