The Sun Cluster HA for Solaris Containers fault monitors verify that the following components are running correctly:
Zone boot resource
Zone script resource
Zone SMF resource
Each Sun Cluster HA for Solaris Containers fault monitor is contained in the resource that represents Solaris Zones component. You create these resources when you register and configure Sun Cluster HA for Solaris Containers. For more information, see Registering and Configuring Sun Cluster HA for Solaris Containers.
System properties and extension properties of these resources control the behavior of the fault monitor. The default values of these properties determine the preset behavior of the fault monitor. The preset behavior should be suitable for most Sun Cluster installations. Therefore, you should tune the Sun Cluster HA for Solaris Containers fault monitor only if you need to modify this preset behavior.
Tuning the Sun Cluster HA for Solaris Containers fault monitors involves the following tasks:
Setting the interval between fault monitor probes
Setting the time-out for fault monitor probes
Defining the criteria for persistent faults
Specifying the failover behavior of a resource
For more information, see Tuning Fault Monitors for Sun Cluster Data Services in Sun Cluster Data Services Planning and Administration Guide for Solaris OS.
The Sun Cluster HA for Solaris Containers zone boot and script resources uses a parameter file to pass parameters to the start, stop and probe commands. Changes to these parameters take effect at every restart or enabling, disabling of the resource.
The fault monitor for the zone boot component ensures that the all requirements for the zone boot component to run are met:
The Sun Cluster HA for Solaris Containers zsched process is running.
If this process is not running, the fault monitor restarts the zone. If this fault persists, the fault monitor fails over the resource group that contains resource for the zone boot component.
Every host that is managed by a SUNW.LogicalHostname resource is operational.
If the host is not operational, the fault monitor fails over the resource group that contains resource for the zone boot component.
The specified milestone is either online or degraded
If the milestone is not online or degraded, the fault monitor restarts the zone. If this fault persists, the fault monitor fails over the resource group that contains resource for the zone boot component.
To verify the state of the milestone, the fault monitor connects to the zone. If the fault monitor cannot connect to the zone, the fault monitor retries every five seconds for approximately 60% of the probe time-out. If the attempt to connect still fails, then the fault monitor restarts the zone.
The fault monitor for the zone script component runs the script that you specify for the component. The value that this script returns to the fault monitor determines the action that the fault monitor performs. For more information, see Table 3.
The fault monitor for the zone SMF component verifies that the SMF service is not disabled. If the service is disabled, the fault monitor restarts the SMF service. If this fault persists, the fault monitor fails over the resource group that contains resource for the zone SMF component.
If the service is not disabled, the fault monitor runs the SMF service probe that you specify for the component. The value that this probe returns to the fault monitor determines the action that the fault monitor performs. For more information, see Table 4.