To minimize the disruption that transient faults in a resource cause, a fault monitor restarts the resource in response to such faults. For persistent faults, more disruptive action than restarting the resource is required:
For the SAP DB resource, the fault monitor fails over the resource to another node. The SAP DB resource is a failover resource.
For the SAP xserver resource, the fault monitor takes the resource offline. The SAP xserver is a scalable resource.
The fault monitors treat a fault as persistent if the number of attempts to restart a resource exceeds a specified threshold within a specified retry interval. Defining the criteria for persistent faults enables you to set the threshold and the retry interval to accommodate the performance characteristics of your cluster and your availability requirements.
Thorough_probe_interval system property
Probe_timeout extension property
To ensure that you allow enough time for the threshold to be reached within the retry interval, use the following expression to calculate values for the retry interval and the threshold:
retry-interval ≥ threshold × (thorough-probe-interval + probe-timeout)
To set the threshold and the retry interval, set the following system properties:
Set these properties for each resource that contains a Sun Cluster HA for SAP DB fault monitor that you need to tune. The resource types of these resources are shown in Table 1–3.
Besides defining a criterion for persistent faults, the retry interval affects the response of a fault monitor to the following faults:
Unavailability of SAP xserver that the SAP DB fault monitor detects. If the SAP DB fault monitor detects that SAP xserver is unavailable twice within the retry interval, the SAP DB fault monitor restarts SAP xserver.
Persistent system errors. A persistent system error is a system error that occurs four times within the retry interval. If a persistent system error occurs, the fault monitor restarts SAP xserver.