Tuning the HA for JD Edwards EnterpriseOne Enterprise Server Fault Monitors

Language:

This section provides the following information about the HA for JD Edwards EnterpriseOne Enterprise Server fault monitor's probing algorithm or functionality, and the conditions, messages, and recovery actions associated with unsuccessful probing:

For conceptual information about fault monitors, see the Oracle Solaris Cluster 4.3 Concepts Guide.

Resource Properties

The HA for JD Edwards EnterpriseOne Enterprise Server fault monitor uses the resource properties that are specified in the resource type ORCL.JDE_Enterprise_Server. See the r_properties(5) man page for a list of general resource properties used. See HA for JD Edwards EnterpriseOne Enterprise Server Extension Properties for a list of resource properties for this resource type.

Probing Algorithm and Functionality

The JD Edwards EnterpriseOne Enterprise Server is controlled by extension properties that control the probing frequency. The default values of these properties determine the preset behavior of the fault monitor and are suitable for most Oracle Solaris Cluster installations.

You can modify this preset behavior by modifying the following settings:

The interval between fault monitor probes (Thorough_probe_interval).
The timeout for fault monitor probes (Probe_timeout).
The number of times the fault monitor attempts to restart the resource (Retry_count). The HA for JD Edwards EnterpriseOne Enterprise Server fault monitor checks the server status within an infinite loop. During each cycle, the fault monitor checks the server state and reports failure or success.
If the fault monitor is successful, it returns to its infinite loop and continues the next cycle of probing and sleeping.
If the fault monitor reports a failure, a request is made to the cluster to restart the resource. If the fault monitor reports another failure, another request is made to the cluster to restart the resource. This behavior continues whenever the fault monitor reports a failure. If successive restarts exceed the Retry_count within the Thorough_probe_interval, a request is made to fail over the resource group onto a different node.

Operation of the HA for JD Edwards EnterpriseOne Enterprise Server Probe

The following list explains how the HA for JD Edwards EnterpriseOne Enterprise Server probe operates:

If the control_app_server script for the resource is still running with the start option, the probe returns 100. This implements wait for online during start. Otherwise, the probe continues.
If the output from jdejobs returns no running processes, the probe returns 100 to indicate a failed start. Otherwise, the probe continues.
If the return code from porttest returns a non-zero value, the status of the resource is changed to Degraded with the status message porttest failed and the probe returns 0.
On a subsequent probe, if the porttest returns a value 0, the probe returns 0 and the status messages of the resource are changed to Online and Service is Online respectively.
If the HA for JD Edwards EnterpriseOne Enterprise Server resource is repeatedly restarted and subsequently exhausts the Retry_count within the Retry_interval, and if Failover_enabled is set to TRUE, a failover to another node is initiated for the resource group.