Sun Cluster Data Service for SAP Web Application Server Guide for Solaris OS

Tuning the Sun Cluster HA for SAP Web Application Server Fault Monitors

Fault monitoring for the Sun Cluster HA for SAP Web Application Server data service is provided by the following fault monitors:

Each fault monitor is contained in a resource whose resource type is shown in the following table.

Table 1–3 Resource Types for the Fault Monitors of Sun Cluster HA for SAP Web Application Server

SAP enqueue server 

SUNW.sapenq

SAP replica server 

SUNW.saprepl

SAP message server 

SUNW.sapscs

SAP web application server component 

SUNW.sapwebas

SAP J2EE engine 

SUNW.gds

System properties and extension properties of the resource types control the behavior of the fault monitors. The default values of these properties determine the preset behavior of the fault monitors. The preset behavior should be suitable for most Sun Cluster installations. Therefore, you should tune the fault monitors only if you need to modify this preset behavior.

Tuning these fault monitors involves the following tasks:

Perform these tasks when you register and configure Sun Cluster HA for SAP Web Application Server, as described in Registering and Configuring Sun Cluster HA for SAP Web Application Server.

For detailed information about these tasks, see “Tuning Fault Monitors for Sun Cluster Data Services” in Sun Cluster Data Services Planning and Administration Guide for Solaris OS.

Operation of the Fault Monitor for the SAP Enqueue Server Resource Type

To determine whether the SAP enqueue server and the SAP replica server are operating correctly, the fault monitor for the SAP enqueue server resource type probes these resources periodically.

The probe uses the SAP utility ensmon to check the health of the SAP enqueue server and the SAP replica server.


# ensmon -H localhost -S port   option
-H localhost

Specifies that the name of the host is localhost.

-S port

Specifies the enqueue port.

option

Specifies the resources that the probe should check. The possible values of this option are as follows:

  • 1 – Check the SAP enqueue server only.

  • 2 – Check both the SAP enqueue server and the SAP replica server.

If this command is run on the command line, a return code is returned on the command line.

During a probe, the fault monitor first determines whether both the SAP enqueue server and the SAP replica server are online by running the ensmon command with the option argument set to 2.


# ensmon -H localhost -S port   2

The result of this command determines the action of the probe, as follows:

  1. If the command times out, the SAP enqueue server fault monitor checks whether only the SAP enqueue server is online by running the ensmon command with the option set to 1.


    # ensmon -H localhost -S port   1
    
    • If this command times out, the SAP enqueue server issues a partial failure. If this timeout occurs one more time within the probe interval period, a failover occurs.

    • If this command succeeds, the SAP enqueue server fault monitor logs a warning message to explain that the SAP enqueue server is online but the status of the SAP replica server is unknown.

    • If this command causes a system error, the SAP enqueue server issues a less serious partial failure. If a system error occurs three more times within the probe interval period, a failover occurs.

    • For all other unsuccessful conditions, the SAP enqueue server triggers a failover.

  2. If the command does not time out, the probe checks the value of the return code from the ensmon command, as follows:

    • A return code value of 0 indicates that the command is successful, and no further action is taken until the next probe.

    • A return code value of 4 indicates that the enqueue is running, and the replica is configured, but the replica is not running. The probe logs a warning message to indicate that the replica is not running.

    • A return code value of 8 indicates that the enqueue server is not running, and the probe triggers a failover.

    • A return code of 12 indicates an invalid parameter for the command, and the probe triggers a failover.

    • All other return codes are treated as a partial failure. If such a failure occurs three more times within the probe interval period, a failover occurs.

Note that the values for the number of timeouts and the probe interval period are assigned by the SAP enqueue server fault monitor. You cannot change these values.

Operation of the Fault Monitor for the SAP Replica Server Resource Type

Fault monitor responsibility for the SAP replica server resource type is currently handled by the Process Monitor Facility (PMF) in Sun Cluster.

Operation of the Fault Monitor for the SAP Message Server Resource Type

The fault monitor probe for the SAP message server resource type requires the msprot program. See Configuration Requirements.

The msprot program is not configurable.

Operation of the Fault Monitor for the SAP Web Application Server Component Resource Type

The fault monitoring for the SAP message server component resource type is performed by the dpmon program.

The dpmon program is not configurable.

Operation of the Fault Monitor for the SAP J2EE Engine Resource Type

The fault monitor probe for the SAP J2EE engine resource type requires the sap_j2ee_probe program.

The sap_j2ee_probe program is not configurable.