Oracle® Solaris Cluster Data Service for Oracle iPlanet Web Server Guide

Exit Print View

Updated: July 2015
 
 

Operations by the Fault Monitor During a Probe

The probe for HA for Oracle iPlanet Web Server uses a request to the server to query the health of that server. Before the probe actually queries the server, a check is made to confirm that network resources are configured for this web server resource. If no network resources are configured, an error message (No network resources found for resource) is logged, and the probe exits with failure.

The probe must address the following two configurations of Oracle iPlanet Web Server.

  • SSL–based instance (secure)

  • Non-SSL based instance (insecure)

If the web server is in SSL–based mode and if the probe cannot communicate with the secure ports, the error message Unable to parse configuration file is logged, and the probe exits with failure. The SSL and non-SSL based instance probes involve common steps.

The Resource_dependencies resource-property setting on the Oracle iPlanet Web Server resource determines the set of IP addresses that the web server uses. The Port_list resource-property setting determines the list of port numbers that Oracle iPlanet Web Server uses. The fault monitor assumes that the web server is listening on all combinations of IP and port. The fault monitor attempts to probe all such combinations and might fail if the web server is not listening on a particular IP address and port combination.

If the probe fails to connect to the web server using a specified IP address and port combination, a complete failure occurs. The probe records the failure and takes appropriate action.

If the probe successfully connects, the probe checks if the web server is run in SSL–based mode. If so, the probe disconnects and returns with a success status. No further checks are performed for an SSL-based Oracle iPlanet Web Server.

However, if the web server is running in non-SSL based mode, the probe sends an HTTP 1.0 HEAD request to the web server and waits for the response. The request can be unsuccessful for various reasons, including heavy network traffic, heavy system load, and misconfiguration.

Misconfiguration can occur when the web server is not configured to listen on all IP address and port combinations that are being probed. The web server should service every port for every IP address specified for this resource.

Misconfigurations can also result if the Resource_dependencies and Port_list resource properties were not set correctly when you created the resource.

If the reply to the query is not received within the Probe_timeout resource time limit, the probe considers this probe a failure of HA for Oracle iPlanet Web Server. The failure is recorded in the probe's history.

A probe failure can be a complete or partial failure. The following probe failures are considered complete failures.

  • Failure to connect to the server. The following error message is sent, where %s indicates the host name and %d indicates the port number.

    Failed to connect to %s port %d
  • Timeout (exceeding the resource-property timeout Probe_timeout) after trying to connect to the server.

  • Failure to successfully send the probe string to the server. The following error message is sent, where the first %s indicates the host name, %d indicates the port number, and the second %s indicates further details about the error.

    Failed to communicate with server %s port %d: %s

The monitor accumulates two such partial failures within the resource-property interval Retry_interval and counts them as one failure.

The following probe failures are considered partial failures.

  • Timeout (exceeding the resource-property timeout Probe_timeout) while trying to read the reply from the server to the probe's query.

  • Failing to read data from the server for other reasons. The following error message is sent, where the first %s indicates the host name, %d indicates the port number, and the second %s indicates further details about the error.

    Failed to communicate with server %s port %d: %s

The probe connects to the Oracle iPlanet Web Server server and performs an HTTP 1.1 GET check by sending a HTTP request to each of the URIs in Monitor_Uri_List. If the HTTP server return code is 500 (Internal Server Error) or if the connect fails, the probe will take action.

The result of the HTTP requests is either failure or success. If all of the requests successfully receive a reply from the Oracle iPlanet Web Server server, the probe returns and continues the next cycle of probing and sleeping.

Heavy network traffic, heavy system load, and misconfiguration can cause the HTTP GET probe to fail. Misconfiguration of the Monitor_Uri_List property can cause a failure if a URI in the Monitor_Uri_List includes an incorrect port or hostname. For example, if the web server instance is listening on logical host schost-1 and the URI was specified as http://schost-2/servlet/monitor, the probe will try to contact schost-2 to request /servlet/monitor.

Based on the history of failures, a failure can cause either a local restart or a failover of the data service. This action is further described in Tuning Fault Monitors for Oracle Solaris Cluster Data Services in Oracle Solaris Cluster Data Services Planning and Administration Guide .