The Sun Cluster HA for Sun Java System Web Server fault monitor is contained in the resource that represents Sun Java System Web Server. You create this resource when you register and configure Sun Cluster HA for Sun Java System Web Server. For more information, see Registering and Configuring Sun Cluster HA for Sun Java System Web Server.
System properties and extension properties of this resource control the behavior of the fault monitor. The default values of these properties determine the preset behavior of the fault monitor. The preset behavior should be suitable for most Sun Cluster installations. Therefore, you should tune the Sun Cluster HA for Sun Java System Web Server fault monitor only if you need to modify this preset behavior.
For more information, see the following sections.
The probe for Sun Cluster HA for Sun Java System Web Server uses a request to the server to query the health of that server. Before the probe actually queries the server, a check is made to confirm that network resources are configured for this web server resource. If no network resources are configured, an error message (No network resources found for resource) is logged, and the probe exits with failure.
The probe for Sun Cluster HA for Sun Java System Web Proxy Server also uses a similar mechanism to that of Sun Cluster HA for Sun Java System Web Server.
The probe must address the following two configurations of Sun Java System Web Server.
Secure instance
Insecure instance
If the web server is in secure mode and if the probe cannot communicate with the secure ports, the error message Unable to parse configuration file is logged, and the probe exits with failure. The secure and insecure instance probes involve common steps.
The Resource_dependencies resource-property setting on the Sun Java System Web Server resource determines the set of IP addresses that the web server uses. The Port_list resource-property setting determines the list of port numbers that Sun Java System Web Server uses. The fault monitor assumes that the web server is listening on all combinations of IP and port. The fault monitor attempts to probe all such combinations and might fail if the web server is not listening on a particular IP address and port combination.
If the probe fails to connect to the web server using a specified IP address and port combination, a complete failure occurs. The probe records the failure and takes appropriate action.
If the probe successfully connects, the probe checks if the web server is run in a secure mode. If so, the probe disconnects and returns with a success status. No further checks are performed for a secure Sun Java System Web Server.
However, if the web server is running in insecure mode, the probe sends an HTTP 1.0 HEAD request to the web server and waits for the response. The request can be unsuccessful for various reasons, including heavy network traffic, heavy system load, and misconfiguration.
Misconfiguration can occur when the web server is not configured to listen on all IP address and port combinations that are being probed. The web server should service every port for every IP address specified for this resource.
Misconfigurations can also result if the Resource_dependencies and Port_list resource properties were not set correctly when you created the resource.
If the reply to the query is not received within the Probe_timeout resource time limit, the probe considers this probe a failure of Sun Cluster HA for Sun Java System Web Server. The failure is recorded in the probe's history.
A probe failure can be a complete or partial failure. The following probe failures are considered complete failures.
Failure to connect to the server. The following error message is sent, where %s indicates the host name and %d indicates the port number.
| Failed to connect to %s port %d | 
Timeout (exceeding the resource-property timeout Probe_timeout) after trying to connect to the server.
Failure to successfully send the probe string to the server. The following error message is sent, where the first %s indicates the host name, %d indicates the port number, and the second %s indicates further details about the error.
| Failed to communicate with server %s port %d: %s | 
The monitor accumulates two such partial failures within the resource-property interval Retry_interval and counts them as one failure.
The following probe failures are considered partial failures.
Timeout (exceeding the resource-property timeout Probe_timeout) while trying to read the reply from the server to the probe's query.
Failing to read data from the server for other reasons. The following error message is sent, where the first %s indicates the host name, %d indicates the port number, and the second %s indicates further details about the error.
| Failed to communicate with server %s port %d: %s | 
The probe connects to the Sun Java System Web Server server and performs an HTTP 1.1 GET check by sending a HTTP request to each of the URIs in Monitor_Uri_List. If the HTTP server return code is 500 (Internal Server Error) or if the connect fails, the probe will take action.
The result of the HTTP requests is either failure or success. If all of the requests successfully receive a reply from the Sun Java System Web Server server, the probe returns and continues the next cycle of probing and sleeping.
Heavy network traffic, heavy system load, and misconfiguration can cause the HTTP GET probe to fail. Misconfiguration of the Monitor_Uri_List property can cause a failure if a URI in the Monitor_Uri_List includes an incorrect port or hostname. For example, if the web server instance is listening on logical host schost-1 and the URI was specified as http://schost-2/servlet/monitor, the probe will try to contact schost-2 to request /servlet/monitor.
Based on the history of failures, a failure can cause either a local restart or a failover of the data service. This action is further described in Tuning Fault Monitors for Sun Cluster Data Services in Sun Cluster Data Services Planning and Administration Guide for Solaris OS.