This section describes the Sun Cluster HA for BEA WebLogic Server Fault Monitor.
The Fault Monitor detects failures and takes action. If the monitor detects a failure in a BEA WebLogic Server, it first restarts the BEA WebLogic Server. If the BEA WebLogic Server fails a certain number of times (configurable by the administrator) within a certain time window (configurable by the administrator), the resource group containing the BEA WebLogic Server will be failed over to another surviving cluster node and restarted.
The Fault Monitor method, by default, probes the server URL set in the extension property Server_url. The probe connects to the hostname and the port and then does an HTTP GET request on the URL. If the connect fails, it is considered a complete failure and the resource group containing the BEA WebLogic Server is restarted or failed over to another surviving cluster node and restarted. If the connect succeeds, but the http response code is 500 (internal server error), it is also considered a complete failure and the resource group is restarted or failed over. All other http response codes are considered a success.
If the monitor_uri_list extension property is set, the probe method connects to the URIs mentioned in the list and takes action if there is a failure. The probe performs an HTTP GET on the specified URI or URIs.
If a complete failure (URL or URI probe) of the BEA WebLogic Server instance is detected by the probe, and if a database probe script is specified in the extension property db_probe_script, the probe method will probe the database before taking any action on the BEA WebLogic Server resource. If the database probe script returns success (database is up), action is taken on the BEA WebLogic Server resource. If the database probe script returns a failure (database is down), the BEA WebLogic Server probe will not take any action (restart or failover) until the database is up.
Before starting the BEA WebLogic Server configured in the resource, the BEA WebLogic Server configuration and the resource extension properties are validated. If the db_probe_script extension property is set, the database is probed by invoking the script set in the extension property. If the database is up, the BEA WebLogic Server is started by invoking the START script configured in the extension property Start_script under pmf. If the database is not up, the START method will return success and let the probe method handle the starting of the BEA WebLogic Server. The probe method will wait until the database is up to start the BEA WebLogic Server, as explained in Probing Algorithm and Functionality.
After launching the START script under pmf, the START method waits until the BEA WebLogic Server is in RUNNING mode before declaring the START method as successful. While waiting for the BEA WebLogic Server to start up, the probe method tries to connect to the server to check if it is up. There will be some messages displayed on the console during start up. The message “Failed to connect to host logical-host-1 and port 7001: Connection refused” will continue to be displayed until the BEA WebLogic Server starts up completely. Once the BEA WebLogic Server is in the RUNNING mode, the START method sets the status to “Started Successfully”.
The STOP method stops the BEA WebLogic Server configured in the resource. By default, the STOP method kills the BEA WebLogic Server by sending a SIGKILL to the BEA WebLogic process. If the smooth_shutdown extension property is set to TRUE, the STOP method tries to bring down the BEA WebLogic instance by running the command
java weblogic.Admin -url hostname:port -username $WLS_USER -password $WLS_PW SHUTDOWN
If this command fails, the BEA WebLogic Server is shut down by using SIGKILL. Even if the command succeeds, the STOP method sends SIGKILL to make sure the BEA WebLogic process is stopped.
If the smooth_shutdown extension property is set to TRUE, make sure the WLS_USER and WLS_PW are set in the BEA WebLogic Server START script. Once the Smooth_shutdown is set to TRUE, the WLS_USER and WLS_PW must not be removed from the START script unless the extension property is set to FALSE again.