BEA Logo BEA WebLogic Server Release 1.1

  Corporate Info  |  News  |  Solutions  |  Products  |  Partners  |  Services  |  Events  |  Download  |  How To Buy

   Using WebLogic Server Clusters:   Previous topic   |   Next topic   |   Contents   |  Index

 

Troubleshooting WebLogic Server Clusters

 

Applying Service Packs

If you experience cluster-related problems with WebLogic Server versions 4.5 or 5.1, try applying the latest service pack for you release before contacting BEA Technical Support. The latest service packs address many of the earlier cluster-related problems with these WebLogic Server versions, especially problems related to cluster deadlocking scenarios.

See WebLogic Server Updates for more information about obtaining and installing WebLogic Server service packs.

Collecting Diagnostic Information

Before contacting BEA Technical Support for help with cluster-related problems, make sure you follow the steps in this section to collect the required diagnostic information for your system. The primary diagnostic information for cluster-related problems is a log file that contains multiple thread dumps (if applicable) from the clustered server. This log file can be helpful in diagnosing a variety of cluster-related problems, but it is especially important for addressing problems related to cluster "freezes" and deadlocks.

Note: If you experience a cluster problem that involves a deadlock between server instances or otherwise causes your cluster to "hang", a log file that contains multiple thread dumps is a prerequisite for diagnosing your problem.

To create the required log file, follow these steps:

  1. Remove or backup and log files you may currently have. In practice you should create a new log file each time you boot a WebLogic Server instance, rather than append new sessions to a historical log file.

  2. Turn on verbose Garbage Collection (GC) output for your Java VM when you start WebLogic Server. Doing so will produce the required thread dumps for diagnosing cluster problems. See the next step for an example command-line.

  3. Redirect both the standard error and standard output to a log file. Doing so places thread dump information in the proper context with WebLogic Server informational and error messages, and provides a more useful log for diagnostic purposes. For example:

    % java -ms64 -mx64m -verbosegc -classpath $CLASSPATH -Dweblogic.class.path=./license;./classes;./lib/weblogicaux.jar;./myserver/serverclasses -Dweblogic.home=. -Dweblogic.security.policy=./weblogic.policy weblogic.Server >> logfile.txt

  4. Continue running the WebLogic Server cluster until you have reproduced the problem and the log file contains multiple thread dumps on each server, with distinct intervals between thread dumps.

Providing diagnostics to BEA Technical Support

After you have created a diagnostic log file (with thread dumps, if applicable), use the following guidelines to provide the information to your BEA Technical Support representative:

  1. Compress the log file using an operating system compression utility:

    % tar czf logfile.tar logfile.txt

  2. Append the compressed log file to an email message to your Technical Support representative.

    Note: Always include the compressed log file as an attachment to the message. Do not cut and paste the log file into the body of the email.

  3. If the compressed log file is too large to attach to an email message, you can use the BEA Customer Support FTP site.

Addressing Common Problems

The following sections provide solutions to common cluster-related problems. They also provide information for how to diagnose non-specific problems, such as poor cluster performance.

Tuning client connection timeouts with TIME_WAIT

Version 4.5 and 5.1 WebLogic proxy plug-ins do not use connection pooling to access clusters in the presentation tier. If you use a two-tier cluster, each request that a proxy plug-in makes to the servlet/JSP cluster opens an IP socket. After the client closes the socket, the socket remains open on the WebLogic Server for the configured timeout period.

On most systems, the default timeout period is too long to support the numerous, brief socket connections used by clients of a web application. If you have a large number of users accessing your cluster via a proxy plug-in, you may find that the system frequently has a large number of open (but inactive) sockets waiting to timeout.

The timeout period for sockets is determined by the IP implementation of your operating system. There are no WebLogic Server-specific configuration parameters that affect socket timeouts. To reduce the length of time that inactive client sockets remain opened, reduce the IP timeout value for the operating system that hosts the WebLogic Server cluster. The applicable configuration parameters are:

Server fails to join a cluster

There are several reasons why a WebLogic Server does not join a cluster on startup, including general network availability and WebLogic-specific configuration problems. Use this checklist to check your configuration and startup process.

  1. Check your command-line parameters for typos, misspellings, etc.

  2. Verify that there are no physical problems with your network connection, etc. Network connections can be verified using the dbping utilities discussed in Testing Connections.

  3. Verify that no other application is using the cluster multicast address.

  4. Run the utils.MulticastTest utility to verify that multicast is working.

  5. Run the utils.system utility to verify CLASSPATH consistency if you are not using a shared file system classpath.

Other items which require troubleshooting include general configuration errors and communications errors, such as:

  1. Incompatible version numbers. All WebLogic Servers in the cluster must be the same version. If a server attempts to join a cluster with a WebLogic Server whose version does not match the other servers in the cluster, an error message will be generated.

  2. Unable to find a license for clustering. Your WebLogic license does not include the clustering feature. Contact your sales representative.

  3. Unable to send service announcement. This could indicate a general network problem, or a misconfigured DNS. Clustered servers communicate among themselves over multicast and must share the same (exclusive) multicast address.

  4. Cannot set default clusterAddress properties value. This could mean that another server with the same IP address has already joined the cluster. Check to make sure you do not have duplicate IP addresses assigned to multiple machines.

  5. Unable to create a multicast socket for clustering, Multicast socket send error, or Multicast socket receive error. These communications errors are most likely caused by an incorrect or bad multicast address.

    Note that each operating system has specific configuration requirements for configuring multicast; you should check your OS documentation for help in correcting this error.