|bea.com | products | dev2dev | support | askBEA|
|e-docs > WebLogic Server > Using WebLogic Server Clusters > Troubleshooting Common Problems|
Using WebLogic Server Clusters
Troubleshooting Common Problems
This chapter provides guidelines on how to prevent cluster problems or troubleshoot them if they do occur.
Before You Start the Cluster
You can do a number of things to help prevent problems before you boot the cluster.
Check for a Cluster License
Your WebLogic Server license must include the clustering feature. If you try to start a cluster without a clustering license, you will see the error message Unable to find a license for clustering.
Check the Server Version Numbers
All servers in the cluster must have the same major version number, but can have different minor version numbers and service packs.
The cluster's Administration Server is typically not configured as a cluster member, but it should run the same major version of WebLogic Server used on the managed servers.
Check the Multicast Address
A problem with the multicast address is one of the most common reasons a cluster does not start or a server fails to join a cluster.
A multicast address is required for each cluster. The multicast address can be an IP number between 18.104.22.168 and 22.214.171.124, or a host name with an IP address within that range.
You can check a cluster's multicast address and port using the WebLogic Server Console.
For each cluster on a network, the combination of multicast address and port must be unique. If two clusters on a network use the same multicast address, they should use different ports. If the clusters use different multicast addresses, they can use the same port or accept the default port, 7001.
Before booting the cluster, make sure the cluster's multicast address and port are correct and do not conflict with the multicast address and port of any other clusters on the network.
The errors you are most likely to see if the multicast address is bad are:
Check the CLASSPATH Value
Make sure the value of CLASSPATH is the same on all managed servers in the cluster. CLASSPATH is set by the setEnv script, which you run before you run startManagedWebLogic to start the managed servers.
By default, setEnv sets this value for CLASSPATH (as represented on Windows systems):
If you change the value of CLASSPATH on one managed server, or change how setEnv sets CLASSPATH, you must change it on all managed servers in the cluster.
Check the Thread Count
Each server in the cluster is allocated an execution thread count that you can check in the console (click Servers > server name > Monitoring > Monitor All Active Queues > Configure Execute Queue; then click default in the server list).
Figure 8-1 Checking the Execute Queue Thread Count
Before starting a managed server, check its Thread Count attribute. The default value is 15, and the minimum value is 5. If the value of Thread Count is below 5, change it to a higher value so that the managed server doesn't hang on startup.
After You Start the Cluster
Check Your Commands
If the cluster fails to start, or a server fails to join the cluster, the first step is to check any commands you have entered, such as startManagedWebLogic or a java interpreter command, for errors and misspellings.
Remember that in this release, the server starts with the system name and password weblogic.
Generate a Log File
Before contacting BEA Technical Support for help with cluster-related problems, you need to collect some diagnostic information. The information you need most is a log file with multiple thread dumps from a managed server. The log file is especially important for addressing cluster freezes and deadlocks.
Remember: a log file that contains multiple thread dumps is a prerequisite for diagnosing your problem.
To create the log file, follow these steps on an administration server or managed server:
% java -ms64m -mx64m -verbose:gc -classpath $CLASSPATH
weblogic.Server >> logfile.txt
Redirecting both standard error and standard output places thread dump information in the proper context with server informational and error messages and provides a more useful log.
% tar czf logfile.tar logfile.txt
- or zip it using a Windows utility.
Check Garbage Collection
If you are experiencing cluster problems, you should also check the garbage collection on the managed servers. If garbage collection is taking too long, the servers will not be able to make the frequent heartbeat signals that tell the other cluster members they are running and available.
If garbage collection (either first or second generation) is taking 10 or more seconds, you need to tune heap allocation (the msmx parameter) on your system.
You can verify that multicast is working by running utils.MulticastTest from one of the managed servers. See "Using the WebLogic Server Java Utilities" in WebLogic Server Command Reference.