This chapter provides a description of the tools, methods, and information sources available for troubleshooting the Sun Java System Application Server Server 9.1. Guidelines for evaluating and investigating a problem are included.
As applications get deployed, undeployed, and redeployed, and as you experiment with different server configuration settings, there may be times when your server gets into a confused or unstable state. In such cases, it is useful to have a previously saved working configuration on which to fall back. This is not problem solving, per se, but rather a way to avoid problems in the first place.
The Application Server asadmin command includes a backup-domain option that backs up the domain(s) you specify. Use this option to take periodic “snapshots” of your server configuration. Then, if necessary, use the restore-domain option to restore one or more domains to a known working state.
Refer to the Application Server Administration Guide for complete instructions on using the asadmin backup-domain and restore-domain options. Briefly, however, for the purposes of this Troubleshooting Guide, use the following procedure to backup and restore a server configuration:
Start the Application Server.
install_dir/bin/asadmin start-domain domain_name
Stop the domain.
install_dir/bin/asadmin stop-domain domain_name
Back up the domain.
install_dir/bin/asadmin backup-domain domain_name
Backed up directories are stored by default in the install_dir/backups directory.
Make changes to the Application Server configuration and/or domain(s), as desired.
If necessary, restore the server and/or domain configuration to the state saved in Step 3, above.
install_dir/bin/asadmin restore-domain --filename backup_file domain_name
J2EE application servers are typically deployed in complex and highly sophisticated operating environments. The Sun Java System Application Server covers a broad range of technologies, including Java, Java servlets, XML, JSP, JDBC data sources, EJB technology, and more. Other products and tools associated with the Application Server are LDAP, Web Server, SunONE Message Queue, deployment and migration tools, and so on. Understanding and diagnosing complex issues involving so many disparate components requires thorough knowledge and a careful diagnostic process.
Gathering any or all of the following information will make it easier to classify a problem and search for solutions. Note that operating system utilities, such as pkginfo and showrev on Solaris and rpm on Linux, are helpful in gathering system information.
What are the exact version numbers of the operating system and products installed?
Have any patches been applied? If so, specify product and operating system patch numbers.
How is the system configured?
What system resources does the system have (memory, disk, swap space, and so on)?
How many application servers, web servers, and directory servers are installed?
How is the web server connected to Application Server? On the same machine or not?
How is the Application Server connected to the directory server?
Are application servers in a cluster or not?
Was any upgrade done? If so, what were source and target versions?
Was a migration done? If so, what were source and target versions?
Have any new applications been deployed?
Is SSL enabled or not?
What versions of the HADB and the backend database are being used?
What JDBC driver is being used to access the database?
What JDK version is being used?
What are the JVM heap, stack, and garbage collection-related parameters set to?
What are the JVM options?
What third-party technologies are being used in the installation?
Are the interoperating component versions in compliance with the compatibility matrix specified in the release notes?
After gathering this information:
Collect web server error and access log data (web server instance-specific).
Collect any Application Server stack traces. Note that a fresh set of logs associated with the specific problem should be run. This avoids scanning gigabytes of irrelevant log information.
Determine the sequence of events that occurred when the problem first appeared, including any steps that may already have been taken to resolve the problem.
The following topics are addressed in this section:
Sometimes the most obvious solutions are overlooked, and so the first step is to verify the system configuration. Refer to the Sun Java System Application Server 9.1 Release Notes for the most up-to-date system requirements and dependencies.
Messages generally include information about the attempted action, the outcome of the action, and, if applicable, the cause of jeopardy or failure.
The log files contain the following general types of message entries:
Error – These messages mark critical failures that cause status to be reported as Failed. Error messages generally provide detailed information about the nature and the cause of the problem that occurred.
Warning – These messages mark non-critical failures. Warning messages generally contain information about the cause and the nature of the failure, and also provide possible remedies.
Information – These messages mark normal completion of particular tasks.
In some cases, the message is very clear about what is wrong and what needs to be done, if anything, to fix it. For example, if you start a domain using the asadmin start-domain command, then inadvertently issue the same command again after the domain has started, the following message is displayed:
userD:\\Sun\\studio5_se\\appserver8\\bin\>asadmin start-domain Domain already started : domain1 Domain domain1 Started.
In this case, the message gives clear guidance and the problem can be disregarded.
Sometimes an error message gives only general information about the problem or solution, or suggests multiple possibilities. For example:
[16/Jun/2003:22:20:50] SEVERE ( 2204): WEB0200: Configuration error in web module [JAXBProjectStudio] (while initializing virtual server [server1]) com.iplanet.ias.config.ConfigException: Failed to load deployment descriptor for: JAXBProjectStudio cause: java.io.FileNotFoundException:
In this case, the problem is not obvious, or there might be multiple things wrong. You might have to consider various possibilities and perhaps a number of solutions. If the proposed fix is time consuming or costly, take steps to ensure that the fix is likely to be correct before actually doing anything.
Some error messages are either not helpful or provide little guidance; for example:
[23/Jun/2003:16:50:45] WARNING ( 1972): for host 127.0.0.1 trying to GET /SupplierServiceClient1/SupplierServiceClient1_SOAP.html, send-file reports: HTTP4144: error sending D:/Sun/studio5_se/appserver8/domains/ domain1/server1/applications/j2ee-modules/SupplierServiceClient1_1/ SupplierServiceClient1_SOAP.html (Overlapped I/O operation is in progress.) status=1:5
In this case, there is very little information to go on. It is especially important to identify the exact situation that caused the error, and what the symptoms are before proceeding.
For descriptions of all the Application Server error messages, refer to the Sun Java System Application Server 9.1 Error Message Reference.
In addition to the message text, a logged message provides the following information:
Date and time of the event
Log level for the event — Application Server-specified log level ID or name
Process identifier (PID) — PID of the Application Server process
(optional) Virtual server identifier (VSID) — VSID that generated the message
Message identifier (MID) — subsystem and a four digit integer
The specific logs associated with each Application Server problem area are discussed in the associated chapters of this manual.
The Application Server has many log levels that can be set in the Administration GUI (FINEST, FINER, FINE, CONFIG, INFO, WARNING, SEVERE, ALERT, and FATAL). All messages are logged when the log level is set to FINEST and only serious error messages appear if the log level is set to FATAL.
Note that the more detailed log levels (FINEST, FINER, FINE) can generate high volumes of log information for certain events, which may make it appear at first glance that there is an error condition when in fact there is not.
All messages with a log level less than the default level of INFO (FINEST, FINER, FINE, and CONFIG) provide information related to debugging and must be specifically enabled. Instructions for doing this are contained in the Sun Java System Application Server Administrator's Guide.
In addition to the standard JDK log levels, the Application Server has added log levels designed to map more intuitively to the Application Server log file (server.log) and to tightly integrate with Solaris. The log levels ALERT and FATAL are specific to the Application Server and are not implemented in the JDK1.4 logging API.
For information on the event log mechanism used in the Microsoft Windows operating environment, refer to the Windows help system index using the keywords Event Logging. If you choose to send logs to the Windows server.log file, only messages with a log level of INFO, WARNING, SEVERE, ALERT, or FATAL are logged to the Windows Event Log.
The Administration GUI provides the following two logging options:
Option 1 — Log stdout (System.out.print) content to the event log
Option 2 — Log stderr (System.err.print) content to the event log
If the above options are not set:
Anything written to stdout or stderr (that is, using System.out or System.err) will not appear in the logs.
Messages logged with the JDK logger will appear in the logs.
Messages written to stdout or stderr appear with the INFO level, but do not have a message ID.
The Application Client Container (ACC) has its own log service and can only log to a local file. The ACC typically runs in its own process, on a different host from the Application Server. It has its own logging infrastructure and its own log file. The sun-acc.xml file contains the ACC configuration. Refer to the Sun Java System Application Server Application Server Developer's Guide to Clients for more information.
The following procedure describes how to obtain a server thread dump on UNIX.
Verify that the server.xml file for the affected server instance does not include the -Xrs java-option flag. Remove the -Xrs java-option flag if it exists.
If the option is changed, restart the server instance.
Use the ps command to determine the java and/or appservDAS processes under which the application server is running.
Run the following command on the application server instance:
kill -3 pid
The kill command redirects the thread dump to the server.log file for server the instance.
The following procedure describes how to obtain a server thread dump on Windows.
Verify that the server.xml file for your server instance does not include the -Xrs java-option flag. Remove the -Xrs java-option flag if it exists.
If the option was changed, restart your Application Server.
Type ctrl-brk in the Application Server window. The thread dump will be redirected to the server.log file for the instance.
A good initial step is to scan this Troubleshooting Guide to see if the problem is addressed here. If so, select the appropriate solution. Many of the solutions contain references to other documents in the Application Server document collection for additional details, explanations, or examples.
Start by reading the Release Notes for the version of the product you are troubleshooting.
Descriptions of the Application Server manuals are listed in Application Server Documentation Set.
Go to SunSolve.
Under SunSolve Collections, click the Search Collections link.
Select the checkbox for the collection(s) to search.
Enter the search criteria.
Browse directly in any of the online forums, or log in and register to start posting messages. The Application Server online forum is available at: http://forum.java.sun.com/index.jspa
When necessary, gather together the information you have acquired and contact technical support at http://www.sun.com/service/contacting.