![]() | |
Sun Java System Application Server 7 2004Q2 Troubleshooting Guide |
Chapter 3
Startup and Login ProblemsThis chapter addresses common problems that can occur when the Application Server or Admin Server are starting, or when a user is logging in.
The following sections are contained in this chapter:
Can’t access server’s initial screenWhen you visit the start page of the Application Server, the initial screen does not appear. Consider the following:
Is the Application Server running?
Use one of the following commands to determine if the Admin Server has been started:
Solution
If the Application Server is not running, start the initially-configured administrative domain by running the following command:
As the command completes, you should observe the following results:
If other problems occur, you can use the following command to stop both the Admin Server as well as the Application Server instance of the initially-configured domain, domain1:
As the command completes, you should observe the following results:
Now restart the domain as explained above.
Syntax on the asadmin command is contained in the Application Server man pages and the Sun Java System Application Server Administrator’s Guide.
During installation, did the initial server startup run successfully?
If the console window is still open, it should display a message like this:
where domain1 is the name of the default domain. This indicates that the default domain was started successfully.
If you have already closed the console window, you can check for messages in the Application Server log file here:
If startup was successful, you should see a message similar to the following at the end of the log file:
[INFO][...][..][date&time][Application server startup complete .]
Is the server available locally?
To verify that the server is running locally:
Situation 1: If the start page does not appear on the local machine, it is most likely that the application server isn't running or didn't start normally.
Solution 1
In addition to checking the server logs for any errors during startup, check the following:
Situation 2: If the start page appears locally but not on remote machines, there is a networking problem from the remote clients. For example, DNS might be set incorrectly (so the request is being sent to the wrong machine), the network configuration on the remote machine could be incorrect, a network router could be down, and so on.
Solution 2
This is probably not an Application Server issue. Check your network.
Was the server started at the expected port?
The server could be running at a different port number than the one you expect, either because it was intentionally installed there, or because another server was already running on the default port when the server was installed.
To determine which port number the server is actually using:
Explanation of how the expected port number can change during installation—The server's default port number is 80, however, you can specify a different port number during installation. During installation, if the specified port number is already taken by another application when you start the server, the port number rolls forward to the next available number. For example, if a server was already running on the default port 80, the Application Server would be running on port number 81. If two servers were running, the port number would be 82, and so on.
If http-listener is running at a port that is in use, you may see a message similar to the following:
[21/Jan/2003:01:41:15] WARNING (10364): ADM0011: Could not reregister HttpListener with DomainRegistry.
Application Server and HADB port assignments must not conflict with other port assignments on the same machine. Default and recommended port assignments are as follows:
Solution 1
Kill any other process that is running under the same port, or change the port number of the http-listener as follows:
- Open the Administration interface (hostname:admin_port).
- Browse to the HTTP Server.
- Browse to the HTTP Listeners.
The default listener is http-listener-1.
- Click that default listener and find the port number (default is 80).
- Change it to any unused port.
- Save the settings.
You should no longer receive this warning.
Solution 2
Change to another port and be sure to enter the correct port number when invoking the server.
Is your proxy setting causing a problem?
You should be able to access the server directly from your local system (localhost) as follows (for the default port 80):
You may not be able to access your local system if your browser connects to the web through a proxy. (A proxy is a program that looks like a direct web connection, but which is actually a separate program that makes that connection for you.)
A typical error message situation is:
The requested item could not be loaded by the proxy.
Netscape Proxy's network connection was refused by the server: localhost:4848 The server may not be accepting connections or may be busy. Try connecting again later
Solution
To solve this problem, do one of the following:
Has an ungraceful shutdown occurred on a previously-running server?
If a crash has occurred, the server could be in an inconsistent state.
Solution
Use the asadmin stop-domain command to stop the Application Server, then restart the server using asadmin start-domain command.
Refer to Is the Application Server running? for guidelines.
Can't access the Admin Server.The Admin Server provides the administration facilities for the Application Server (one Admin Server per domain). The Application, Server log file, at domains/domain1/admin-server/logs/server.log, may be helpful in determining the reason the Admin Server is not running.
If you cannot access the Admin Server, consider the following:
Has the Admin Server been started?
See Is the Application Server running?.
Are you the user who installed the Application Server?
When the start-domain or stop-domain command fails with the error:
Could not start the domain.
You don’t have permission to access
<install_dir>/domains/domain1/admin-server/configThe error indicates that you are not logged on as the user who installed the system.
Solution
You have to start the domain's admin server using the same login name as the user who installed the app server. You can then start other server instances using AdminGUI, once the admin server has been started, but the admin server can only be started by the person who installed the server.
Is the Admin Server running at the expected port?
The default port number for the Admin Server is 4848. However, the server could be running at a different port number than the one you expect, either because it was intentionally installed there, or because another server was already running on the installation port when the server was started.
Solution
Refer to Was the server started at the expected port? for guidelines on checking the port your Admin Server is actually running on.
Be sure to enter the correct port number when invoking the Admin Server.
Can’t access a server application.If you are unable to access a particular application, find the application’s context root in the deployed application’s application.xml file in domains/domain1/server1/applications/j2ee-apps.
Then consider the following:
Is the Application Server running, and is the application deployed?
The server must be running before an application can be accessed.
Solution
Use the asadmin command to determine if the application server is running. If it is, and if the application is deployed, the following command will list the deployed applications and components:
asadmin list-components --user admin --password password server1
You can then look for your application in the listing, which will look something like this:
hello1 <application>
dukesbook <application>
There are no standalone WAR modules
There are no standalone EJB modules
There are no connector modulesFor more information, see Can’t access server’s initial screen.
Is the application enabled?
Use the following command to see if the application is enabled:
asadmin show-component-status --user admin --password password dukesbook
where dukesbook is the application (component) name.
Was application deployment successful?
An application must be successfully deployed before it can be accessed.
Solution
To verify that deployment was successful, do the following:
- Check install_dir/domains/domain1/server/server.log for Admin Server. You may see entries similar to the following:
[20/Jul/2003:11:41:41] INFO ( 1600): DPL5109: EJBC - START of EJBC for [stateless-converter]
[20/Jul/2003:11:41:41] INFO ( 1600): CORE3282: stdout: Remote message: Processing beans ....
[20/Jul/2003:11:41:42] INFO ( 1600): DPL5108: EJBC - Generated code for remote home and EJBObject implementations for [stateless-converter]
[20/Jul/2003:11:41:42] INFO ( 1600): CORE3282: stdout: Remote message: Compiling wrapper code ....
[20/Jul/2003:11:41:46] INFO ( 1600): CORE3282: stdout: Remote message: Compiling RMI-IIOP code ....
[20/Jul/2003:11:41:55] INFO ( 1600): DPL5110: EJBC - END of EJBC for [stateless-converter]
[20/Jul/2003:11:41:56] INFO ( 1600): Total Deployment Time: 17605 msec, Total EJB Compiler Module Time: 14100 msec, Portion spent EJB Compiling: 80%
Breakdown of EJBC Module Time: Total Time for EJBC: 14100 msec, CMP Generation: 0 msec (0%), Java Compilation: 10 msec (0%), RMI Compilation: 13239 msec (93%),
[20/Jul/2003:11:41:56] INFO ( 1600): ADM1041:Sent the event to instance:[ApplicationDeployEvent -- deploy stateless-converter]
[20/Jul/2003:11:42:03] INFO ( 1600): ADM1042:Status of event to instance:[success]
- Check the file system hierarchy under your server (such as server1) and look for your new application directory under j2ee-apps. If it was a module you deployed, look under the j2ee-modules directory to see your new module directory.
- Check the instance's server.xml file in the /config directory for the instance. Look for an entry similar to the following for your application or module:
<j2ee-application enabled="true" location="/Sun/studio5_se/appserver7/domains/domain1/server1/applications/j2ee-ap ps/stateless-converter_1" name="stateless-converter" virtual-servers="server1"/>
Forgot the user name or password.See Don’t know the admin username/password.
Forgot the admin Server port number.See Don’t know the Admin Server port number.
Server won’t startSome possible startup failure scenarios include:
File parsing failure: loadbalancer.xml not found
The following error occurs when the configuration file has an invalid pointer to a file:
LBConfigParser...: reports: ... Parsing of file : Failed ...
Message:The primary document entity could not be opened.For example, when the path to load balancer is specified incorrectly, the remainer of the message looks like this:
Id=/...path.../config/loadbalancer.xml
This error indicates that the path to the load balancer is invalid. You need to specify the absolute path to loadbalancer.xml in the Web Server configuration file
Invalid password(s)
The server instance log reports that startup was unsuccessful because of incorrect security password(s).
Solution 1
If you typed the wrong password three times during startup as a local user, you’ll need to reinitiate the startup process.
Solution 2
If the wrong password was provided by a remote GUI/CLI instance startup, the procedure must be modified to supply the correct password.
Solution 3
If the security attribute in the init.conf file was wrongly set "off" for a secured instance, then it needs to be manually corrected.
Solution 4
If the password.conf file is present in the config directory and it contains the wrong passwords, it should be manually corrected, or deleted to initiate on-line requests for passwords.
Abnormal subprocess termination / core dump
The following error messages occur when attempting to the start the server:
Could not start the instance: domain1:admin-server
server failed to start: abnormal subprocess termination
Could not start the instance: domain1:server1
server failed to start: abnormal subprocess termination
Could not start one or more instances in the domain : domain1
Could not start one or more domainsSubsequent attempts to start the server may be accompanied by a core dump like the following, with errors recored in the server.log file:
CORE1116: Application Server
INFO: CORE3016: daemon is running as super-user
Bad System Call - core dumpedThis error can occur when the app server runs out of file descriptors.
Solution: Increase the number of file descriptors
On UNIX, you can use the ulimit command to determine the number of available file descriptors or to set limits on the system’s available file descriptors. The ulimit command displays the limits for the current shell and its descendants.
For the sh shell, the ulimit -a command lists all the current resource limits. The ulimit -n command lists the maximum file descriptors plus 1.
Check the file descriptors on your solaris box with the following unix command:
ulimit -n
256where 256 is the number of file descriptors returned by the command.
To successfully start the servers, the file desciptor count should be set to 1024.
Edit the /etc/system file and add the following 2 lines:set rlim_fd_max=4086
set rlim_fd_cur=1024After adding these lines, reboot your system and check the file descriptor value again. It should now be as follows:
ulimit -n
1024You should now be able to start the admin server and server1 instance.
CGI error
If the Application Server won’t start, you may receive the following error:
This message indicates that the system requiresadditional resources.
Solution 1: Set adequate limits for file descriptors
For more details, see Solution: Increase the number of file descriptors.
Solution 2: Change kernel parameters
On UNIX, increase the system resources by modifying the /etc/system file to include the following entries:
Reboot the system for the new kernel parameters to take effect.
After you have set the shell resources, the Application Server should start.
Load balancer won’t startThis section covers cases in which the load balancer fails to initialize:
Parser can’t open loadbalancer.xml
The following error is reported:
CNFG1000: Parsing of file : ...
Message:The primary document entity could not be opened.
Id=<path>/loadbalancer.xml
...
lb.configurator: CNFG1014 : Error occured while
initializing Loadbalancer config Parser.This error occurs when the configuration file does not specify an absolute path to the location of the Load Balancer plugin.
Solution
Ensure that the configuration file contains the correct, absolute value.
Identical instance names
A message similar to the following might appear in the load balancer log file when you try to start the load balancer:
lb.configurator: CNFG1008 : Multiple instances with the same name : are not allowed for the cluster cluster1
The most likely problem is that the load balancer configuration file, loadbalancer.xml, is not configured correctly.
Solution
Verify your loadbalancer.xml file and make sure that the instance name is unique.
Load balancer / web server won’t start—listener portsThis problem occurs when two instances have the same listener value—for example, if instance foo has listener value bar:80 and instance spam has listener value bar:80.
The error messages that result look like this:
04/Sep/2003:13:01:08] warning ( 2938): reports: lb.runtime: RNTM2029: DaemonMonitor :http://hostname:81 : could be because of connection saturation
[04/Sep/2003:13:01:08] failure ( 2938): ServerInstance.cpp@265: reports: lb.runtime:RNTM3002 : Failed to add listener multiple times: <instance name>
[04/Sep/2003:13:01:08] failure ( 2938): FailoverGroup.cpp@102: reports:lb.failovermanager: FGRP1002: Instance <instance name> could not be added to theFailoverGroup: cluster1
[04/Sep/2003:13:01:08] failure ( 2938): LBConfigurator.cpp@209: reports:lb.cofigurator: CNFG1007 :ServerInstance <instance name> could not be added onFailoverGroup cluster1
[04/Sep/2003:13:01:08] failure ( 2938): lbplugin.cpp@168: reports: lb.runtime:RNTM3004 : Failed to initialise load balancing subsystem
Solution
Make sure that the listener values for each instance are unique.
Restart operation failsWhen an attempted restart fails, consider the following:
SSL/TLS are enabled
Restart does not work if SSL/TLS are enabled.
Solution
Stop and then start the instance.
JMS failed to start.The JMS failed to start.
Are you attempting to start the instance as a non-root user?
When attempting to start the application server instance as a non-root user, the command fails and the following message is displayed:
Could not start the instance
In the log file for the instance (server.log), the following error message occurs:
JMS5035: Timed out after 30000 milliseconds while trying to verify if the JMS service startup succeeded.
When started as root, the application server instance starts normally.
Solution
Verify the correct user owns the JMS broker instance by running the following command:
ls -l /opt/imq/var/instances/
For example, the broker files for server1 in domain1 will be in the domain1_server1 directory. If this directory is owned by root, the ownership of the broker files must be changed to the appropriate user. For example, the following command changes the ownership of these files to the UNIX user greg in the staff group:
chown -R greg:staff /opt/imq/var/instance/domain1_server1
Unless this change is made, it is not possible for the Application Server to access these files, so the JMS broker (and ultimately the Application Server) cannot start.
Do Solaris bundled and unbundled domains and instances have the same names?
If your machine has the Solaris 9 bundled version of the Application Server software installed, and you then install the unbundled version of the Application Server, the Message Queue broker for these application server installations will be shared.
Note
In general, only one or the other type of bundle should be used. It not necessary to install an unbundled Application Server if a bundled version is already available.
If you do not uniquely name your new domains and instances, you may receive the following errors when starting up the second instance with the same domain or instance name:
Solution
Give the (unbundled) domains and instances names that are different from the instances and domains in the bundled installation.
To avoid these errors, refer to the JMS Support chapter in the Application Server Administrator’s Guide for guidance.
Do the imq logs have out of memory errors?
If the imq logs show out of memory errors, system tuning is necessary.
Solution
1. Upgrade system memory.
2. Decrease the app server’s heapsize.
3. Add more swapspace.
Note
Adding more swap space will increase the number of applications that can run, which may adversely affect system performance, as more swapping will occur.
For more information on optimizing your system, consult the Performance and Tuning Guide.