Sun Java System Application Server Enterprise Edition 8.1 2005Q2 Troubleshooting Guide

HADB Administration Problems

The hadbm command and its many subcommands and options are provided for administering the high-availability database (HADB). The hadbm command is located in the install_dir/SUNWhadb/4/bin directory.

Refer to the chapter on Configuring the High Availability Database in the Sun Java System Application Server Administrator's Guide for a full explanation of this command. Specifics on the various hadbm subcommands are explained in the hadbm man pages.

The following problems are addressed in this section:

hadbm Command Fails: The agents could not be reached

Description

The command fails with the error:

The agents <url\> could not be reached.

The hosts in the URL could be unreachable either because the hosts are down, because the communication pathway has not been established, because the port number in the URL is wrong, or because the management agents are down.

Solution

Verify that the URL is correct. If the URL is correct, verify that the hosts are up and running and are ready to accept communications; for example:

ping hostname1ping hostname2...

hadbm Command Fails: command not found

Description

The hadbm command can be run from the current directory, or you can set the search PATH to access the hadb commands from anywhere, which is much more convenient. The error, “hadbm: Command not found,” indicates that neither of these conditions has been met.

Solution 1

cd to the directory that contains the hadbm command and run it from there:

cd install_dir/SUNWhadb/4/bin/
./hadbm

Solution 2

Use the full path to invoke the hadbm command:

install_dir/SUNWhadb/4/bin/hadbm

Solution 3

You can use the hadbm command from anywhere by setting the PATH variable. Instructions for setting the PATH variable are contained in the “Preparing for HADB Setup” chapter of the Sun Java System Application Server 8.1 Installation Guide.

To verify that the PATH settings are correct, run the following commands:

which asadmin
which hadbm

These commands should echo the paths to the utilities.

hadbm Command Fails: JAVA_HOME not defined

Description

The message “hadbm: <path\>: Invalid Java home location” indicates that the JAVA_HOME environment variable has not been set properly.

Solution

If multiple Java versions are installed on the system, ensure that the JAVA_HOME environment variable points to the correct Java version (1.4.1_03 or above for Enterprise Edition).

Instructions for setting the PATH variable are contained in the “Preparing for HADB Setup” chapter of the Sun Java System Application Server 8.1 Installation Guide.

hadbm createdomain fails, but two split domains are created

Description

If running the HADB management agent on a host with multiple network interfaces, the createdomain command may fail if not all network interfaces are on the same subnet:

hadbm:Error 22020: The management agents could not establish a domain, 
please check that the hosts can communicate with UDP multicast.

By default, the management agents use the “first” interface for UDP multicasts (“first” as returned by java.net.NetworkInterface.getNetworkInterfaces()).

Solution

The best solution is to tell the management agent which subnet to use by setting ma.server.mainternal.interfaces in the configuration file; for example:

ma.server.mainternal.interfaces=10.11.100.0

Alternatively, one may configure the router between the subnets to route multicast packets. By default, the management agent uses multicast address 228.8.8.8.

create Fails: path does not exist on a host

Description

After issuing the hadbm create command, an error similar to the following appears on the console:

./hadbm create ...
...
hadbm: Error 22022: Path path does not exist on host host

This error message can also appear when new nodes are added without the specified paths do not exist on the machines.

Solution

Log in to the host and create paths for the HADB devices and HADB history files. Run hadbm create and specify the --devicepath and --historypath options to the paths created. Also make sure that the user running the management agent on the host has read and write access to these directories.


Note –

HADB executables cannot be installed on different paths on different hosts.


Database Does Not Start

The create or start command fails with the console error message:

hadbm: Error 22095: Database could not be started...

Consider the following possibilities:

Was there a shared memory get segment failure?

Description

Start may fail if the resources (shared memory, disk space) allocated for that node are taken by some other processes, after the node is stopped.

Solution

Refer to Problems Related to Shared Memory for suggestions on resolving this issue.

Do the History Files Contain Errors?

Description

If the problem still persists, inspect the HADB history files. Some of the more likely error messages to look for are:

Solutions

After verifying that none of the above errors have occurred, try the following remedies, in order:

For more information, refer to the Error Message Reference.

clear Command Failed

The clear command reinitializes the database device files residing on disks. This may fail due to problems with disk or disk access. Check whether any error message from hadbm indicates this. If not, look into the ma.log files and check whether devinit has generated any error messages.

create-session-store Failed

The asadmin create-session-store command could fail for one of these reasons:

Invalid user name or password

This error occurs when the --dbsystempassword supplied to the create-session-store command is not the same password as the one given at the time of database creation.

Solution 1

Try the command again with the correct password.

Solution 2

If you cannot remember the dbsystem password, you need to clear the database using hadbm clear and provide a new dbsystem system password.

SQLException: No suitable driver

The create-session-store produces the error: SessionStoreException: java.sql.SQLException: No suitable driver.

Solution 1

This error can occur when asadmin is not able to find hadbjdbc4.jar from the AS_HADB path defined in asenv.conf in the Application Server config directory.

The solution is to change AS_HADB to point to the location of the HADB installation.

Here is a sample AS_HADB entry from an asenv.conf file:

AS_HADB=/export/home0/hercules/0815/SUNWhadb/4.4.0-8

Solution 2

This error can also occur if you provide the incorrect value for --storeUrl. To solve this problem, obtain the correct URL using hadbm get jdbcURL.

hadbm Command Hangs

If the management agent with which the hadbm communicates dies before the operation finishes, then the hadbm process may hang. Check whether the all the agents are running.

Cannot Restart the HADB

Description

HADB restart does not work after a double node failure. Additional recovery actions are needed before HADB can be restarted.

Symptoms of a double node failure include:

This problem occurs when mirror HADB host machines have failed or been rebooted, typically after a power outage, or when a machine is rebooted without first stopping the HADB (in a single-machine installation), or when a pair of mirror machines from both Data Redundancy Units (DRUs) are rebooted.

HADB cannot heal itself automatically in such “double failure” situations because the part of the data that resided on the pair nodes is lost. In such cases, the hadbm start command does not succeed, and the hadbm status command shows that HADB is in a non-operational state.

For more information on the DRUs and HADB confutation, see “Administering the High Availability Database” in the Administration Guide, and the Deployment Guide.


Tip –

If the HADB exhibits strange behavior (for example consistent timeout problems), and you want to check whether a restart cures the problem, use the hadbm restart command.

When the HADB is restarted in this manner, HADB services remain available. Conversely, if HADB is started and stopped in separate operations using hadbm stop and hadbm start, HADB services are unavailable while HADB is stopped.


Solution

Verify that the node states show Starting/Recovering, then reset the database. Follow the instructions under “Recovering from Session Data Corruption” in the “Administering the High Availability Database” chapter of the Administration Guide.