This section describes known high-availability database (HADB) issues and associated solutions.
When you increase the device or buffer sizes by using hadbm set, the management system checks resource availability when creating databases or adding nodes. However, the system does not check if sufficient resources are available when device or main-memory buffer sizes are changed.
Verify that enough free disk or memory space is available on all hosts before increasing any of the devicesize or buffersize configuration attributes.
You cannot register the same software package with the same name with different locations at different hosts. For example:
hadbm registerpackage test --packagepath=/var/install1 --hosts europa11 Package successfully registered. hadbm registerpackage test --packagepath=/var/install2 --hosts europa12 hadbm:Error 22171: A software package has already been registered with the package name test. |
HADB does not support heterogeneous paths across nodes in a database cluster. Make sure that the HADB server installation directory (--packagepath) is the same across all participating hosts.
If running the management agent on a host with multiple network interfaces, the createdomain command might fail if not all network interfaces are on the same subnet:
hadbm:Error 22020: The management agents could not establish a domain, please check that the hosts can communicate with UDP multicast. |
If not configured, the management agents will (use the "first" interface for UDP multicasts. "First" is defined by the result from java.net.NetworkInterface.getNetworkInterfaces().
The best solution is to tell the management agent which subnet to use (set ma.server.mainternal.interfaces in the configuration file. For example, ma.server.mainternal.interfaces=10.11.100.0). Alternatively you might configure the router between the subnets to route multicast packets (the management agent uses multicast address 228.8.8.8).
Before retrying with a new configuration of the management agents, you might have to clean up the management agent repository. Stop all agents in the domain, and delete all files and directories in the repository directory, identified by repository.dr.path in the management agent configuration file. This cleanup must be performed on all hosts before restarting the agents with a new configuration file.
After deleting an HADB instance, subsequent attempts to create new instances with the configure-ha-cluster command fail. The problem is that old directories are left from the original HADB instance in ha_install_dir/rep/* and ha_install_dir/config/hadb/instance_name.
Be sure to manually delete these directories after deleting an HADB instance.
A bug in the 64-bit version of Red Hat Enterprise Linux 3.0 forces the clu_trans_srv process into an uninterruptible mode when performing asynchronous I/O. This means that kill -9 command does not work and the operating system must be rebooted.
Use a 32-bit version of Red Hat Enterprise Linux 3.0.
Capital letters in passwords are converted to lowercase when the password is stored in hadb.
Do not use passwords containing capital letters.
Sometimes a resource contention problem on the server may cause a management client to become disconnected. When reconnecting, the following misleading error message "hadbm:Error 22184: A password is required to connect to the management agent" might be returned.
Check if a resource problem has occurred on the server, take proper action (for example, , add more resources), and retry the operation.
Special-use interfaces with IP addresses like 0.0.0.0 should not be registered as valid interfaces to be used for HADB nodes in the Management Agent. Registering such interfaces can cause problems if HADB nodes are set up on these interfaces by means of a user issuing a hadbm create command that uses host names instead of IP addresses. The nodes will then be unable to communicate, causing the create command to hang.
When using hadbm create on hosts with multiple interfaces, always specify the IP addresses explicitly by using DDN notation.
On the Windows platform, with certain configurations and loads, a large number of reassembly failures might occur in the operating system. The problem has been seen with configurations of more than 20 nodes when running several table scans (select *) in parallel. Possible symptoms include transactions aborting frequently, repair or recovery taking a long time to complete, and frequent timeouts occurring in various parts of the system.
To fix the problem, the Windows registry variable HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters can be set to a value higher than the default 100. For best results, increase this value to 0x1000 (4096). For more information, see article 811003 from the Microsoft support pages.
When a machine is overloaded, the masking mechanism fails and some characters from the typed password can be exposed. This exposition poses a minor security risk. The password should always be masked.
Put the passwords in their own password files (the method recommended since Application Server 8.1) and refer to these files with either the --adminpassword or --dbpasswordfile options.