This section lists additional information about HADB deployment and upgrading.
Store device, log and history files on local disks only, do not use remote-mounted file systems.
If more than one node is placed on a host, it is recommended to keep the devices belonging to each node on different disks. Otherwise, the disk contention would reduce the performance. Symptoms of this problem can be seen in the history files by the messages like this: “BEWARE - last flush/fputs took too long.” When one single node has more than one data device file, it is recommended to use separate disks for these device files.
Use local disks (preferably separate disk than the one used for data devices) to install HADB binaries on HADB hosts. NFS delays or disk contention may cause node restarts with warning, “Process blocked for nnn, max block time is nnn” in the history files.
Do not place the hadb devices, history files, management agent directories and agent config files in the hadb package path. This will cause problems when upgrading to newer versions and deleting the old package path.
This release of HADB is offically supported for a maximum of 28 nodes; 24 active data nodes with 4 spares.
We recommend using the same version for the JDBC driver and the HADB server.
We do not support IPv6, only IPv4.
The command line length on Windows is restricted to 2048 bytes.
The network must be configured for UDP multicast.
Due to excessive swapping observed in RedHat Enterprise Linux 3.0, updates 1 through 3, we do not recommend it as a deployment platform. The problem is fixed in RedHat Enterprise Linux 3.0 update 4.
Possibility of running NSUP with real time priority:
The node supervisor (NSUP) processes (clu_nsup_srv) ensure the high availability of the HADB with the help of exchanging “heartbeat” messages in a timely manner. The timing gets affected when an NSUP is colocated with other processes causing resource starvation. The consequence is false network partitioning and node restarts (preceded by a warning “Process blocked for n seconds” in history files) resulting in aborted transactions and other exceptions.
To solve this problem, clu_nsup_srv (found in installpath/lib/server) must have the suid bit set and the file must be owned by root. This is achieved manually by the commands:
# chown root clu_nsup_srv # chmod u+s clu_nsup_srv |
This causes the clu_nsup_srv process to run as the user root when started, and this in turn allows the process to automatically give itself real-time priority after startup. To avoid any security impact by using setuid, the real-time priority is set in the very beginning and the process falls back to the effective uid once the priority has been changed. Other HADB processes will lower their priority to timeshare priority.
If NSUP could not set the real-time priority, it issues a warning, “Could not set realtime priority” (unix: errno will be set to EPERM), which is written out in ma.log file and continues without real-time priority.
There are cases where it is not possible to set real-time priorities; for example:
When installed in Solaris 10 non-global zones
When PRIV_PROC_LOCK_MEMORY (Allow a process to lock pages in physical memory) and/or PRIV_PROC_PRIOCNTL privileges are revoked in Solaris 10
Users turn off setuid permission
Users install the software as tar files (nonroot install option for the App.server)
The clu_nsup_srv process is not CPU consuming, its footprint is small and running it with real-time priority will not impact performance.
Configuring IP network multipathing for HADB for Solaris (tested on Solaris 9 only):
Sun recommends that Solaris hosts running HADB be set up with network multipathing in order to ensure the highest possible network availability. Network multipathing setup is covered in detail in the IP Network Multipathing Administration Guide. If you decide to use multipathing with HADB, refer to the Administering Network Multipathing section of the IP Network Multipathing Administration Guide in order to set up multipathing before you proceed with adapting the multipathing setup for HADB as described below. The IP Network Multipathing Administration Guide is part of the Solaris 9 System Administrator Collection, and can be downloaded from http://docs.sun.com.
Set network interface failure detection time
For HADB to properly support multipathing failover, the network interface failure detection time must not exceed 1000 milliseconds as specified by the FAILURE_DETECTION_TIME parameter in /etc/default/mpathd. Edit the file and change the value of this parameter to 1000 if the original value is higher:
FAILURE_DETECTION_TIME=1000 |
In order for the change to take effect, issue the following command:
pkill -HUP in.mpathd |
IP addresses to use with HADB
As described in the Solaris IP Network Multipathing Administration Guide, multipathing involves grouping physical network interfaces into multipath interface groups. Each physical interface in such a group has two IP addresses associated with it: a physical interface address and a test address. Only the physical interface address can be used for transmitting data, while the test address is for Solaris internal use only. When hadbm create --hosts is run, each host should be specified with only one physical interface address from the multipath group.
Example
Assume that Host 1 and Host 2 have two physical network interfaces each. On each host, these two interfaces are set up as a multipath group, and running ifconfig -a yields the following:
Host 1:
bge0: flags=1000843<mtu 1500 index 5 inet 129.159.115.10 netmask ffffff00 broadcast 129.159.115.255 groupname mp0 bge0:1: flags=9040843<mtu 1500 index 5 inet 129.159.115.11 netmask ffffff00 broadcast 129.159.115.255 bge1: flags=1000843<mtu 1500 index 6 inet 129.159.115.12 netmask ffffff00 broadcast 129.159.115.255 groupname mp0 bge1:1: flags=9040843<mtu 1500 index 6 inet 129.159.115.13 netmask ff000000 broadcast 129.159.115.255 |
Host 2:
bge0: flags=1000843<mtu 1500 index 3 inet 129.159.115.20 netmask ffffff00 broadcast 129.159.115.255 groupname mp0 bge0:1: flags=9040843<mtu 1500 index 3 inet 129.159.115.21 netmask ff000000 broadcast 129.159.115.255 bge1: flags=1000843<mtu 1500 index 4 inet 129.159.115.22 netmask ffffff00 broadcast 129.159.115.255 groupname mp0 bge1:1: flags=9040843<mtu 1500 index 4 inet 129.159.115.23 netmask ff000000 broadcast 129.159.115.255 |
Here, the physical network interfaces on both hosts are the ones listed as bge0 and bge1. The ones listed as bge0:1 and bge1:1 are multipath test interfaces (they are thus marked as DEPRECATED in the ifconfig output), as described in the IP Network Multipathing Administration Guide.
To set up HADB in this environment, select one physical interface address from each host. In this example. we choose 129.159.115.10 from host 1 and 129.159.115.20 from host 2. To create a database with one database node per host, use the following argument to hadbm create:
--host 129.159.115.10,129.159.115.20 |
To create a database with two database nodes on each host, use the following argument:
--host 129.159.115.10,129.159.115.20,129.159.115.10,129.159.115.20 |
In both cases, the ma.server.mainternal.interfaces variable on both hosts should be set to 129.159.115.0/24.
It is not possible to upgrade from 4.2 or 4.3 to 4.4 online. However, 4.4 supports online upgrade for the future versions. To upgrade from 4.4.1 to 4.4.2, apply the following steps:
Install 4.4.2 on all HADB hosts (On another path than that of 4.4.1 – for instance /opt/SUNWhadb/4.4.2-6).
Install the new version on the hadbm client hosts.
Stop all management agents running on the HADB hosts.
Start the management agent processes using the new version's software, but with the old configuration files. In the remaining steps, please use the hadbm command found in the new version's bin directory.
Register the package in the management domain (default package name here becomes V4.4, so another package name may be required to avoid conflicts with existing packages having the same name):
hadbm registerpackage --packagepath=/opt/SUNWhadb/4.4.2-6 V4.4.2 |
Restart the database with the new version (the following command does a rolling restart of the nodes):
hadbm set packagename=V4.4.2 database_name |
Check that the database status is “running” (using the command hadbm status) and that it functions normally, serving the client transactions.
If everything works, the old installation can be removed later:
Before unregistering the old package, remove all references to the old package from the ma repository. Otherwise, hadbm unregisterpackage will fail with “package in use.” A dummy reconfiguration operation, for instance, hadbm set connectiontrace=<same_as_previous_value> will remove all references to the old package. Now, unregister the old package:
hadbm unregisterpackage [--hosts=<host_list>] <old_package_name> |
Remove the old installation from the file system, as described in the HADB installation instructions.