Sun GlassFish Enterprise Server 2.1 High Availability Administration Guide

Preparing for HADB Setup

This section discusses the following topics:

After performing these tasks, see Chapter 3, Administering High Availability Database.

For the latest information on HADB, see Sun GlassFish Enterprise Server 2.1 Release Notes.

Prerequisites and Restrictions

Before setting up and configuring HADB, make sure your network and hardware environment meets the requirements described in the Sun GlassFish Enterprise Server 2.1 Release Notes. Additionally, there are restrictions with certain file systems; for example, with Veritas. For more information, see the Release Notes.

HADB uses Intimate Shared Memory (SHM_SHARE_MMU flag) when it creates and attaches to its shared memory segments. The use of this flag essentially locks the shared memory segments into physical memory and prevents them from being paged out. Therefore, HADB's shared memory is locked into physical memory, which can easily impact installations on low-end machines. Ensure you have the recommended amount of memory when co-locating Application Server and HADB.

Configuring Network Redundancy

Configuring a redundant network will enable HADB to remain available, even if there is a single network failure. You can configure a redundant network in two ways:

On Solaris 9, by setting up network multipathing.
On all platforms except Windows Server 2003, by configuring a double network.

Setting Up Network Multipathing

Before setting up network multipathing, refer to the Administering Network Multipathing section in the IP Network Multipathing Administration Guide on docs.sun.com.

To Configure HADB Host Machines that Already Use IP Multipathing

Set network interface failure detection time.

For HADB to properly support multipathing failover, the network interface failure detection time must not exceed one second (1000 milliseconds), as specified by the FAILURE_DETECTION_TIME parameter in /etc/default/mpathd. Edit the file and change the value of this parameter to 1000 if the original value is higher:
FAILURE_DETECTION_TIME=1000
To put the change into effect, use this command:
pkill -HUP in.mpathd

Set up IP addresses to use with HADB.

As described in the IP Network Multipathing Administration Guide, multipathing involves grouping physical network interfaces into multipath interface groups. Each physical interface in such a group has two IP addresses associated with it:
- a physical interface address used for transmitting data.
- a test address for Solaris internal use only.
Specify only one physical interface address from the multipath group when you use hadbm create --hosts.

Example 2–1 Setting up Multipathing

Suppose there are two host machines named host1 and host2. If they each have two physical network interfaces, then set up the two interfaces as a multipath group. Run ifconfig -a on each host.

The output on host1 is:

bge0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4>
mtu 1500 index 5 inet 129.159.115.10 netmask ffffff00 broadcast 129.159.115.255 
groupname mp0

bge0:1: flags=9040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER>
mtu 1500 index 5 inet 129.159.115.11 netmask ffffff00 broadcast 129.159.115.255

bge1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> 
mtu 1500 index 6 inet 129.159.115.12 netmask ffffff00 broadcast 129.159.115.255 
groupname mp0

bge1:1: flags=9040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER> 
mtu 1500 index 6 inet 129.159.115.13 netmask ff000000 broadcast 129.159.115.255

The output on host2 is:

bge0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> 
mtu 1500 index 3 inet 129.159.115.20 netmask ffffff00 broadcast 129.159.115.255 
groupname mp0

bge0:1: flags=9040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER> 
mtu 1500 index 3 inet 129.159.115.21 netmask ff000000 broadcast 129.159.115.255

bge1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> 
mtu 1500 index 4 inet 129.159.115.22 netmask ffffff00 broadcast 129.159.115.255 
groupname mp0

bge1:1: flags=9040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER> 
mtu 1500 index 4 inet 129.159.115.23 netmask ff000000 broadcast 129.159.115.255

In this example, the physical network interfaces on both hosts are listed after bge0 and bge1. Those listed after bge0:1 and bge1:1 are multipath test interfaces (marked DEPRECATED in the ifconfig output), as described in the IP Network Multipathing Administration Guide.

To set up HADB in this environment, select one physical interface address from each host. In this example, HADB uses IP address 129.159.115.10 from host1 and 129.159.115.20 from host2. To create a database with one database node per host, use the command hadbm create --hosts. For example

hadbm create --hosts 129.159.115.10,129.159.115.20

To create a database with two database nodes on each host, use the command:

hadbm create --hosts 129.159.115.10,129.159.115.20,
129.159.115.10,129.159.115.20

In both cases, you must configure the agents on host1 and host2 with separate parameters to specify which interface on the machines the agents should use. So, on host1 use:

ma.server.mainternal.interfaces=129.159.115.10

And on host2 use:

ma.server.mainternal.interfaces=129.159.115.20

For information on the ma.server.mainternal.interfaces variable, see Configuration File.

Configuring Double Networks

To enable HADB to tolerate single network failures, use IP multipathing if the operating system (for example, Solaris) supports it. Do not configure HADB with double networks on Windows Server 2003—the operating system does not work properly with double networks.

If your operating system is not configured for IP multipathing, and HADB hosts are equipped with two NICs, you can configure HADB to use double networks. For every host, the IP addresses of each of the network interface card (NIC) must be on separate IP subnets.

Within a database, all nodes must be connected to a single network, or all nodes must be connected to two networks.

Note –

Routers between the subnets must be configured to forward UDP multicast messages between subnets.

When creating an HADB database, use the –hosts option to specify two IP addresses or host names for each node: one for each NIC IP address. For each node, the first IP address is on net-0 and the second on net-1. The syntax is as follows, with host names for the same node separated by a plus sign (+):

--hosts=node0net0name+node0net1name
,node1net0name+node1net1name
,node2net0name+node2net1name
, ...

For example, the following argument creates two nodes, each with two network interfaces. The following host option is used to create these nodes:

--hosts 10.10.116.61+10.10.124.61,10.10.116.62+10.10.124.62

Thus, the network addresses

For node0 are 10.10.116.61 and 10.10.124.61
For node1 are 10.10.116.62 and 10.10.124.62

Notice that 10.10.116.61 and 10.10.116.62 are on the same subnet, and 10.10.124.61 and 10.10.124.62 are on the same subnet.

In this example, the management agents must use the same subnet. Thus, the configuration variable ma.server.mainternal.interfaces must be set to, for example, 10.10.116.0/24. This setting can be used on both agents in this example.

Configuring Shared Memory and Semaphores

You must configure shared memory and semaphores before installing HADB. The procedure depends on your operating system.

If you run other applications than HADB on the hosts, calculate these applications' use of shared memory and semaphores, and add them to the values required by HADB. The values recommended in this section are sufficient for running up to six HADB nodes on each host. You need only increase the values if you either run more than six HADB nodes, or the hosts are running applications that require additional shared memory and semaphores.

If the number of semaphores is too low, HADB can fail and display this error message: No space left on device. This can occur either while starting the database, or during run time.

To configure shared memory and semaphores on Solaris

Since the semaphores are a global operating system resource, the configuration depends on all processes running on the host, and not HADB alone. On Solaris, configure the semaphore settings by editing the /etc/system file.

Configure shared memory.
- Set shminfo_shmmax, which specifies the maximum size of a single shared memory segment on the host. Set this value to the total amount of RAM installed on the HADB host machine, expressed as a hexadecimal value, but no more than 2 GB.
  
  For example, for 2 GB of RAM, set the value as follows in the /etc/system file:
  set shmsys:shminfo_shmmax=0x80000000
  Note –
  To determine a host machine’s memory, use this command:
  prtconf | grep Memory
- On Solaris 8 or earlier, set shminfo_shmseg, the maximum number of shared memory segments to which one process can attach. Set the value to six times the number of nodes per host. For up to six nodes per host, add the following to the /etc/system file:
  set shmsys:shminfo_shmseg=36
  On Solaris 9 and later, shmsys:shminfo_shmseg is obsolete.
- Set shminfo_shmmni, the maximum number of shared memory segments in entire system. Since each HADB node allocates six shared memory segments, the value required by HADB must be at least six times the number of nodes per host. On Solaris 9, for up to six nodes per host, there is no need to change the default value.

Configure semaphores.

Check the /etc/system file for the following semaphore configuration entries, for example:
set semsys:seminfo_semmni=10 set semsys:seminfo_semmns=60 set semsys:seminfo_semmnu=30
If the entries are present, increment the values as indicated below.

If the /etc/system file does not these entries, add them at the end of the file:
- Set seminfo_semmni, the maximum number of semaphore identifiers. Each HADB node needs one semaphore identifier. On Solaris 9, for up to six nodes per host, there is no need to change the default value. For example:
  set semsys:seminfo_semmni=10
- Set seminfo_semmns, the maximum number of semaphores in the entire system. Each HADB node needs eight semaphores. On Solaris 9, or up to six nodes per host, there is no need to change the default value. For example:
  set semsys:seminfo_semmns=60
- Set seminfo_semmnu, the maximum number of undo structures in the system. One undo structure is needed for each connection (configuration variable NumberOfSessions, default value 100). For up to six nodes per host, set it to 600:
  set semsys:seminfo_semmnu=600

Reboot the machine.

To configure shared memory on Linux

On Linux, you must configure shared memory settings. You do not need to adjust the default semaphore settings.

Edit the file /etc/sysctl.conf.

With Redhat Linux, you can also modify sysctl.conf to set the kernel parameters.

Set the values of kernel.shmax and kernel.shmall, as follows:
echo MemSize > /proc/sys/shmmax echo MemSize > /proc/sys/shmall
where MemSize is the number of bytes.

The kernel.shmax parameter defines the maximum size in bytes for a shared memory segment. The kernel.shmall parameter sets the total amount of shared memory in pages that can be used at one time on the system. Set the value of both of these parameters to the amount physical memory on the machine. Specify the value as a decimal number of bytes.

For example, to set both values to 2GB, use the following:
echo 2147483648 > /proc/sys/kernel/shmmax echo 2147483648 > /proc/sys/kernel/shmall

Reboot the machine using this command:

sync; sync; reboot

Procedure for Windows

Windows does not require any special system settings. However, if you want to use an existing J2SE installation, set the JAVA_HOME environment variable to the location where the J2SE is installed.

Synchronizing System Clocks

You must synchronize clocks on HADB hosts, because HADB uses time stamps based on the system clock. HADB uses the system clock to manage timeouts and to time stamp events logged to history files. For troubleshooting, you must analyze all the history files together, since HADB is a distributed system. So, it is important that all the hosts’ clocks be synchronized

Do not adjust system clocks on a running HADB system. Doing so can cause problems in the operating system or other software components that can in turn cause problems such as hangs or restarts of HADB nodes. Adjusting the clock backward can cause some HADB server processes to hang as the clock is adjusted.

To synchronize clocks:

On Solaris, use xntpd (network time protocol daemon).
On Linux, use ntpd.
On Windows, use NTPTime on Windows

If HADB detects a clock adjustment of more than one second, it logs it to the node history file, for example:

NSUP INF 2003-08-26 17:46:47.975 Clock adjusted.
 Leap is +195.075046 seconds.