Sun GlassFish Enterprise Server v2.1.1 Deployment Planning Guide

High-Availability Database

This section discusses the following topics:

Overview

Java EE applications’ need for session persistence was previously described in Session Persistence. The Application Server uses the high-availability database (HADB) as a highly available session store. HADB is included with the Sun GlassFish Enterprise Server with HADB, but in deployment can be run on separate hosts. HADB provides a highly available data store for HTTP session and stateful session bean data.

The advantages of this decoupled architecture include:

Server instances in a highly available cluster are loosely coupled and act as high performance Java EE containers.
Stopping and starting server instances does not affect other servers or their availability.
HADB can run on a different set of less expensive machines (for example, with single or dual processors). Several clusters can share these machines. Depending upon the deployment needs, you can run HADB on the same machines as Application Server (co-located) or different machines (separate tier). For more information on the two options, see Co-located Topology
As state management requirements change, you can add resources to the HADB system without affecting existing clusters or their applications.

Note –

HADB is optimized for use by Application Server and is not meant to be used by applications as a general purpose database.

For HADB hardware and network system requirements, seeHardware and Software Requirements in Sun GlassFish Enterprise Server v2.1.1 Release Notes.

System Requirements

The system requirements for HADB hosts are:

At least one CPU per HADB node.
At least 512 MB memory per node

For additional requirements for very high availability, see Mitigating Double Failures

HADB Architecture

HADB is a distributed system consisting of pairs of nodes. Nodes are divided into two data redundancy units (DRUs), with a node from each pair in each DRU, as illustrated in Data Redundancy Units.

Each node consists of:

A set of processes for transactional state replication
A dedicated area of shared memory used for communication among the processes.
One or more secondary storage devices (disks).

A set of HADB nodes can host one or more session databases. Each session database is associated with a distinct application server cluster. Deleting a cluster also deletes the associated session database.

For HADB hardware requirements, seeHardware and Software Requirements in Sun GlassFish Enterprise Server v2.1.1 Release Notes.

Nodes and Node Processes

There are two types of HADB nodes:

Active nodes that store data.
Spare nodes that do not contain any data initially, but perform as active nodes if an active node becomes unavailable. Spare nodes are optional but useful for achieving higher availability.

Each node has a parent process and several child processses. The parent process, called the node supervisor (NSUP), is started by the management agent. It is responsible for creating the child processes and keeping them running.

The child processes are:

Transaction server process (TRANS), that coordinates transactions on distributed nodes, and manages data storage.
Relational algebra server process (RELALG) that coordinates and executes complex relational algebra queries such as sorts and and joins.
SQL shared memory server process (SQLSHM) that maintains the SQL dictionary cache.
SQL server process (SQLC), that receives client queries, compiles them into local HADB instructions, sends the instructions to TRANS, receives the results and conveys them to the client. Each node has one main SQL server and one sub-server for each client connection.
Node manager server process (NOMAN) that management agents use to execute management commands issued by the hadbm management client.

Data Redundancy Units

As previously described, an HADB instance contains a pair of DRUs. Each DRU has the same number of active and spare nodes as the other DRU in the pair. Each active node in a DRU has a mirror node in the other DRU. Due to mirroring, each DRU contains a complete copy of the database.

The following figure shows an example HADB architecture with six nodes: four active nodes and two spare nodes. Nodes 0 and 1 are a mirror pair, as are nodes 2 and 3. In this example, each host has one node. In general, a host can have more than one node if it has sufficient system resources (see System Requirements).

Note –

You must add machines that host HADB nodes in pairs, with one machine in each DRU.

HADB achieves high availability by replicating data and services. The data replicas on mirror nodes are designated as primary replicas and hot standby replicas. The primary replica performs operations such as inserts, deletes, updates, and reads. The hot standby replica receives log records of the primary replica’s operations and redoes them within the transaction life time. Read operations are performed only by the primary node and thus not logged. Each node contains both primary and hot standby replicas and plays both roles. The database is fragmented and distributed over the active nodes in a DRU. Each node in a mirror pair contains the same set of data fragments. Duplicating data on a mirror node is known as replication. Replication enables HADB to provide high availability: when a node fails, its mirror node takes over almost immediately (within seconds). Replication ensures availability and masks node failures or DRU failures without loss of data or services.

When a mirror node takes over the functions of a failed node, it has to perform double work: its own and that of the failed node. If the mirror node does not have sufficient resources, the overload will reduce its performance and increase its failure probability. When a node fails, HADB attempts to restart it. If the failed node does not restart (for example, due to hardware failure), the system continues to operate but with reduced availability.

HADB tolerates failure of a node, an entire DRU, or multiple nodes, but not a “double failure” when both a node and its mirror fail. For information on how to reduce the likelihood of a double failure, see Mitigating Double Failures

Spare Nodes

When a node fails, its mirror node takes over for it. If the failed node does not have a spare node, then at this point, the failed node will not have a mirror. A spare node will automatically replace a failed node’s mirror. Having a spare node reduces the time the system functions without a mirror node.

A spare node does not normally contain data, but constantly monitors for failure of active nodes in the DRU. When a node fails and does not recover within a specified timeout period, the spare node copies data from the mirror node and synchronizes with it. The time this takes depends on the amount of data copied and the system and network capacity. After synchronizing, the spare node automatically replaces the mirror node without manual intervention, thus relieveing the mirror node from overload, thus balancing load on the mirrors. This is known as failback or self-healing.

When a failed host is repaired (by shifting the hardware or upgrading the software) and restarted, the node or nodes running on it join the system as a spare nodes, since the original spare nodes are now active.

Spare nodes are not required, but they enable a system to maintain its overall level of service even if a machine fails. Spare nodes also make it easy to perform planned maintenance on machines hosting active nodes. Allocate one machine for each DRU to act as a spare machine, so that if one of the machines fails, the HADB system continues without adversely affecting performance and availability.

Note –

As a general rule, have a spare machine with enough Application Server instances and HADB nodes to replace any machine that becomes unavailable.

Example Spare Node Configurations

The following examples illustrate using spare nodes in HADB deployments. There are two possible deployment topologies: co-located, in which HADB and Application Servers reside on the same hosts, and separate tier , in which they reside on separate hosts. For more information on deployment topologies, see Chapter 3, Selecting a Topology

Example: co-located configuration

As an example of a spare node configuration, suppose you have a co-located topology with four Sun Fire^TM V480 servers, where each server has one Application Server instance and two HADB data nodes.

For spare nodes, allocate two more servers (one machine per DRU). Each spare machine runs one application server instance and two spare HADB nodes.

Example: separate tier configuration

Suppose you have a separate-tier topology where the HADB tier has two Sun Fire^TM 280R servers, each running two HADB data nodes. To maintain this system at full capacity, even if one machine becomes unavailable, configure one spare machine for the Application Server instances tier and one spare machine for the HADB tier.

The spare machine for the Application Server instances tier must have as many instances as the other machines in the Application Server instances tier. Similarly, the spare machine for the HADB tier must have as many HADB nodes as the other machines in the HADB tier.

Mitigating Double Failures

HADB’s built-in data replication enables it to tolerate failure of a single node or an entire DRU. By default, HADB won’t survive a double failure , when a mirror node pair or both DRUs fail. In such cases, HADB become unavailable.

In addition to using spare nodes as described in the previous section, you can minimize the likelihood of a double failure by taking the following steps:

Providing independent power supplies: For optimum fault tolerance, the servers that support one DRU must have independent power (through uninterruptible power supplies), processing units, and storage. If a power failure occurs in one DRU, the nodes in the other DRU continue servicing requests until the power returns.
Providing double interconnections: To tolerate single network failures, replicate the lines and switches between DRUs.

These steps are optional, but will increase the overall availability of the HADB instance.

HADB Management System

The HADB management system provides built-in security and facilitates multi-platform management. As illustrated in the following figure, the HADB management architecture contains the following components:

As shown in the figure, one HADB management agent runs on every machine that runs the HADB service. Each machine typically hosts one or more HADB nodes. An HADB management domain contains many machines, similar to an Application Server domain. At least two machines are required in a domain for the database to be fault tolerant, and in genera there must be an even number of machines to form the DRU pairs. Thus, a domain contains many management agents.

As shown in the figure, a domain can contain one or more database instances. One machine can contain one or more nodes belonging to one or more database instances.

Management Client

The HADB management client is a command-line utility, hadbm, for managing the HADB domain and its database instances. HADB services can run continously— even when the associated Application Server cluster is stopped—but must be shut down carefully if they are to be deleted.

You can use the asadmin command line utility to create and delete the HADB instance associated with a highly available cluster. For more information, see Chapter 9, Configuring High Availability Session Persistence and Failover, in Sun GlassFish Enterprise Server v2.1.1 High Availability Administration Guide.

Management Agent

The management agent is a server process (named ma) that can access resources on a host; for example, it can create devices and start database processes. The management agent coordinates and performs management client commands such as starting or stopping a database instance.

A management client connects to a management agent by specifying the address and port number of the agent. Once connected, the management client sends commands to HADB through the management agent. The agent receives requests and executes them. Thus, a management agent must be running on a host before issuing any hadbm management commands to that host. The management agent can be configured as a system service that starts up automatically.

Ensuring availability of management agents

The management agent process ensures the availability of the HADB node supervisor processes by restarting them if they fail. Thus, for deployment, you must ensure the availability of the ma process to maintain the overall availability of HADB. After restarting, the management agent recovers the domain and database configuration data from other agents in the domain.the system.

Use the host operating system (OS) to ensure the availability of the management agent. On Solaris or Linux, init.d ensures the availability of the ma process after a process failure and reboot of the operating system. On Windows, the management agent runs as a Windows service. Thus, the OS restarts the management agent if the agent fails or the OS reboots.

Management Domains

An HADB management domain is a set of hosts, each of which has a management agent running on the same port number. The hosts in a domain can contain one or more HADB database instances. A management domain is defined by the common port number the agents use and an identifier (called a domainkey) generated when you create or the domain or add an agent to it. The domainkey provides a unique identifier for the domain, crucial since management agents communicate using multicast. You can set up an HADB management domain to match with an Application Server domain.

Having multiple database instances in one domain can be useful in a development environment, since it enables different developer groups to use their own database instance. In some cases, it may also be useful in production environments.

All agents belonging to a domain coordinate their management operations. When you change the database configuration through an hadbm command, all agents will change the configuration accordingly. You cannot stop or restart a node unless the management agent on the node’s host is running. However, you can execute hadbm commands that read HADB state or configuration variable values even if some agents are not available.

Use the following management client commands to work with management domains:

hadbm createdomain: creates a management domain with the specified hosts.
hadbm extenddomain: adds hosts to an existing management domain
hadbm deletedomain: removes a management domain.
hadbm reducedomain: removes hosts from the management domain.
hadbm listdomain: lists all hosts defined in the management domain.

Repository

Management agents store the database configuration in a repository. The repository is highly fault-tolerant, because it is replicated over all the management agents. Keeping the configuration on the server enables you to perform management operations from any computer that has a management client installed.

A majority of the management agents in a domain must be running to perform any changes to the repository. Thus, if there are M agents in a domain, at least M/2 + 1 agents (rounded down to a whole number) must be running to make a change to the repository.

If some of the hosts in a domain are unavailable, for example due to hardware failures, and you cannot perform some management commands because you don’t have a quorum, use the hadbm disablehost command to remove the failed hosts from the domain.