Data Redundancy Units (Sun GlassFish Enterprise Server 2.1 Deployment Planning Guide)

Sun GlassFish Enterprise Server 2.1 Deployment Planning Guide

Data Redundancy Units

As previously described, an HADB instance contains a pair of DRUs. Each DRU has the same number of active and spare nodes as the other DRU in the pair. Each active node in a DRU has a mirror node in the other DRU. Due to mirroring, each DRU contains a complete copy of the database.

The following figure shows an example HADB architecture with six nodes: four active nodes and two spare nodes. Nodes 0 and 1 are a mirror pair, as are nodes 2 and 3. In this example, each host has one node. In general, a host can have more than one node if it has sufficient system resources (see System Requirements).

Note –

You must add machines that host HADB nodes in pairs, with one machine in each DRU.

HADB achieves high availability by replicating data and services. The data replicas on mirror nodes are designated as primary replicas and hot standby replicas. The primary replica performs operations such as inserts, deletes, updates, and reads. The hot standby replica receives log records of the primary replica’s operations and redoes them within the transaction life time. Read operations are performed only by the primary node and thus not logged. Each node contains both primary and hot standby replicas and plays both roles. The database is fragmented and distributed over the active nodes in a DRU. Each node in a mirror pair contains the same set of data fragments. Duplicating data on a mirror node is known as replication. Replication enables HADB to provide high availability: when a node fails, its mirror node takes over almost immediately (within seconds). Replication ensures availability and masks node failures or DRU failures without loss of data or services.

When a mirror node takes over the functions of a failed node, it has to perform double work: its own and that of the failed node. If the mirror node does not have sufficient resources, the overload will reduce its performance and increase its failure probability. When a node fails, HADB attempts to restart it. If the failed node does not restart (for example, due to hardware failure), the system continues to operate but with reduced availability.

HADB tolerates failure of a node, an entire DRU, or multiple nodes, but not a “double failure” when both a node and its mirror fail. For information on how to reduce the likelihood of a double failure, see Mitigating Double Failures