The Sun Cluster HA for Sun Java System Application Server makes the DAS component highly available by configuring it as a failover data service. The DAS must be configured to listen on a failover IP address. When Sun Cluster HA for Sun Java System Application Server detects a failed DAS, the data service restarts the DAS locally or fails it over to another node, depending on the values of the retry count and retry interval.
The Node Agent (NA) component is also configured as a failover data service. A Node Agent may be configured to manage a number of Application Server instances, and the Sun Cluster HA for Sun Java System Application Server data service indirectly manages all of these instances. All the Application Server instances are associated with the Node Agents and the Node Agents are configured to listen on a failover IP address.
If the Application server instances are down, the Node Agents will restart them. Any lost transactions are recovered while the instances are restarting. In the case of a crash of the Sun Cluster node on which the Node Agents and the Application Server instances are running, the Sun Cluster HA for Sun Java System Application Server fails over the Node Agent and the Application Server instances to another Sun Cluster node.
Only one Node Agent resource is created for all the Node Agents configured for one failover IP address. The data service automatically detects the Node Agents that are configured on the failover IP address that is created in the failover resource group.
The following example is a configuration that comprises four Node Agents.
Node Agent NA1 and its associated server instances I1 and I2 are configured to listen on failover IP address IP1.
Node Agent NA2 and its associated server instances I3 and I4 are also configured to listen on failover IP address IP1.
Node Agent NA3 and its associated server instances I5 and I6 are configured to listen on failover IP address IP2.
Node Agent NA4 and its associated server instances I7 and I8 are also configured to listen on failover IP address IP2.
In this example, you create one resource for Node Agents NA1 and NA2 and all their server instances, and another resource for Node Agents NA3 and NA4 and all their server instances.
A detailed example of creating resources for four Node Agents is provided in Example of Creating the Failover Node Agent Component in the Sun Cluster HA for Sun Java System Application Server.
In the following sections, only two Node Agents are illustrated.
The following figure illustrates the failover DAS and failover Node Agent configuration before any node failure occurs.
The figure illustrates the following setup.
There are two physical nodes, Node1 and Node2.
The DAS is contained in the failover resource group RG1 on Node1 and listens on failover IP address IP1.
The Node Agent NA1 and the Application Server instances I1 and I2 that the Node Agent manages are contained in the resource group RG2 on Node1 and listen on failover IP address IP2.
The Node Agent NA2 and the Application Server instances I3 and I4 that the Node Agent manages are contained in the resource group RG3 on Node2 and listen on failover IP address IP3.
There is one domain, Domain1, which contains the DAS and the two Node Agents, as well as all the instances managed by the Node Agents.
The Application Server is installed on the global file system (GFS) and is accessible to the components on both Node1 and Node2.
Bringing these resource groups online starts the Node Agents, which in turn start the Application Server instances that they manage.
The following figure illustrates the failover DAS and failover Node Agent configuration after a node failure.
After a failure on Node1, resource groups RG1 and RG2 fail over to Node2. Resource group RG1 contains the DAS and its failover address IP1. Resource group RG2 contains Node Agent NA1, instances I1 and I2, and their failover address IP2.
The Node Agent probe relies upon the DAS for getting the status of the Node Agent. If DAS fails, there is no way to determine the status of the Node Agent. You need to ensure that the DAS is running at all times to know the status of the Node Agent.