Solaris Resource Manager 1.3 System Administration Guide

Configuring Solaris Resource Manager in Sun Cluster 3.0 Update Environments

Valid Topologies

You can install Solaris Resource Manager on any valid Sun Cluster 3.0 Update topology. See Sun Cluster 3.0 12/01 Concepts for descriptions of valid topologies.

Determining Requirements

Before you configure the Solaris Resource Manager product in a Sun Cluster environment, you must decide how you want to control and track resources across switchovers or failovers. If you configure all cluster nodes identically, usage limits will be enforced identically on primary and backup nodes.

While the configuration parameters need not be identical for all applications in the configuration files on all nodes, all applications must at least be represented in the configuration files on all potential masters of that application. For example, if Application 1 is mastered by phys-schost-1 but could potentially be switched or failed-over to phys-schost-2 or phys-schost-3, then Application 1 must be included in the configuration files on all three nodes (phys-schost-1, phys-schost-2, and phys-schost-3).

Solaris Resource Manager is very flexible with regard to configuration of usage and accrual parameters, and few restrictions are imposed by Sun Cluster. Configuration choices depend on the needs of the site. Consider the general guidelines in the following sections before configuring your systems.

Configuring Memory Limits Parameters

When using the Solaris Resource Manager product with Sun Cluster, you should configure memory limits appropriately to prevent unnecessary failover of applications and a ping-pong effect of applications. In general:

Do not set memory limits too low.

When an application reaches its memory limit, it might fail over. This is especially important for database applications, when reaching a virtual memory limit can have unexpected consequences.
Do not set memory limits identically on primary and backup nodes.

Identical limits can cause a ping-pong effect when an application hits its memory limit and fails over to a backup node with an identical memory limit. Set the memory limit slightly higher on the backup node. The applications, resources, and preferences at the site determine how much higher the limit is set. The difference in memory limits helps prevent the ping-pong scenario and gives you a period of time in which to adjust the parameters as necessary.
Do use the Solaris Resource Manager memory limits for coarse-grained problem scenario load-balancing.

For example, you can use memory limits to prevent an errant application from consuming excess resources.

Using Accrued Usage Parameters

Several Solaris Resource Manager parameters are used for keeping track of system resource usage accrual: CPU shares, number of logins, and connect-time. However, in the case of switchover or failover, usage accrual data (CPU usage, number of logins, and connect-time) will restart at zero by default on the new master for all applications that were switched or failed over. Accrual data is not transferred dynamically across nodes.

To avoid invalidating the accuracy of the Solaris Resource Manager usage accrual reporting feature, you can create scripts to gather accrual information from the cluster nodes. Because an application might run on any of its potential masters during an accrual period, the scripts should gather accrual information from all possible masters of a given application. For more information, see Chapter 9, Usage Data.

Failover Scenarios

On Sun Cluster, Solaris Resource Manager can be configured so that the resource allocation configuration described in the lnode configuration (/var/srm/srmDB) remains the same in normal cluster operation and in switchover or failover situations. For more information, see Sample Share Allocation.

The following sections are example scenarios.

The first two sections, Two-Node Cluster With Two Applications and Two-Node Cluster With Three Applications, show failover scenarios for entire nodes.
The section Failover of Resource Group Only illustrates failover operation for an application only.

In a cluster environment, an application is configured as part of a resource group (RG). When a failure occurs, the resource group, along with its associated applications, fails over to another node. In the following examples, Application 1 (App-1) is configured in resource group RG-1, Application 2 (App-2) is configured in resource group RG-2, and Application 3 (App-3) is configured in resource group RG-3.

Although the numbers of assigned shares remain the same, the percentage of CPU resources allocated to each application will change after failover, depending on the number of applications running on the node and the number of shares assigned to each active application.

In these scenarios, assume the following configurations.

All applications are configured under a common parent lnode.
The applications are the only active processes on the nodes.
The limits databases are configured the same on each node of the cluster.

Two-Node Cluster With Two Applications

You can configure two applications on a two-node cluster such that each physical host (phys-schost-1, phys-schost-2) acts as the default master for one application. Each physical host acts as the backup node for the other physical host. All applications must be represented in the Solaris Resource Manager limits database files on both nodes. When the cluster is running normally, each application is running on its default master, where it is allocated all CPU resources by Solaris Resource Manager.

After a failover or switchover occurs, both applications run on a single node where they are allocated shares as specified in the configuration file. For example, this configuration file specifies that Application 1 is allocated 80 shares and Application 2 is allocated 20 shares.

# limadm set cpu.shares=80 App-1 
# limadm set cpu.shares=20 App-2 
...

The following diagram illustrates the normal and failover operations of this configuration. Note that although the number of shares assigned does not change, the percentage of CPU resources available to each application can change, depending on the number of shares assigned to each process demanding CPU time.

The preceding context describes the graphic.

Two-Node Cluster With Three Applications

On a two-node cluster with three applications, you can configure it such that one physical host (phys-schost-1) is the default master of one application and the second physical host (phys-schost-2) is the default master for the remaining two applications. Assume the following example limits database file on every node. The limits database file does not change when a failover or switchover occurs.

# limadm set cpu.shares=50	 App-1
# limadm set cpu.shares=30	 App-2
# limadm set cpu.shares=20	 App-3 
...

When the cluster is running normally, Application 1 is allocated 50 shares on its default master, phys-schost-1. This is equivalent to 100 percent of CPU resources because it is the only application demanding CPU resources on that node. Applications 2 and 3 are allocated 30 and 20 shares, respectively, on their default master, phys-schost-2. Application 2 would receive 60 percent and Application 3 would receive 40 percent of CPU resources during normal operation.

If a failover or switchover occurs and Application 1 is switched over to phys-schost-2, the shares for all three applications remain the same, but the percentages of CPU resources are re-allocated according to the limits database file.

Application 1, with 50 shares, receives 50 percent of CPU.
Application 2, with 30 shares, receives 30 percent of CPU.
Application 3, with 20 shares, receives 20 percent of CPU.

The following diagram illustrates the normal and failover operations of this configuration.

Failover of Resource Group Only

In a configuration in which multiple resource groups have the same default master, it is possible for a resource group (and its associated applications) to fail over or be switched over to a backup node, while the default master remains up and running in the cluster.

Note -

During failover, the application that fails over will be allocated resources as specified in the configuration file on the backup node. In this example, the limits database files on the primary and backup nodes have the same configurations.

For example, this sample configuration file specifies that Application 1 is allocated 30 shares, Application 2 is allocated 60 shares, and Application 3 is allocated 60 shares.

# limadm set cpu.shares=30 App-1
# limadm set cpu.shares=60 App-2
# limadm set cpu.shares=60 App-3
...

The following diagram illustrates the normal and failover operations of this configuration, where RG-2, containing Application 2, fails over to phys-schost-2. Note that although the number of shares assigned does not change, the percentage of CPU resources available to each application can change, depending on the number of shares assigned to each application demanding CPU time.