Sun Cluster 3.1 Data Service Planning and Administration Guide

Chapter 1 Planning for Sun Cluster Data Services

This chapter provides planning information and guidelines to install and configure Sun Cluster data services. This chapter contains the following sections.

See the Sun Cluster 3.1 Concepts Guide document for conceptual information about data services, resource types, resources, and resource groups.

If your applications are not currently offered as Sun Cluster data services, see the Sun Cluster 3.1 Data Services Developer's Guide for information on how to develop other applications to become highly available data services.

Sun Cluster Data Services Installation and Configuration Tasks

The following table lists the books that describe the installation and configuration of Sun Cluster data services.

Table 1–1 Task Map: Installing and Configuring Sun Cluster Data Services


Task	For Instructions, Go To …
Install and configure Sun Cluster HA for Oracle	Sun Cluster 3.1 Data Service for Oracle
Install and configure Sun Cluster HA for Sun Open Net Environment (Sun ONE) Web Server	Sun Cluster 3.1 Data Service for Sun ONE Web Server
Install and configure Sun Cluster HA for Sun ONE Directory Server	Sun Cluster 3.1 Data Service for Sun ONE Directory Server
Install and configure Sun Cluster HA for Apache	Sun Cluster 3.1 Data Service for Apache
Install and configure Sun Cluster HA for DNS	Sun Cluster 3.1 Data Service for Domain Name Service (DNS)
Install and configure Sun Cluster HA for NFS	Sun Cluster 3.1 Data Service for Network File System (NFS)
Install and configure Sun Cluster Support for Oracle Parallel Server/Real Application Clusters	Sun Cluster 3.1 Data Service for Oracle Parallel Server/Real Application Clusters
Install and configure Sun Cluster HA for SAP	Sun Cluster 3.1 Data Service for SAP
Install and configure Sun Cluster HA for Sybase ASE	Sun Cluster 3.1 Data Service for Sybase ASE
Install and configure Sun Cluster HA for BroadVision One-To-One Enterprise	Sun Cluster 3.1 Data Service for BroadVision One-To-One Enterprise
Install and configure Sun Cluster HA for NetBackup	Sun Cluster 3.1 Data Service for Netbackup
Install and configure Sun Cluster HA for SAP liveCache	Sun Cluster 3.1 Data Service for SAP liveCache
Install and configure Sun Cluster HA for Siebel	Sun Cluster 3.1 Data Service for Siebel

Configuration Guidelines for Sun Cluster Data Services

This section provides configuration guidelines for Sun Cluster data services.

Identifying Data Service Special Requirements

Identify requirements for all of the data services before you begin Solaris and Sun Cluster installation. Failure to do so might result in installation errors that require that you completely reinstall the Solaris and Sun Cluster software.

For example, the Oracle Parallel Fail Safe/Real Application Clusters Guard option of Sun Cluster Support for Oracle Parallel Server/Real Application Clusters has special requirements for the hostnames that you use in the cluster. Sun Cluster HA for SAP also has special requirements. You must accommodate these requirements before you install Sun Cluster software because you cannot change hostnames after you install Sun Cluster software.

Determining the Location of the Application Binaries

You can install the application software and application configuration files on one of the following locations.

The local disks of each cluster node – Placing the software and configuration files on the individual cluster nodes provides the following advantage. You can upgrade application software later without shutting down the service.

The disadvantage is that you then have several copies of the software and configuration files to maintain and administer.
The cluster file system – If you put the application binaries on the cluster file system, you have only one copy to maintain and manage. However, you must shut down the data service in the entire cluster to upgrade the application software. If you can spare a small amount of downtime for upgrades, place a single copy of the application and configuration files on the cluster file system.

See the planning chapter of the Sun Cluster 3.1 Software Installation Guide for information on how to create cluster file systems.
Highly available local file system - Using HAStoragePlus, you can integrate your local file system into the Sun Cluster environment making the local file system highly available. HAStoragePlus provides additional file system capabilities such as checks, mounts, and unmounts enabling Sun Cluster to fail over local file systems. In order to failover, the local file system must reside on global disk groups with affinity switchovers enabled.

See the individual data service chapters or Enabling Highly Available Local File Systems for information on how to use the HAStoragePlus resource type.

Verifying the `nsswitch.conf` File Contents

The nsswitch.conf file is the configuration file for name-service lookups. This file determines the following information.

which databases within the Solaris environment to use for name-service lookups
in what order to consult the databases

Some data services require that you direct “group” lookups to “files” first. For these data services, change the “group” line in the nsswitch.conf file so that the “files” entry is listed first. See the chapter for the data service that you plan to configure to determine whether you need to change the “group” line.

See the planning chapter in the Sun Cluster 3.1 Software Installation Guide for additional information on how to configure the nsswitch.conf file for the Sun Cluster environment.

Planning the Cluster File System Configuration

Depending on the data service, you might need to configure the cluster file system to meet Sun Cluster requirements. See the chapter for the data service that you plan to configure to determine whether any special considerations apply.

The resource type HAStoragePlus enables you to use a highly available local file system in a Sun Cluster environment configured for failover. See Enabling Highly Available Local File Systems for information on setting up the HAStoragePlus resource type.

See the planning chapter of the Sun Cluster 3.1 Software Installation Guide for information on how to create cluster file systems.

Relationship Between Resource Groups and Disk Device Groups

Sun Cluster uses the concept of node lists for disk device groups and resource groups. Node lists are ordered lists of primary nodes, which are potential masters of the disk device group or resource group. Sun Cluster uses a failback policy to determine what happens when a node has been down and then rejoins the cluster, and the rejoining node appears earlier in the node list than the current primary node. If failback is set to True, the device group or resource group will be switched off of the current primary and switched onto the rejoining node, making the rejoining node the new primary.

To ensure high availability of a failover resource group, make the resource group's node list match the node list of associated disk device groups. For a scalable resource group, the resource group's node list cannot always match the device group's node list because, currently, a device group's node list must contain exactly two nodes. For a greater-than-two-node cluster, the node list for the scalable resource group can have more than two nodes.

For example, assume that you have a disk device group, disk-group-1, that has nodes phys-schost-1 and phys-schost-2 in its node list, with the failback policy set to Enabled. Assume that you also have a failover resource group, resource-group-1, which uses disk-group-1 to hold its application data. When you set up resource-group-1, also specify phys-schost-1 and phys-schost-2 for the resource group's node list, and set the failback policy to True.

To ensure high availability of a scalable resource group, make the scalable resource group's node list a superset of the node list for the disk device group. Doing so ensures that the nodes that are directly connected to the disks are also nodes that can run the scalable resource group. The advantage is that, when at least one cluster node connected to the data is up, the scalable resource group runs on that same node, making the scalable services available also.

See the Sun Cluster 3.1 Software Installation Guide for information on how to set up disk device groups. See the Sun Cluster 3.1 Concepts Guide document for more details on the relationship between disk device groups and resource groups.

Understanding `HAStorage` and `HAStoragePlus`

The HAStorage and the HAStoragePlus resource types can be used to configure the following options.

Coordinate the boot order of disk devices and resource groups by causing the START methods of the other resources in the same resource group that contains the HAStorage or HAStoragePlus resource to wait until the disk device resources become available.
With AffinityOn set to True, enforce colocation of resource groups and disk device groups on the same node, thus enhancing the performance of disk-intensive data services.

In addition, HAStoragePlus is capable of mounting any global file system found to be in an unmounted state. See Planning the Cluster File System Configuration for more information.

Note –

If the device group is switched to another node while the HAStorage or HAStoragePlus resource is online, AffinityOn has no effect and the resource group does not migrate along with the device group. On the other hand, if the resource group is switched to another node, AffinityOn being set to True causes the device group to follow the resource group to the new node.

See Synchronizing the Startups Between Resource Groups and Disk Device Groups for information about the relationship between disk device groups and resource groups. The SUNW.HAStorage(5) and SUNW.HAStoragePlus(5) man pages provides additional details.

See Enabling Highly Available Local File Systems for procedures for mounting of file systems such as VxFS in a local mode. The SUNW.HAStoragePlus(5) man page provides additional details.

Determining Whether Your Data Service Requires `HAStorage` or `HAStoragePlus`

In cases where a data service resource group has a node list in which some of the nodes are not directly connected to the storage, you must configure HAStorage or HAStoragePlus resources in the resource group and set the dependency of the other data service resources to the HAStorage or HAStoragePlus resource. This requirement coordinates the boot order between the storage and the data services.
If your data service is disk intensive, such as Sun Cluster HA for Oracle and Sun Cluster HA for NFS, ensure that you perform the following tasks.
- Add a HAStorage or HAStoragePlus resource to your data service resource group.
- Switch the HAStorage or HAStoragePlus resource online.
- Set the dependency of your data service resources to the HAStorage or HAStoragePlus resource.
- Set AffinityOn to True.
  
  When you perform these tasks, the resource groups and disk device groups are collocated on the same node.
- The failback settings must be identical for both the resource group and device group(s).
If your data service is not disk intensive—such as one that reads all of its files at startup (for example, Sun Cluster HA for DNS)—configuring the HAStorage or HAStoragePlus resource type is optional.

Choosing Between `HAStorage` and `HAStoragePlus`

To determine whether to create HAStorage or HAStoragePlus resources within a data service resource group, consider the following criteria.

Use HAStorage if you are using Sun Cluster 3.0 12/01 or earlier.
Use HAStoragePlus if you are using Sun Cluster 3.0 5/02 or Sun Cluster 3.1. (If you want to integrate any file system locally into a Sun Cluster configured for failover, you must upgrade to Sun Cluster 3.0 5/02 or Sun Cluster 3.1 and use the HAStoragePlus resource type. See Planning the Cluster File System Configuration for more information.)

Considerations

Use the information in this section to plan the installation and configuration of any data service. The information in this section encourages you to think about the impact your decisions have on the installation and configuration of any data service. For specific considerations for your data service, see the chapter in Sun Cluster 3.1 Data Service Planning and Administration Guide that applies to your specific data service.

When using data services that are I/O intensive and that have a large number of disks configured in the cluster, the application may experience delays due to retries within the I/O subsystem during disk failures. An I/O subsystem may take several minutes to retry and recover from a disk failure. This delay can result in Sun Cluster failing over the application to another node, even though the disk may have eventually recovered on its own. To avoid failover during these instances, consider increasing the default probe timeout of the data service. If you need more information or help with increasing data service timeouts, contact your local support engineer.
For better performance, Install and configure your data service on the cluster nodes with direct connection to the storage.

Node List Properties

You can specify the following three node lists when configuring data services.

installed_nodes – A property of the resource type. This property is a list of the cluster node names on which the resource type is installed and enabled to run.
nodelist – A property of a resource group that specifies a list of cluster node names where the group can be brought online, in order of preference. These nodes are known as the potential primaries or masters of the resource group. For failover services, configure only one resource group node list. For scalable services, configure two resource groups and thus two node lists. One resource group and its node list identifies the nodes on which the shared addresses are hosted. This list is a failover resource group on which the scalable resources depend. The other resource group and its list identifies nodes on which the application resources are hosted. The application resources depend on the shared addresses. Therefore, the node list for the resource group that contains the shared addresses must be a superset of the node list for the application resources.
auxnodelist – A property of a shared address resource. This property is a list of physical node IDs that identify cluster nodes that can host the shared address but never serve as primary in the case of failover. These nodes are mutually exclusive with the nodes identified in the node list of the resource group. This list pertains to scalable services only. See the scrgadm(1M) man page for details.

Overview of the Installation and Configuration Process

Use the following procedures to install and configure data services.

Install the data service packages from the Sun Cluster Agents CD-ROM.
Install and configure the application to run in the cluster environment.
Configure the resources and resource groups that the data service uses. When you configure a data service, specify the resource types, resources, and resource groups that the Resource Group Manager (RGM) will manage. The chapters for the individual data services describe these procedures.

Before you install and configure data services, see the Sun Cluster 3.1 Software Installation Guide, which includes procedures on how to install the data service software packages and how to configure Internet Protocol Network Multipathing (IP Networking Multipathing) groups that the network resources use.

Note –

You can use SunPlex Manager to install and configure the following data services: Sun Cluster HA for Oracle, Sun Cluster HA for Sun ONE Web Server, Sun Cluster HA for Sun ONE Directory Server, Sun Cluster HA for Apache, Sun Cluster HA for DNS, and Sun Cluster HA for NFS. See the SunPlex Manager online help for more information.

Installation and Configuration Task Flow

The following table shows a task map of the procedures to install and configure a Sun Cluster failover data service.

Table 1–2 Task Map: Sun Cluster Data Service Installation and Configuration


Task	For Instructions, Go to
Install the Solaris and Sun Cluster software	Sun Cluster 3.1 Software Installation Guide
Set up IP Networking Multipathing groups	Sun Cluster 3.1 Software Installation Guide
Set up multihost disks	Sun Cluster 3.1 Software Installation Guide
Plan resources and resource groups	Sun Cluster 3.1 Release Notes
Decide the location for application binaries, and configure the `nsswitch.conf` file	Chapter 1, Planning for Sun Cluster Data Services
Install and configure the application software	The chapter for each data service in this book
Install the data service software packages	Sun Cluster 3.1 Software Installation Guide or the chapter for each data service in this book
Register and configure the data service	The chapter for each data service in this book

Example

The example in this section shows how you might set up the resource types, resources, and resource groups for an Oracle application that has been instrumented to be a highly available failover data service.

The main difference between this example and an example of a scalable data service is that, in addition to the failover resource group that contains the network resources, a scalable data service requires a separate resource group (called a scalable resource group) for the application resources.

The Oracle application has two components, a server and a listener. Sun supplies the Sun Cluster HA for Oracle data service, and therefore these components have already been mapped into Sun Cluster resource types. Both of these resource types are associated with resources and resource groups.

Because this example is a failover data service, the example uses logical hostname network resources, which are the IP addresses that fail over from a primary node to a secondary node. Place the logical hostname resources into a failover resource group, and then place the Oracle server resources and listener resources into the same resource group. This ordering enables all of the resources to fail over as a group.

For Sun Cluster HA for Oracle run to on the cluster, you must define the following objects.

LogicalHostname resource type – This resource type is built in, and therefore you do not need to explicitly register the resource type.
Oracle resource types – Sun Cluster HA for Oracle defines two Oracle resource types—a database server and a listener.
Logical hostname resources – These resources host the IP addresses that fail over in a node failure.
Oracle resources – You must specify two resource instances for Sun Cluster HA for Oracle—a server and a listener.
Failover resource group – This container is composed of the Oracle server and listener and logical hostname resources that will fail over as a group.

Tools for Data Service Resource Administration

This section describes the tools that you can use to perform installation and configuration tasks.

The SunPlex Manager Graphical User Interface (GUI)

SunPlex Manager is a web-based tool that enables you to perform the following tasks.

Install a cluster.
Administer a cluster.
Create and configure resources and resource groups.
Configure data services with the Sun Cluster software.

See the Sun Cluster 3.1 Software Installation Guide for instructions on how to use SunPlex Manager to install cluster software. SunPlex Manager provides online help for most administrative tasks.

The Sun Cluster Module for the Sun Management Center GUI

The Sun Cluster module enables you to monitor clusters and to perform some operations on resources and resource groups from the Sun Management Center GUI. See the Sun Cluster 3.1 Software Installation Guide for information about installation requirements and procedures for the Sun Cluster module. Go to http://docs.sun.com to access the Sun Management Center software documentation set, which provides additional information about Sun Management Center.

The `scsetup` Utility

The scsetup(1M) utility is a menu-driven interface that you can use for general Sun Cluster administration. You can also use this utility to configure data service resources and resource groups. Select option 2 from the scsetup main menu to launch the Resource Group Manager submenu.

The `scrgadm` Command

You can use the scrgadm command to register and configure data service resources. See the procedure on how to register and configure your data service in the applicable chapter of this book. If, for example, you use Sun Cluster HA for Oracle, see “Installing and Configuring Sun Cluster HA for Oracle” in Sun Cluster 3.1 Data Service for Oracle. Chapter 2, Administering Data Service Resources also contains information on how to use the scrgadm command to administer data service resources. Finally, see the scrgadm(1M) man page for additional information.

Data Service Resource Administration Tasks

The following table lists which tool you can use in addition to the command line for different data service resource administration tasks. See Chapter 2, Administering Data Service Resources for more information about these tasks and for details on how to use the command line to complete related procedures.

Table 1–3 Tools You Can Use for Data Service Resource Administration Tasks


Task	SunPlex Manager	Sun Management Center	The `scsetup` Utility
Register a resource type	Yes	No	Yes
Create a resource group	Yes	No	Yes
Add a resource to a resource group	Yes	No	Yes
Bring a resource group online	Yes	Yes	No
Remove a resource group	Yes	Yes	No
Remove a resource	Yes	Yes	No
Switch the current primary of a resource group	Yes	No	No
Disable a resource	Yes	Yes	No
Move the resource group of a disabled resource into the unmanaged state	Yes	No	No
Display resource type, resource group, and resource configuration information	Yes	Yes	No
Change resource properties	Yes	No	No
Clear the `STOP_FAILED` error flag on resources	Yes	No	No
Add a node to a resource group	Yes	No	No

Sun Cluster Data Service Fault Monitors

This section provides general information about data service fault monitors. The Sun-supplied data services contain fault monitors that are built into the package. The fault monitor (or fault probe) is a process that probes the health of the data service.

Fault Monitor Invocation

The RGM invokes the fault monitor when you bring a resource group and its resources online. This invocation causes the RGM to internally call the MONITOR_START method for the data service.

The fault monitor performs the following two functions.

monitors the abnormal exit of the data service server process or processes
checks the health of the data service

Monitoring of the Abnormal Exit of the Server Process

The Process Monitor Facility (PMF) monitors the data service processes.

The data service fault probe runs in an infinite loop and sleeps for an adjustable amount of time that the resource property Thorough_probe_interval sets. While sleeping, the probe checks with the PMF to see if the process has exited. If the process has exited, the probe updates the status of the data service as “Service daemon not running” and takes action. The action can involve restarting the data service locally or failing over the data service to a secondary cluster node. To decide whether to restart or to fail over the data service, the probe checks the value set in the resource properties Retry_count and Retry_interval for the data service application resource.

Checking the Health of the Data Service

Typically, communication between the probe and the data service occurs through a dedicated command or a successful connection to the specified data service port.

The logic that the probe uses is roughly as follows.

Sleep (Thorough_probe_interval).
Perform health checks under a time-out property Probe_timeout. Probe_timeout is a resource extension property of each data service that you can set.
If Step 2 is a success, that is, the service is healthy, update the success/failure history. To update the success/failure history, purge any history records that are older than the value that is set for the resource property Retry_interval. The probe sets the status message for the resource as “Service is online” and returns to Step 1.

If Step 2 resulted in a failure, the probe updates the failure history. The probe then computes the total number of times that the health check failed.

The result of the health check can range from a complete failure to success. The interpretation of the result depends on the specific data service. Consider a scenario where the probe can successfully connect to the server and send a handshake message to the server, but the probe receives only a partial response before it times out. This scenario is most likely a result of system overload. If some action is taken (such as restarting the service), the clients reconnect to the service, thus further overloading the system. If this event occurs, a data service fault monitor can decide not to treat this “partial” failure as fatal. Instead, the monitor can track this failure as a nonfatal probe of the service. These partial failures are still accumulated over the interval that the Retry_interval property specifies.

However, if the probe cannot connect to the server at all, the failure can be considered fatal. Partial failures lead to incrementing the failure count by a fractional amount. Every time the failure count reaches total failure (either by a fatal failure or by accumulation of partial failures), the probe restarts or fails over the data service in an attempt to correct the situation.
If the result of the computation in Step 3 (the number of failures in the history interval) is less than the value of the resource property Retry_count, the probe attempts to correct the situation locally (for example, by restarting the service). The probe sets the status message of the resource as “Service is degraded” and returns to Step 1.
If the number of failures in Retry_interval exceeds Retry_count, the probe calls scha_control with the “giveover” option. This option requests failover of the service. If this request succeeds, the fault probe stops on this node. The probe sets the status message for the resource as, “Service has failed.”
The Sun Cluster framework can deny the scha_control request issued in the previous step for various reasons. The return code of scha_control identifies the reason. The probe checks the return code. If the scha_control is denied, the probe resets the failure/success history and starts afresh. This probe resets the history because the number of failures is already above Retry_count, and the fault probe would attempt to issue scha_control in each subsequent iteration (which would be denied again). This request would place additional load on the system and would increase the likelihood of further service failures.

The probe then returns to Step 1.