This chapter provides planning information and guidelines to install and configure Sun Cluster data services. This chapter contains the following sections.
For information about data services, resource types, resources, and resource groups, see Sun Cluster Concepts Guide for Solaris OS.
Sun Cluster software can provide service only for those data services that are either supplied with the Sun Cluster product or are created with the Sun Cluster data services application programming interfaces (APIs).
If a Sun Cluster data service is not provided for your application, consider developing a custom data service for the application. To develop a custom data service, use the Sun Cluster data services APIs. For more information, see Sun Cluster Data Services Developer’s Guide for Solaris OS.
Sun Cluster does not provide a data service for the sendmail(1M) subsystem. The sendmail subsystem can run on the individual cluster nodes, but the sendmail functionality is not highly available. This restriction applies to all the sendmail functionality, including the functionality of mail delivery and mail routing, queuing, and retry.
This section provides configuration guidelines for Sun Cluster data services.
Identify requirements for all of the data services before you begin Solaris and Sun Cluster installation. Failure to do so might result in installation errors that require that you completely reinstall the Solaris and Sun Cluster software.
For example, the Oracle Real Application Clusters Guard option of Sun Cluster Support for Oracle Real Application Clusters has special requirements for the hostnames that you use in the cluster. Sun Cluster HA for SAP also has special requirements. You must accommodate these requirements before you install Sun Cluster software because you cannot change hostnames after you install Sun Cluster software.
Some Sun Cluster data services are not supported for use in x86 based clusters. For more information, see the release notes for your release of Sun Cluster at http://docs.sun.com.
You can install the application software and application configuration files on one of the following locations.
The local disks of each cluster node – Placing the software and configuration files on the individual cluster nodes provides the following advantage. You can upgrade application software later without shutting down the service.
The disadvantage is that you then have several copies of the software and configuration files to maintain and administer.
The cluster file system – If you put the application binaries on the cluster file system, you have only one copy to maintain and manage. However, you must shut down the data service in the entire cluster to upgrade the application software. If you can spare a short period of downtime for upgrades, place a single copy of the application and configuration files on the cluster file system.
For information about how to create cluster file systems, see Planning the Global Devices, Device Groups, and Cluster File Systems in Sun Cluster Software Installation Guide for Solaris OS.
Highly available local file system – Using HAStoragePlus, you can integrate your local file system into the Sun Cluster environment, making the local file system highly available. HAStoragePlus provides additional file system capabilities such as checks, mounts, and unmounts that enable Sun Cluster to fail over local file systems. To fail over, the local file system must reside on global disk groups with affinity switchovers enabled.
For information about how to use the HAStoragePlus resource type, see Enabling Highly Available Local File Systems.
The nsswitch.conf file is the configuration file for name-service lookups. This file determines the following information.
The databases within the Solaris environment to use for name-service lookups
The order in which the databases are to be consulted
Some data services require that you direct “group” lookups to “files” first. For these data services, change the “group” line in the nsswitch.conf file so that the “files” entry is listed first. See the documentation for the data service that you plan to configure to determine whether you need to change the “group” line.
For additional information about how to configure the nsswitch.conf file for the Sun Cluster environment, see Planning the Sun Cluster Environment in Sun Cluster Software Installation Guide for Solaris OS.
Depending on the data service, you might need to configure the cluster file system to meet Sun Cluster requirements. To determine whether any special considerations apply, see the documentation for the data service that you plan to configure.
For information about how to create cluster file systems, see Planning the Global Devices, Device Groups, and Cluster File Systems in Sun Cluster Software Installation Guide for Solaris OS.
The resource type HAStoragePlus enables you to use a highly available local file system in a Sun Cluster environment that is configured for failover. For information about setting up the HAStoragePlus resource type, see Enabling Highly Available Local File Systems.
The Service Management Facility (SMF) enables you to automatically start and restart SMF services, during a node boot or service failure. This feature is similar to the Sun Cluster Resource Group Manager (RGM), which facilitates high availability and scalability for cluster applications. SMF services and RGM features are complementary to each other.
Sun Cluster includes three new SMF proxy resource types that can be used to enable SMF services to run with Sun Cluster in a failover, multi-master, or scalable configuration. The SMF proxy resource types enables you to encapsulate a set of interrelated SMF services into a single resource, SMF proxy resource to be managed by Sun Cluster. In this feature, SMF manages the availability of SMF services on a single node. Sun Cluster provides cluster-wide high availability and scalability of the SMF services.
For information on how you can encapsulate these services, see Enabling Solaris SMF Services to Run With Sun Cluster
You might require Sun Cluster to make highly available an application other than NFS or DNS that is integrated with the Solaris Service Management Facility (SMF). To ensure that Sun Cluster can restart or fail over the application correctly after a failure, you must disable SMF service instances for the application as follows:
For any application other than NFS or DNS, disable the SMF service instance on all potential primary nodes for the Sun Cluster resource that represents the application.
If multiple instances of the application share any component that you require Sun Cluster to monitor, disable all service instances of the application. Examples of such components are daemons, file systems, and devices.
If you do not disable the SMF service instances of the application, both the Solaris SMF and Sun Cluster might attempt to control the startup and shutdown of the application. As a result, the behavior of the application might become unpredictable.
For more information, see the following documentation:
Sun Cluster uses the concept of node lists for device groups and resource groups. Node lists are ordered lists of primary nodes, which are potential masters of the disk device group or resource group. Sun Cluster uses a failback policy to determine the behavior of Sun Cluster in response to the following set of conditions:
A node that has failed and left the cluster rejoins the cluster.
The node that is rejoining the cluster appears earlier in the node list than the current primary node.
If failback is set to True, the device group or resource group is switched off the current primary and switched onto the rejoining node, making the rejoining node the new primary.
For example, assume that you have a disk device group, disk-group-1, that has nodes phys-schost-1 and phys-schost-2 in its node list, with the failback policy set to Enabled. Assume that you also have a failover resource group, resource-group-1, which uses disk-group-1 to hold its application data. When you set up resource-group-1, also specify phys-schost-1 and phys-schost-2 for the resource group's node list, and set the failback policy to True.
To ensure high availability of a scalable resource group, make the scalable resource group's node list a superset of the node list for the disk device group. This setting ensures that the nodes that are directly connected to the disks are also nodes that can run the scalable resource group. The advantage is that, when at least one cluster node connected to the data is up, the scalable resource group runs on that same node, making the scalable services available also.
For more information about the relationship between device groups and resource groups, see Device Groups in Sun Cluster Overview for Solaris OS.
For information about how to set up device groups, see the following documentation:
The HAStoragePlus resource type can be used to configure the following options.
Coordinate the boot order of disk devices and resource groups. Other resources in the resource group that contains the HAStoragePlus resource are brought online only after the disk device resources become available.
With AffinityOn set to True, enforce collocation of resource groups and device groups on the same node. This enforced collocation enhances the performance of disk-intensive data services.
In addition, HAStoragePlus is capable of mounting local and global file systems. For more information, see Planning the Cluster File System Configuration.
If the device group is switched to another node while the HAStoragePlus resource is online, AffinityOn has no effect. The resource group does not migrate with the device group. However, if the resource group is switched to another node, the setting of AffinityOn to True causes the device group to follow the resource group to the new node.
See Synchronizing the Startups Between Resource Groups and Device Groups for information about the relationship between device groups and resource groups.
See Enabling Highly Available Local File Systems for procedures for mounting of file systems such as VxFS and Solaris ZFS (Zettabyte File System) in a local mode. The SUNW.HAStoragePlus(5) man page provides additional details.
The following types of data services require HAStoragePlus:
Data services with nodes that are not directly connected to storage
Data services that are disk intensive
Some nodes in the node list of a data service's resource group might not be directly connected to the storage. In this situation, you must coordinate the boot order between the storage and the data service. To meet this requirement, configure the resource group as follows:
Configure HAStoragePlus resources in the resource group.
Set the dependency of the other data service resources to the HAStoragePlus resource.
Some data services, such as Sun Cluster HA for Oracle and Sun Cluster HA for NFS are disk intensive. If your data service is disk intensive, ensure that the resource groups and device groups are collocated on the same node. To meet this requirement, perform the following tasks.
Adding an HAStoragePlus resource to your data service resource group
Switching the HAStoragePlus resource online
Setting the dependency of your data service resources to the HAStoragePlus resource
Setting AffinityOn to True
The failback settings must be identical for both the resource group and device groups.
Some data services are not disk intensive. For example, Sun Cluster HA for DNS, which reads all of its files at startup, is not disk intensive. If your data service is not disk intensive, configuring the HAStoragePlus resource type is optional.
Use the information in this section to plan the installation and configuration of any data service. The information in this section encourages you to think about the impact your decisions have on the installation and configuration of any data service. For specific considerations for a data service, see the documentation for the data service.
Retries within the I/O subsystem during disk failures might cause applications whose data services are disk intensive to experience delays. Disk-intensive data services are I/O intensive and have a large number of disks configured in the cluster. An I/O subsystem might require several minutes to retry and recover from a disk failure. This delay can cause Sun Cluster to fail over the application to another node, even though the disk might have eventually recovered on its own. To avoid failover during these instances, consider increasing the default probe timeout of the data service. If you need more information or help with increasing data service timeouts, contact your local support engineer.
For better performance, install and configure your data service on the cluster nodes with direct connection to the storage.
Client applications that run on cluster nodes should not map to logical IP addresses of an HA data service. After a failover, these logical IP addresses might no longer exist, leaving the client without a connection.
You can specify the following node list properties when configuring data services.
installed_nodes property
nodelist property
auxnodelist property
The installed_nodes property is a property of the resource type for the data service. This property is a list of the cluster node names on which the resource type is installed and enabled to run.
The nodelist property is a property of a resource group. This property specifies a list of cluster node names where the group can be brought online, in order of preference. These nodes are known as the potential primaries or masters of the resource group. For failover services, configure only one resource group node list. For scalable services, configure two resource groups and thus two node lists. One resource group and its node list identify the nodes on which the shared addresses are hosted. This list is a failover resource group on which the scalable resources depend. The other resource group and its list identify nodes on which the application resources are hosted. The application resources depend on the shared addresses. Therefore, the node list for the resource group that contains the shared addresses must be a superset of the node list for the application resources.
The auxnodelist property is a property of a shared address resource. This property is a list of node IDs that identify cluster nodes that can host the shared address but never serve as primary in the case of failover. These nodes are mutually exclusive with the nodes that are identified in the node list of the resource group. This list pertains to scalable services only. For details, see the clressharedaddress(1CL) man page.
Use the following procedures to install and configure a data service.
Install the data service packages from the installation medium on which the packages are supplied.
Sun Cluster 3.0 5/02 CD-ROM
Sun Cluster 3.0 5/02 Agents CD-ROM
Install and configure the application to run in the cluster environment.
Configure the resources and resource groups that the data service uses. When you configure a data service, specify the resource types, resources, and resource groups that the Resource Group Manager (RGM) is to manage. The documentation for the individual data services describes these procedures.
Before you install and configure data services, see Sun Cluster Software Installation Guide for Solaris OS, which includes instructions for the following tasks:
Installing the data service software packages
Configuring IP network multipathing groups that the network resources use
You can use Sun Cluster Manager to install and configure the following data services: Sun Cluster HA for Oracle, Sun Cluster HA for Sun JavaTM System Web Server, Sun Cluster HA for Web Server, Sun Cluster HA for Apache, Sun Cluster HA for DNS, and Sun Cluster HA for NFS. See the SunPlex Manager online help for more information.
The following table summarizes the tasks for installing and configuring Sun Cluster data services. The table also provides cross-references to detailed instructions for performing the tasks.
Table 1–1 Tasks for Installing and Configuring Sun Cluster Data Services
Task |
Instructions |
---|---|
Install the Solaris and Sun Cluster software | |
Set up IPMP groups | |
Set up multihost disks | |
Plan resources and resource groups |
Appendix D, Data Service Configuration Worksheets and Examples |
Decide the location for application binaries, and configure the nsswitch.conf file | |
Install and configure the application software |
The appropriate Sun Cluster data services book |
Install the data service software packages |
Sun Cluster Software Installation Guide for Solaris OS or the appropriate Sun Cluster data services book |
Register and configure the data service |
The appropriate Sun Cluster data services book |
This example summarizes how to set up the resource types, resources, and resource groups that a failover data service for the Oracle application requires. For complete instructions for configuring the data service for the Oracle application, see Sun Cluster Data Service for Oracle Guide for Solaris OS.
The principal difference between this example and an example of a scalable data service is as follows: In addition to the failover resource group that contains the network resources, a scalable data service requires a separate resource group (scalable resource group) for the application resources.
The Oracle application has two components, a server and a listener. Sun supplies the Sun Cluster HA for Oracle data service, and therefore these components have already been mapped into Sun Cluster resource types. Both of these resource types are associated with resources and resource groups.
Because this example is a failover data service, the example uses logical hostname network resources, which are the IP addresses that fail over from a primary node to a secondary node. Place the logical hostname resources into a failover resource group, and then place the Oracle server resources and listener resources into the same resource group. This ordering enables all of the resources to fail over as a group.
For Sun Cluster HA for Oracle to run on the cluster, you must define the following objects.
LogicalHostname resource type – This resource type is built in, and therefore you do not need to explicitly register the resource type.
Oracle resource types – Sun Cluster HA for Oracle defines two Oracle resource types—a database server and a listener.
Logical hostname resources – These resources host the IP addresses that fail over in a node failure.
Oracle resources – You must specify two resource instances for Sun Cluster HA for Oracle — a server and a listener.
Failover resource group – This container is composed of the Oracle server and listener and logical hostname resources that will fail over as a group.
This section describes the tools that you can use to perform installation and configuration tasks.
SunPlex Manager is a web-based tool that enables you to perform the following tasks.
Installing a cluster
Administering a cluster
Creating and configuring resources and resource groups
Configuring data services with the Sun Cluster software
Sun Cluster Manager provides wizards to automate the configuration of Sun Cluster data services for the following applications.
Apache Web Server
NFS
Oracle
Oracle Real Application Clusters
SAP Web Application Server
Each wizard enables you to configure Sun Cluster resources that the data service requires. The wizard does not automate the installation and configuration of the application software to run in a Sun Cluster configuration. To install and configure application software to run in a Sun Cluster configuration, use utilities of the application and Sun Cluster maintenance commands. For more information, see your application documentation and the Sun Cluster documentation set. Each wizard supports only a limited subset of configuration options for a data service. To configure options that a wizard does not support, use Sun Cluster Manager or Sun Cluster maintenance commands to configure the data service manually. For more information, see the Sun Cluster documentation.
Sun Cluster Manager provides wizards to automate the configuration of the following Sun Cluster resources.
Logical hostname resource
Shared address resource
Highly available storage resource
You can use a resource that you create by using a wizard with any data service regardless of how you configure the data service.
For instructions for using use SunPlex Manager to install cluster software, see Sun Cluster Software Installation Guide for Solaris OS. SunPlex Manager provides online help for most administrative tasks.
The Sun Cluster module enables you to monitor clusters and to perform some operations on resources and resource groups from the Sun Management Center GUI. See the Sun Cluster Software Installation Guide for Solaris OS for information about installation requirements and procedures for the Sun Cluster module. Go to http://docs.sun.com to access the Sun Management Center software documentation set, which provides additional information about Sun Management Center.
The clsetup(1CL) utility is a menu-driven interface that you can use for general Sun Cluster administration. You can also use this utility to configure data service resources and resource groups. Select option 2 from the clsetup main menu to launch the Resource Group Manager submenu.
You can use the Sun Cluster maintenance commands to register and configure data service resources. See the procedure for how to register and configure your data service in the book for the data service. If, for example, you are using Sun Cluster HA for Oracle, see Registering and Configuring Sun Cluster HA for Oracle in Sun Cluster Data Service for Oracle Guide for Solaris OS.
For more information about how to use the commands to administer data service resources, see Chapter 2, Administering Data Service Resources.
The following table summarizes by task which tool you can use to administer data service resources. For more information about these tasks and for details about how to use the command line to complete related procedures, see Chapter 2, Administering Data Service Resources.
Table 1–2 Tools for Administering Data Service Resources
Task |
SunPlex Manager |
SPARC: Sun Management Center |
clsetup Utility |
---|---|---|---|
Register a resource type |
+ |
_ |
+ |
Create a resource group |
+ |
_ |
+ |
Add a resource to a resource group |
+ |
_ |
+ |
Suspend the automatic recovery actions of a resource group |
+ |
_ |
+ |
Resume the automatic recovery actions of a resource group |
+ |
_ |
+ |
Bring a resource group online |
+ |
+ |
+ |
Remove a resource group |
+ |
+ |
+ |
Remove a resource |
+ |
+ |
+ |
Switch the current primary of a resource group |
+ |
_ |
+ |
Enable a resource |
+ |
+ |
+ |
Disable a resource |
+ |
+ |
+ |
Move a resource group to the unmanaged state |
+ |
_ |
+ |
Display resource type, resource group, and resource configuration information |
+ |
+ |
+ |
Change resource properties |
+ |
_ |
+ |
Clear the STOP_FAILED error flag on resources |
+ |
_ |
+ |
Clear the START_FAILED resource state for a resource |
+ |
_ |
+ |
Add a node to a resource group |
+ |
_ |
+ |