Skip Navigation Links | |
Exit Print View | |
Oracle Solaris Cluster Data Services Planning and Administration Guide Oracle Solaris Cluster 4.1 |
1. Planning for Oracle Solaris Cluster Data Services
2. Administering Data Service Resources
Overview of Tasks for Administering Data Service Resources
Configuring and Administering Oracle Solaris Cluster Data Services
How to Register a Resource Type
How to Install and Register an Upgrade of a Resource Type
How to Migrate Existing Resources to a New Version of the Resource Type
How to Unregister Older Unused Versions of the Resource Type
How to Downgrade a Resource to an Older Version of Its Resource Type
How to Create a Failover Resource Group
How to Create a Scalable Resource Group
Configuring Failover and Scalable Data Services on Shared File Systems
How to Configure a Failover Application Using the ScalMountPoint Resource
How to Configure a Scalable Application Using the ScalMountPoint Resource
Tools for Adding Resources to Resource Groups
How to Add a Logical Hostname Resource to a Resource Group by Using the clsetup Utility
How to Add a Logical Hostname Resource to a Resource Group Using the Command-Line Interface
How to Add a Shared Address Resource to a Resource Group by Using the clsetup Utility
How to Add a Shared Address Resource to a Resource Group Using the Command-Line Interface
How to Add a Failover Application Resource to a Resource Group
How to Add a Scalable Application Resource to a Resource Group
Bringing Resource Groups Online
How to Bring Resource Groups Online
Switching Resource Groups to Preferred Primaries
How to Switch Resource Groups to Preferred Primaries
How to Quiesce a Resource Group
How to Quiesce a Resource Group Immediately
Suspending and Resuming the Automatic Recovery Actions of Resource Groups
Immediately Suspending Automatic Recovery by Killing Methods
How to Suspend the Automatic Recovery Actions of a Resource Group
How to Suspend the Automatic Recovery Actions of a Resource Group Immediately
How to Resume the Automatic Recovery Actions of a Resource Group
Disabling and Enabling Resource Monitors
How to Disable a Resource Fault Monitor
How to Enable a Resource Fault Monitor
How to Remove a Resource Group
Switching the Current Primary of a Resource Group
How to Switch the Current Primary of a Resource Group
Disabling Resources and Moving Their Resource Group Into the UNMANAGED State
How to Disable a Resource and Move Its Resource Group Into the UNMANAGED State
Displaying Resource Type, Resource Group, and Resource Configuration Information
Changing Resource Type, Resource Group, and Resource Properties
How to Change Resource Type Properties
How to Change Resource Group Properties
How to Change Resource Properties
How to Change Resource Dependency Properties
How to Modify a Logical Hostname Resource or a Shared Address Resource
Clearing the STOP_FAILED Error Flag on Resources
How to Clear the STOP_FAILED Error Flag on Resources
Clearing the Start_failed Resource State
How to Clear a Start_failed Resource State by Switching Over a Resource Group
How to Clear a Start_failed Resource State by Restarting a Resource Group
How to Clear a Start_failed Resource State by Disabling and Enabling a Resource
Upgrading a Preregistered Resource Type
Information for Registering the New Resource Type Version
Information for Migrating Existing Instances of the Resource Type
Reregistering Preregistered Resource Types After Inadvertent Deletion
How to Reregister Preregistered Resource Types After Inadvertent Deletion
Adding or Removing a Node to or From a Resource Group
Adding a Node to a Resource Group
How to Add a Node to a Scalable Resource Group
How to Add a Node to a Failover Resource Group
Removing a Node From a Resource Group
How to Remove a Node From a Scalable Resource Group
How to Remove a Node From a Failover Resource Group
How to Remove a Node From a Failover Resource Group That Contains Shared Address Resources
Example - Removing a Node From a Resource Group
Synchronizing the Startups Between Resource Groups and Device Groups
Managed Entity Monitoring by HAStoragePlus
Troubleshooting Monitoring for Managed Entities
Additional Administrative Tasks to Configure HAStoragePlus Resources for a Zone Cluster
How to Set Up the HAStoragePlus Resource Type for New Resources
How to Set Up the HAStoragePlus Resource Type for Existing Resources
Configuring an HAStoragePlus Resource for Cluster File Systems
Sample Entries in /etc/vfstab for Cluster File Systems
How to Set Up the HAStoragePlus Resource for Cluster File Systems
How to Delete an HAStoragePlus Resource Type for Cluster File Systems
Enabling Highly Available Local File Systems
Configuration Requirements for Highly Available Local File Systems
Format of Device Names for Devices Without a Volume Manager
Sample Entries in /etc/vfstab for Highly Available Local File Systems
How to Set Up the HAStoragePlus Resource Type by Using the clsetup Utility
How to Delete an HAStoragePlus Resource That Makes a Local Solaris ZFS Highly Available
Sharing a Highly Available Local File System Across Zone Clusters
Modifying Online the Resource for a Highly Available Local File System
How to Add File Systems Other Than Solaris ZFS to an Online HAStoragePlus Resource
How to Remove File Systems Other Than Solaris ZFS From an Online HAStoragePlus Resource
How to Add a Solaris ZFS Storage Pool to an Online HAStoragePlus Resource
How to Remove a Solaris ZFS Storage Pool From an Online HAStoragePlus Resource
Changing a ZFS Pool Configuration That is Managed by an HAStoragePlus Resource
How to Change a ZFS Pool Configuration That is Managed by an Online HAStoragePlus Resource
How to Recover From a Fault After Modifying the Zpools Property of an HAStoragePlus Resource
Changing the Cluster File System to a Local File System in an HAStoragePlus Resource
How to Change the Cluster File System to Local File System in an HAStoragePlus Resource
Upgrading the HAStoragePlus Resource Type
Information for Registering the New Resource Type Version
Information for Migrating Existing Instances of the Resource Type
Distributing Online Resource Groups Among Cluster Nodes
Enforcing Collocation of a Resource Group With Another Resource Group
Specifying a Preferred Collocation of a Resource Group With Another Resource Group
Distributing a Set of Resource Groups Evenly Among Cluster Nodes
Specifying That a Critical Service Has Precedence
Delegating the Failover or Switchover of a Resource Group
Combining Affinities Between Resource Groups
Zone Cluster Resource Group Affinities
Configuring the Distribution of Resource Group Load Across Nodes
How to Configure Load Limits for a Node
How to Set Priority for a Resource Group
How to Set Load Factors for a Resource Group
How to Set Preemption Mode for a Resource Group
How to Concentrate Load Onto Fewer Nodes in the Cluster
Enabling Oracle Solaris SMF Services to Run With Oracle Solaris Cluster
Encapsulating an SMF Service Into a Failover Proxy Resource Configuration
Encapsulating an SMF Service Into a Multi-Master Proxy Resource Configuration
Encapsulating an SMF Service Into a Scalable Proxy Resource Configuration
Tuning Fault Monitors for Oracle Solaris Cluster Data Services
Setting the Interval Between Fault Monitor Probes
Setting the Timeout for Fault Monitor Probes
Defining the Criteria for Persistent Faults
Complete Failures and Partial Failures of a Resource
Dependencies of the Threshold and the Retry Interval on Other Properties
System Properties for Setting the Threshold and the Retry Interval
After a cluster boots or services fail over to another node, global devices and local and cluster file systems might require time to become available. However, a data service can run its START method before global devices and local and cluster file systems come online. If the data service depends on global devices or local and cluster file systems that are not yet online, the START method times out. In this situation, you must reset the state of the resource groups that the data service uses and restart the data service manually.
To avoid these additional administrative tasks, use the HAStoragePlus resource type. Add an instance of HAStoragePlus to all resource groups whose data service resources depend on global devices or local and cluster file systems. Instances of these resource types perform the following operations:
Forcing the START method of the other resources in the same resource group to wait until global devices and local and cluster file systems become available.
If an application resource is configured on top of an HAStoragePlus resource, the application resource must define the offline restart dependency on the underlying HAStoragePlus resource. This ensures that the application resource comes online after the dependent HAStoragePlus resource comes online, and goes offline before the HAStoragePlus resource goes offline.
The following command creates an offline restart dependency from an application resource to a HASP resource:
# clrs set -p Resource_dependencies_offline_restart=hasp_rs applicaton_rs
To create an HAStoragePlus resource, see How to Set Up the HAStoragePlus Resource Type for New Resources.
All entities that are managed by the HAStoragePlus resource type are monitored. The SUNWHAStoragePlus resource type provides a fault monitor to monitor the health of the entities managed by the HASP resource, including global devices, file systems, and ZFS storage pools. The fault monitor runs fault probes on a regular basis. If one of the entities becomes unavailable, the resource is restarted or a failover to another node is performed. If more than one entity is monitored, the fault monitor probes them all at the same time. Ensure that all configuration changes to the managed entities are completed before you enable monitoring.
Note - Version 9 of the HAStoragePlus resource fault monitor probes the devices and file systems it manages by reading and writing to the file systems. If a read operation is blocked by any software on the I/O stack and the HAStoragePlus resource is required to be online, the user must disable the fault monitor. For example, you must unmonitor the HAStoragePlus resource managing the Availability Suite Remote Replication volumes because Availability Suite from Oracle blocks reading from any bitmap volume or any data volume in the NEED SYNC state. The HAStoragePlus resource managing the Availability Suite volumes must be online at all times.
For more information on the properties that enable monitoring for managed entities, see the SUNW.HAStoragePlus(5) man page.
For instructions on enabling and disabling monitoring for managed entities, see How to Enable a Resource Fault Monitor.
Depending on the type of managed entity, the fault monitor probes the target by reading or writing to it. If more than one entity is monitored, the fault monitor probes them all at the same time.
Table 2-2 What the Fault Monitor Verifies
For instructions on enabling a resource fault monitor, see How to Enable a Resource Fault Monitor.
If monitoring is not enabled on the managed entities, perform the following troubleshooting steps:
Ensure that the hastorageplus_probe process is running.
Look for error messages on the console.
Enable debug messages to the syslog file.
# mkdir -p /var/cluster/rgm/rt/SUNW.HAStoragePlus:9
# echo 9 > /var/cluster/rgm/rt/SUNW.HAStoragePlus:9/loglevel
You should also check the /etc/syslog.conf file to ensure that messages with the daemon.debug facility level are logged to the /var/adm/messages file. Add the daemon.debug entry to the /var/adm/messages action if it is not already present.
When you configure HAStoragePlus resources for a zone cluster, you need to perform the following additional tasks before performing the steps for global cluster:
While configuring file systems like UFS in file system mount points, the file systems need to be configured to the zone cluster. For more information about configuring a file system to a zone cluster, see How to Add a Local File System to a Specific Zone-Cluster Node in Oracle Solaris Cluster Software Installation Guide.
While configuring global devices in global device paths, the devices need to be configured to the zone cluster. For more information about configuring global devices to a zone cluster, see Adding Storage Devices to a Zone Cluster in Oracle Solaris Cluster Software Installation Guide.
While configuring the ZFS file systems using Zpools, the ZFS pool needs to be configured to the zone cluster. For more information about configuring a ZFS file system to a zone cluster, see How to Add a ZFS Storage Pool to a Zone Cluster in Oracle Solaris Cluster Software Installation Guide.
In the following example, the resource group resource-group-1 contains the following data services.
HA for Oracle iPlanet Web Server (formerly Sun Java System Web Server), which depends on /global/resource-group-1
HA for Oracle, which depends on /dev/global/dsk/d5s2
HA for NFS, which depends on dsk/d6
Note - To create an HAStoragePlus resource with Oracle Solaris ZFS as a highly available local file system seeHow to Set Up the HAStoragePlus Resource Type to Make a Local Solaris ZFS File System Highly Available section.
To create an HAStoragePlus resource hastorageplus-1 for new resources in resource-group-1, read Synchronizing the Startups Between Resource Groups and Device Groups and then perform the following steps.
To create an HAStoragePlus resource, see Enabling Highly Available Local File Systems.
# clresourcegroup create resource-group-1
The following command prints a list of registered resource types.
# clresourcetype show | egrep Type
# clresourcetype register SUNW.HAStoragePlus
# clresource create -g resource-group-1 -t SUNW.HAStoragePlus \ -p GlobalDevicePaths=/dev/global/dsk/d5s2,dsk/d6 \ -p FilesystemMountPoints=/global/resource-group-1 hastorageplus-1
GlobalDevicePaths can contain the following values.
Global device group names, such as nfs-dg, dsk/d5
Paths to global devices, such as /dev/global/dsk/d1s2, /dev/md/nfsdg/dsk/d10
FilesystemMountPoints can contain the following values.
Mount points of local or cluster file systems, such as /local-fs/nfs, /global/nfs
Note - HAStoragePlus has a Zpools extension property that is used to configure ZFS file system storage pools and a ZpoolsSearchDir extension property that is used to specify the location to search for the devices of ZFS file system storage pools. The default value for the ZpoolsSearchDir extension property is /dev/dsk. The ZpoolsSearchDir extension property is similar to the -d option of the zpool(1M) command.
The resource is created in the enabled state.
For example, for Oracle iPlanet Web Server (formerly Sun Java System Web Server), run the following command.
# clresource create -g resource-group-1 -t SUNW.iws \ -p Confdir_list=/global/iws/schost-1 -p Scalable=False \ -p Resource_dependencies=schost-1 -p Port_list=80/tcp \ -p Resource_dependencies_offline_restart=hastorageplus-1 resource
The resource is created in the enabled state.
# clresource show -v resource | egrep Resource_dependencies_offline_restart
# clresourcegroup online -M resource-group-1
The HAStoragePlus resource type contains another extension property, AffinityOn, which is a Boolean that specifies whether HAStoragePlus must perform an affinity switchover for the global devices that are defined in GLobalDevicePaths and FileSystemMountPoints extension properties. For details, see the SUNW.HAStoragePlus(5) man page.
Note - The setting of the AffinityOn flag is ignored for scalable services. Affinity switchovers are not possible with scalable resource groups.
Before You Begin
Read Synchronizing the Startups Between Resource Groups and Device Groups.
The following command prints a list of registered resource types.
# clresourcetype show | egrep Type
# clresourcetype register SUNW.HAStoragePlus
# clresource create -g resource-group \ -t SUNW.HAStoragePlus -p GlobalDevicePaths= … \ -p FileSystemMountPoints=... -p AffinityOn=True hastorageplus-1
The resource is created in the enabled state.
# clresource set -p Resource_Dependencies_offline_restart=hastorageplus-1 resource
# clresource show -v resource | egrep Resource_dependencies_offline_restart