This chapter provides information on preparing to administer the cluster and the procedures for using Sun Cluster administration tools.
Sun Cluster's highly-available environment ensures that critical applications are available to end users. The system administrator's job is to make sure that Sun Cluster is stable and operational.
Familiarize yourself with the planning information in the Sun Cluster Software Installation Guide for Solaris OS and the Sun Cluster Concepts Guide for Solaris OS before beginning administration tasks. Sun Cluster administration is organized into tasks among the following manuals.
Standard tasks, used to administer and maintain the cluster on a regular, perhaps daily basis. These tasks are described in this guide.
Data service tasks, such as installation, configuration, and changing properties. These tasks are described in the Sun Cluster Data Services Planning and Administration Guide for Solaris OS.
Service tasks, such as adding or repairing storage or network hardware. These tasks are described in the Sun Cluster Hardware Administration Manual for Solaris OS.
For the most part, you can perform Sun Cluster administration tasks while the cluster is operational, with the impact on cluster operation limited to a single node. For those procedures that require that the entire cluster be shut down, schedule downtime for off hours, to impose minimal impact on the system. If you plan to take down the cluster or a cluster node, notify users ahead of time.
You can perform administrative tasks on Sun Cluster by using a Graphical User Interface (GUI) or by using the command-line. The following section provides an overview of the GUI and command-line tools.
Sun Cluster supports Graphical User Interface (GUI) tools that you can use to perform various administrative tasks on your cluster. These GUI tools are SunPlexTM Manager and, if you are using Sun Cluster on a SPARC based system, Sun Management Center. See Chapter 10, Administering Sun Cluster With the Graphical User Interfaces for more information and for procedures about configuring SunPlex Manager and Sun Management Center. For specific information about how to use these tools, see the online help for each GUI.
You can perform most Sun Cluster administration tasks interactively through the scsetup(1M) utility. Whenever possible, administration procedures in this guide are described using scsetup.
You can administer the following Main Menu items through the scsetup utility.
Quorum
Resource groups
Cluster interconnect
Device groups and volumes
Private hostnames
New nodes
Other cluster properties
You can administer the following Resource Group Menu items through the scsetup utility.
Create a resource group
Add a network resource to a resource group
Add a data service resource to a resource group
Online/Offline or Switchover a resource group
Enable/Disable a resource
Change properties of a resource group
Change properties of a resource
Remove a resource from a resource group
Remove a resource group
Clear the stop_failed error flag from a resource
Table 1-1 lists other commands that you use to administer Sun Cluster. See the man pages for more detailed information.
Table 1–1 Sun Cluster Command-Line Interface Commands
Command |
Description |
---|---|
Starts remote console access to the cluster. |
|
Use to switch IP addresses from one adapter to another in an IP Network Multipathing group. |
|
Checks and validates the Sun Cluster configuration to ensure the very basic configuration for a cluster to be functional. |
|
Updates a Sun Cluster configuration. The -p option lists cluster configuration information. |
|
Provides administrative access to the device ID configuration. |
|
Runs the global device namespace administration script. |
|
Installs and configures Sun Cluster software. The command can be run interactively or non-interactively. The -p option displays release and package version information for the Sun Cluster software. |
|
Manages the registration of resource types, the creation of resource groups, and the activation of resources within a resource group. The -p option displays information on installed resources, resource groups, and resource types. Note – Resource type, resource group, and resource property names are case insensitive when executing scrgadm. |
|
Runs the interactive cluster configuration utility, which generates the scconf command and its various options. |
|
Shuts down the entire cluster. |
|
Provides a snapshot of the cluster status. |
|
Performs changes that affect node mastery and states for resource groups and disk device groups. |
In addition, use commands to administer the volume manager portion of Sun Cluster. These commands depend on the specific volume manager used in your cluster, either Solstice DiskSuiteTM, VERITAS Volume Manager, or Solaris Volume ManagerTM.
This section describes what to do to prepare for administering your cluster.
Document the hardware aspects that are unique to your site as your Sun Cluster configuration is scaled. Refer to your hardware documentation when you change or upgrade the cluster to save administration labor. Labeling cables and connections between the various cluster components can also make administration easier.
Reduce the time required by a third-party service provider when servicing your cluster by keeping records of your original cluster configuration, and subsequent changes.
You can use a dedicated SPARC workstation, known as the administrative console, to administer the active cluster. Typically, you install and run the Cluster Control Panel (CCP) and graphical user interface (GUI) tools on the administrative console. For more information on the CCP, see How to Log In to Sun Cluster Remotely. For instructions on installing the Cluster Control Panel module for Sun Management Center and SunPlex Manager GUI tools, see the Sun Cluster Software Installation Guide for Solaris OS.
The administrative console is not a cluster node. The administrative console is used for remote access to the cluster nodes, either over the public network or through a network-based terminal concentrator.
If your SPARC cluster consists of a Sun EnterpriseTM 10000 server, you must log in from the administrative console to the System Service Processor (SSP). Connect using the netcon(1M) command. The default method for netcon to connect with a Sun Enterprise 10000 domain is through the network interface. If the network is inaccessible, you can use netcon in “exclusive” mode by setting the -f option. You can also send ~* during a normal netcon session. Either of the previous solutions give you the option of toggling to the serial interface if the network becomes unreachable.
Sun Cluster does not require a dedicated administrative console, but using a console provides these benefits:
Enables centralized cluster management by grouping console and management tools on the same machine
Provides potentially quicker problem resolution by Enterprise Services or your service provider
Back up your cluster on a regular basis. Even though Sun Cluster provides an HA environment, with mirrored copies of data on the storage devices, Sun Cluster is not a replacement for regular backups. Sun Cluster can survive multiple failures, but does not protect against user or program error, or catastrophic failure. Therefore, you must have a backup procedure in place to protect against data loss.
The following information should be included as part of your backup.
All file system partitions
All database data if you are running DBMS data services
Disk partition information for all cluster disks
The md.tab file if you are using Solstice DiskSuite/Solaris Volume Manager as your volume manager
Table 1–2 provides a starting point for administering your cluster.
Table 1–2 Sun Cluster 3.1 4/04 Administration Tools
Task |
Tool |
Documentation |
---|---|---|
Log in to the Cluster Remotely |
Use the ccp command to launch the Cluster Control Panel (CCP). Then select one of the following icons: cconsole(1M), crlogin(1M), or ctelnet(1M). | |
Configure the Cluster Interactively |
Launch thescsetup(1M) utility. | |
Display Sun Cluster Release Number and Version Information |
Use the scinstall(1M) command with either the -p or -pv options. | |
Display Installed Resources, Resource Groups, and Resource Types Note – Resource type, resource group, and resource property names are case insensitive when executing scrgadm. |
Use the scrgadm(1M) -p command. |
How to Display Configured Resource Types, Resource Groups, and Resources |
Monitor Cluster Components Graphically |
Use SunPlex Manager or the Sun Cluster module for Sun Management Center (which is available with Sun Cluster on SPARC based systems only). |
SunPlex Manager or Sun Cluster module for Sun Management Center online help |
Administer Some Cluster Components Graphically |
Use SunPlex Manager or the Sun Cluster module for Sun Management Center (which is available with Sun Cluster on SPARC based systems only). |
SunPlex Manager or Sun Cluster module for Sun Management Center online help |
Check the Status of Cluster Components |
Use the scstat(1M) command. | |
Check the Status of IP Network Multipathing Groups on the Public Network |
Use thescstat(1M) command with the -i option. | |
View the Cluster Configuration |
Use the scconf(1M) -p command. | |
Check Global Mount Points |
Use the sccheck(1M) command. | |
Look at Sun Cluster System Messages |
Examine the/var/adm/messages file. |
Solaris 9 System Administrator Collection“Viewing System Messages” in System Administration Guide: Advanced Administration |
Monitor the Status of Solstice DiskSuite |
Use the metastat commands. |
Solstice DiskSuite/Solaris Volume Manager documentation |
Monitor the Status of VERITAS Volume Manager if running Solaris 8 |
Use the vxstat or vxva commands. |
VERITAS Volume Manager documentation |
Monitor the Status of Solaris Volume Manager if running Solaris 9 |
Use the svmstatcommand |
The Cluster Control Panel (CCP) provides a launch pad for cconsole(1M), crlogin(1M), and ctelnet(1M) tools. All three tools start a multiple window connection to a set of specified nodes. The multiple-window connection consists of a host window for each of the specified nodes and a common window. Input to the common window is sent to each of the host windows, allowing you to run commands simultaneously on all nodes of the cluster. See the ccp(1M) and cconsole(1M) man pages for more information.
Verify that the following prerequisites are met before starting the CCP.
Install the SUNWccon package on the administrative console.
Make sure the PATH variable on the administrative console includes the Sun Cluster tools directory, /opt/SUNWcluster/bin, and /usr/cluster/bin. You can specify an alternate location for the tools directory by setting the $CLUSTER_HOME environment variable.
Configure the clusters file, the serialports file, and the nsswitch.conf file if using a terminal concentrator. The files can be either /etc files or NIS/NIS+ databases. See clusters(4) and serialports(4) for more information.
Determine if you have a Sun Enterprise 10000 server platform.
If no, proceed to Step 3.
If yes, log into the System Service Processor (SSP) and connect by using the netcon command. After the connection is made, type Shift~@ to unlock the console and gain write access.
Start the CCP launch pad.
From the administrative console, type the following command.
# ccp clustername |
The CCP launch pad is displayed.
To start a remote session with the cluster, click either the cconsole, crlogin, or ctelnet icon in the CCP launch pad.
You can also start cconsole, crlogin, or ctelnet sessions from the command line.
The scsetup(1M) utility enables you to interactively configure quorum, resource group, cluster transport, private hostname, device group, and new node options for the cluster.
Become superuser on any node in the cluster.
Enter the scsetup utility.
# scsetup |
The Main Menu is displayed.
Make your configuration selection from the menu. Follow the onscreen instructions to complete a task.
See the scsetup online help for more information.
You do not need to be logged in as superuser to perform these procedures.
Display the Sun Cluster patch numbers.
Sun Cluster update releases are identified by the main product patch number plus the update version.
% showrev -p |
Display the Sun Cluster release number and version strings for all Sun Cluster packages.
% scinstall -pv |
The following example displays the cluster's release number.
% showrev -p | grep 110648 Patch: 110648-05 Obsoletes: Requires: Incompatibles: Packages: |
The following example displays the cluster's release information and version information for all packages.
% scinstall -pv SunCluster 3.1 SUNWscr: 3.1.0,REV=2000.10.01.01.00 SUNWscdev: 3.1.0,REV=2000.10.01.01.00 SUNWscu: 3.1.0,REV=2000.10.01.01.00 SUNWscman: 3.1.0,REV=2000.10.01.01.00 SUNWscsal: 3.1.0,REV=2000.10.01.01.00 SUNWscsam: 3.1.0,REV=2000.10.01.01.00 SUNWscvm: 3.1.0,REV=2000.10.01.01.00 SUNWmdm: 4.2.1,REV=2000.08.08.10.01 |
You can also accomplish this procedure by using the SunPlex Manager GUI. Refer to Chapter 10, Administering Sun Cluster With the Graphical User Interfaces. See the SunPlex Manager online help for more information.
You do not need to be logged in as superuser to perform this procedure.
Display the cluster's configured resource types, resource groups, and resources.
% scrgadm -p |
The following example shows the resource types (RT Name), resource groups (RG Name), and resources (RS Name) configured for the cluster schost.
% scrgadm -p RT Name: SUNW.SharedAddress RT Description: HA Shared Address Resource Type RT Name: SUNW.LogicalHostname RT Description: Logical Hostname Resource Type RG Name: schost-sa-1 RG Description: RS Name: schost-1 RS Description: RS Type: SUNW.SharedAddress RS Resource Group: schost-sa-1 RG Name: schost-lh-1 RG Description: RS Name: schost-3 RS Description: RS Type: SUNW.LogicalHostname RS Resource Group: schost-lh-1 |
You can also accomplish this procedure by using the SunPlex Manager GUI. See the SunPlex Manager online help for more information.
You do not need to be logged in as superuser to perform this procedure.
Check the status of cluster components.
% scstat -p |
The following example provides a sample of status information for cluster components returned by scstat(1M).
% scstat -p -- Cluster Nodes -- Node name Status --------- ------ Cluster node: phys-schost-1 Online Cluster node: phys-schost-2 Online Cluster node: phys-schost-3 Online Cluster node: phys-schost-4 Online ------------------------------------------------------------------ -- Cluster Transport Paths -- Endpoint Endpoint Status -------- -------- ------ Transport path: phys-schost-1:qfe1 phys-schost-4:qfe1 Path online Transport path: phys-schost-1:hme1 phys-schost-4:hme1 Path online ... ------------------------------------------------------------------ -- Quorum Summary -- Quorum votes possible: 6 Quorum votes needed: 4 Quorum votes present: 6 -- Quorum Votes by Node -- Node Name Present Possible Status --------- ------- -------- ------ Node votes: phys-schost-1 1 1 Online Node votes: phys-schost-2 1 1 Online ... -- Quorum Votes by Device -- Device Name Present Possible Status ----------- ------- -------- ------ Device votes: /dev/did/rdsk/d2s2 1 1 Online Device votes: /dev/did/rdsk/d8s2 1 1 Online ... -- Device Group Servers -- Device Group Primary Secondary ------------ ------- --------- Device group servers: rmt/1 - - Device group servers: rmt/2 - - Device group servers: schost-1 phys-schost-2 phys-schost-1 Device group servers: schost-3 - - -- Device Group Status -- Device Group Status ------------ ------ Device group status: rmt/1 Offline Device group status: rmt/2 Offline Device group status: schost-1 Online Device group status: schost-3 Offline ------------------------------------------------------------------ -- Resource Groups and Resources -- Group Name Resources ---------- --------- Resources: test-rg test_1 Resources: real-property-rg - Resources: failover-rg - Resources: descript-rg-1 - ... -- Resource Groups -- Group Name Node Name State ---------- --------- ----- Group: test-rg phys-schost-1 Offline Group: test-rg phys-schost-2 Offline ... -- Resources -- Resource Name Node Name State Status Message ------------- --------- ----- -------------- Resource: test_1 phys-schost-1 Offline Offline Resource: test_1 phys-schost-2 Offline Offline ----------------------------------------------------------------- -- IPMP Groups -- Node Name Group Status Adapter Status --------- ----- ------ ------- ------ IPMP Group: phys-schost-1 sc_ipmp0 Online qfe1 Online IPMP Group: phys-schost-2 sc_ipmp0 Online qfe1 Online ------------------------------------------------------------------ |
You can also accomplish this procedure by using the SunPlex Manager GUI. See the SunPlex Manager online help for more information.
You do not need to be logged in as superuser to perform this procedure.
To check the status of the IP Network Multipathing groups, use the scstat(1M) command.
The following example provides a sample of status information for cluster components returned by scstat -i.
% scstat -i ----------------------------------------------------------------- -- IPMP Groups -- Node Name Group Status Adapter Status --------- ----- ------ ------- ------ IPMP Group: phys-schost-1 sc_ipmp1 Online qfe2 Online IPMP Group: phys-schost-1 sc_ipmp0 Online qfe1 Online IPMP Group: phys-schost-2 sc_ipmp1 Online qfe2 Online IPMP Group: phys-schost-2 sc_ipmp0 Online qfe1 Online ------------------------------------------------------------------ |
You can also accomplish this procedure by using the SunPlex Manager GUI. See the SunPlex Manager online help for more information.
You do not need to be logged in as superuser to perform this procedure.
View the cluster configuration
% scconf -p |
To display more information using the scconf command, use the verbose options. See the scconf(1M) man page for details.
The following example lists the cluster configuration.
% scconf -p Cluster name: cluster-1 Cluster ID: 0x3908EE1C Cluster install mode: disabled Cluster private net: 172.16.0.0 Cluster private netmask: 255.255.0.0 Cluster new node authentication: unix Cluster new node list: <NULL - Allow any node> Cluster nodes: phys-schost-1 phys-schost-2 phys-schost-3 phys-schost-4 Cluster node name: phys-schost-1 Node ID: 1 Node enabled: yes Node private hostname: clusternode1-priv Node quorum vote count: 1 Node reservation key: 0x3908EE1C00000001 Node transport adapters: hme1 qfe1 qfe2 Node transport adapter: hme1 Adapter enabled: yes Adapter transport type: dlpi Adapter property: device_name=hme Adapter property: device_instance=1 Adapter property: dlpi_heartbeat_timeout=10000 ... Cluster transport junctions: hub0 hub1 hub2 Cluster transport junction: hub0 Junction enabled: yes Junction type: switch Junction port names: 1 2 3 4 ... Junction port: 1 Port enabled: yes Junction port: 2 Port enabled: yes ... Cluster transport cables Endpoint Endpoint State -------- -------- ----- Transport cable: phys-schost-1:hme1@0 hub0@1 Enabled Transport cable: phys-schost-1:qfe1@0 hub1@1 Enabled Transport cable: phys-schost-1:qfe2@0 hub2@1 Enabled Transport cable: phys-schost-2:hme1@0 hub0@2 Enabled ... Quorum devices: d2 d8 Quorum device name: d2 Quorum device votes: 1 Quorum device enabled: yes Quorum device name: /dev/did/rdsk/d2s2 Quorum device hosts (enabled): phys-schost-1 phys-schost-2 Quorum device hosts (disabled): ... Device group name: schost-3 Device group type: SVM Device group failback enabled: no Device group node list: phys-schost-3, phys-schost-4 Diskset name: schost-3 |
The sccheck(1M) command runs a set of checks to validate the basic configuration required for a cluster to function properly. If no checks fail, sccheck returns to the shell prompt. If a check fails, sccheck produces reports in either the specified or the default output directory. If you run sccheck against more than one node, sccheck will produce a report for each node and a report for multi-node checks.
The sccheck command runs in two steps: data collection and analysis. Data collection can be time consuming, depending on the system configuration. You can invoke sccheck in verbose mode with the -v1 flag to print progress messages, or you can use the -v2 flag to run sccheck in highly verbose mode which prints more detailed progress messages, especially during data collection.
Run sccheck after performing an administration procedure that might result in changes to devices, volume management components, or the Sun Cluster configuration.
The following example shows sccheck being run in verbose mode against nodes phys-schost-1 and phys-schost-2 with all checks passing.
# sccheck -v1 -h phys-schost-1,phys-schost-2 sccheck: Requesting explorer data and node report from phys-schost-1. sccheck: Requesting explorer data and node report from phys-schost-2. sccheck: phys-schost-1: Explorer finished. sccheck: phys-schost-1: Starting single-node checks. sccheck: phys-schost-1: Single-node checks finished. sccheck: phys-schost-2: Explorer finished. sccheck: phys-schost-2: Starting single-node checks. sccheck: phys-schost-2: Single-node checks finished. sccheck: Starting multi-node checks. sccheck: Multi-node checks finished # |
The following example shows the node phys-schost-2 in the cluster suncluster missing the mount point /global/phys-schost-1. Reports are created in the output directory /var/cluster/sccheck/myReports/.
# sccheck -v1 -h phys-schost-1,phys-schost-2 -o /var/cluster/sccheck/myReports sccheck: Requesting explorer data and node report from phys-schost-1. sccheck: Requesting explorer data and node report from phys-schost-2. sccheck: phys-schost-1: Explorer finished. sccheck: phys-schost-1: Starting single-node checks. sccheck: phys-schost-1: Single-node checks finished. sccheck: phys-schost-2: Explorer finished. sccheck: phys-schost-2: Starting single-node checks. sccheck: phys-schost-2: Single-node checks finished. sccheck: Starting multi-node checks. sccheck: Multi-node checks finished. sccheck: One or more checks failed. sccheck: The greatest severity of all check failures was 3 (HIGH). sccheck: Reports are in /var/cluster/sccheck/myReports. # # cat /var/cluster/sccheck/myReports/sccheck-results.suncluster.txt ... =================================================== = ANALYSIS DETAILS = =================================================== ------------------------------------ CHECK ID : 3065 SEVERITY : HIGH FAILURE : Global filesystem /etc/vfstab entries are not consistent across all Sun Cluster 3.x nodes. ANALYSIS : The global filesystem /etc/vfstab entries are not consistent across all nodes in this cluster. Analysis indicates: FileSystem '/global/phys-schost-1' is on 'phys-schost-1' but missing from 'phys-schost-2'. RECOMMEND: Ensure each node has the correct /etc/vfstab entry for the filesystem(s) in question. ... # |
The sccheck(1M) command includes checks which examine the /etc/vfstab file for configuration errors with the cluster file system and its global mount points.
Run sccheck after making cluster configuration changes that have affected devices or volume management components.
The following example shows the node phys-schost-2 of the cluster suncluster missing the mount point /global/schost-1. Reports are being sent to the output directory /var/cluster/sccheck/myReports/
# sccheck -v1 -h phys-schost-1,phys-schost-2 -o /var/cluster/sccheck/myReports sccheck: Requesting explorer data and node report from phys-schost-1. sccheck: Requesting explorer data and node report from phys-schost-2. sccheck: phys-schost-1: Explorer finished. sccheck: phys-schost-1: Starting single-node checks. sccheck: phys-schost-1: Single-node checks finished. sccheck: phys-schost-2: Explorer finished. sccheck: phys-schost-2: Starting single-node checks. sccheck: phys-schost-2: Single-node checks finished. sccheck: Starting multi-node checks. sccheck: Multi-node checks finished. sccheck: One or more checks failed. sccheck: The greatest severity of all check failures was 3 (HIGH). sccheck: Reports are in /var/cluster/sccheck/myReports. # # cat /var/cluster/sccheck/myReports/sccheck-results.suncluster.txt ... =================================================== = ANALYSIS DETAILS = =================================================== ------------------------------------ CHECK ID : 3065 SEVERITY : HIGH FAILURE : Global filesystem /etc/vfstab entries are not consistent across all Sun Cluster 3.x nodes. ANALYSIS : The global filesystem /etc/vfstab entries are not consistent across all nodes in this cluster. Analysis indicates: FileSystem '/global/phys-schost-1' is on 'phys-schost-1' but missing from 'phys-schost-2'. RECOMMEND: Ensure each node has the correct /etc/vfstab entry for the filesystem(s) in question. ... # # cat /var/cluster/sccheck/myReports/sccheck-results.phys-schost-1.txt ... =================================================== = ANALYSIS DETAILS = =================================================== ------------------------------------ CHECK ID : 1398 SEVERITY : HIGH FAILURE : An unsupported server is being used as a Sun Cluster 3.x node. ANALYSIS : This server may not been qualified to be used as a Sun Cluster 3.x node. Only servers that have been qualified with Sun Cluster 3.x are supported as Sun Cluster 3.x nodes. RECOMMEND: Because the list of supported servers is always being updated, check with your Sun Microsystems representative to get the latest information on what servers are currently supported and only use a server that is supported with Sun Cluster 3.x. ... # |