This document provides the following information for SunTM Cluster 3.0 5/02 software.
The appendices to this document include installation planning worksheets and examples for planning the Sun Cluster 3.0 5/02 software and data services installation.
The following table lists new features and functionality that require updates to the Sun Cluster documentation. The second column identifies the documentation that was updated. Contact your Sun sales representative for the complete list of supported hardware and software.
Table 1–1 New Features and Functionality
Feature or Functionality |
Documentation Updates |
---|---|
HAStoragePlus |
The Sun Cluster 3.0 5/02 Supplement contains updates to the Sun Cluster 3.0 12/01 Data Services Installation and Configuration Guide and the Sun Cluster 3.0 12/01 Data Services Developer's Guide to support the HAStoragePlus resource type. The HAStoragePlus resource type can be used to make a local file system highly available within a Sun Cluster environment. The Sun Cluster 3.0 5/02 Error Messages Guide documents new HAStoragePlus error messages. |
Prioritized Service Management (RGOffload) |
The Sun Cluster 3.0 5/02 Supplement contains new procedures and updates to the Sun Cluster 3.0 12/01 Data Services Installation and Configuration Guide to support the RGOffload resource type. RGOffload allows your cluster to automatically free a node's resources for critical data services by off-loading resource groups containing non-critical data services. The Sun Cluster 3.0 5/02 Error Messages Guide documents new RGOffload error messages. |
Sun Cluster Security Hardening support for additional data services |
The Sun Cluster Security Hardening documentation is available at http://www.sun.com/security/blueprints. From this URL, scroll down to the Architecture heading to locate the article on Sun Cluster Security Hardening. See Sun Cluster Security Hardening for more information. |
SunPlex Agent Builder enhancements |
The Sun Cluster 3.0 5/02 Supplement contains updates to the Sun Cluster 3.0 12/01 Data Services Developer's Guide to support creation of a generic data service (GDS), a single pre-compiled data service, by using SunPlex Agent Builder. |
Uninstalling Sun Cluster software |
The Sun Cluster 3.0 5/02 Supplement contains new cluster-software uninstallation procedures and updates to related procedures in the Sun Cluster 3.0 12/01 Software Installation Guide and the Sun Cluster 3.0 12/01 System Administration Guide. The new -r option to scinstall(1M) removes Sun Cluster software from a node. |
Upgrade to Sun Cluster 3.0 5/02 software from any previous release of Sun Cluster 3.0 software |
Follow procedures in “Upgrading to a Sun Cluster 3.0 Software Update Release” in the Sun Cluster 3.0 12/01 Software Installation Guide to upgrade from any previous release of Sun Cluster 3.0 software. See Upgrading to a Sun Cluster 3.0 Software Update Release for corrections to the Solaris 8 upgrade instructions. |
This section includes additional information on new features and functionality.
Sun Cluster Security Hardening uses the Solaris Operating Environment hardening techniques recommended by the Sun BluePrintsTM program to achieve basic security hardening for clusters. The Solaris Security Toolkit automates the implementation of Sun Cluster Security Hardening.
The Sun Cluster Security Hardening documentation is available at http://www.sun.com/security/blueprints. From this URL, scroll down to the Architecture heading to locate the article “Securing the Sun Cluster 3.0 Software.” This document describes how to secure Sun Cluster 3.0 deployments in a Solaris 8 environment. This description includes the use of the Solaris Security Toolkit and other best-practice security techniques recommended by Sun security experts.
Sun Cluster Security Hardening supports all Sun Cluster 3.0 5/02 data services listed in the table below, in a Solaris 8 environment.
Sun Cluster Security Hardening supports all Sun Cluster 3.0 5/02 data services on Solaris 8 only. Security hardening is not available for Sun Cluster 3.0 5/02 on Solaris 9.
Data Service Agent |
Application Version: Failover |
Application Version: Scalable |
---|---|---|
Sun Cluster HA for iPlanet Messaging Server |
6.0 |
4.1 |
Sun Cluster HA for iPlanet Web Server |
6.0 |
4.1 |
Sun Cluster HA for Apache |
1.3.9 |
1.3.9 |
Sun Cluster HA for SAP |
4.6D (32 and 64 bit) |
4.6D (32 and 64 bit) |
Sun Cluster HA for iPlanet Directory Server |
4.12 |
N/A |
Sun Cluster HA for NetBackup |
3.4 |
N/A |
Sun Cluster HA for Oracle |
8.1.7 and 9i (32 and 64 bit) |
N/A |
Sun Cluster HA for Sybase ASE |
12.0 (32 bit) |
N/A |
Sun Cluster Support for Oracle Parallel Server/Real Application Clusters |
8.1.7 and 9i (32 and 64 bit) |
N/A |
Sun Cluster HA for DNS |
with OS |
N/A |
Sun Cluster HA for NFS |
with OS |
N/A |
This section describes the supported software and memory requirements for Sun Cluster 3.0 5/02 software.
Operating environment and patches – Supported Solaris versions and patches are available at the following URL.
For more details, see Patches and Required Firmware Levels.
Volume managers
On Solaris 8 – Solstice DiskSuiteTM 4.2.1 and VERITAS Volume Manager 3.1.1 and 3.2.
On Solaris 9 – Solaris Volume Manager and VERITAS Volume Manager 3.2.
File systems –
On Solaris 8 – Solaris UFS and VERITAS File System 3.4.
On Solaris 9 – Solaris UFS and VERITAS File System 3.4.
Data services (agents) – Contact your Sun sales representative for the complete list of supported data services and application versions. Specify the resource type names when you install the data services with the scinstall(1M) utility and when you register the resource types associated with the data service with the scrgadm(1M) utility.
Table 1–3 Supported Data Services for Sun Cluster 3.0 5/02 Software
Data Service |
Sun Cluster Resource Type |
---|---|
Sun Cluster HA for Apache |
apache |
Sun Cluster HA for BroadVision One-To-One Enterprise |
bv |
Sun Cluster HA for DNS |
dns |
Sun Cluster HA for iPlanet Web Server |
iws |
Sun Cluster HA for NetBackup |
netbackup |
Sun Cluster HA for NFS |
nfs |
Sun Cluster HA for iPlanet Directory Server |
nsldap |
Sun Cluster HA for Oracle |
oracle |
Sun Cluster HA for SAP |
sap |
Sun Cluster HA for Sybase ASE |
sybase |
Sun Cluster Support for Oracle Parallel Server/Real Application Clusters |
N/A |
Memory Requirements – Sun Cluster 3.0 software requires extra memory above what is configured for a node under a normal workload. The extra memory equals 128 Mbytes plus ten percent. For example, if a standalone node normally requires 1 Gbyte of memory, you need an extra 256 Mbytes to meet memory requirements.
RSMAPI – Sun Cluster 3.0 software supports the Remote Shared Memory Application Programming Interface (RSMAPI) on RSM-capable interconnects, such as PCI‐SCI.
Public Network Management (PNM) will not be supported in the next Sun Cluster feature release. Network adapter monitoring and failover for Sun Cluster will instead be performed by Solaris IP Multipathing.
Use the PNM to configure and administer network interface card monitoring and failover. However, the user interfaces to the PNM daemon and PNM administration commands are obsolete and will be removed in the next Sun Cluster feature release. Users are strongly discouraged from developing tools that rely on these interfaces. The following interfaces are officially supported in the current release, but are expected to be removed or changed in the next Sun Cluster feature release.
pnmd(1M)
pnmconfig(4)
pnmstat(1M)
pnmset(1M)
pnmrtop(1M)
pnmptor(1M)
To prepare for the transition to IP Multipathing in the next Sun Cluster feature release, consider the following issues.
With the integration of IP Multipathing in the next feature release, the Solaris IP Multipathing administration model and interfaces should be used for network availability management. See the Solaris IP Multipathing Administration Guide for more details.
Solaris IP Multipathing requires that, for IP Multipathing groups, each interface must have its own IP address to be used strictly by the IP Multipathing daemon for monitoring of its health. Therefore, before transitioning to IP Multipathing, you must prepare an additional IP address for each interface (active and backup) in a NAFO group.
For example, assume nafo0 is on the 75 subnet, and it consists of qfe0 and qfe4. Suppose /etc/hostname.qfe0 contains the hostname schostname‐1, which maps to an IP address on the 75 subnet. In order to transition to IP Multipathing in the future, two more IP addresses on the 75 subnet must be allocated, to be used for each of qfe0 and qfe4. See the Solaris IP Multipathing Administration Guide for more details.
IP Multipathing requires that all interfaces in an IP Multipathing group have distinct hardware (MAC) addresses; that is, setting the eeprom property local-mac-address? to true is required. Interface cards that do not support this include:
X1018A SunSwiftTM SBus Adapter
X1059A SunFastEthernetTM Adapter 2.0 SBus Card
You must be prepared to replace these cards during the upgrade to the next feature release. While single-adapter IP Multipathing groups may function despite the lack of support for local-mac-address?=true, such configurations are not optimal for high availability setups. See the Solaris IP Multipathing Administration Guide for more details.
The Sun Cluster 3.0 5/02 user documentation is available online in AnswerBook2TM format for use with AnswerBook2 documentation servers. The Sun Cluster 3.0 5/02 AnswerBook2 documentation set consists of the following collections.
The Sun Cluster 3.0 5/02 Collection, which includes the following manuals.
Sun Cluster 3.0 5/02 Supplement
Sun Cluster 3.0 5/02 Error Messages Guide
Sun Cluster 3.0 12/01 Software Installation Guide
Sun Cluster 3.0 12/01 System Administration Guide
Sun Cluster 3.0 12/01 Hardware Guide
Sun Cluster 3.0 12/01 Data Services Developer's Guide
Sun Cluster 3.0 12/01 Concepts
The Sun Cluster 3.0 12/01 Data Services Collection, which includes the following manual.
Sun Cluster 3.0 12/01 Data Services Installation and Configuration Guide
The Sun Cluster 3.0 5/02 Supplement contains additions and changes to the Sun Cluster 3.0 12/01 documentation set. Use this supplement in conjunction with the Sun Cluster 3.0 12/01 manuals that are also provided in the Sun Cluster 3.0 5/02 Collection and with the Sun Cluster 3.0 12/01 Data Services Collection.
In addition, the docs.sun.comSM web site enables you to access Sun Cluster documentation on the Web. You can browse the docs.sun.com archive or search for a specific book title or subject at the following Web site.
AnswerBook2 documentation server software is not provided on the Solaris 9 documentation CD‐ROM. If you are using the Solaris 9 version of Sun Cluster 3.0 5/02 software and do not already have the AnswerBook2 server software, see http://www.sun.com/software/ab2 to download the AnswerBook2 software, installation instructions, and release notes. Alternately, use the PDF versions of the documentation, which are also provided on the Sun Cluster 3.0 5/02 CD‐ROMs. See PDF Files for more information.
The Solaris 8 operating environment release includes AnswerBook2 documentation server software. The Solaris 8 documentation CD‐ROM, which is separate from the Solaris operating environment CD‐ROM, includes the documentation server software. You need the Solaris 8 documentation CD‐ROM to install an AnswerBook2 documentation server.
If you have installed an AnswerBook2 documentation server at your site, you can use the same server for the Sun Cluster AnswerBooks. Otherwise, install a documentation server on a machine at your site. We recommend that you use the administrative console as the administrative interface to your cluster for the documentation server. Do not use a cluster node as your AnswerBook2 documentation server.
For information on installing an AnswerBook2 documentation server, load the Solaris 8 documentation CD‐ROM on a server and view the README files.
Install the Sun Cluster AnswerBook2 documents on a file system on the same server on which you install the documentation server. The Sun Cluster AnswerBooks include a post‐install script that automatically adds the documents to your existing AnswerBook library.
Note the following requirements to set up your AnswerBook2 servers.
Your server system must have approximately 34 Megabytes of disk space available for the server software and roughly 600 Megabytes of disk space available for installing document collections.
You must have root (superuser) access to the documentation server.
The AnswerBook2 server must have a CD-ROM drive.
If you do not have an installed AnswerBook2 documentation server, you need the Solaris 8 operating environment documentation CD-ROM, which contains the software instructions for installing an AnswerBook2 documentation server.
The Sun Cluster 3.0 5/02 CD-ROM includes the Sun Cluster 3.0 5/02 Collection. The Sun Cluster 3.0 Agents 5/02 CD-ROM contains the Sun Cluster 3.0 12/01 Data Services Collection.
Use this procedure to install the Sun Cluster AnswerBook packages for the Sun Cluster 3.0 5/02 Collection and Sun Cluster 3.0 12/01 Data Services Collection.
Become superuser on the server that has an AnswerBook2 documentation server.
If you have previously installed the Sun Cluster AnswerBooks, remove the old packages.
If you have never installed Sun Cluster AnswerBooks, skip this step.
# pkgrm SUNWscfab SUNWscdab |
Insert the Sun Cluster 3.0 5/02 CD-ROM or Sun Cluster 3.0 Agents 5/02 CD-ROM into a CD‐ROM drive attached to your documentation server.
The Volume Management daemon, vold(1M), mounts the CD‐ROM automatically.
Change directory to the CD-ROM location that contains the Sun Cluster AnswerBook package.
The AnswerBook packages reside at the following locations.
Sun Cluster 3.0 5/02 CD-ROM
/cdrom/suncluster_3_0_u3/SunCluster_3.0/Packages
Sun Cluster 3.0 Agents 5/02 CD-ROM
/cdrom/scdataservices_3_0_u3/components/SunCluster_Data_Service_Answer_Book_3.0/Packages
Use the pkgadd(1) command to install the package.
# pkgadd -d . |
Select the Sun Cluster 3.0 5/02 Collection (SUNWscfab) and the Sun Cluster 3.0 12/01 Data Services Collection (SUNWscdab) packages to install.
From the pkgadd installation options menu, choose heavy to add the complete package to the system and update the AnswerBook2 catalog.
Select either the Sun Cluster 3.0 5/02 Collection (SUNWscfab) or the Sun Cluster 3.0 12/01 Data Services Collection (SUNWscdab).
The document collection package on each CD‐ROM includes a post‐installation script that adds the collection to the documentation server's database and restarts the server. You can now view the Sun Cluster AnswerBooks from your documentation server.
The Sun Cluster CD‐ROMs include a PDF file for each book in the Sun Cluster documentation set.
Similar to the Sun Cluster AnswerBooks, seven PDF files reside on the Sun Cluster CD‐ROM and one PDF file resides on the Sun Cluster Agents CD‐ROM. The PDF file names are abbreviations of the books (see Table 1–4).
The PDF files reside at the following locations.
Sun Cluster 3.0 5/02 CD-ROM
/cdrom/suncluster_3_0_u3/SunCluster_3.0/Docs/locale/C/PDF
Sun Cluster 3.0 Agents 5/02 CD-ROM
/cdrom/scdataservices_3_0_u3/components/SunCluster_Data_Service_Answer_Book_3.0/Docs/locale/C/PDF
CD‐ROM |
PDF Abbreviation |
Book Title |
---|---|---|
Sun Cluster 3.0 5/02 CD-ROM |
CLUSTSUPP |
Sun Cluster 3.0 5/02 Supplement |
CLUSTINSTALL | Sun Cluster 3.0 12/01 Software Installation Guide | |
CLUSTNETHW | Sun Cluster 3.0 12/01 Hardware Guide | |
CLUSTAPIPG | Sun Cluster 3.0 12/01 Data Services Developer's Guide | |
CLUSTSYSADMIN |
Sun Cluster 3.0 12/01 System Administration Guide |
|
CLUSTCONCEPTS |
Sun Cluster 3.0 12/01 Concepts |
|
CLUSTERRMSG | Sun Cluster 3.0 5/02 Error Messages Guide | |
Sun Cluster 3.0 Agents 5/02 CD-ROM |
CLUSTDATASVC |
Sun Cluster 3.0 12/01 Data Services Installation and Configuration Guide |
The following restrictions apply to the Sun Cluster 3.0 5/02 release:
Remote Shared Memory (RSM) transport types – These transport types are mentioned in the documentation, but they are not supported. If you use the RSMAPI, specify dlpi as the transport type.
Scalable Coherent Interface (SCI) – The SBus SCI interface is not supported as a cluster interconnect. However, the PCI-SCI interface is supported.
Logical network interfaces – These interfaces are reserved for use by Sun Cluster 3.0 software.
Disk path monitoring – Only active disk paths (from the current primary node) are monitored for failures by Sun Cluster software. You must manually monitor disk paths to avoid double failures or loss of path to a quorum device.
SunVTSTM – This is not supported.
Multihost tape, CD-ROM, and DVD‐ROM – This is not supported.
Loopback File System – Sun Cluster 3.0 software does not support the use of the loopback file system (LOFS) on cluster nodes.
Running client applications on the cluster nodes – Client applications running on cluster nodes should not map to logical IP-addresses that are part of an HA data service. During failover, these logical IP-addresses might go away, leaving the client without a connection.
Running high‐priority process scheduling classes on cluster nodes – This is not supported. Do not run, on any cluster node, any processes that run in the time‐sharing scheduling class with a higher‐than‐normal priority or any processes that run in the real‐time scheduling class. Sun Cluster 3.0 relies on kernel threads that do not run in the real‐time scheduling class. Other time‐sharing processes that run at higher‐than‐normal priority or real‐time processes can prevent the Sun Cluster kernel threads from acquiring needed CPU cycles.
File system quotas – Quotas are not supported in Sun Cluster 3.0 configurations.
Sun Cluster 3.0 software can only provide service for those data services that are either supplied with the Sun Cluster product or set up with the Sun Cluster data services API.
Do not use cluster nodes as mail servers because the Sun Cluster environment does not support the sendmail(1M) subsystem. Mail directories must reside on non-Sun Cluster nodes.
Do not configure cluster nodes as routers (gateways). If the system goes down, the clients cannot find an alternate router and cannot recover.
Do not configure cluster nodes as NIS or NIS+ servers. However, cluster nodes can be NIS or NIS+ clients.
Do not use a Sun Cluster configuration to provide a highly available boot or install service on client systems.
Do not use a Sun Cluster 3.0 configuration to provide an rarpd service.
RAID level 5 is supported on only the following hardware platforms at this time:
Sun StorEdge A5x00/A3500FC arrays.
Sun StorEdge T3 and T3+ arrays. However, note that if you are using these arrays in a single-controller configuration, an additional mechanism for data redundancy, such as host-based mirroring, must also be used. If these arrays are used in a partner-group configuration, the controllers are redundant and you can use RAID 5 without host-based mirroring.
Alternate Pathing (AP) is not supported.
If you are using a Sun EnterpriseTM 420R server with a PCI-card in slot J4701, the motherboard must be at dash-level 15 or higher (501-5168-15 or higher). To find the motherboard part number and revision level, look at the edge of the board closest to PCI slot 1.
System panics have been observed in clusters when UDWIS I/O cards are used in slot 0 of a board in a Sun Enterprise 10000 server; do not install UDWIS I/O cards in slot 0 of a board in this server (see BugId 4490386 .)
In Solstice DiskSuite configurations that use mediators, the number of mediator hosts configured for a diskset must be exactly two.
DiskSuite Tool (metatool) is not compatible with Sun Cluster 3.0 software.
Use of VxVM Dynamic Multipathing (DMP) with Sun Cluster 3.0 software to manage multiple paths from the same node is not supported.
Simple root disk groups (rootdg created on a single slice of the root disk) are not supported as disk types with VxVM on Sun Cluster software.
Software RAID 5 is not supported.
The command umount -f behaves in the same manner as the umount command without the -f option. It does not support forced unmounts.
The command unlink (1M) is not supported on non-empty directories.
The command lockfs -d is not supported. Use lockfs -n as a workaround.
The cluster file system does not support any of the file-system features of Solaris software by which one would put a communication end-point in the file-system name space. Therefore, although you can create a UNIX domain socket whose name is a path name into the cluster file system, the socket would not survive a node failover. In addition, any fifos or named pipes you create on a cluster file system would not be globally accessible, nor should you attempt to use fattach from any node other than the local node.
It is not supported to execute binaries off file systems mounted by using the forcedirectio mount option.
The following VxFS features are not supported in a Sun Cluster 3.0 configuration.
Quick I/O
Snapshots
Storage checkpoints
Cache advisories (these can be used, but the effect will be observed on the given node only)
VERITAS CFS (requires VERITAS cluster feature & VCS)
All other VxFS features and options that are supported in a cluster configuration are supported by Sun Cluster 3.0 software. See VxFS documentation and man pages for details about VxFS options that are or are not supported in a cluster configuration.
The following VxFS-specific mount options are not supported in a Sun Cluster 3.0 configuration.
convosync (Convert O_SYNC)
mincache
qlog, delaylog, tmplog
For a VxFS cluster file system, you must globally mount and unmount the cluster file system from the primary node (the node that masters the disk on which the VxFS file system resides) to ensure that the operation succeeds. A VxFS cluster file system mount or unmount operation that is performed from a secondary node might fail.
For a VxFS cluster file system, you must issue ioctls only from the primary node. If you do not know whether an administration command involves ioctls, issue the command from the primary node.
To administer a VxFS cluster file system, you must perform all VxFS administration commands from the primary node of the VxFS cluster file system.
All public networking adapters must be in NAFO groups.
Only one NAFO group exists per IP subnet for each node. Sun Cluster 3.0 software does not support even the weak form of IP striping, in which multiple IP addresses exist on the same subnet.
Only one adapter in a NAFO group can be active at any time.
Sun Cluster 3.0 software does not support setting local‐mac‐address?=true in the OpenBootTM PROM.
This section describes restrictions for specific data services. There are no restrictions that apply to all data services.
Future Sun Cluster Release Notes will not include data service restrictions that apply to specific data services. However, Sun Cluster Release Notes will document any data service restrictions that apply to all data services.
For additional data service restrictions that apply to specific data services, see the Sun Cluster 3.0 12/01 Data Services Installation and Configuration Guide.
Adhere to the Oracle Parallel Fail Safe/Real Application Clusters Guard option of Oracle Parallel Server/Real Application Clusters because you cannot change hostnames after you install Sun Cluster software.
For more information on this hostnames/node names restriction, see the Oracle Parallel Fail Safe/Real Application Clusters Guard documentation.
If the VERITAS NetBackup client is a cluster, only one logical host can be configured as the client because there is only one bp.conf file.
If the NetBackup client is a cluster and if one of the logical hosts on the cluster is configured as the NetBackup client, NetBackup cannot back up the physical hosts.
On the cluster running the master server, the master server is the only logical host that can be backed up.
Backup media cannot be attached to the master server, so one or more media servers are required.
No Sun Cluster node may be an NFS client of a Sun Cluster HA for NFS‐exported file system being mastered on a node in the same cluster. Such cross-mounting of Sun Cluster HA for NFS is prohibited. Use the cluster file system to share files among cluster nodes.
Applications running locally on the cluster must not lock files on a file system exported via NFS. Otherwise, local blocking (for example, flock(3UCB) or fcntl(2)) might interfere with the ability to restart the lock manager (lockd). During restart, a blocked local process may be granted a lock which may be intended to be reclaimed by a remote client. This would cause unpredictable behavior.
Sun Cluster HA for NFS requires that all NFS client mounts be “hard” mounts.
For Sun Cluster HA for NFS, do not use hostname aliases for network resources. NFS clients mounting cluster file systems using hostname aliases might experience statd lock recovery problems.
Sun Cluster 3.0 software does not support Secure NFS or the use of Kerberos with NFS, in particular, the secure and kerberos options to the share_nfs(1M) subsystem. However, Sun Cluster 3.0 software does support the use of secure ports for NFS by adding the entry set nfssrv:nfs_portmon=1 to the /etc/system file on cluster nodes.
The following guidelines apply to the Sun Cluster 3.0 5/02 release.
The following guideline addresses the problem reported in Bug 4499573. It was determined that the related functionality works as expected. As such, the Sun Cluster 3.0 12/01 Data Services Installation and Configuration Guide needs to reflect the following guideline.
When using data services that are I/O intensive and that have a large number of disks configured in the cluster, the application may experience delays due to retries within the I/O subsystem during disk failures. An I/O subsystem may take several minutes to retry and recover from a disk failure. This delay can result in Sun Cluster failing over the application to another node, even though the disk may have eventually recovered on its own. To avoid failover during these instances, consider increasing the default probe timeout of the data service. If you need more information or help with increasing data service timeouts, contact your local support engineer.
Identify requirements for all data services before you begin Solaris and Sun Cluster installation. If you do not inform yourself of these requirements, you might perform the installation process incorrectly and thereby need to completely reinstall the Solaris and Sun Cluster software.
For example, the Oracle Parallel Fail Safe/Real Application Clusters Guard option of Oracle Parallel Server/Real Application Clusters has special requirements for the hostnames/node names that you use in the cluster. You must accommodate these requirements before you install Sun Cluster software because you cannot change hostnames after you install Sun Cluster software. For more information on the special requirements for the hostnames/node names, see the Oracle Parallel Fail Safe/Real Application Clusters Guard documentation.
This section provides information about patches for Sun Cluster configurations.
Sun Cluster software is an early adopter of PatchPro, a patch-management solution from Sun. This new tool is intended to ease the selection and download of patches required for installation or maintenance of Sun Cluster software. PatchPro provides a Sun Cluster-specific Interactive Mode tool to make the installation of patches easier and an Expert Mode tool to maintain your configuration with the latest set of patches. Expert Mode is especially useful for those who want to get all of the latest patches, not just the high availability and security patches.
You must have a SunSolveSM account registered to view and download the required patches for the Sun Cluster product. If you don't have an account registered, contact your Sun service representative or sales engineer, or register through the SunSolve Online Web site.
To access the PatchPro tool for Sun Cluster software, go to http://www.sun.com/PatchPro/, click on “Sun Cluster,” then choose either Interactive Mode or Expert Mode. Follow the instructions in the PatchPro tool to describe your cluster configuration and download the patches.
The SunSolve OnlineSM Web site provides 24-hour access to the most up-to-date information regarding patches, software, and firmware for Sun products. Access the SunSolve Online site at http://sunsolve.sun.com for the most current matrixes of supported software, firmware, and patch revisions.
You must have a SunSolve account registered to view and download the required patches for the Sun Cluster product. If you don't have an account registered, contact your Sun service representative or sales engineer, or register through the SunSolve Online Web site.
You can find Sun Cluster 3.0 patch information by using the SunSolve EarlyNotifierSM Service. To view the EarlyNotifier information, log into SunSolve and access the Simple search selection from the top of the main page. From the Simple Search page, click on the EarlyNotifier box and type Sun Cluster 3.0 in the search criteria box. This will bring up the EarlyNotifier page for Sun Cluster 3.0 software.
Before you install Sun Cluster 3.0 software and apply patches to a cluster component (Solaris operating system, Sun Cluster software, volume manager or data services software, or disk hardware), review the EarlyNotifier information and any README files that accompany the patches. All cluster nodes must have the same patch level for proper cluster operation.
For specific patch procedures and tips on administering patches, see the Sun Cluster 3.0 12/01 System Administration Guide.
To view license terms, attribution, and copyright statements for mod_ssl, refer to the Sun Cluster 3.0 README file on the Sun Cluster 3.0 5/02 CD-ROM.
This section describes how to upgrade from Sun Management Center 2.1.1 to Sun Management Center 3.0 software on a Sun Cluster 3.0 5/02 configuration.
Perform this procedure to upgrade from Sun Management Center 2.1.1 to Sun Management Center 3.0 software on a Sun Cluster 3.0 5/02 configuration.
Have available the following items.
Sun Cluster 3.0 5/02 CD-ROM or the path to the CD‐ROM image. You will use the CD‐ROM to reinstall the Sun Cluster module packages after you upgrade Sun Management Center software.
Sun Management Center 3.0 documentation.
Sun Management Center 3.0 patches and Sun Cluster module patches, if any. See Patches and Required Firmware Levels for the location of patches and installation instructions.
Stop any Sun Management Center processes.
If the Sun Management Center console is running, exit the console.
In the console window, select File>Exit from the menu bar.
On each Sun Management Center agent machine (cluster node), stop the Sun Management Center agent process.
# /opt/SUNWsymon/sbin/es-stop -a |
On the Sun Management Center server machine, stop the Sun Management Center server process.
# /opt/SUNWsymon/sbin/es-stop -S |
As superuser, remove Sun Cluster module packages from the locations listed in Table 1–5.
You must remove all Sun Cluster module packages from all locations. Otherwise, the Sun Management Center software upgrade might fail because of package dependency problems. After you upgrade Sun Management Center software, you will reinstall these packages in Step 5.
# pkgrm module-package |
Location |
Package to Remove |
---|---|
Each cluster node |
SUNWscsam, SUNWscsal |
Sun Management Center console machine |
SUNWscscn |
Sun Management Center server machine |
SUNWscssv |
Sun Management Center help server machine |
SUNWscshl |
Upgrade to Sun Management Center 3.0 software.
Follow the upgrade procedures in your Sun Management Center 3.0 documentation.
As superuser, reinstall Sun Cluster module packages to the locations listed in Table 1–6.
For Sun Management Center 3.0 software, you install the help server package SUNWscshl on the console machine as well as on the help server machine.
# cd /cdrom/suncluster_3_0_u3/SunCluster_3.0/Packages # pkgadd module-package |
Location |
Package to Install |
---|---|
Each cluster node |
SUNWscsam, SUNWscsal |
Sun Management Center console machine |
SUNWscscn, SUNWscshl |
Sun Management Center server machine |
SUNWscssv |
Sun Management Center help server machine |
SUNWscshl |
Apply any Sun Management Center patches and any Sun Cluster module patches to each node of the cluster.
Restart Sun Management Center agent, server, and console processes on all involved machines.
Follow procedures in “How to Start Sun Management Center” in the Sun Cluster 3.0 12/01 Software Installation Guide.
Load the Sun Cluster module.
Follow procedures in “How to Load the Sun Cluster Module” in the Sun Cluster 3.0 12/01 Software Installation Guide.
If the Sun Cluster module was previously loaded, unload the module and then reload it to clear all cached alarm definitions on the server. To unload the module, from the console's Details window select Module>Unload Module.
This section describes undocumented information about the Sun Cluster 3.0 module to Sun Management Center 3.0. For information about upgrade to Sun Management Center 3.0, see Sun Management Center Software Upgrade.
From the Sun Cluster module console you can create, change the state of, or delete resources and resource groups. You can access them by opening the Sun Cluster Details window and choosing the options from the hierarchy (tree) or topology views.
You can access the resource and resource group creation wizards from the Resource Group subtree in the hierarchy (tree) view.
You can create, change the state of, or delete resources and resource groups from a pop-up menu available at each table shown below.
Pop-Up Menu Items and Associated Tables
Access from the Resource Group Status table and Resource Group Properties table:
Bring Online
Take Offline
Delete Selected Resource Group
Create New Resource Group
Create New Resource
Access from the Resource Status table and Resource Configuration table:
Enable
Disable
Delete Resource
Create New Resource Group
Create New Resource
Perform the following steps to access the wizards to create a resource or resource group.
In either the hierarchy (tree) or topology view, double-click Operating System>Sun Cluster.
Click the right mouse button on the Resource Groups item, or on any item within the Resource Groups subtree.
Choose “Create New Resource Group” or “Create New Resource” from the pop-up menu.
Perform the following procedure to use the creation wizard on the pop-up menus, accessible from the resource and resource group tables.
Display either the resource table or the resource group table.
Point to any cell entry in the table, excluding the header row.
Click the right mouse button.
Choose the action you want from the pop-up menu.
Perform the following steps to alter the state of a resource or to delete a resource or resource group. Use the pop-up menus from the resource and resource group tables to enable or disable a resource, or to bring a resource group online or take it offline.
Display either the resource or resource group table.
Select the item that you want to modify.
To delete an entry, select the resource or resource group to delete.
To change the state of an entry, select the state cell in the row of the resource or resource group to change.
Click the right mouse button.
Choose from the pop-up menu one of the following tasks to perform.
Bring Online
Take Offline
Enable
Disable
Delete Selected Resource Group
Delete Resource
When you delete or edit status of a resource or resource group, the Sun Cluster module launches a Probe Viewer window. If the Sun Cluster module successfully performs the task that you choose, the Probe Viewer window displays the message Probe command returned no data. If the task is not completed successfully, the window displays an error message.
See Sun Management Center documentation and online help for more information about Sun Management Center.
The following known problems affect the operation of the Sun Cluster 3.0 5/02 release. For the most current information, see the online Sun Cluster 3.0 5/02 Release Notes Supplement at http://docs.sun.com.
Problem Summary: When using Sun Enterprise 10000 servers in a cluster, panics have been observed in these servers when a certain configuration of I/O cards is used.
Workaround: Do not install UDWIS I/O cards in slot 0 of an SBus I/O board in Sun Enterprise 10000 servers in a cluster.
Problem Summary: Record locking does not work on another node(s) when the device trying to be locked is a global device, for example, /dev/global/rdsk/d4s0.
Record locking appears to work well when the program is run multiple times in the background on any particular node. The expected behavior is that after the first copy of the program locks a portion of the device, other copies of the program block waiting for the device to be unlocked. However, when the program is run from a different node, the program succeeds in locking the device again when in fact it should block waiting for the device to be unlocked.
Workaround: There is no workaround.
Problem Summary: When a Sun Cluster configuration is upgraded to Solaris 8 10/01 software (required for Sun Cluster 3.0 12/01 upgrade), the Apache application start and stop scripts are restored. If an Apache data service (Sun Cluster HA for Apache) is already present on the cluster and configured in its default configuration (the /etc/apache/httpd.conf file exists and the /etc/rc3.d/S50apache file does not exist), the Apache application starts on its own, independent of the Sun Cluster HA for Apache data service. This prevents the data service from starting because the Apache application is already running.
Workaround: Do the following for each node.
Before you shut down a node to upgrade it, determine whether the following links already exist, and if so, whether the file names contain an uppercase K or S.
/etc/rc0.d/K16apache /etc/rc1.d/K16apache /etc/rc2.d/K16apache /etc/rc3.d/S50apache /etc/rcS.d/K16apache |
If these links already exist and contain an uppercase K or S in the file name, no further action is necessary. Otherwise, perform the action in the next step after you upgrade the node to Solaris 8 10/01 software.
After the node is upgraded to Solaris 8 10/01 software, but before you reboot the node, move aside the restored Apache links by renaming the files with a lowercase k or s.
# mv /a/etc/rc0.d/K16apache /a/etc/rc0.d/k16apache # mv /a/etc/rc1.d/K16apache /a/etc/rc1.d/k16apache # mv /a/etc/rc2.d/K16apache /a/etc/rc2.d/k16apache # mv /a/etc/rc3.d/S50apache /a/etc/rc3.d/s50apache # mv /a/etc/rcS.d/K16apache /a/etc/rcS.d/k16apache |
Problem Summary: Sun Cluster HA for NFS requires files [SUCCESS=return] for the hosts lookup entry in the /etc/nsswitch.conf file, and requires that all cluster private IP addresses be present in the /etc/inet/hosts file on all cluster nodes.
Otherwise, Sun Cluster HA for NFS will not be able to fail over correctly in the presence of public network failures.
Workaround: Perform the following steps on each node of the cluster.
Modify the hosts entry in the /etc/nsswitch.conf file so that, upon success in resolving a name locally, it returns success immediately and does not contact NIS or DNS.
hosts: cluster files [SUCCESS=return] nis dns |
Add entries for all cluster private IP addresses to the /etc/inet/hosts file.
You only need to list the IP addresses plumbed on the physical private interfaces in the /etc/nsswitch.conf and /etc/inet/hosts files. The logical IP addresses are already resolvable through the cluster nsswitch library.
To list the physical private IP addresses, run the following command on any cluster node.
% grep ip_address /etc/cluster/ccr/infrastructure |
Each IP address in this list must be assigned a unique hostname that does not conflict with any other hostname in the domain.
Sun Cluster software already requires that any HA IP addresses (LogicalHostname/SharedAddresses) be present in /etc/inet/hosts on all cluster nodes and that files is listed before nis or dns. The additional requirements mandated by this bug are to list [SUCCESS=return] after files and to list all cluster private IP addresses in the /etc/inet/hosts file.
Problem Summary: On rare occasions, private interconnect transport paths ending at a qfe adapter fail to come up.
Workaround: Perform the following steps.
Identify the adapter that is at fault.
Scstat -W output should show all transport paths with that adapter as one of the path endpoints in the “faulted” or the “waiting” states.
Use scsetup(1M) to remove all cables connected to that adapter from the cluster configuration.
Use scsetup again to remove that adapter from the cluster configuration.
Add the adapter and the cables back to the cluster configuration.
Verify whether these steps fixed the problem and whether the paths are able to come back up.
If removing the cables and the adapter and then adding them back does not work, repeat the procedure a few times. If that does not help, reboot the node that has the problem adapter. There is a good chance that the problem will be gone when the node boots up. Before you reboot the node, ensure that the remaining cluster has enough quorum votes to survive the node reboot.
Problem Summary: If the rpc.pmfd daemon monitors a process that forks a new process as the result of handling a signal, then using pmfadm -k tag signal might result in an infinite loop. This might occur because pmfadm(1M) attempts to kill all processes in the tag's process tree while the newly forked processes are being added to the tree (each one being added as a result of killing a previous one).
This bug should not occur with pmfadm -s tag signal.
Workaround: Use pmfadm -s tag signal instead of pmfadm -k. The -s option to pmfadm does not suffer from the same race condition as the -k option.
Problem Summary: Using the forcedirectio mount option and the mmap(2) function concurrently might cause data corruption and system hangs or panics.
Workaround: Observe the following restrictions.
Do not remount a file system with the directio mount option added at remount time.
Do not set the directio mount option on a single file by using the directio ioctl.
If there is a need to use directio, mount the whole file system with directio options.
Problem Summary: If an attempt is made to mount the same device on different mount points, the system will catch this error most of the time and cause the second mount to fail. However, under certain rare conditions, the system might not be able to catch this error and could allow both mounts to succeed. This happens only when all four of the following conditions hold true.
The two mounts are performed concurrently
The same device is mounted
The device is mounted on two different mount points
One mount is global while the other mount is local.
Workaround: System administrator should exercise caution while mounting file systems on the cluster.
Problem Summary: The scconf(1M) command may not reminor the VxVM disk groups in some cases and will give out the error saying that the device is already in use in another device group.
Workaround: Perform the following steps to assign a new minor number to the disk group.
Find the minor numbers already in use.
Observe the minor numbers in use along with the major number listed in the following output.
% ls -l /dev/vx/rdsk/*/* crw------- 1 root root 210,107000 Mar 11 18:18 /dev/vx/rdsk/fix/vol-01 crw------- 1 root root 210,88000 Mar 15 16:31 /dev/vx/rdsk/iidg/vol-01 crw------- 1 root root 210,88001 Mar 15 16:32 /dev/vx/rdsk/iidg/vol-02 crw------- 1 root root 210,88002 Mar 15 16:33 /dev/vx/rdsk/iidg/vol-03 crw------- 1 root root 210,88003 Mar 15 16:49 /dev/vx/rdsk/iidg/vol-04 crw------- 1 root root 210,13000 Mar 18 16:09 /dev/vx/rdsk/sndrdg/vol-01 crw------- 1 root root 210,13001 Mar 18 16:08 /dev/vx/rdsk/sndrdg/vol-02 |
Choose any other multiple of 1000 that is not in use as the base minor number for the new disk group.
Assign the unused minor number to the disk group in error.
Use the vxdg command's reminor option.
Retry the failed scconf command.
Problem Summary: On Solaris 9, the Sun Cluster HA for Oracle data service's stop method can time out, in case of public network failure, if external name services are not available. The Sun Cluster HA for Oracle data service uses the su(1M) user command to start and stop the database.
Workaround: On each node that can be a primary for the oracle_server or oracle_listener resource, modify the /etc/nsswitch.conf file to include the following entries for passwd, group, publickey and project databases.
passwd: files group: files publickey: files project: files |
These modifications ensure that the su(1M) command does not refer to the NIS/NIS+ name services, and ensures that the data service starts and stops correctly in case of network failure.
Problem Summary: Use of sendfile(3EXT) will panic the node.
Workaround: There is no workaround for this problem except not to use sendfile.
Problem Summary: On Solaris 9, a cluster node that is being shut down might panic with the following message on its way down.
CMM: Shutdown timer expired. Halting |
Workaround: There is no workaround for this problem. The node panic has no other side effects and can be treated as relatively harmless.
Problem Summary: Creation of an HAStoragePlus resource fails if the order of the file-system mount points specified in the FilesystemMountPoints extension property is not the same as the order specified in the /etc/vfstab file.
Workaround: Ensure that the mount point list specified in the FilesystemMountPoints extension property matches the sequence specified in the /etc/vfstab file. For example, if the /etc/vfstab file specifies file system entries in the sequence /a, /b, and /c, the FilesystemMountPoints sequence can be “/a,/b,/c” or “/a,/b” or “/a,/c” but not “/a,/c,/b.”
Problem Summary: If the Failover_enabled extension property is set to FALSE, this is supposed to prevent the resource monitor from initiating a resource group failover.
However, if the monitor is attempting a resource restart, and the START or STOP method fails or times out, then the monitor will attempt a giveover no matter what is the setting of Failover_enabled.
Workaround: There is no workaround for this bug.
Problem Summary: Solstice DiskSuite soft partition-based device groups on locally mounted VxFS can trigger errors if device group switchover commands (scswitch -D device-group) are issued.
Solstice DiskSuite internally performs mirror resync operations which may take a significant amount of time. Mirror resyncs degrade redundancy. VxFS reports errors at this juncture causing fault monitor/application IO failures, resulting in application restarts.
Workaround: For any Solstice DiskSuite device group configured with HAStoragePlus, do not switch over the device group manually. Instead, switch over the resource group, which in turn will cause error-free device switchovers.
Alternately, configure locally mounted VxFS file systems on VxVM disk groups.
Problem Summary: Some error messages were not included on the Sun Cluster 3.0 5/02 CD‐ROM.
Workaround: These error messages are documented in New Error Messages.
Problem Summary: fsck(1M) of a file system resident on a Sun Cluster global Solstice DiskSuite/VxVM device group fails if executed from a non-primary (secondary) node. This has been observed on Solaris 9, although it is possible that earlier Solaris releases could exhibit this behavior.
Workaround: Invoke the fsck command only on the primary node.
Problem Summary: A Sun Cluster HA for Oracle listener resource does not behave correctly if multiple listener resources are configured to start listeners with the same listener name.
Workaround: Do not use the same listener name for multiple listeners running on a cluster.
Problem Summary: Dissociating/detaching a plex from a VxVM disk group under Sun Cluster 3.0 might panic the cluster node with the following panic string.
panic[cpu2]/thread=30002901460: BAD TRAP: type=31 rp=2a101b1d200 addr=40 mmu_fsr=0 occurred in module "vxfs" due to a NULL pointer dereference |
Workaround: Before you dissociate/detach a plex, unmount the corresponding file system.
Problem Summary: Failover does not occur when the resource group property auto_start_on_new_cluster is set to false.
Workaround: Each time the whole cluster reboots, for resource groups that have the auto_start_on_new_cluster property set to false, set the auto_start_on_new_cluster property to true, then reset the auto_start_on_new_cluster property to false.
# scrgadm -c -g rgname -y auto_start_on_new_cluster=true # scrgadm -c -g rgname -y auto_start_on_new_cluster=false |
Problem Summary: For globally mounted VxFS file systems, the /etc/mnttab file system might not display the global option.
Workaround: If an /etc/mnttab entry is found on all the nodes of the cluster for the given file system, this shows that the file system is globally mounted.
Problem Summary: On remounting a globally mounted file system, /etc/mnttab is not updated.
Workaround: There is no workaround.
Problem Summary: When using Sun Cluster HA for NFS with HAStoragePlus, blocking locks are not recovered during failovers and switchovers. As a result, lockd cannot be restarted by Sun Cluster HA for NFS, which leads to failure of the nfs_postnet_stop method, causing the cluster node to crash.
Workaround: Do not use Sun Cluster HA for NFS on HAStoragePlus. Cluster file systems do not suffer from this problem, therefore configuring Sun Cluster HA for NFS on a cluster file system can be used as a workaround.
Problem Summary: When an HTTP server is killed on a node, it leaves a PID file on that node. Next time the HTTP server is started, it checks if the PID file exists and checks if any process with the PID is already running (kill -0). Since PIDs are recycled, there could be some other process with the same PID as the last HTTP server PID. This will cause the HTTP server startup to fail.
Workaround: If the HTTP server fails to start with an error like the following, manually remove the PID file for the HTTP server to restart correctly.
Mar 27 17:47:58 ppups4 uxwdog[939]: could not log PID to PidLog /app/iws/https-schost-5.example.com/logs/pid, server already running (No such file or directory) |
Problem Summary: To avoid panics when using VERITAS products such as VxFS with Sun Cluster software, the default thread stack size needs to be increased.
Workaround: Increase the stack size by putting the following lines in the /etc/system file.
set lwp_default_stksize=0x6000 set svc_default_stksize 0x8000 |
The svc_default_stksize entry is needed for NFS operations.
After installing VERITAS packages, verify that VERITAS has not added similar statements to the /etc/system file. if so they should be resolved to one statement using the higher value.
Problem Summary: In a greater-than-two-node device group with an ordered node list, if the node being removed is not the last in the ordered list, then the scconf output will show partial information about the node list.
Workaround:
To prevent the user from hitting this bug, remove the nodes one by one, starting from the last node seen in the node list until the selected node is removed. Then add the other nodes back to the device group.
To repair the cluster state once you've hit this bug, do the following:
Stop any services (file systems, data services) using the device group.
Unregister (remove) the device group from the cluster's memory by using scsetup(1m). Note: Leave the device-group object itself intact.
Reregister the device group (like it was a new device group) using scsetup(1m), adding the correct list of nodes.
When re-adding the removed node back into the device group, explicitly give an ordered list with all valid nodes included.
Problem Summary: After powering off one of the Sun StorEdge T3 Arrays then running scshutdown, rebooting both nodes puts the cluster in a non-working state.
Workaround: If half the replicas are lost, perform the following steps:
Ensure that the cluster is in cluster mode.
Forcibly import the diskset.
# metaset -s set-name -f -C take |
Delete the broken replicas.
# metadb -s set-name -fd /dev/did/dsk/dNsX |
Release the diskset.
# metaset -s set-name -C release |
Now the file system can be mounted and used. However, the redundancy in the replicas has not been restored. If the other half of replicas is lost, then there will be no way to restore the mirror to a sane state.
Recreate the databases after the above repair procedure is applied.
This section discusses known errors or omissions for documentation, online help, or man pages and steps to correct these problems.
A note in SunPlex Manager's online help is inaccurate. The note appears in the Oracle data service installation procedure. The correction is as follows.
Incorrect:
Note: If no entries exist for the shmsys and semsys variables in the /etc/system file when SunPlex Manager packages are installed, default values for these variables are automatically put in the /etc/system file. The system must then be rebooted. Check Oracle installation documentation to verify that these values are appropriate for your database.
Correct:
Note: If no entries exist for the shmsys and semsys variables in the /etc/system file when you install the Oracle data service, default values for these variables can be automatically put in the /etc/system file. The system must then be rebooted. Check Oracle installation documentation to verify that these values are appropriate for your database.
The introductory paragraph to “Installing Sun Cluster HA for Oracle Packages” in the Sun Cluster 3.0 12/01 Data Services Installation and Configuration Guide does not discuss the additional package needed for users with clusters running Sun Cluster HA for Oracle with 64‐bit Oracle. The following section corrects the introductory paragraph to “Installing Sun Cluster HA for Oracle Packages” in the Sun Cluster 3.0 12/01 Data Services Installation and Configuration Guide.
Depending on your configuration, use the scinstall(1M) utility to install one or both of the following packages on your cluster. Do not use the -s option to non‐interactive scinstall to install all of the data service packages.
SUNWscor: Cluster running Sun Cluster HA for Oracle with 32 bit Oracle or 64‐bit Oracle
SUNWscorx: Cluster running Sun Cluster HA for Oracle with 64‐bit Oracle
SUNWscor is the prerequisite package for SUNWscorx.
If you installed the SUNWscor data service package as part of your initial Sun Cluster installation, proceed to “Registering and Configuring Sun Cluster HA for Oracle” on page 30. Otherwise, use the following procedure to install the SUNWscor and SUNWscorx packages.
Simple root disk groups are not supported as disk types with VERITAS Volume Manager on Sun Cluster software. As a result, if you perform the procedure “How to Restore a Non-Encapsulated root (/) File System (VERITAS Volume Manager)” in the Sun Cluster 3.0 12/01 System Administration Guide, you should eliminate Step 9, which tells you to determine if the root disk group (rootdg) is on a single slice on the root disk. You would complete Step 1 through Step 8, skip Step 9, and proceed with Step 10 to the end of the procedure.
The following is a correction to Step 8 of “How to Upgrade to a Sun Cluster 3.0 Software Update Release” in the Sun Cluster 3.0 12/01 Software Installation Guide.
(Optional) Upgrade Solaris 8 software.
Temporarily comment out all global device entries in the /etc/vfstab file.
Do this to prevent the Solaris upgrade from attempting to mount the global devices.
Shut down the node to upgrade.
# shutdown -y -g0 ok |
Follow instructions in the installation guide for the Solaris 8 Maintenance Update version you want to upgrade to.
Do not reboot the node when prompted to reboot.
Uncomment all global device entries that you commented out in Step a in the /a/etc/vfstab file.
Install any Solaris software patches and hardware-related patches, and download any needed firmware contained in the hardware patches.
If any patches require rebooting, reboot the node in non-cluster mode as described in Step f.
Reboot the node in non-cluster mode.
Include the double dashes (--) and two quotation marks (") in the command.
# reboot -- "-x" |
The following upgrade procedures contain changes and corrections to the procedures since release of the Sun Cluster 3.0 12/01 Software Installation Guide.
To upgrade from Sun Cluster 2.2 to Sun Cluster 3.0 5/02 software, perform the following procedures instead of the versions documented in the Sun Cluster 3.0 12/01 Software Installation Guide.
Become superuser on a cluster node.
If you are installing from the CD‐ROM, insert the Sun Cluster 3.0 5/02 CD-ROM into the CD‐ROM drive on a node.
If the volume daemon vold(1M) is running and configured to manage CD‐ROM devices, it automatically mounts the CD‐ROM on the /cdrom/suncluster_3_0_u3 directory.
Change to the /cdrom/suncluster_3_0_u3/SunCluster_3.0/Packages directory.
# cd /cdrom/suncluster_3_0_u3/SunCluster_3.0/Packages |
If your volume manager is Solstice DiskSuite, install the latest Solstice DiskSuite mediator package (SUNWmdm) on each node.
Reconfigure mediators.
Determine which node has ownership of the diskset to which you will add the mediator hosts.
# metaset -s setname |
Specifies the diskset name
If no node has ownership, take ownership of the diskset.
# metaset -s setname -t |
Takes ownership of the diskset
Recreate the mediators.
# metaset -s setname -a -m mediator‐host‐list |
Adds to the diskset
Specifies the names of the nodes to add as mediator hosts for the diskset
Repeat for each diskset.
On each node, shut down the rpc.pfmd daemon.
# /etc/init.d/initpmf stop |
Upgrade the first node to Sun Cluster 3.0 5/02 software.
These procedures will refer to this node as the first-installed node.
On the first node to upgrade, change to the /cdrom/suncluster_3_0_u3/SunCluster_3.0/Tools directory.
# cd /cdrom/suncluster_3_0_u3/SunCluster_3.0/Tools |
Upgrade the cluster software framework.
# ./scinstall ‐u begin ‐F |
Specifies that this is the first-installed node in the cluster
See the scinstall(1M) man page for more information.
Install any Sun Cluster patches on the first node.
See the Sun Cluster 3.0 5/02 Release Notes for the location of patches and installation instructions.
Reboot the node.
# shutdown -g0 -y -i6 |
When the first node reboots into cluster mode, it establishes the cluster.
Upgrade the second node to Sun Cluster 3.0 5/02 software.
On the second node, change to the /cdrom/suncluster_3_0_u3/SunCluster_3.0/Tools directory.
# cd /cdrom/suncluster_3_0_u3/SunCluster_3.0/Tools |
Upgrade the cluster software framework.
# ./scinstall ‐u begin ‐N node1 |
Specifies the name of the first-installed node in the cluster, not the name of the second node to be installed
See the scinstall(1M) man page for more information.
Install any Sun Cluster patches on the second node.
See the Sun Cluster 3.0 5/02 Release Notes for the location of patches and installation instructions.
Reboot the node.
# shutdown -g0 -y -i6 |
After both nodes are rebooted, verify from either node that both nodes are cluster members.
-- Cluster Nodes -- Node name Status --------- ------ Cluster node: phys-schost-1 Online Cluster node: phys-schost-2 Online |
See the scstat(1M) man page for more information about displaying cluster status.
Choose a shared disk to be the quorum device.
You can use any disk shared by both nodes as a quorum device. From either node, use the scdidadm(1M) command to determine the shared disk's device ID (DID) name. You specify this device name in Step 5, in the -q globaldev=DIDname option to scinstall.
# scdidadm ‐L |
Configure the shared quorum device.
Start the scsetup(1M) utility.
# scsetup |
The Initial Cluster Setup screen is displayed.
If the quorum setup process is interrupted or fails to complete successfully, rerun scsetup.
At the prompt Do you want to add any quorum disks?, configure a shared quorum device.
A two-node cluster remains in install mode until a shared quorum device is configured. After the scsetup utility configures the quorum device, the message Command completed successfully is displayed.
At the prompt Is it okay to reset "installmode"?, answer Yes.
After the scsetup utility sets quorum configurations and vote counts for the cluster, the message Cluster initialization is complete is displayed and the utility returns you to the Main Menu.
Exit from the scsetup utility.
From any node, verify the device and node quorum configurations.
You do not need to be superuser to run this command.
% scstat -q |
From any node, verify that cluster install mode is disabled.
You do not need to be superuser to run this command.
% scconf -p | grep "Cluster install mode:" Cluster install mode: disabled |
Update the directory paths.
Go to “How to Update the Root Environment” in the Sun Cluster 3.0 12/01 Software Installation Guide.
The following example shows the beginning process of upgrading a two-node cluster from Sun Cluster 2.2 to Sun Cluster 3.0 5/02 software. The cluster node names are phys‐schost‐1, the first-installed node, and phys‐schost‐2, which joins the cluster that phys‐schost‐1 established. The volume manager is Solstice DiskSuite and both nodes are used as mediator hosts for the diskset schost‐1.
(Install the latest Solstice DiskSuite mediator package on each node) # cd /cdrom/suncluster_3_0_u3/SunCluster_3.0/Packages # pkgadd -d . SUNWmdm (Restore the mediator) # metaset -s schost-1 -t # metaset -s schost‐1 -a -m phys‐schost‐1 phys‐schost‐2 (Shut down the rpc.pmfd daemon) # /etc/init.d/initpmf stop (Begin upgrade on the first node and reboot it) phys‐schost‐1# cd /cdrom/suncluster_3_0_u3/SunCluster_3.0/Tools phys‐schost‐1# ./scinstall ‐u begin ‐F phys-schost-1# shutdown -g0 -y -i6 (Begin upgrade on the second node and reboot it) phys‐schost‐2# cd /cdrom/suncluster_3_0_u3/SunCluster_3.0/Tools phys‐schost‐2# ./scinstall ‐u begin ‐N phys‐schost‐1 phys-schost-2# shutdown -g0 -y -i6 (Verify cluster membership) # scstat (Choose a shared disk and configure it as the quorum device) # scdidadm -L # scsetup Select Quorum>Add a quorum disk (Verify that the quorum device is configured) # scstat -q (Verify that the cluster is no longer in install mode) % scconf -p | grep "Cluster install mode:" Cluster install mode: disabled |
This procedure finishes the scinstall(1M) upgrade process begun in How to Upgrade Cluster Software Packages. Perform these steps on each node of the cluster.
Become superuser on each node of the cluster.
Is your volume manager VxVM?
If no, go to Step 3.
If yes, install VxVM and any VxVM patches and create the root disk group (rootdg) as you would for a new installation.
To install VxVM and encapsulate the root disk, perform the procedures in “How to Install VERITAS Volume Manager Software and Encapsulate the Root Disk” in the Sun Cluster 3.0 12/01 Software Installation Guide. To mirror the root disk, perform the procedures in “How to Mirror the Encapsulated Root Disk” in the Sun Cluster 3.0 12/01 Software Installation Guide.
To install VxVM and create rootdg on local, non-root disks, perform the procedures in “How to Install VERITAS Volume Manager Software Only” and in “How to Create a rootdg Disk Group on a Non-Root Disk” in the Sun Cluster 3.0 12/01 Software Installation Guide.
Are you upgrading Sun Cluster HA for NFS?
If yes, go to Step 4.
If no, go to Step 5.
Finish Sun Cluster 3.0 software upgrade and convert Sun Cluster HA for NFS configuration.
If you are not upgrading Sun Cluster HA for NFS, perform Step 5 instead.
Insert the Sun Cluster 3.0 Agents 5/02 CD-ROM into the CD‐ROM drive on a node.
This step assumes that the volume daemon vold(1M) is running and configured to manage CD‐ROM devices.
Finish the cluster software upgrade on that node.
# scinstall ‐u finish ‐q globaldev=DIDname \ -d /cdrom/scdataservices_3_0_u3 -s nfs |
Specifies the device ID (DID) name of the quorum device
Specifies the directory location of the CD‐ROM image
Specifies the Sun Cluster HA for NFS data service to configure
An error message similar to the following might be generated. You can safely ignore it.
** Installing Sun Cluster - Highly Available NFS Server ** Skipping "SUNWscnfs" - already installed |
Eject the CD‐ROM.
Repeat Step a through Step c on the other node.
When completed on both nodes, cluster install mode is disabled and all quorum votes are assigned.
Skip to Step 6.
Finish Sun Cluster 3.0 software upgrade on each node.
If you are upgrading Sun Cluster HA for NFS, perform Step 4 instead.
# scinstall ‐u finish ‐q globaldev=DIDname |
Specifies the device ID (DID) name of the quorum device
If you are upgrading any data services other than Sun Cluster HA for NFS, configure resources for those data services as you would for a new installation.
See the Sun Cluster 3.0 12/01 Data Services Installation and Configuration Guide for procedures.
If your volume manager is Solstice DiskSuite, from either node bring pre-existing disk device groups online.
# scswitch ‐z ‐D disk-device-group ‐h node |
Performs the switch
Specifies the name of the disk device group, which for Solstice DiskSuite software is the same as the diskset name
Specifies the name of the cluster node that serves as the primary of the disk device group
From either node, bring pre-existing data service resource groups online.
At this point, Sun Cluster 2.2 logical hosts are converted to Sun Cluster 3.0 5/02 resource groups, and the names of logical hosts are appended with the suffix -lh. For example, a logical host named lhost‐1 is upgraded to a resource group named lhost‐1‐lh. Use these converted resource group names in the following command.
# scswitch ‐z ‐g resource-group ‐h node |
Specifies the name of the resource group to bring online
You can use the scrgadm -p command to display a list of all resource types and resource groups in the cluster. The scrgadm -pv command displays this list with more detail.
If you are using Sun Management Center to monitor your Sun Cluster configuration, install the Sun Cluster module for Sun Management Center.
Ensure that you are using the most recent version of Sun Management Center.
See your Sun Management Center documentation for installation or upgrade procedures.
Follow guidelines and procedures in “Installation Requirements for Sun Cluster Monitoring” in the Sun Cluster 3.0 12/01 Software Installation Guide to install the Sun Cluster module packages.
Verify that all nodes have joined the cluster.
Go to “How to Verify Cluster Membership” in the Sun Cluster 3.0 12/01 Software Installation Guide.
The following example shows the finish process of upgrading a two-node cluster upgraded from Sun Cluster 2.2 to Sun Cluster 3.0 5/02 software. The cluster node names are phys‐schost‐1 and phys‐schost‐2, the device group names are dg‐schost‐1 and dg‐schost‐2, and the data service resource group names are lh‐schost‐1 and lh‐schost‐2. The scinstall command automatically converts the Sun Cluster HA for NFS configuration.
(Finish upgrade on each node) phys‐schost‐1# scinstall ‐u finish ‐q globaldev=d1 \ -d /cdrom/scdataservices_3_0_u3 -s nfs phys‐schost‐2# scinstall ‐u finish ‐q globaldev=d1 \ -d /cdrom/scdataservices_3_0_u3 -s nfs (Bring device groups and data service resource groups on each node online) phys‐schost‐1# scswitch ‐z ‐D dg‐schost‐1 ‐h phys‐schost‐1 phys‐schost‐1# scswitch ‐z ‐g lh-schost-1 ‐h phys‐schost‐1 phys‐schost‐1# scswitch ‐z ‐D dg‐schost‐2 ‐h phys‐schost‐2 phys‐schost‐1# scswitch ‐z ‐g lh-schost-2 ‐h phys‐schost‐2 |
The procedure “How to Bring a Node Out of Maintenance State” in the Sun Cluster 3.0 12/01 System Administration Guide does not apply to a two-node cluster. A procedure appropriate for a two-node cluster will be evaluated for the next release.
The following paragraph clarifies behavior of the scgdevs command. This information is not currently included in the scgdevs(1M) man page.
New Information:
scgdevs(1M) called from the local node will perform its work on remote nodes asynchronously. Therefore, command completion on the local node does not necessarily mean it has completed its work clusterwide.
There is an error in the Name section. The Name section should read as follows:
sap_ci, SUNW.sap_ci and SUNW.sap_ci_v2 - Resource type implementations for Sun Cluster HA for SAP central instance.
The is an error in the Description section. The Description section should read as follows:
The Resource Group Manager (RGM) manages the SAP data service for Sun Cluster software. Configure the Sun Cluster HA for SAP central instance as a logical-hostname resource and an SAP central instance resource.
There is an error in the Name section. The Name section should read as follows:
sap_as, SUNW.sap_as - Resource type implementation for Sun Cluster HA for SAP as a failover data service.
sap_as, SUNW.sap_as_v2 - Resource type implementation for Sun Cluster HA for SAP as a failover data service or a scalable data service.
There is an error in the Description section. The Description section should read as follows:
The Resource Group Manager (RGM) manages the SAP data service for Sun Cluster software. If you are setting up the Sun Cluster HA for SAP application server as a failover data service configure it as a logical-hostname resource and an SAP application-server resource. If you are setting up the Sun Cluster HA for SAP application-server as a scalable data service configure it as a scalable SAP application-server resource.
The following new resource group property should be added to the rg_properties(5) man page.
Auto_start_on_new_cluster
This property controls whether the Resource Group Manager starts the resource group automatically when a new cluster is forming.
The default is TRUE. If set to TRUE, the Resource Group Manager attempts to start the resource group automatically to achieve Desired_primaries when all nodes of the cluster are simultaneously rebooted. If set to FALSE, the Resource Group does not start automatically when the cluster is rebooted.
Category: Optional Default: True Tunable: Any time
The following error messages were not included on the Sun Cluster 3.0 5/02 CD-ROM.
360600:Oracle UDLM package wrong instruction set architecture.
The Oracle UDLM package that is currently installed is the incorrect instruction set architecture for the mode that the node is currently booted in, (e.g., Oracle UDLM is 64-bit (sparc9) and the node is currently boot in 32-bit mode (sparc)).
Solution:Obtain and install the proper Oracle UDLM package from Oracle for the instruction set architecture of the system, or boot the node in an instruction set architecture that is compatible with the current version of the Oracle UDLM.
800320:Fencing %s from shared disk devices.
A reservation has been performed to fence off nonmember nodes from disks that are shared between the cluster nodes.
Solution:None.
558777:Enabling failfast on all shared disk devices.
A reservation failfast will be set so nodes which share these disk groups will be brought down if they are fenced off by other nodes.
Solution:None.
309875:Error encountered enabling failfast.
An error occurred while attempting to enable the reservation failfast on the disks that are shared by other nodes.
Solution:This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log, and /var/cluster/ucmm/dlm*/logs/* from all the nodes and contact your Sun service representative.