How to Prepare the Cluster for Upgrade (Dual-Partition) (Sun Cluster Software Installation Guide for Solaris OS)

Sun Cluster Software Installation Guide for Solaris OS

How to Prepare the Cluster for Upgrade (Dual-Partition)

Perform this procedure to prepare the cluster for a dual-partition upgrade. These procedures will refer to the two groups of nodes as the first partition and the second partition. The nodes that you assign to the second partition will continue cluster services while you upgrade the nodes in the first partition. After all nodes in the first partition are upgraded, you switch cluster services to the first partition and upgrade the second partition. After all nodes in the second partition are upgraded, you boot the nodes into cluster mode to rejoin the nodes from the first partition.

Note –

If you are upgrading a single-node cluster, do not use this upgrade method. Instead, go to How to Prepare the Cluster for Upgrade (Standard) or How to Prepare the Cluster for Upgrade (Live Upgrade).

On the Solaris 10 OS, perform all steps from the global zone only.

Before You Begin

Perform the following tasks:

Ensure that the configuration meets the requirements for upgrade. See Upgrade Requirements and Software Support Guidelines.
Have available the installation media, documentation, and patches for all software products that you are upgrading, including the following software:
- Solaris OS
- Sun Cluster 3.2 framework
- Sun Cluster 3.2 data services (agents)
- Applications that are managed by Sun Cluster 3.2 data-services
- VERITAS Volume Manager, if applicable
See Patches and Required Firmware Levels in Sun Cluster 3.2 Release Notes for Solaris OS for the location of patches and installation instructions.
If you use role-based access control (RBAC) instead of superuser to access the cluster nodes, ensure that you can assume an RBAC role that provides authorization for all Sun Cluster commands. This series of upgrade procedures requires the following Sun Cluster RBAC authorizations if the user is not superuser:
- solaris.cluster.modify
- solaris.cluster.admin
- solaris.cluster.read
See Role-Based Access Control (Overview) in System Administration Guide: Security Services for more information about using RBAC roles. See the Sun Cluster man pages for the RBAC authorization that each Sun Cluster subcommand requires.

Ensure that the cluster is functioning normally.
1. View the current status of the cluster by running the following command from any node.
  % scstat
  See the scstat(1M) man page for more information.
2. Search the /var/adm/messages log on the same node for unresolved error messages or warning messages.
3. Check the volume-manager status.

If necessary, notify users that cluster services might be temporarily interrupted during the upgrade.

Service interruption will be approximately the amount of time that your cluster normally takes to switch services to another node.

Become superuser on a node of the cluster.

For a two-node cluster that uses Sun StorEdge Availability Suite software or Sun StorageTek Availability Suite software, ensure that the configuration data for availability services resides on the quorum disk.

The configuration data must reside on a quorum disk to ensure the proper functioning of Availability Suite after you upgrade the cluster software.

Become superuser on a node of the cluster that runs Availability Suite software.

Identify the device ID and the slice that is used by the Availability Suite configuration file.
phys-schost# /usr/opt/SUNWscm/sbin/dscfg /dev/did/rdsk/dNsS
In this example output, N is the device ID and S the slice of device N.

Identify the existing quorum device.

phys-schost# scstat -q
-- Quorum Votes by Device --
                     Device Name         Present Possible Status
                     -----------         ------- -------- ------
   Device votes:     /dev/did/rdsk/dQsS  1       1        Online

In this example output, dQsS is the existing quorum device.

If the quorum device is not the same as the Availability Suite configuration-data device, move the configuration data to an available slice on the quorum device.
phys-schost# dd if=`/usr/opt/SUNWesm/sbin/dscfg` of=/dev/did/rdsk/dQsS
Note –
You must use the name of the raw DID device, /dev/did/rdsk/, not the block DID device, /dev/did/dsk/.

If you moved the configuration data, configure Availability Suite software to use the new location.

As superuser, issue the following command on each node that runs Availability Suite software.
phys-schost# /usr/opt/SUNWesm/sbin/dscfg -s /dev/did/rdsk/dQsS

If you will upgrade the Solaris OS and your cluster uses dual-string mediators for Solaris Volume Manager software, unconfigure your mediators.

See Configuring Dual-String Mediators for more information about mediators.
1. Run the following command to verify that no mediator data problems exist.
  phys-schost# medstat -s setname
  -s setname
  
  Specifies the disk set name.
  
  If the value in the Status field is Bad, repair the affected mediator host. Follow the procedure How to Fix Bad Mediator Data.
2. List all mediators.
  
  Save this information for when you restore the mediators during the procedure How to Finish Upgrade to Sun Cluster 3.2 Software.
3. For a disk set that uses mediators, take ownership of the disk set if no node already has ownership.
  phys-schost# scswitch -z -D setname -h node
  -z
  
  Changes mastery.
  
  -D devicegroup
  
  Specifies the name of the disk set.
  
  -h node
  
  Specifies the name of the node to become primary of the disk set.
4. Unconfigure all mediators for the disk set.
  phys-schost# metaset -s setname -d -m mediator-host-list
  -s setname
  
  Specifies the disk set name.
  
  -d
  
  Deletes from the disk set.
  
  -m mediator-host-list
  
  Specifies the name of the node to remove as a mediator host for the disk set.
  
  See the mediator(7D) man page for further information about mediator-specific options to the metaset command.
5. Repeat Step c through Step d for each remaining disk set that uses mediators.

If you are running the Sun Cluster HA for Sun Java System Application Server EE (HADB) data service with Sun Java System Application Server EE (HADB) software as of version 4.4, disable the HADB resource and shut down the HADB database.

If you are running a version of Sun Java System Application Server EE (HADB) software before 4.4, you can skip this step.

When one cluster partition is out of service during upgrade, there are not enough nodes in the active partition to meet HADB membership requirements. Therefore, you must stop the HADB database and disable the HADB resource before you begin to partition the cluster.
phys-schost# hadbm stop database-name phys-schost# scswitch -n -j hadb-resource
For more information, see the hadbm(1m) man page.

If you are upgrading a two-node cluster, skip to Step 16.

Otherwise, proceed to Step 8 to determine the partitioning scheme to use. You will determine which nodes each partition will contain, but interrupt the partitioning process. You will then compare the node lists of all resource groups against the node members of each partition in the scheme that you will use. If any resource group does not contain a member of each partition, you must change the node list.

Load the Sun Java Availability Suite DVD-ROM into the DVD-ROM drive.

If the volume management daemon vold(1M) is running and is configured to manage CD-ROM or DVD devices, the daemon automatically mounts the media on the /cdrom/cdrom0/ directory.

Change to the Solaris_arch/Product/sun_cluster/Solaris_ver/Tools/ directory, where arch is sparc or x86 (Solaris 10 only) and where ver is 9 for Solaris 9 or 10 for Solaris 10 .
phys-schost# cd /cdrom/cdrom0/Solaris_arch/Product/sun_cluster/Solaris_ver/Tools

Start the scinstall utility in interactive mode.
phys-schost# ./scinstall
Note –
Do not use the /usr/cluster/bin/scinstall command that is already installed on the node. You must use the scinstall command on the Sun Java Availability Suite DVD-ROM.

The scinstall Main Menu is displayed.

Type the number that corresponds to the option for Manage a dual-partition upgrade and press the Return key.

*** Main Menu ***

    Please select from one of the following (*) options:

        1) Create a new cluster or add a cluster node
        2) Configure a cluster to be JumpStarted from this install server
      * 3) Manage a dual-partition upgrade
      * 4) Upgrade this cluster node
      * 5) Print release information for this cluster node
 
      * ?) Help with menu options
      * q) Quit

    Option:  3

The Manage a Dual-Partition Upgrade Menu is displayed.

Type the number that corresponds to the option for Display and select possible partitioning schemes and press the Return key.

Follow the prompts to perform the following tasks:
1. Display the possible partitioning schemes for your cluster.
2. Choose a partitioning scheme.
3. Choose which partition to upgrade first.
  
  Note –
  Stop and do not respond yet when prompted, Do you want to begin the dual-partition upgrade?, but do not exit the scinstall utility. You will respond to this prompt in Step 18 of this procedure.

Make note of which nodes belong to each partition in the partition scheme.

On another node of the cluster, become superuser.

Ensure that any critical data services can switch over between partitions.

For a two-node cluster, each node will be the only node in its partition.

When the nodes of a partition are shut down in preparation for dual-partition upgrade, the resource groups that are hosted on those nodes switch over to a node in the other partition. If a resource group does not contain a node from each partition in its node list, the resource group cannot switch over. To ensure successful switchover of all critical data services, verify that the node list of the related resource groups contains a member of each upgrade partition.
1. Display the node list of each resource group that you require to remain in service during the entire upgrade.
  phys-schost# scrgadm -pv -g resourcegroup | grep "Res Group Nodelist"
  -p
  
  Displays configuration information.
  
  -v
  
  Displays in verbose mode.
  
  -g resourcegroup
  
  Specifies the name of the resource group.
2. If the node list of a resource group does not contain at least one member of each partition, redefine the node list to include a member of each partition as a potential primary node.
  phys-schost# scrgadm -a -g resourcegroup -h nodelist
  -a
  
  Adds a new configuration.
  
  -h
  
  Specifies a comma-separated list of node names.

Determine your next step.
- If you are upgrading a two-node cluster, return to Step 8 through Step 13 to designate your partitioning scheme and upgrade order.
  
  When you reach the prompt Do you want to begin the dual-partition upgrade?, skip to Step 18.
- If you are upgrading a cluster with three or more nodes, return to the node that is running the interactive scinstall utility.
  
  Proceed to Step 18.

At the interactive scinstall prompt Do you want to begin the dual-partition upgrade?, type Yes.

The command verifies that a remote installation method is available.

When prompted, press Enter to continue each stage of preparation for dual-partition upgrade.

The command switches resource groups to nodes in the second partition, and then shuts down each node in the first partition.

After all nodes in the first partition are shut down, boot each node in that partition into noncluster mode.

On SPARC based systems, perform the following command:
ok boot -x

On x86 based systems running the Solaris 9 OS, perform either of the following commands:

phys-schost# reboot -- -xs
or
...
                      <<< Current Boot Parameters >>>
Boot path: /pci@0,0/pci-ide@7,1/ata@1/cmdk@0,0:b
Boot args:

Type  b [file-name] [boot-flags] <ENTER>  to boot with options
or    i <ENTER>                           to enter boot interpreter
or    <ENTER>                             to boot with defaults

                  <<< timeout in 5 seconds >>>
Select (b)oot or (i)nterpreter: b -xs

On x86 based systems running the Solaris 10 OS, perform the following commands:

In the GRUB menu, use the arrow keys to select the appropriate Solaris entry and type e to edit its commands.

The GRUB menu appears similar to the following:

GNU GRUB version 0.95 (631K lower / 2095488K upper memory)
+-------------------------------------------------------------------------+
| Solaris 10 /sol_10_x86                                                  |
| Solaris failsafe                                                        |
|                                                                         |
+-------------------------------------------------------------------------+
Use the ^ and v keys to select which entry is highlighted.
Press enter to boot the selected OS, 'e' to edit the
commands before booting, or 'c' for a command-line.

For more information about GRUB based booting, see Chapter 11, GRUB Based Booting (Tasks), in System Administration Guide: Basic Administration.

In the boot parameters screen, use the arrow keys to select the kernel entry and type e to edit the entry.

The GRUB boot parameters screen appears similar to the following:

GNU GRUB version 0.95 (615K lower / 2095552K upper memory)
+----------------------------------------------------------------------+
| root (hd0,0,a)                                                       |
| kernel /platform/i86pc/multiboot                                     |
| module /platform/i86pc/boot_archive                                  |
+----------------------------------------------------------------------+
Use the ^ and v keys to select which entry is highlighted.
Press 'b' to boot, 'e' to edit the selected command in the
boot sequence, 'c' for a command-line, 'o' to open a new line
after ('O' for before) the selected line, 'd' to remove the
selected line, or escape to go back to the main menu.

Add -x to the command to specify that the system boot into noncluster mode.

[ Minimal BASH-like line editing is supported. For the first word, TAB
lists possible command completions. Anywhere else TAB lists the possible
completions of a device/filename. ESC at any time exits. ]

grub edit> kernel /platform/i86pc/multiboot -x

Press Enter to accept the change and return to the boot parameters screen.

The screen displays the edited command.

GNU GRUB version 0.95 (615K lower / 2095552K upper memory)
+----------------------------------------------------------------------+
| root (hd0,0,a)                                                       |
| kernel /platform/i86pc/multiboot -x                                  |
| module /platform/i86pc/boot_archive                                  |
+----------------------------------------------------------------------+
Use the ^ and v keys to select which entry is highlighted.
Press 'b' to boot, 'e' to edit the selected command in the
boot sequence, 'c' for a command-line, 'o' to open a new line
after ('O' for before) the selected line, 'd' to remove the
selected line, or escape to go back to the main menu.-

Type b to boot the node into noncluster mode.

Note –
This change to the kernel boot parameter command does not persist over the system boot. The next time you reboot the node, it will boot into cluster mode. To boot into noncluster mode instead, perform these steps to again to add the -x option to the kernel boot parameter command.

If any applications that are running in the second partition are not under control of the Resource Group Manager (RGM), create scripts to halt the applications before you begin to upgrade those nodes.

During dual-partition upgrade processing, these scripts would be called to stop applications such as Oracle RAC before the nodes in the second partition are halted.
1. Create the scripts that you need to stop applications that are not under RGM control.
  - Create separate scripts for those applications that you want stopped before applications under RGM control are stopped and for those applications that you want stop afterwards.
  - To stop applications that are running on more than one node in the partition, write the scripts accordingly.
  - Use any name and directory path for your scripts that you prefer.
2. Ensure that each node in the cluster has its own copy of your scripts.
3. On each node, modify the following Sun Cluster scripts to call the scripts that you placed on that node.
  - /etc/cluster/ql/cluster_pre_halt_apps - Use this file to call those scripts that you want to run before applications that are under RGM control are shut down.
  - /etc/cluster/ql/cluster_post_halt_apps - Use this file to call those scripts that you want to run after applications that are under RGM control are shut down.
  The Sun Cluster scripts are issued from one arbitrary node in the partition during post-upgrade processing of the partition. Therefore, ensure that the scripts on any node of the partition will perform the necessary actions for all nodes in the partition.

Next Steps

Upgrade software on each node in the first partition.

To upgrade Solaris software before you perform Sun Cluster software upgrade, go to How to Upgrade the Solaris OS and Volume Manager Software (Dual-Partition).
- If Sun Cluster 3.2 software does not support the release of the Solaris OS that you currently run on your cluster, you must upgrade the Solaris software to a supported release. See “Supported Products” in Sun Cluster 3.2 Release Notes for Solaris OS for more information.
- If Sun Cluster 3.2 software supports the release of the Solaris OS that you currently run on your cluster, further Solaris software upgrade is optional.
Otherwise, upgrade to Sun Cluster 3.2 software. Go to How to Upgrade Sun Cluster 3.2 Software (Dual-Partition).