Sun Cluster Upgrade Guide for Solaris OS

Completing a Cluster Upgrade

ProcedureHow to Commit the Upgraded Cluster to Sun Cluster 3.2 1/09 Software

Before You Begin

Ensure that all upgrade procedures are completed for all cluster nodes that you are upgrading.

  1. From one node, check the upgrade status of the cluster.


    phys-schost# scversions
    
  2. From the following table, perform the action that is listed for the output message from Step 1.

    Output Message 

    Action 

    Upgrade commit is needed.

    Proceed to Step 3.

    Upgrade commit is NOT needed. All versions match.

    Go to How to Verify Upgrade of Sun Cluster 3.2 1/09 Software.

    Upgrade commit cannot be performed until all cluster nodes are upgraded. Please run scinstall(1m) on cluster nodes to identify older versions.

    Return to the Sun Cluster upgrade procedures that you used and upgrade the remaining cluster nodes. 

    Check upgrade cannot be performed until all cluster nodes are upgraded. Please run scinstall(1m) on cluster nodes to identify older versions.

    Return to the Sun Cluster upgrade procedures that you used and upgrade the remaining cluster nodes. 

  3. After all nodes have rejoined the cluster, from one node commit the cluster to the upgrade.


    phys-schost# scversions -c
    

    Committing the upgrade enables the cluster to utilize all features in the newer software. New features are available only after you perform the upgrade commitment.

  4. From one node, verify that the cluster upgrade commitment has succeeded.


    phys-schost# scversions
    Upgrade commit is NOT needed. All versions match.
Next Steps

Go to How to Verify Upgrade of Sun Cluster 3.2 1/09 Software.

ProcedureHow to Verify Upgrade of Sun Cluster 3.2 1/09 Software

Perform this procedure to verify that the cluster is successfully upgraded to Sun Cluster 3.2 1/09 software. On the Solaris 10 OS, perform all steps from the global zone only.

Before You Begin
  1. On each node, become superuser.

  2. On each upgraded node, view the installed levels of Sun Cluster software.


    phys-schost# clnode show-rev -v
    

    The first line of output states which version of Sun Cluster software the node is running. This version should match the version that you just upgraded to.

  3. From any node, verify that all upgraded cluster nodes are running in cluster mode (Online).


    phys-schost# clnode status
    

    See the clnode(1CL) man page for more information about displaying cluster status.

  4. SPARC: If you upgraded from Solaris 8 to Solaris 9 software, verify the consistency of the storage configuration.

    1. On each node, run the following command to verify the consistency of the storage configuration.


      phys-schost# cldevice check
      

      Caution – Caution –

      Do not proceed to Step b until your configuration passes this consistency check. Failure to pass this check might result in errors in device identification and cause data corruption.


      The following table lists the possible output from the cldevice check command and the action you must take, if any.

      Example Message 

      Action 

      device id for 'phys-schost-1:/dev/rdsk/c1t3d0' does not match physical device's id, device may have been replaced

      Go to Chapter 7, Recovering From an Incomplete Upgrade and perform the appropriate repair procedure.

      device id for 'phys-schost-1:/dev/rdsk/c0t0d0' needs to be updated, run cldevice repair to update

      None. You update this device ID in Step b.

      No output message 

      None. 

      See the cldevice(1CL) man page for more information.

    2. On each node, migrate the Sun Cluster storage database to Solaris 9 device IDs.


      phys-schost# cldevice repair
      
    3. On each node, run the following command to verify that storage database migration to Solaris 9 device IDs is successful.


      phys-schost# cldevice check
      
      • If the cldevice command displays a message, return to Step a to make further corrections to the storage configuration or the storage database.

      • If the cldevice command displays no messages, the device-ID migration is successful. When device-ID migration is verified on all cluster nodes, proceed to How to Finish Upgrade to Sun Cluster 3.2 1/09 Software.


Example 6–1 Verifying Upgrade to Sun Cluster 3.2 1/09 Software

The following example shows the commands used to verify upgrade of a two-node cluster to Sun Cluster 3.2 1/09 software. The cluster node names are phys-schost-1 and phys-schost-2.


phys-schost# clnode show-rev -v
3.2
…
phys-schost# clnode status
=== Cluster Nodes ===

--- Node Status ---

Node Name                                          Status
---------                                          ------
phys-schost-1                                      Online
phys-schost-2                                      Online

Next Steps

Go to How to Finish Upgrade to Sun Cluster 3.2 1/09 Software.

ProcedureHow to Finish Upgrade to Sun Cluster 3.2 1/09 Software

Perform this procedure to finish Sun Cluster upgrade. On the Solaris 10 OS, perform all steps from the global zone only.

Before You Begin

Ensure that all steps in How to Verify Upgrade of Sun Cluster 3.2 1/09 Software are completed.

  1. Copy the security files for the common agent container to all cluster nodes.

    This step ensures that security files for the common agent container are identical on all cluster nodes and that the copied files retain the correct file permissions.

    1. On each node, stop the Sun Java Web Console agent.


      phys-schost# /usr/sbin/smcwebserver stop
      
    2. On each node, stop the security file agent.


      phys-schost# /usr/sbin/cacaoadm stop
      
    3. On one node, change to the /etc/cacao/instances/default/ directory.


      phys-schost-1# cd /etc/cacao/instances/default/
      
    4. Create a tar file of the /etc/cacao/instances/default/ directory.


      phys-schost-1# tar cf /tmp/SECURITY.tar security
      
    5. Copy the /tmp/SECURITY.tar file to each of the other cluster nodes.

    6. On each node to which you copied the /tmp/SECURITY.tar file, extract the security files.

      Any security files that already exist in the /etc/cacao/instances/default/ directory are overwritten.


      phys-schost-2# cd /etc/cacao/instances/default/
      phys-schost-2# tar xf /tmp/SECURITY.tar
      
    7. Delete the /tmp/SECURITY.tar file from each node in the cluster.

      You must delete each copy of the tar file to avoid security risks.


      phys-schost-1# rm /tmp/SECURITY.tar
      phys-schost-2# rm /tmp/SECURITY.tar
      
    8. On each node, start the security file agent.


      phys-schost# /usr/sbin/cacaoadm start
      
    9. On each node, start the Sun Java Web Console agent.


      phys-schost# /usr/sbin/smcwebserver start
      
  2. If you upgraded any data services that are not supplied on the product media, register the new resource types for those data services.

    Follow the documentation that accompanies the data services.

  3. If you upgraded Sun Cluster HA for SAP liveCache from the Sun Cluster 3.0 or 3.1 version to the Sun Cluster 3.2 version, modify the /opt/SUNWsclc/livecache/bin/lccluster configuration file.

    1. Become superuser on a node that will host the liveCache resource.

    2. Copy the new /opt/SUNWsclc/livecache/bin/lccluster file to the /sapdb/LC_NAME/db/sap/ directory.

      Overwrite the lccluster file that already exists from the previous configuration of the data service.

    3. Configure this /sapdb/LC_NAME/db/sap/lccluster file as documented in How to Register and Configure Sun Cluster HA for SAP liveCache in Sun Cluster Data Service for SAP liveCache Guide for Solaris OS.

  4. If you upgraded the Solaris OS and your configuration uses dual-string mediators for Solaris Volume Manager software, restore the mediator configurations.

    1. Determine which node has ownership of a disk set to which you will add the mediator hosts.


      phys-schost# metaset -s setname
      
      -s setname

      Specifies the disk set name.

    2. On the node that masters or will master the disk set, become superuser.

    3. If no node has ownership, take ownership of the disk set.


      phys-schost# cldevicegroup switch -n node devicegroup
      
      node

      Specifies the name of the node to become primary of the disk set.

      devicegroup

      Specifies the name of the disk set.

    4. Re-create the mediators.


      phys-schost# metaset -s setname -a -m mediator-host-list
      
      -a

      Adds to the disk set.

      -m mediator-host-list

      Specifies the names of the nodes to add as mediator hosts for the disk set.

    5. Repeat these steps for each disk set in the cluster that uses mediators.

  5. If you upgraded VxVM, upgrade all disk groups.

    1. Bring online and take ownership of a disk group to upgrade.


      phys-schost# cldevicegroup switch -n node devicegroup
      
    2. Run the following command to upgrade a disk group to the highest version supported by the VxVM release you installed.


      phys-schost# vxdg upgrade dgname
      

      See your VxVM administration documentation for more information about upgrading disk groups.

    3. On each node that is directly connected to the disk group, bring online and take ownership of the upgraded disk group.


      phys-schost# cldevicegroup switch -n node dgname
      

      This step is necessary is to update the VxVM device files major number with the latest vxio number that might have been assigned during the upgrade.

    4. Repeat for each remaining VxVM disk group in the cluster.

  6. Migrate resources to new resource type versions.

    You must migrate all resources to the Sun Cluster 3.2 resource-type version.


    Note –

    For Sun Cluster HA for SAP Web Application Server, if you are using a J2EE engine resource or a web application server component resource or both, you must delete the resource and recreate it with the new web application server component resource. Changes in the new web application server component resource includes integration of the J2EE functionality. For more information, see Sun Cluster Data Service for SAP Web Application Server Guide for Solaris OS.


    See Upgrading a Resource Type in Sun Cluster Data Services Planning and Administration Guide for Solaris OS, which contains procedures which use the command line. Alternatively, you can perform the same tasks by using the Resource Group menu of the clsetup utility. The process involves performing the following tasks:

    • Registering the new resource type.

    • Migrating the eligible resource to the new version of its resource type.

    • Modifying the extension properties of the resource type as specified in Sun Cluster Release Notes.


      Note –

      The Sun Cluster 3.2 1/09 release might introduce new default values for some extension properties. These changes affect the behavior of any existing resource that uses the default values of such properties. If you require the previous default value for a resource, modify the migrated resource to set the property to the previous default value.


  7. If your cluster runs the Sun Cluster HA for Sun Java System Application Server EE (HADB) data service and you shut down the HADB database before you began a dual-partition upgrade, re-enable the resource and start the database.


    phys-schost# clresource enable hadb-resource
    phys-schost# hadbm start database-name
    

    For more information, see the hadbm(1m) man page.

  8. If you upgraded to the Solaris 10 OS and the Apache httpd.conf file is located on a cluster file system, ensure that the HTTPD entry in the Apache control script still points to that location.

    1. View the HTTPD entry in the /usr/apache/bin/apchectl file.

      The following example shows the httpd.conf file located on the /global cluster file system.


      phys-schost# cat /usr/apache/bin/apchectl | grep HTTPD=/usr
      HTTPD="/usr/apache/bin/httpd -f /global/web/conf/httpd.conf"
    2. If the file does not show the correct HTTPD entry, update the file.


      phys-schost# vi /usr/apache/bin/apchectl
      #HTTPD=/usr/apache/bin/httpd
      HTTPD="/usr/apache/bin/httpd -f /global/web/conf/httpd.conf"
      
  9. If the cluster runs on the Solaris 10 OS and you intend to configure zone clusters, change the private-network IP address range.

    Specify the number of zone clusters that you expect to configure in the cluster.


    phys-schost# cluster set net-props num_zonecluster=N
    

    The command calculates the number of additional private-network IP addresses that are needed and automatically modifies the IP address range.

  10. From any node, start the clsetup utility.


    phys-schost# clsetup
    

    The clsetup Main Menu is displayed.

  11. Re-enable all disabled resources.

    1. Type the option number for Resource Groups and press the Return key.

      The Resource Group Menu is displayed.

    2. Type the option number for Enable/Disable a Resource and press the Return key.

    3. Choose a resource to enable and follow the prompts.

    4. Repeat Step c for each disabled resource.

    5. When all resources are re-enabled, type q to return to the Resource Group Menu.

  12. Bring each resource group back online.

    This step includes the bringing online of resource groups in non-global zones.

    1. Type the option number for Online/Offline or Switchover a Resource Group and press the Return key.

    2. Follow the prompts to put each resource group into the managed state and then bring the resource group online.

  13. When all resource groups are back online, exit the clsetup utility.

    Type q to back out of each submenu, or press Ctrl-C.

  14. If, before upgrade, you enabled automatic node reboot if all monitored disk paths fail, ensure that the feature is still enabled.

    Also perform this task if you want to configure automatic reboot for the first time.

    1. Determine whether the automatic reboot feature is enabled or disabled.


      phys-schost# clnode show
      
      • If the reboot_on_path_failure property is set to enabled, no further action is necessary.

      • If reboot_on_path_failure property is set to disabled, proceed to the next step to re-enable the property.

    2. Enable the automatic reboot feature.


      phys-schost# clnode set -p reboot_on_path_failure=enabled
      
      -p

      Specifies the property to set

      reboot_on_path_failure=enable

      Specifies that the node will reboot if all monitored disk paths fail, provided that at least one of the disks is accessible from a different node in the cluster.

    3. Verify that automatic reboot on disk-path failure is enabled.


      phys-schost# clnode show
      === Cluster Nodes ===                          
      
      Node Name:                                      node
      …
        reboot_on_path_failure:                          enabled
      …
  15. (Optional) Capture the disk partitioning information for future reference.


    phys-schost# prtvtoc /dev/rdsk/cNtXdYsZ > filename
    

    Store the file in a location outside the cluster. If you make any disk configuration changes, run this command again to capture the changed configuration. If a disk fails and needs replacement, you can use this information to restore the disk partition configuration. For more information, see the prtvtoc(1M) man page.

  16. (Optional) Make a backup of your cluster configuration.

    An archived backup of your cluster configuration facilitates easier recovery of the your cluster configuration,

    For more information, see How to Back Up the Cluster Configuration in Sun Cluster System Administration Guide for Solaris OS.

  17. (Optional) Install or complete upgrade of Sun Cluster Geographic Edition 3.2 11/07 software.

    See Sun Cluster Geographic Edition Installation Guide.

Troubleshooting

Resource-type migration failure - Normally, you migrate resources to a new resource type while the resource is offline. However, some resources need to be online for a resource-type migration to succeed. If resource-type migration fails for this reason, error messages similar to the following are displayed:

phys-schost - Resource depends on a SUNW.HAStoragePlus type resource that is not online anywhere. (C189917) VALIDATE on resource nfsrs, resource group rg, exited with non-zero exit status. (C720144) Validation of resource nfsrs in resource group rg on node phys-schost failed.

If resource-type migration fails because the resource is offline, use the clsetup utility to re-enable the resource and then bring its related resource group online. Then repeat migration procedures for the resource.

Java binaries location change - If the location of the Java binaries changed during the upgrade of shared components, you might see error messages similar to the following when you attempt to run the cacaoadm start or smcwebserver start commands:

phys-schost# /opt/SUNWcacao/bin/cacaoadm startNo suitable Java runtime found. Java 1.4.2_03 or higher is required.Jan 3 17:10:26 ppups3 cacao: No suitable Java runtime found. Java 1.4.2_03 or higher is required.Cannot locate all the dependencies

phys-schost# smcwebserver start/usr/sbin/smcwebserver: /usr/jdk/jdk1.5.0_04/bin/java: not found

These errors are generated because the start commands cannot locate the current location of the Java binaries. The JAVA_HOME property still points to the directory where the previous version of Java was located, but that previous version was removed during upgrade.

To correct this problem, change the setting of JAVA_HOME in the following configuration files to use the current Java directory:

/etc/webconsole/console/config.properties/etc/opt/SUNWcacao/cacao.properties

Next Steps

If you have a SPARC based system and use Sun Management Center to monitor the cluster, go to SPARC: How to Upgrade Sun Cluster Module Software for Sun Management Center.

Otherwise, the cluster upgrade is complete.