Sun Cluster Geographic Edition 3.1 8/05 Release Notes

Sun Cluster Geographic Edition 3.1 8/05 Release Notes

This document provides the following information for SunTM Cluster Geographic Edition 3.1 8/05 software.

Supported Products

This section describes the supported software for Sun Cluster Geographic Edition software.

Table 1 Supported Products

Software 

Version 

Solaris Operating System 

8, 9, and 10 (SPARC® edition)

Sun Cluster 

3.1, 8/05 

Volume management software 

  • For use with Sun StorEdge Availability Suite 3.2.1: Solaris Volume Manager software on Solaris 9, Solstice Disk Suite on Solaris 8, or VERITAS Volume Manager software

  • For use with Hitachi TrueCopy: VERITAS Volume Manager software

Data Replication 

  • Sun StorEdge Availability Suite 3.2.1


    Note –

    Sun StorEdge Availability Suite 3.2.1 is not supported on Solaris OS 10.


  • Hitachi TrueCopy RAID Manager/Solaris Version 01–10–03/02

Sun Cluster Geographic Edition 

3.1, 8/05 

Limitations on Running Sun Cluster Geographic Edition Software on Solaris OS 10

Code for pthread_atfork() Might Cause a Child Process to Hang (6276483)

Problem Summary: Authentication subprocesses might hang in the atfork() handler for the pkcs11 library.

Workaround: The security.provider.1 line in the /usr/jdk/instances/jdk1.5.0/jre/lib/security/java.security file contains the following entry:


security.provider.1=sun.security.pkcs11.SunPKCS11 
${java.home}/lib/security/sunp 
kcs11-solaris.cfg

Change the preceding lines on all nodes of both partner clusters to the following:


security.provider.1=sun.security.provider.Sunn

The Sun Cluster Geographic Edition Installer Does Not Work on Solaris OS 10 (6350105)

Problem Summary: You cannot use the Sun Cluster Geographic Edition installer to install the Sun Cluster Geographic Edition software on Solaris OS 10.

Workaround: You must install all the packages on the Sun Cluster Geographic Edition software CD on every node of both clusters by using the pkgadd(1M) command.

ProcedureHow to Install the Sun Cluster Geographic Edition Software Packages

You must perform this task on all nodes of all the clusters in your geographically separated cluster.

Before You Begin

Before you begin to install software, be sure to read Chapter 1, Planning the Sun Cluster Geographic Edition Installation, in Sun Cluster Geographic Edition Installation Guide.

  1. Change to the directory that contains the Sun Cluster Geographic Edition software packages for Solaris OS 10.


    # cd cd-root/suncluster_geographic_1_0/Solaris_sparc/Product/\
    sun_cluster_geo/Solaris_9/Packages
    
  2. Install the following Sun Cluster Geographic Edition software packages in the following order by using the pkgadd -G -d . package_name command:


    Note –

    The Sun Cluster Geographic Edition software packages must be installed in the global zone only.


    • SUNWscmautil: Sun Cluster Management Agent utilities

    • SUNWscmautilr: Sun Cluster Management Agent utilities for root

    • SUNWscghb: Sun Cluster Geographic Edition heartbeats

    • SUNWscghbr: Sun Cluster heartbeats for root

    • SUNWscgctl: Control management agents

    • SUNWscgctlr: Control management agents for root

    • If you are using Hitachi TrueCopy data replication:

      • SUNWscgreptc: Hitachi TrueCopy data replication

      • SUNWscgreptcu : Hitachi TrueCopy data replication for usr

    • SUNWscgspm: SunPlex Manager Extensions

    • SUNWscgman: Sun Cluster Geographic Edition man pages


    Note –

    Sun StorEdge Availability Suite 3.2.1 is not supported on Solaris OS 10. If you are running Solaris OS 10, do not install the Sun Cluster Geographic Edition packages for Sun StorEdge Availability Suite 3.2.1 support.


    You can also install the following localization packages:

    • SUNWcscgctl: Simplified Chinese Control Agents

    • If you are using Sun StorEdge Availability Suite 3.2 software data replication, SUNWcscgrepavsu: Simplified Chinese Sun StorEdge Availability Suite data replication for usr

    • If you are using Hitachi TrueCopy data replication, SUNWcscgreptcu: Simplified Chinese Hitachi TrueCopy data replication for usr

    • SUNWcscgspm: Simplified Chinese SunPlex Manager Extensions

    • SUNWjscgctl: Japanese Sun Cluster Geographic Edition Control Agents

    • SUNWjscgman: Japanese Sun Cluster Geographic Edition man pages

    • If you are using Sun StorEdge Availability Suite 3.2 software data replication, SUNWjscgrepavsu: Japanese Sun StorEdge Availability Suite data replication for usr

    • If you are using Hitachi TrueCopy data replication, SUNWjscgreptcu: Japanese Hitachi TrueCopy data replication for usr

    • SUNWjscgspm: Japanese SunPlex Manager Extensions

    • SUNWkscgctl: Korean Sun Cluster Geographic Edition Control Agents

    • If you are using Sun StorEdge Availability Suite 3.2 software data replication, SUNWkscgrepavsu: Korean Sun StorEdge Availability Suite data replication for usr

    • If you are using Hitachi TrueCopy data replication, SUNWkscgreptcu: Korean Hitachi TrueCopy data replication for usr

    • SUNWkscgspm: Korean SunPlex Manager Extensions


    # pkgadd -G -d . SUNWscmautil
    # pkgadd -G -d . SUNWscmautilr
    # pkgadd -G -d . SUNWscghb
    # pkgadd -G -d . SUNWscghbr
    # pkgadd -G -d . SUNWscgctl
    # pkgadd -G -d . SUNWscgctlr
    # pkgadd -G -d . SUNWscgreptc
    # pkgadd -G -d . SUNWscgreptcu
    # pkgadd -G -d . SUNWscgspm
    # pkgadd -G -d . SUNWscgman
    

ProcedureHow to Uninstall the Sun Cluster Geographic Edition Software

You must perform this task on all nodes of all the clusters in your geographically separated cluster.

  1. Become superuser on the node or cluster where you intend to uninstall the Sun Cluster Geographic Edition software.


    % su
  2. Stop the Sun Cluster Geographic Edition infrastructure on the local cluster.


    # geoadm stop
    

    For more information on disabling the Sun Cluster Geographic Edition software on a cluster, see Disabling the Sun Cluster Geographic Edition Software in Sun Cluster Geographic Edition System Administration Guide.

  3. Uninstall Sun Cluster Geographic Edition software packages from the cluster using the pkgrm(1M) command.

    Ensure that you also uninstall any localization packages you have installed.


    Note –

    You must uninstall the packages in the reverse order of the installation.



    # pkgrm SUNWscgman
    # pkgrm SUNWscgspm
    # pkgrm SUNWscgreptcu
    # pkgrm SUNWscgreptc
    # pkgrm SUNWscgctlr
    # pkgrm SUNWscgctl
    # pkgrm SUNWscghbr
    # pkgrm SUNWscghb
    # pkgrm SUNWscmautilr
    # pkgrm SUNWscmautil
    
  4. Verify that Sun Cluster Geographic Edition software has been removed.


    # pkginfo | grep -i geo
    

The SUNWscmasa Package Replaced by SUNWscmasau and SUNWscmasar Packages (6354491)

Problem Summary: The Sun Cluster Geographic Edition software has a dependency on the SUNWscmasa package. During the Sun Cluster Geographic Edition installation process, you will get a warning message that states that the SUNWscmasa package is missing.

Workaround: In Solaris OS 10, the SUNWscmasa package has been replaced by two packages, SUNWscmasau and SUNWscmasar.

Ignore the warning message for the missing package SUNWscmasa and continue installing the Sun Cluster Geographic Edition packages with the pkgadd command.

Update Sun Cluster Geographic Edition Packages to Work with Solaris Zones (6364022)

Problem Summary: When a local zone is created, Sun Cluster Geographic Edition packages are copied and processes might fail due to Sun Cluster components that are not available on local zone.

Workaround: Install from the global zone using the -G option of the pkgadd command. For more information on the pkgadd command, see pkgadd(1M).


Note –

The Sun Cluster Geographic Edition software packages must be installed in the global zone only.


Known Issues and Bugs

The following known issues and bugs affect the operation of the Sun Cluster Geographic Edition 3.1 8/05 release.

Extended SunPlex Manager GUI Restrictions

Problem Summary: You cannot delete a protection group that contains device groups.

Workaround: To delete a protection group that contains device groups by using the GUI, delete the device groups individually first. Then, delete the protection group.

Writing to java.util.logging.ErrorManager Results in Common Agent Container Logging Error (5081674)

Problem Summary: The java.io.InterruptedIOException error message appears when logging to common agent container log file java.util.logging.ErrorManager.

Workaround: This exception is harmless and can safely be ignored.

Sun Cluster Geographic Edition Infrastructure Might Remain Offline After a Cluster Reboot (6218200)

Problem Summary: The Sun Cluster Geographic Edition infrastructure might remain offline after a cluster is rebooted.

Workaround:

If the Sun Cluster Geographic Edition infrastructure is offline after a cluster reboot, restart the Sun Cluster Geographic Edition infrastructure by using the geoadm start command.

No RBAC Support for GUI (6226493)

Problem Summary: The GUI does not support RBAC.

Workaround: Invoke the GUI as root on the local cluster.

GUI Requires Same Root Password on Partner Clusters (6260505)

Problem Summary: To use the root password to access the SunPlex Manager GUI, the root password must be the same on all nodes of both clusters.

Workaround: Ensure that the root password is the same on every node of both clusters.

Partner Clusters in Different Domains Cannot Include a Domain Name With the Cluster Name (6260506)

Problem Summary: Partner clusters in different domains cannot include domain name with cluster name.

Workaround: Specify the partner cluster name with the IP of the logical hostname for the partner cluster in the /etc/hosts file of each node on local cluster. See also bug 6252467.


Note –

Manually updating the /etc/hosts file might result in conflicts with local domain machines of the same name.


A Custom Heartbeat Must Exist on Both the Remote and Local Cluster Before the Heartbeat Can Join a Partnership (6263692)

Problem Summary: If a partnership is created on a remote cluster by using a custom heartbeat, then a heartbeat by the same name must exist on the local cluster before it can join the partnership. You cannot create a heartbeat by using the GUI, so the appropriate heartbeat will not be available to choose on the Join Partnership page.

Workaround: Use the CLI to create the custom heartbeat, and then use either CLI or GUI to join the partnership.

Communication Loss Between Node and Storage Device Might Result in Error State (6269186)

Problem Summary: When the sysevent daemon crashes, the cluster status goes to Error and heartbeat status goes to No Reponse.

Workaround: Restart the sysevent daemon and restart the Sun Cluster Geographic Edition infrastructure as follows.

ProcedureHow to Restart the Sun Cluster Geographic Edition Infrastructure

  1. Disable the Sun Cluster Geographic Edition software.


    phys-paris-1# geoadm stop
  2. On one node of the cluster, enable the Sun Cluster Geographic Edition infrastructure.


    phys-paris-1# geoadm start
See Also

For more information about the geoadm command, see the geoadm(1M) man page.

Cluster Status is Error When the sysevent Daemon Crashes (6276483)

Problem Summary: When the sysevent daemon crashes, the cluster status goes to Error and heartbeat status goes to No Reponse.

Workaround: Restart the sysevent daemon and restart the Sun Cluster Geographic Edition infrastructure as follows.

ProcedureHow to Restart the Sun Cluster Geographic Edition Infrastructure

  1. Disable the Sun Cluster Geographic Edition software.


    phys-paris-1# geoadm stop
  2. On one node of the cluster, enable the Sun Cluster Geographic Edition infrastructure.


    phys-paris-1# geoadm start
See Also

For more information about the geoadm command, see the geoadm(1M) man page.

Unclear Error Message When Protection Group Start Times Out (6284278)

Problem Summary: When the geopg start command times out, the following message appears: “Waiting response timeout: 100000.” This message does not clearly state that the operation has timed out. Also, the timeout period is stated in milliseconds instead of seconds.

Workaround: None.

When geo-failovercontrol Resource Goes to STOP_FAILED State the Resource Times Out (6288257)

Problem Summary: When the common agent container hangs or is very slow to respond, for example, because of high system loads, then the geo-failovercontrol stop method times out. This time out results in the geo-failovercontrol resource going into the STOP_FAILED state.

Workaround: This problem should be rare because the stop_timeout period is relatively large, 10 minutes. However, if the geo-failovercontrol resource is in the STOP_FAILED state, use the following procedure to recover and enable the Sun Cluster Geographic Edition infrastructure.

Activated Protection Groups Deactivated and Resource Groups in an Error State After a Cluster Reboot (6289463)

Problem Summary: A protection group is activated on the primary cluster with the resource group in an OK state. If the primary cluster is rebooted, when the cluster comes back up the protection group is in a deactivated state and the resource group is in an Error state.

Workaround: During a failback-switchover, before synchronizing the partnership as described in step 1a of the procedure, the protection group must be deactivated:


# geopg stop -e Local  protection-group-name
-e Local

Specifies the scope of the command

By specifying a local scope, the command operates on the local cluster only.

protection-group-name

Specifies the name of the protection group

If the protection group is already deactivated, the state of the resource group in the protection group is probably Error. The state is Error because the application resource groups are managed and offline.

Deactivating the protection group will result in the application resource groups no longer being managed, clearing the Error state.

For the complete procedure, see How to Perform a Failback-Switchover on a System That Uses Sun StorEdge Availability Suite 3.2.1 Replication in Sun Cluster Geographic Edition System Administration Guide.

Incorrect Message When Adding Resource Group to Protection Group (6290256)

Problem Summary: When application resource groups are added to a protection group, you might see a message that states that the application resource group and lightweight resource group must be in the same protection group. This messages indicates that the application resource group must be in the same protection group as the device group that is controlled by the lightweight resource group.

Regardless of the message, do not add the lightweight resource group to the protection group because the lightweight resource group is managed by the Sun Cluster Geographic Edition software.

Workaround: None.

Pulling Public Network From a Node Mastering Device Groups Controlled by Sun StorEdge Availability Suite 3.2.1 and Sun Cluster Geographic Edition Infrastructure Resource Groups Results in the Node Being Aborted (6291382)

Problem Summary: Pulling public network from node mastering device groups controlled by Sun StorEdge Availability Suite 3.2.1 and Sun Cluster Geographic Edition infrastructure resource groups and resources results in that node losing the public network and being aborted.

Workaround: None.

A Failed Switchover for Hitachi TrueCopy Leaves Pairs Within dev_group With Unmatched Volume Status (6295537)

Problem Summary: The switchover procedures currently documented in Hitachi TrueCopy CCI guide are correct; however, when a switchover fails because of a SVOL-SSUS takeover, the dev_group might result in unmatched volume status which causes pairvolchk and pairsplit commands to fail.

Workaround: To bring dev_group to matched volume status, bring pairs within a dev_group to matched volume status. The commands to be used to bring the pairs to matched volume status depend on the current pair state, and which cluster's volumes the user want to make primary (bring application up on). Refer to the Hitachi TrueCopy CCI guide for Hitachi TrueCopy command set. Then, complete the procedure in Recovering From a Switchover Failure on a System That Uses Hitachi TrueCopy Replication in Sun Cluster Geographic Edition System Administration Guide.

Hitachi TrueCopy CCI Commands and Hitachi TrueCopy Resources Report that Remote horcmd Is Not Alive Even When It Is Alive and Responding (6297384)

Problem Summary: When a cluster node has two or more network addresses on different subnets for communication, IP_address in the /etc/horcm.conf file must be set to NONE. You must set the IP_address field to NONE even if the network addresses belong to the same subnet.

If the IP_address field is not set to NONE, Hitachi TrueCopy commands could respond unpredictably with the timeout error ENORMT, even though the remote process horcmd is alive and responding.

Workaround: Update the SUNW.GeoCtlTC resource time out values if the default Hitachi TrueCopy time out value has changed in the /etc/horcm.conf file. The default Hitachi TrueCopy time out value in /etc/horcm.conf is 3000(10ms), which is 30 seconds.

The SUNW.GeoCtlTC resources that have been created by the Sun Cluster Geographic Edition environment also have the default time out set at 3000(10ms).

If the default Hitachi TrueCopy time out value has changed in /etc/horcm.conf, the resource time out values must be updated according to algorithm discussed below. You should not change the default time out values for /etc/horcm.conf and Hitachi TrueCopy resources unless the situation demands otherwise.

The following equations establish an upper limit on the time it takes for a Hitachi TrueCopy command to time out based on various factors:


Note –

Units appear in seconds in the following equation.


For example, if horctimeout were set to 30, and numhosts is set to 2, and numretries is set to 2, then Upper-limit-on-timeout would be 120.

Based on value of Upper-limit-on-timeout, the following resource time out values should be set. A minimum of 60 should be specified as a buffer, to allow for processing of other commands.


Validate_timeout = Upper-limit-on-timeout + 60
Update_timeout = Upper-limit-on-timeout + 60
Monitor_Check_timeout = Upper-limit-on-timeout + 60
Probe_timeout = Upper-limit-on-timeout + 60
Retry_Interval = (Prote_timeout + Thorough_probe_interval) + 60

The other time out parameters in the resource should contain default values.

To change the time out values, complete the following steps:

  1. Bring the resource group offline by using the scswitch command.

  2. Update the required timeout properties by using the scrgadm command.

  3. Bring the resource group online by using the scswitch command.

Traversing Dependencies Consumes System Resources (6297751)

Problem Summary: Traversing dependencies consumes a lot of system resources.

Workaround: None.

Protection Group Switchover Fails Without Apparent Reason and Does Not Report Reason for Failure (6299103)

Problem Summary: Sometimes the geopg switchover command fails and does not state the reason for failure.

Workaround: Follow the procedure in Recovering From a Switchover Failure on a System That Uses Hitachi TrueCopy Replication in Sun Cluster Geographic Edition System Administration Guide.

The GUI Does Not Always Return the Result of Creating or Adding a Device Groups to a Protection Group (6300168)

Problem Summary: If creating or adding a device group to a protection group takes longer than the timeout period allowed within the browser, the GUI might not refresh when the operation does complete.

Workaround: You can either navigate to the partnership page in the GUI or use the command geopg list to see the result of the operation.

CLI Command Hangs If the Node Where the Geocontrol Module Is Active Reboots While the Command Is Running (6300616)

Problem Summary: The process cacaocsc hangs sometimes when the server side socket is partially closed or broken. See also bug 6304065.

Workaround: Exit out of the command by using Ctrl+C or the kill command.

Restarting the Common Agent Container While a Switchover Is in Progress Results in CRITICAL INTERNAL ERROR Error (6302009)

Problem Summary: When a cluster encounters a failure during the switchover process, such as node mastering infrastructure resource group losing power, an unclear message is returned.

Workaround: None.

GUI Does Not Refresh Protection Group Status Change (6302217)

Problem Summary: Configuration and state changes of entities on a page displayed in the GUI should cause the page to be refreshed automatically. Sometimes the refresh does not take place.

Workaround: Use the navigation tree to navigate to a different page then return to the original page. It will be refreshed on reload.

Performing Two or More Operations that Update the Sun StorEdge Availability Suite 3.2.1 Configuration Database Simultaneously Might Corrupt the Configuration Database (6303883)

Problem Summary: You must not perform two or more operations that update the Sun StorEdge Availability Suite 3.2.1 configuration database simultaneously in the Sun Cluster environment.

When the Sun Cluster Geographic Edition software is running, you must not perform two or more of the following commands simultaneously on different protection groups with data replicated by Sun StorEdge Availability Suite 3.2.1:

For example, running the geopg start pg1 command and geopg switchover pg2 command simultaneously might corrupt the Sun StorEdge Availability Suite 3.2.1 configuration database.


Note –

Sun StorEdge Availability Suite 3.2.1 is not supported on Solaris OS 10. If you are running Solaris OS 10, do not install the Sun Cluster Geographic Edition packages for Sun StorEdge Availability Suite 3.2.1 support.


Workaround: For Sun Cluster configurations consisting of two or more nodes, you must enable the Sun StorEdge Availability Suite 3.2.1 dscfglockd daemon process on all of the nodes of both partner clusters. You do not need to enable this daemon for Sun Cluster configurations consisting of only a single node.

To enable the dscfglockd daemon process , complete the following procedure on all nodes of both partner clusters.

ProcedureHow to Enable the Sun StorEdge Availability Suite 3.2.1 dscfglockd Daemon Process

  1. Ensure that the Sun StorEdge Availability Suite 3.2.1 product has been installed as instructed in the Sun StorEdge Availability Suite 3.2.1 product documentation.

  2. Ensure that the Sun StorEdge Availability Suite 3.2.1 product has been patched with the latest patches available on SunSolve at http://sunsolve.sun.com.

  3. Create a copy of the /etc/init.d/scm file.


    # cp /etc/init.d/scm /etc/init.d/scm.original
  4. Edit the/etc/init.d/scm file.

    Delete the comment character (#) and the comment “(turned off for 3.2)” from the following lines.


    # do_stopdscfglockd (turned off for 3.2)
    	# do_dscfglockd (turned off for 3.2)
  5. Save the edited file.

  6. If you do not need to reboot all the Sun Cluster nodes, then a system administrator with superuser privileges must run the following command on each node.


    # /usr/opt/SUNWscm/lib/dscfglockd \
    -f /var/opt/SUNWesm/dscfglockd.cf
Next Steps

If you require further assistance, contact your Sun service representative.

Protection Group Takeover and Switchover on Active Primary Cluster Causes Application Resource Groups to Be Recycled (6304781)

Problem Summary: Running the commands geopg takeover or geopg switchover on a primary cluster where the protection group has been activated, results in application resource groups in the protection group being taken offline and not managed, and then brought online again on the same cluster.

Workaround: None.

Unable to Start Sun Cluster Geographic Edition Infrastructure After Node Is Brought Down During geops create or geops join Operation Has Been Run (6305780)

Problem Summary: If you bring down a node while running the geops create or geops joincommand, you will not be able to restart the Sun Cluster Geographic Edition infrastructure.

Workaround: Contact your Sun service representative.

Protection Group Role and Data Replication Role Do Not Match When Protection Group Switchover Times Out (6306759)

Problem Summary: When the geopg switchover command times out, the protection group role might not match the data replication role. Despite this mismatch, the geoadm status command indicates that the configuration is in the OK state rather than the Error state.

Workaround: Validate the protection group again by using the geopg validate command on both clusters after a switchover or takeover times out.

Synchronization Status Should Be ERROR After a Failed Protection Group Takeover (6307131)

Problem Summary: When a takeover operation cannot change the role of the original primary cluster, the synchronization status should be ERROR.

Workaround: Resynchronize the protection group by using the geopg update command, and then validate the protection group on original primary cluster by using the geopg validate command.

No Error Message When a Takeover Operation Fails to Change Old Primary to Secondary (6309228)

Problem Summary: The geopg takeover command returns success, but the protection group is left as the primary on both clusters.

Workaround: None.

Common Agent Container Might Hang After It Has Been Running for a While (6383202)

Problem Summary: The Common Agent Container can hang after it has been running for a prolong period.

Workaround: None.

Patches and Required Firmware Levels

This section provides information about patches for Sun Cluster Geographic Edition configurations.


Note –

You must be a registered SunSolveTM user to view and download the required patches for the Sun Cluster Geographic Edition product. If you do not have a SunSolve account, contact your Sun service representative or sales engineer, or register online at http://sunsolve.sun.com.


You must have the following patches installed:

To use scalable resource groups in Sun Cluster Geographic Edition 3.1 8/05, you must also install the following patches:

Check with a Sun service representative for the availability of these patches.

Installing Patches

You must run the same patch levels for Sun Cluster and Common Agent Container on all nodes of both clusters.

The patch level for each node on which you have installed the Sun Cluster Geographic Edition software must meet the Sun Cluster patch-level requirements.

All nodes in one cluster must have the same version of the Sun Cluster Geographic Edition software and the same patch level.

To ensure that the patches have been installed properly, install the patches on your secondary cluster before you install the patches on the primary cluster.

ProcedureHow to Prepare the Cluster for Patch Installation

You must complete this procedure on only one node of each cluster.

  1. Ensure that the cluster is functioning properly.

    Ensure that Sun Cluster Geographic Edition is running properly.


    # geoadm status
    

    To view the current status of the cluster, run the following command from any node:


    # scstat
    

    See the scstat(1M) man page for more information.

    Search the /var/adm/messages log on the same node for unresolved error messages or warning messages.

    Check the volume manager status.

  2. On a node of the cluster, become root.


    % su
    
  3. Remove all application resource groups from protection groups.

    Highly available applications do not have downtime during the Sun Cluster Geographic Edition software patch installation.


    # geopg remove-resource-group resourcegroup protectiongroupname
    

    See the geopg(1M) man page for more information.

  4. Stop all protection groups that are active on the cluster.


    # geopg stop protectiongroupname -e local | global
    

    See the geopg(1M) man page for more information.

  5. Stop the Sun Cluster Geographic Edition infrastructure.


    # geoadm stop
    

    See the geoadm(1M) man page for more information.

Next Steps

Install the required patches for the Sun Cluster Geographic Edition software. Go to How to Install Patches.

ProcedureHow to Install Patches

Perform this procedure on all nodes of the cluster.

Patch the secondary cluster before you patch the primary cluster to permit testing.

Before You Begin

Perform the following tasks:

  1. Ensure that all the nodes are online and part of the cluster.

    To view the current status of the cluster, run the following command from any node:


    % scstat
    

    See the scstat(1M) man page for more information.

    Search the /var/adm/messages log on the same node for unresolved error messages or warning messages.

  2. Stop the Common Agent Container.


    # cacaoadm stop
    

    See the geoadm(1M) man page for more information.

  3. Install any necessary patches to support Sun Cluster Geographic Edition software by using the patchadd command.

  4. Start the Common Agent Container.


    # cacaoadm start
    
  5. After you have installed all required patches on all the nodes of all clusters, enable the Sun Cluster Geographic Edition software.


    # geoadm start
  6. Add all application resource groups you removed while you were preparing the cluster for a patch installation back to the protection group.


    # geopg add-resource-group resourcegroup protectiongroupname
    

    See the geopg(1M) man page for more information.

  7. Start all the protection groups that you have added.


    # geopg start  protectiongroupname -e local | global [-n]
    

    See the geopg(1M) man page for more information.

  8. Verify that Sun Cluster Geographic Edition software, protection groups, device groups, and application resource groups are all the OK state.


    # geoadm status
    # scstat
    

Sun Cluster Geographic Edition 3.1 8/05 Documentation

The Sun Cluster Geographic Edition 3.1 8/05 user documentation set consists of the following collections:

Sun Cluster Geographic Edition Release Notes Collection

Sun Cluster Geographic Edition Software Collection

Sun Cluster Geographic Edition Reference Collection

For the latest documentation, go to the docs.sun.comSM web site. The docs.sun.com web site enables you to access Sun Cluster Geographic Edition documentation on the Web. You can browse the docs.sun.com archive or search for a specific book title or subject at the following Web site:

http://docs.sun.com

Sun Cluster Geographic Edition 3.1 8/05 Software Collection

Table 2 Sun Cluster Geographic Edition Software Collection

Part Number 

Book Title 

817–7499 

Sun Cluster Geographic Edition Overview

817–7500 

Sun Cluster Geographic Edition Installation Guide

817–7501 

Sun Cluster Geographic Edition System Administration Guide

817–7503 

Sun Cluster Geographic Edition Reference Manual

Localization Issues

This section discusses known errors or omissions for localization and steps to correct these problems.

Common Agent Container Command-Stream Adaptor Unable to Support Encoding If LANG is in the lang.variant Format (6262974)

Problem Summary: The Sun Cluster Geographic Edition command line fails to get the string stream when it calls cacaocsc if the LANG is set to lang.variant.

Workaround: Use the locale_region.variant format.

English Messages Might Be Displayed in Some Fields (6292942)

Problem Summary: Due to late changes, some Sun Cluster Geographic Edition CLI and GUI labels and messages, as well as some Sun StorEdge Availability Suite 3.2.1 error messages, are in English in the localized environment.

Workaround: None.

Use Only Japanese, Korean, and Chinese Locales for Localized Versions of Sun Cluster Geographic Edition

Problem Summary: If you are using a localized version of Sun Cluster Geographic Edition 3.1 8/05 software, you must use only Japanese (ja), Korean (ko), or Chinese (zh) locales.

Workaround: None.

Documentation Issues

This section discusses known errors or omissions for man pages, documentation, or online help and steps to correct these problems.

Sun Cluster Geographic Edition Man Pages

This section discusses errors and omissions from the Sun Cluster Geographic Edition man pages.

Fence_level Parameter Described in the geopg Man Page Must Be Set to never or async (6265011)

Problem Summary: If the fence_level parameter is not set to never or async, data replication might not function properly when the secondary site goes down.

Workaround: To avoid application failure on the primary cluster, specify a Fence_level of never or async.

If you have special requirements to use a Fence_level of data or status, consult your Sun representative.

geopg Man Page Does Not Document the Maximum Timeout Limit (6284337)

Problem Summary: The geopg man page does not document the upper limit of the timeout property.

Workaround: The timeout property now has a maximum of 1000000 seconds.

geopg Man Page Does Not Adequately Describe the Timeout Property (6287531)

Problem Summary: The geopg man page does adequately describe the purpose of the timeout property.

Workaround: The timeout period is the longest time Sun Cluster Geographic Edition waits for a response after a geopg command is executed, such as start, stop, switchover, and takeover. If the command does not respond within the timeout period, Sun Cluster Geographic Edition reports the operation as timed out, even if the underlying command that was executed eventually completes successfully.

The timeout period applies to operations on a per-cluster basis. An operation with a local scope times out if the operation does not complete after the specified timeout period.

An operation with a global scope consists of an action on the local cluster and an action on the remote cluster. The local and remote action are timed separately. So, an operation with a global scope times out if the local operation does not complete after the specified timeout period or if the remote operation does not complete after the specified timeout period.

For example, the following operation is launched with a local scope:


# geopg start -e Local

If the timeout property is set to 200 seconds, then the geopg start operation times out if the operation does not complete after 200 seconds.

The same operation is launched with a global scope:


# geopg start -e Global

If the timeout property is set to 200 seconds, then the geopg start operation times out if the operation does not complete on the local cluster after 200 seconds or if the operation does not complete on the remote cluster after 200 seconds. If the local action takes 150 seconds and the remote action takes 150 seconds, the operation does not time out.

The protection group timeout value is estimated. Not every operation on a protection group is timed against the timeout period. For example, the time taken to initialize the data structure and check for the precondition of the operation are not timed in the timeout period.

geops Man Page Incorrectly Documents the Notification_EmailAddrs Property (6289105)

Problem Summary: The geops incorrectly gives the property for setting the notification email address as Notification_EmailAddrss.

Workaround: The correct property name for the notification email address is Notification_EmailAddrs.

Default Heartbeat Port Number Incorrect in the geohb Man Page (6289264)

Problem Summary: The default TCP/UDP heartbeat port number is given as 8765 in the man page. However, this port number has been assigned to someone else by the Internet Assigned Numbers Authority (IANA).

Workaround: To avoid any potential conflicts, the TCP/UDP now uses port number 2084 by default.

Wrong Syntax for Adding Resource Groups and Device Groups in geopg Man Page (6284809)

Problem Summary: The syntax for adding resource group and device groups is not correct in the geopg man page.

Workaround: The correct syntax for adding a resource group is geopg add-resource-group resource-group protection-group-name. The correct syntax for adding a device group is geopg add-device-group device-group protection-group-name.

Incorrect Example in the geohb Man Page (6290885)

Problem Summary: The geohb man page contains the following incorrect example in the DESCRIPTION section:

To create a heartbeat plug-in that is named command1, use the following:


# geohb add paris-to-newyork -g command1 -p Query_cmd=/usr/bin/hb/

Workaround: The example should read as follows:

To add a custom heartbeat plug-in that is named command1 to the paris-to-newyork heartbeat, use the following:


# geohb add-plugin -p Query_cmd=/usr/bin/hb/ command1 paris-to-newyork

Wrong Description of the geops udpate Command (6297733)

Problem Summary: The geops udpate command synchronizes information with the partner cluster. The command cannot update a partnership while the cluster is disconnected from the partner cluster.

Workaround: Do not use the geops udpate command to update a partnership while the cluster is disconnected from the partner cluster.

Extra Characters In the Man Pages (6302385)

Problem Summary: The characters “ 6 appear randomly in the man pages.

Workaround: Disregard these extra characters.

Man Pages Contain Incorrect Subcommands (6304746)

Problem Summary: The geohb(1M) and the geopg(1M) man pages contain some references to incorrect subcommands in the examples.

Workaround: Always refer to the correct usage information by using the command--help command.

Sun Cluster Geographic Edition Installation Guide

This section discusses errors and omissions from the Sun Cluster Geographic Edition Installation Guide.

Missing Information About Installation Requirements (6293058)

Problem Summary: The Sun Cluster Geographic Edition Installation Guide is missing the following requirements:

Workaround: Ensure that all the nodes of the cluster are running with the same default locale and that the cluster has been configured for secure cluster communication using security certificates before you begin this procedure.

Wrong Command for Restarting Common Agent Container (6302712)

Problem Summary: The How to Install Certificates on Partner Clusters in Sun Cluster Geographic Edition Installation Guide states that you must restart the Common Agent Container on each node of each cluster by using the cacaoadm start command.

Workaround: Use the cacaoadm restart command to restart the Common Agent Container.

Sun Cluster Geographic Edition System Administration Guide

This section discusses errors and omissions from the Sun Cluster Geographic Edition System Administration Guide.

No Troubleshooting Documentation (6265968)

Problem Summary: The Sun Cluster Geographic Edition System Administration Guide does not include instructions on troubleshooting.

Workaround: Contact your Sun service representative.

Requirements for Creating a Protection Group That Uses Oracle Real Application Clusters (6426014)

Problem Summary: The Sun Cluster Geographic Edition software supports using Oracle Real Application Clusters with hardware RAID. The documentation does not explain the requirements for creating a protection group that uses Oracle Real Application Clusters.

Workaround: Before you create a protection group for Oracle Real Application Clusters, ensure that the following conditions are met:

Missing Documentation on When the RoleChange_ActionCmd Command is Run (6426007)

Problem Summary: The Sun Cluster Geographic Edition System Administration Guide does not accurately describe when the executable command specified by the RoleChange_ActionCmd property is run.

Workaround: The executable command you specify in the RoleChange_ActionCmd property runs on the new primary cluster when the primary cluster for the protection group changes and the protection group is started.

Documentation on the Sun Cluster Geographic Edition CD

This section discusses errors and omissions from the Sun Cluster Geographic Edition documentation on the product CD.

Broken Links to the Documentation (6309323)

Problem Summary: The links in main page of the CD are broken.

Workaround: To view the Sun Cluster Geographic Edition documentation, go to the following pages: