Sun Cluster 3.1 4/04 Release Notes for Solaris OS

Known Issues and Bugs

The following known issues and bugs affect the operation of the Sun Cluster 3.1 release.

Data Services: Installation Guidelines

Identify requirements for all data services before you begin Solaris and Sun Cluster installation. If you do not determine these requirements, you might perform the installation process incorrectly and thereby need to completely reinstall the Solaris and Sun Cluster software.

For example, the Oracle Parallel Fail Safe/Real Application Clusters Guard option of Oracle Parallel Server/Real Application Clusters has special requirements for the hostnames/node names that you use in the cluster. You must accommodate these requirements before you install Sun Cluster software because you cannot change hostnames after you install Sun Cluster software. For more information on the special requirements for the hostnames/node names, see the Oracle Parallel Fail Safe/Real Application Clusters Guard documentation.

Nodes Unable to Bring Up qfe Paths (4526883)

Problem Summary: Sometimes, private interconnect transport paths ending at a qfe adapter fail to come online.

Workaround: Follow the steps shown below:

  1. Using scstat -W, identify the adapter that is at fault. The output will show all transport paths with that adapter as one of the path endpoints in the faulted or the waiting states.

  2. Use scsetup to remove from the cluster configuration all the cables connected to that adapter.

  3. Use scsetup again to remove that adapter from the cluster configuration.

  4. Add back the adapter and the cables.

  5. Verify if the paths appear. If the problem persists, repeat steps 1–5 a few times.

  6. Verify if the paths appear. If the problem still persists, reboot the node with the at-fault adapter. Before the node is rebooted, make sure that the remaining cluster has enough quorum votes to survive the node reboot.

The remove Script Fails to Unregister SUNW.gds Resource Type (4727699)

Problem Summary: The remove script fails to unregister SUNW.gds resource type and displays the following message:

Resource type has been un-registered already.

Workaround: After using the remove script, manually unregister SUNW.gds. Alternatively, use the scsetup command or the SunPlex Manager.

Path Timeouts When Using ce Adapters on the Private Interconnect (4746175)

Problem Summary: Clusters using ce adapters on the private interconnect may notice path timeouts and subsequent node panics if one or more cluster nodes have more than four processors.

Workaround: Set the ce_taskq_disable parameter in the ce driver by adding set ce:ce_taskq_disable=1 to /etc/system file on all cluster nodes and then rebooting the cluster nodes. This ensures that heartbeats (and other packets) are always delivered in the interrupt context, eliminating path timeouts and the subsequent node panics. Quorum considerations should be observed while rebooting cluster nodes.

Node Hangs After Rebooting When Switchover Is in Progress (4806621)

Problem Summary: If a device group switchover is in progress when a node joins the cluster, the joining node and the switchover operation may hang. Any attempts to access any device service will also hang. This is more likely to happen on a cluster with more than two nodes and if the file system mounted on the device is a VxFS file system.

Workaround: To avoid this situation, do not initiate device group switchovers while a node is joining the cluster. If this situation occurs, then all the cluster nodes must be rebooted to restore access to device groups.

DNS Wizard Fails if an Existing DNS Configuration is not Supplied (4839993)

Problem Summary: SunPlex Manager includes a data service installation wizard that sets up a highly available DNS service on the cluster. If the user does not supply an existing DNS configuration, such as a named.conf file, the wizard attempts to generate a valid DNS configuration by autodetecting the existing network and nameservice configuration. However, it fails in some network environments, causing the wizard to fail without issuing an error message.

Workaround: When prompted, supply the SunPlex Manager DNS data service install wizard with an existing, valid named.conf file. Otherwise, follow the documented DNS data service procedures to manually configure highly available DNS on the cluster.

Using SunPlex Manager to Install an Oracle Service (4843605)

Problem Summary: SunPlex Manager includes a data service installation wizard which sets up a highly available Oracle service on the cluster by installing and configuring the Oracle binaries as well as creating the cluster configuration. However, this installation wizard is currently not working, and results in a variety of errors based on the users' software configuration.

Workaround: Manually install and configure the Oracle data service on the cluster, using the procedures provided in the Sun Cluster documentation.

Unable to Add Adapter to IPMP Group After Removal (4884060)

Problem Summary: If SunPlex Manager is used to remove an adapter from a multi-adapter IPMP group, it may not always be possible to immediately add the adapter back to the same group again.

Workaround: Remove /etc/hostname.adapter before attempting to add the adapter back to the same IPMP group.

Shell Version of scds_syslog Does Not Use Facility LOG_DAEMON (4897239)

Problem Summary: Due to an internal error, most Sun-supplied cluster agents are writing messages to the system log (see syslog(3C)) using the LOG_USER facility instead of using LOG_DAEMON. On a cluster that is configured with the default syslog settings (see syslog.conf(4)), messages with a severity of LOG_WARNING or LOG_NOTICE, which would ordinarily be written to the system log, are not being output.

Workaround: Add the following line near the front of the /etc/syslog.conf file on all cluster nodes:


user.warning			/var/adm/messages
This will cause user.warning messages to be logged. A similar line can be added for user.notice messages, but this is not necessary and might cause the logs to fill up too quickly, depending on the mix of applications that are running.

nsswitch.conf Requirements for passwd Make nis Unusable (4904975)

Problem Summary: The requirements for the nssswitch.conf file in “Preparing the Nodes and Disks” in Sun Cluster Data Service for SAP liveCache Guide for Solaris OS do not apply to the entry for the passwd database. If these requirements are met, the su command might hang on each node that can master the liveCache resource when the public network is down.

Workaround: On each node that can master the liveCache resource, ensure that the entry in the /etc/nsswitch.conf file for the passwd database is as follows:

passwd: files nis [TRYAGAIN=0]

Data Service Installation Wizards for Oracle and Apache do not Support Solaris 9 and Above (4906470)

Problem Summary: The SunPlex Manager data service installation wizards for Apache and Oracle do not support Solaris 9 and above.

Workaround: Manually install Oracle on the cluster using, using Sun Cluster documentation. If installing Apache on Solaris 9 (or higher), manually add the Solaris Apache packages SUNWapchr and SUNWapchu before running the installation wizard.

Node Panic After One Node is Rebooted as Part of scvxinstall Encapulation (4931910)

Problem Summary: Improper timing of cluster-node reboots during rootdisk encapsulation can cause node panics.

Workaround: Run scvxinstall on one node at a time, waiting until one node has completed all of its reboots before starting scvxinstall on another node.

SunPlex Agent Builder's Default Window Size for Non-English Locales is Too Small (4937877)

Problem Summary: When running SunPlex Agent Builder in a non-English locale, the default window size is too small and some controls may not appear in the window. This problem has been observed in the German and Spanish locales.

Workaround: Manually resize the SunPlex Agent Builder window as needed.

sccheck Hangs When Simultaneously run on Multiple Nodes (4944192)

Problem Summary: sccheck may hang if launched simultaneously from multiple nodes.

Workaround: Do not launch sccheck from any multi-console which passes commands to multiple nodes. sccheck runs may overlap but should not be launched simultaneously.

scinstall -r Does Not Remove Data Service Locale Packages (4955294)

Problem Summary: scinstall -r does not remove locale-specific data service packages.

Workaround: Once the node comes up, run pkginfo | grep -i cluster to make sure all data service packages have been removed. To remove the listed packages, run pkgrm on each package.

Incorrect Language Displayed in the Traditional Chinese Locale (4955538)

Problem Summary: Certain SunPlex Agent Builder messages in the Traditional Chinese locale are displayed in Simplified Chinese.

Workaround: Run SunPlex Agent Builder in the zh_TW locale to correctly display the messages in Traditional Chinese.

Java Binaries Linked to Incorrect Java Version Cause HADB Agent to Malfunction (4968899)

Problem Summary: When hadbm is invoked from the HADB agent, it takes the java binaries from /usr/bin. The HADB agent fails to work properly since the java binaries in /usr/bin need to be linked to the appropriate version of Java 1.4 (or above).

Workaround: Assign JAVA_HOME environment variable with the appropriate version of Java 1.4 (or above) in the script /opt/SUNWappserver7/SUNWhadb/4/bin/hadbm.

scsetup is Not Able to Add the First Adapter to a Single-Node Cluster (4983095)

Problem Summary: If scsetup is used in an attempt to add the first adapter to a single-node cluster, the following error messsage results: Unable to determine transport type.

Workaround: Configure at least the first adapter manually:


# scconf -a -A trtype=type,name=nodename,node=nodename

After the first adapter is configured, further use of scsetup to configure the interconnects works as expected.

Some Data Services Cannot be Upgraded by Using the scinstall Utility

Problem Summary: The data services for the following applications cannot be upgraded by using the scinstall utility:

Workaround: If you plan to upgrade a data service for an application in the preceding list, replace the step for upgrading data services in “Upgrading to Sun Cluster 3.1 4/04 Software (Rolling)” in Sun Cluster Software Installation Guide for Solaris OS with the steps that follow. Perform these steps for each node where the data service is installed.

  1. Remove the software package for the data service that you are upgrading.


    # pkgrm pkg-inst
    

    pkg-inst specifies the software package name for the data service that you are upgrading as listed in the following table.

    Application 

    Data Service Software Package 

    Apache Tomcat 

    SUNWsctomcat

    DHCP 

    SUNWscdhc

    mySQL 

    SUNWscmys

    Oracle E-Business Suite 

    SUNWscebs

    Samba 

    SUNWscsmb

    SWIFTAlliance Access 

    SUNWscsaa

    WebLogic Server (English locale) 

    SUNWscwls

    WebLogic Server (French locale) 

    SUNWfscwls

    WebLogic Server (Japanese locale) 

    SUNWjscwls

    WebSphere MQ 

    SUNWscmqs

    WebSphere MQ Integrator 

    SUNWscmqi

  2. Install the software package for the version of the data service to which you are upgrading.

    To install the software package, follow the instructions in the Sun Cluster documentation for the data service that you are upgrading. This documentation is available at http://docs.sun.com.

HA Oracle Stop Method Times Out (4644289)

Problem Summary: The Sun Cluster HA for Oracle data service uses the super user command, su(1M), to start and stop the database. If you are running Solaris 8 or Solaris 9, the network service might become unavailable when a cluster node's public network fails.

Workaround: Include the following entries in the /etc/nsswitch.conf configuration files on each node that can be the primary for oracle_server or oracle_listener resource:

passwd: files
groups: files
publickey: files
project:  files

These entries ensure that the su command does not refer to the NIS/NIS+ name services, so that the data service starts and stops correctly during a network failure.

SAP liveCache Stop Method Times Out (4836272)

Problem Summary: The Sun Cluster HA for SAP liveCache data service uses the dbmcli command to start and stop liveCache. If you are running Solaris 9, the network service might become unavailable when a cluster node's public network fails.

Workaround: Include one of the following entries for the publickey database in the /etc/nsswitch.conf configuration files on each node that can be the primary for liveCache resources:

publickey: 
publickey:  files
publickey:  files [NOTFOUND=return] nis 
publickey:  files [NOTFOUND=return] nisplus

Adding one of the above entries, in addition to updates documented in Sun Cluster Data Service for SAP liveCache Guide for Solaris OS ensures that the su command and the dbmcli command do not refer to the NIS/NIS+ name services. Bypassing the NIS/NIS+ name services ensures that the data service starts and stops correctly during a network failure.

HA-Siebel Does Not Automatically Restart Failed Siebel Components (4722288)

Problem Summary: Sun Cluster HA for Siebel does not monitor individual Siebel components. If the failure of a Siebel component is detected, only a warning message is logged in syslog.

Workaround: Restart the Siebel server resource group in which components are offline by using the command scswitch -R -h node -g resource_group.