Sun Cluster 3.1 9/04 Release Notes for Solaris OS

Known Issues and Bugs

The following known issues and bugs affect the operation of the Sun Cluster 3.1 9/04 release.

scvxinstall Creates Incorrect vfstab Entries When Boot Device is Multi-Pathed (4639243)

Problem Summary: scvxinstall creates incorrect vfstab entries when boot device is multipathed.

Workaround: Run scvxinstall and choose to encapsulate. When the following message appears, type Ctrl-C to abort the reboot:


This node will be re-booted in 20 seconds. Type Ctrl-C to abort.

Edit the vfstab entry so /global/.devices uses the /dev/{r}dsk/cXtXdX name instead of the /dev/did/{r}dsk name. This revised entry enables VxVM to recognize it as the rootdisk. Rerun scvxinstall and choose to encapsulate. The vfstab file has the necessary updates. Allow the reboot to occur. The encapsulation proceeds as normal.

HA Oracle Stop Method Times Out (4644289)

Problem Summary: The Sun Cluster for HA for Oracle data service uses the su command to start and stop the database. If you are running Solaris 8 or Solaris 9, the network service might become unavailable when a cluster node's public network fails.

Workaround: Include the following entries in the /etc/nsswitch.conf file on each node that can be the primary for oracle_server resource or oracle_listener resource:

passwd: files
groups: files
publickey: files
project:  files

These entries ensure that the su command does not refer to the NIS/NIS+ name services, so that the data service starts and stops correctly during a network failure.

ce Adapters on Private Interconnect Observe Timeouts and Cause Node Panics (4746175)

Problem Summary: Clusters that use ce adapters on the private interconnect observe path timeouts and subsequent node panics if one or more cluster nodes have more than 4 CPUs.

Workaround: Set the ce_taskq_disable parameter in the ce driver by adding the following line to /etc/system file on all cluster nodes.

set ce:ce_taskq_disable=1

Then, reboot the cluster nodes. Consider quorum when you reboot cluster nodes. Setting this parameter ensures that heartbeats (and other packets) are always delivered in the interrupt context thereby eliminating the path timeouts and the subsequent panics.

SAP liveCache Stop Method Times Out (4836272)

Problem Summary: The Sun Cluster HA for SAP liveCache data service uses the dbmcli command to start and stop liveCache. If you are running Solaris 9, the network service might become unavailable when a cluster node's public network fails.

Workaround: Include one of the following entries for the publickey database in the /etc/nsswitch.conf files on each node that can be the primary for liveCache resources:

publickey: 
publickey:  files
publickey:  files [NOTFOUND=return] nis 
publickey:  files [NOTFOUND=return] nisplus

Adding one of the above entries, in addition to updates documented in Sun Cluster Data Service for SAP liveCache Guide for Solaris OS, ensures that the su command and the dbmcli command do not refer to the NIS/NIS+ name services. Bypassing the NIS/NIS+ name services ensures that the data service starts and stops correctly during a network failure.

Some Agents Do Not Use Facility LOG_DAEMON (4897239)

Problem Summary: Due to an internal error, some Sun-supplied cluster agents write messages to the system log (see syslog(3C)) using the LOG_USER facility instead of using LOG_DAEMON. On a cluster that is configured with the default syslog settings (see syslog.conf(4)), messages with a severity of LOG_WARNING or LOG_NOTICE, which would normally be written to the system log, are not being output. This problem occurs only for agent code written as shell scripts.

Workaround:

nsswitch.conf Requirement Should Not Apply To passwd Database (4904975)

Problem Summary: The requirement for the nsswitch.conf file in "Preparing the Nodes and Disks" in Sun Cluster Data Service for SAP liveCache Guide for Solaris OS does not apply to the entry for the passwd database. If these requirements are met, the su command might hang on each node that can master the liveCache resource when the public network is down.

Workaround: On each node that can master the liveCache resource, ensure that the entry in the /etc/nsswitch.conf file for the passwd database is as follows:

passwd: files nis [TRYAGAIN=0]

sccheck Hangs (4944192)

Problem Summary: sccheck might hang if launched simultaneously from multiple nodes.

Workaround: Do not launch sccheck from any multi-console that passes commands to multiple nodes. sccheck runs can overlap, but should not be launched simultaneously.

Java Binaries Linked to Incorrect Java Version Cause HA-DB Agent to Malfunction (4968899)

Problem Summary: Currently, HA-DB data service does not use the JAVA_HOME environment variable. Therefore, HA-DB, when invoked from the HA-DB data service, takes Java binaries from /usr/bin/. The Java binaries in /usr/bin/ need to be linked to the appropriate version of Java 1.4 and above for HA-DB data service to work properly.

Workaround: If you do not object to changing the default version available, perform the following procedure. As an example, this workaround assumes that the /usr/j2se directory is where you have the latest version of Java (such as 1.4 and above).

  1. Do you currently have a directory called java/ in the /usr/ directory? If so, move it to a temporary location.

  2. From the /usr/ directory, link /usr/bin/java and all other Java-related binaries to the appropriate version of Java.


    # ln -s j2se java
    

If you do not want to change the default version available, assign the JAVA_HOME environment variable with the appropriate version of Java (J2SE 1.4 and above) in the /opt/SUNWappserver7/SUNWhadb/4/bin/hadbm script.

HA-DB Reinitializes Without Spares (4973982)

Problem Summary: Due to bug 4974875, whenever autorecovery is performed, the database reinitializes itself without any spares. The mentioned bug has been fixed and integrated into HA-DB release 4.3. For HA-DB 4.2 and below releases, follow one of the procedures below to change the roles of the HA-DB nodes.

Workaround:

  1. Identify the HA-DB nodes that have their roles changed after autorecovery is successful.

  2. On all the nodes that you identified in Step 1, and one node at a time, disable the fault monitor for the HA-DB resource in question.


    # cladm noderole -db dbname -node nodeno -setrole role-before-auto_recovery
    
  3. Enable the fault monitor for the HA-DB resource in question.

    or

  1. Identify the HA-DB nodes that have their roles changed after autorecovery is successful.

  2. On all nodes that host the database, disable the fault monitor for the HA-DB resource in question.

  3. On any one of the nodes, execute the command for each HA-DB node that needs its role changed.


    # cladm noderole -db dbname -node nodeno -setrole role-before-auto_recovery
    

pnmd Not Accessible by Other Node During Rolling Upgrade (4997693)

Problem Summary: During a rolling upgrade, if scstat -i command is run on a cluster node that has not yet been upgraded, the scstat output will not show the status of the IPMP groups hosted on the nodes that have already been upgraded.

Workaround: Use the scstat -i ouput from the upgraded nodes.

LogicalHostname Resource Cannot be Added (5004611)

Problem Summary: A LogicalHostname resource cannot be added to the cluster if it needs to use an IPMP group with a failed adapter.

Workaround: Either remove the failed adapter from the IPMP group, or correct the failure before attempting to use the IPMP group in a LogicalHostname resource.

SunPlex Manager Improperly Stores Encoding Information for the Status (5012328)

Problem Summary: The two fields, Status and Type, in the resource group status page displays values in the first locale that was used to view the page.

Workaround: To see values in a different locale, restart the web server.

uservol is Used for /global/.devices/node@2 After Re-encapsulating Root Disk (5028284)

Problem Summary: After encapsulating the root disk, if you unencapsulate and then reencapsulate the root disk, you might see that a volume called uservol is used for the /global/devices/node@nodeID filesystem. This might cause problems, since the volume name for each node's global devices file system should be unique.

Workaround: After following the documented steps for unencapsulation, kill the vxconfigd daemon before you run scvxinstall again to reencapsulate the root disk.

Multiple Submissions of Login Pages to Sun Web Console Cause Various Login Failures (5039143)

Problem Summary: When logging in to Sun Web Console, if the Login or Enter button is pressed repeatedly, the multiple login requests can result in various failures, thereby preventing access to SunPlex Manager.

Workaround: Become superuser on the cluster node and restart Sun Web Console.


# /usr/sbin/smcwebserver restart

Resource_dependencies_restart Not Working As Expected (5041013)

Problem Summary: The Resource_dependencies_restart resource property does not behave as expected when a resource declares an any node inter-resource–group restart dependency upon a scalable mode resource. Most data services are unaffected.

Workaround: The current behavior of restart dependencies will change as described above, when this bug is fixed. Do not develop code or administrative procedures that depend upon the current incorrect behavior.

sccheck Missing Support for Sun Enterprise 15000 (5056534)

Problem Summary: If you have a Sun Enterprise 15000 server and you run the sccheck command, the check fails and reports an error that indicates that the Sun Enterprise 15000 server is not supported. This statement is not correct.

Workaround: No workaround is necessary. Sun Cluster software supports your Sun Enterprise 15000 server. The error that the sccheck command reports states that the check might be out of date. In this case, sccheck is out of date.

French Unavailable for non-JES Data Service Agents (5059963)

Problem Summary: French (fr) is not available as a language selection for data-service agents that are not part of the Sun Java Enterprise System. However, the GUI installer for those packages suggests otherwise.

Workaround: Ignore the inaccuracy of the GUI installer. French (fr) is not available.

scinstall –u update Does Not Preserve SUNWcacao Security Keys (5068616)

Problem Summary: During upgrade to Sun Cluster 3.1 9/04 software, the scinstall command installs the new common agent container packages, SUNWcacao and SUNWcacaocfg, but does not distribute identical security keys to all cluster nodes.

Workaround: Perform the following steps to ensure that the common agent container security files are identical on all cluster nodes and that the copied files retain the correct file permissions. These files are required by Sun Cluster software.

  1. On one cluster node, change to the /etc/opt/SUNWcacao/ directory.


    phys-schost-1# cd /etc/opt/SUNWcacao/
    
  2. Create a tar file of the /etc/opt/SUNWcacao/security/ directory.


    phys-schost-1# tar cf /tmp/SECURITY.tar security
    
  3. Copy the /tmp/SECURITY.tar file to each of the other cluster nodes.

  4. On each node to which you copied the /tmp/SECURITY.tar file, extract the security files.

    Any security files that already exist in the /etc/opt/SUNWcacao/ directory are overwritten.


    phys-schost-2# cd /etc/opt/SUNWcacao/
    phys-schost-2# tar xf /tmp/SECURITY.tar
    
  5. Delete the /tmp/SECURITY.tar file from each node in the cluster.

    You must delete each copy of the tar file to avoid security risks.


    phys-schost-1# rm /tmp/SECURITY.tar
    phys-schost-2# rm /tmp/SECURITY.tar
    
  6. On each node, restart the security file agent.


    # /opt/SUNWcacao/bin/cacaoadm start
    

Incorrect Date Format for Advanced Filter Panel of SunPlex Manager (5075018)

Problem Summary: The date field on the Advanced Filter panel of SunPlex Manager accepts only mm/dd/yyyy format. However, in non-English locale environments, the date format is different from mm/dd/yyyy, and the return date format from Calendar panel is other than mm/dd/yyyy format.

Workaround: Type the date range in the Advanced Filer panel in mm/dd/yyyy format. Do not use the Set button to display the calendar and choose the date.

Unreadable Error Messages in SunPlex Manager When Removing Resource Group (5083147)

Problem Summary: When you remove a resource group by using SunPlex Manager on Solaris 8, you might receive error messages that are not readable. This problem occurs in Japanese, Korean, Traditional Chinese, and Simplified Chinese.

Workaround: Run system locale in English to display the error messages in English.

Incorrect Extension Property Descriptions in SUNW.sapscs (5083259)

Problem Summary: In the resource type registration (RTR) file SUNW.sapscs, descriptions for two extension properties are incorrect.

Workaround: The description for Scs_Startup_Script should be Startup script for the SCS. Defaults to /usr/sap/SAP_SID/SYS/exe/run/startsap. The description for Scs_Shutdown_Script should be Shutdown script for the SCS. Defaults to /usr/sap/SAP_SID/SYS/exe/run/stopsap.

After JumpStart Completes for Sun Cluster 3.1 9/04, User Cannot Access SunPlex Manager (5095638)

Problem Summary: After installing Sun Cluster software by using the JumpStart method, Sun Web Console cannot launch SunPlex Manager. JumpStart postinstallation processing fails to successfully register SunPlex Manager. with Sun Web Console.

Workaround: Run the following script on each cluster node, after JumpStart installation of Sun Cluster software is finished on all nodes.


# /var/sadm/pkg/SUNWscspmu/install/postinstall  

This script registers SunPlex Manager with Sun Web Console.

Installing Sun Cluster Data Service for HA Oracle From CD-ROM Fails (5098622)

Problem Summary: The installer program on the Sun Cluster 3.1 9/04data services CD-ROM for x86 cannot be used to install HA Oracle. The following message is issued by the installer:

Could not find child archive ....

Workaround: Use scinstall to install Sun Cluster Data Service for HA Oracle.

Some Data Services Cannot be Upgraded by Using the scinstall Utility

Problem Summary: The data services for the following applications cannot be upgraded by using the scinstall utility:

Workaround: If you plan to upgrade a data service for an application in the preceding list, replace the step for upgrading data services in Upgrading to Sun Cluster 3.1 9/04 Software (Rolling) in Sun Cluster Software Installation Guide for Solaris OS with the steps that follow. Perform these steps for each node where the data service is installed.

ProcedureHow to Upgrade Data Services That Cannot be Upgraded by Using scinstall

Steps
  1. Remove the software package for the data service that you are upgrading.


    # pkgrm pkg-inst
    

    pkg-inst specifies the software package name for the data service that you are upgrading as listed in the following table.

    Application 

    Data Service Software Package 

    Apache Tomcat 

    SUNWsctomcat

    DHCP 

    SUNWscdhc

    mySQL 

    SUNWscmys

    Oracle E-Business Suite 

    SUNWscebs

    Samba 

    SUNWscsmb

    SWIFTAlliance Access 

    SUNWscsaa

    WebLogic Server (English locale) 

    SUNWscwls

    WebLogic Server (French locale) 

    SUNWfscwls

    WebLogic Server (Japanese locale) 

    SUNWjscwls

    WebSphere MQ 

    SUNWscmqs

    WebSphere MQ Integrator 

    SUNWscmqi

  2. Install the software package for the version of the data service to which you are upgrading.

    To install the software package, follow the instructions in the Sun Cluster documentation for the data service that you are upgrading. This documentation is available at http://docs.sun.com/.