Other Known Issues (Sun Cluster 2.2 Release Notes)

Sun Cluster 2.2 Release Notes

Other Known Issues

The following issues apply to Sun Cluster 2.2.

Running SCM With the HotJava Browser

If you choose to use the HotJava browser shipped with your Solaris 2.6 or Solaris 7 operating environment to run SCM, there may be problems such as:

Using the menus - For example, after making your menu selection, the menu selection can remain visible on the browser.
Use of swap space - If you choose to use the HotJava browser with SCM, you should have at least 40 MBytes of free swap space. If you find that swap space gets low, restarting the HotJava browser can help.
Can't access online help - When running this version of HotJava on a cluster node and displaying it remotely, online help may freeze when attempting to access online help.

Timeout Values

After configuring each logical host with the scinstall(1M) or scconf(1M) commands, you might need to use the scconf clustername -l command to set the timeout values for the logical host. The timeout value is site-dependent; it is tied to the number of logical hosts, spindles, and file systems.

Refer to the scconf(1M) man page for details. For procedures for setting timeout values, refer to Section 3.14, "Configuring Timeouts for Cluster Transition Steps, in the Sun Cluster 2.2 System Administration Guide.

Encapsulated Root Disks

If you are running SSVM with an encapsulated root disk, you must unencapsulate the root disk before installing Sun Cluster 2.2. After you install Sun Cluster 2.2, encapsulate the disk again. You also must unencapsulate the root disk before changing the major numbers.

Refer to your SSVM documentation for the procedures to encapsulate and unencapsulate the root disk.

SNMP Default Port

As part of the client software installation, the SUNWcsnmp package is installed to provide simple network management protocol (SNMP) support for Sun Cluster. The default port used by Sun Cluster SNMP is the same as the default port number used by Solaris SNMP; both use Port 161. Once the SUNWcsnmp package is installed, you must change the Sun Cluster SNMP port number using the procedure described in Section D.6, "Configuring the Cluster SNMP Agent Port, in the Sun Cluster 2.2 System Administration Guide.

Installation Directory for Sun Cluster HA for Informix

The INFORMIX_ESQL Embedded Language Runtime Facility product must be installed in the /var/opt/informix directory on Sun Cluster servers. This is required even if Informix server binaries are installed on the physical host.

Lotus and Netscape Message Servers

You can set up Lotus Domino servers as HTTP, POP3, IMAP, NNTP, or LDAP servers. Lotus Domino will start server tasks for all of these types. However, do not set up instances of any Netscape message servers on a logical host that is potentially mastered by the node on which Lotus Domino is installed.

Lotus and Netscape Port Numbers

Within a cluster, do not configure Netscape services with the same port number as the one used by the Lotus Domino server. The following port numbers are used by default by the Lotus Domino server:

`HTTP`	`Port 80`
`POP3`	`Port 110`
`IMAP`	`Port 143`
`LDAP`	`Port 389`
`NNTP`	`Port 119`

Failover/Switchover When Logical Host File System Is Busy

If a failover or switchover occurs while a logical host's file system is busy, the logical host fails over only partially; some of the disk group remains on the original target physical host. Do not attempt a switchover if a logical host's file system is busy. Also, do not access any host's file system locally, because file locking does not work correctly when both NFS locks and local locks are present.

SSP Password Must Be Correct

If an incorrect password is used for the System Service Processor (SSP) on an Ultra Enterprise 10000, the system will behave unpredictably and might crash.

Harmless Error When Stopping a Node

When you stop a node, the following error message might be displayed:

in.rdiscd[517]: setsockopt (IP_DROP_MEMBERSHIP): Cannot assign requested address

The error is caused by a timing issue between the in.rdiscd daemon and the IP module. It is harmless and can be ignored safely.

Harmless Error by NFS `lockd` Daemon

For Sun Cluster HA for NFS running on Solaris 7, if the lockd daemon is killed before the statd daemon is fully running, the following error message is displayed:

WARNING: lockd: cannot contact statd (error 4), continuing.

This error message can be ignored safely.

Directory Permissions and Ownership of `$ORACLE_HOME`

If the Sun Cluster HA for Oracle fault monitor displays errors like those shown below, make sure that the $ORACLE_HOME directory permissions are set to 755 and that the directory is owned by the Oracle administrative user with group ID dba.

Feb 16 17:13:13 ID[SUNWcluster.ha.haoracle_fmon.2520]: hahost1:HA1: 
 DBMS Error: connecting to database: ORA-12546: TNS:permission denied
 Feb 16 17:12:13 ID[SUNWcluster.ha.haoracle_fmon.2050]: hahost1:HA1: 
 RDBMS error, but HA-RDBMS Oracle will take no action for this error code

Displaying `LOG_DB_WARNING` Messages for the SAP Probe

The Sun Cluster HA for SAP parameter LOG_DB_WARNING determines whether warning messages should be displayed if the Sun Cluster HA for SAP probe cannot connect to the database. When LOG_DB_WARNING is set to -y and the probe cannot connect to the database, a message is logged at the warning level in the local0 facility. By default, the syslogd(1M) daemon does not display these messages to /dev/console or to /var/adm/messages. To see these warnings, you must modify the /etc/syslog.conf file to display messages of local0.warning priority. For example:

...
 *.err;kern.notice;auth.notice;local0.warning /dev/console
 *.err;kern.debug;daemon.notice;mail.crit;local0.warning /var/adm/messages
 ...

After modifying the file, you must restart syslogd(1M). See the syslog.conf(1M) and syslogd(1M) man pages for more information.

Nodelock Freeze After Cluster Panic

In a cluster with more than two nodes and with direct-attached storage, a problem occurs if the last node in the cluster panics or exits the cluster unusually (without performing the stopnode transition). In such a case, all nodes have been removed from the cluster and the cluster no longer exists, but because the last node left the cluster in an unusual manner, it still holds the nodelock. A subsequent invocation of the scadmin startcluster command will fail to acquire the nodelock.

To work around this problem, manually clear the nodelock before restarting the cluster.

Use the following procedure to manually clear the nodelock and restart the cluster, after the cluster has aborted completely.

As root, display the cluster configuration.
# scconf clustername -p
Look for this line in the output:
clustername Locking TC/SSP, port : A.B.C.D, E
- If E is a positive number, the nodelock is on Terminal Concentrator A.B.C.D and Port E. Proceed to Step 2.
- If E is -1, the lock is on an SSP. Proceed to Step 3.

For a nodelock on a Terminal Concentrator (TC), perform the following steps (otherwise, proceed to Step 3).
1. Start a telnet connection to Terminal Concentrator tc-name.
  $ telnet tc-name Trying 192.9.75.51... Connected to tc-name. Escape character is `^]'.
  Enter Return to continue.
2. Specify -cli (command line interface).
  Enter Annex port name or number: cli
3. Log in as root.
4. Run the admin command.
  annex# admin
5. Reset Port E.
  admin : reset E
6. Close the telnet connection
  annex# hangup
7. Proceed to Step 4.

For a nodelock on a System Service Processor (SSP), perform the following steps.
1. Connect to the SSP.
  $ telnet ssp-name
2. Log in as user ssp.
3. Display information on the clustername.lock file by using the following command (this file is a symbolic link to /proc/csh.pid).
  $ ls -l /var/tmp/clustername.lock
4. Search for the process csh.pid.
  $ ps -ef | grep csh.pid
5. If the csh.pid process exists in the ps -ef output, kill the process by using the following command.
  $ kill -9 csh.pid
6. Delete the clustername.lock file.
  $ rm -f /var/tmp/clustername.lock
7. Log out of the SSP.

Restart the cluster.
$ scadmin startcluster

Setting Up the `/etc/nsswitch.conf` Files With DBMS Data Services

The following applies to configurations using Sun Cluster HA for Oracle, Sun Cluster HA for Informix, or Sun Cluster HA for Sybase.

The Sun Cluster 2.2 Software Installation Guide contains erroneous information about how to set up the /etc/nsswitch.conf files for these DBMS data services. In order for the data services to start and stop correctly in case of switchovers or failovers, the /etc/nsswitch.conf files must be set up as follows.

On each node that can master the logical host running the DBMS data service, the /etc/nsswitch.conf file must have one of the following entries for group.

group:
 group:		 	files
 group:		 	files [NOTFOUND=return] nis
 group:		 	files [NOTFOUND=return] nisplus

The DMBS data services use the su user command when starting and stopping the database node. The above settings will ensure that the su user command does not refer to NIS/NIS+ when the network information name service is not available due to failure of the public network on the cluster node.