Documentation Errata (Sun Cluster 2.2 Release Notes Addendum)

Sun Cluster 2.2 Release Notes Addendum

Previous: Upgrade-Related Bugs

Documentation Errata

4220504 - Page 4-3 in the Sun Cluster 2.2 System Administration Guide includes instructions to run the scadmin startnode command simultaneously on all nodes. Instead, the scadmin startnode command should be run on only one node at a time.

4222817 - Page 8-20 in the Sun Cluster 2.2 Software Installation Guide includes instructions to install Sun Cluster HA for Netscape LDAP by adding the SUNWhadns package. The correct package name is SUNWscnsl.

4224989 - Page 1-25 in the Sun Cluster 2.2 Software Installation Guide includes the statement:

"When Solstice DiskSuite is specified as the volume manager, you cannot configure direct-attach devices, that is, devices that directly attach to more than 2 nodes. Disks can only be connected to pairs of nodes."

This statement is incorrect. Direct-attach devices are supported with Solstice DiskSuite and Sun Cluster 2.2.

4258156 - Page 1-10 in the Sun Cluster 2.2 Software Installation Guide includes the statement that in parallel database configurations, any server failure is recognized by the cluster software, and subsequent user queries are re-routed through one of the remaining servers. This statement is untrue. In the case of a server failure, a cluster reconfiguration occurs automatically and the user queries are dropped. The user must initiate a new query through an active server, or through the original server after it has been restored to service.

You can configure Oracle Parellel Server such that a restart of the application will reconnect the clients to an active server. Configure this by modifying the tnsnames.ora file on all clients, using the procedure described in Section 14.1.4.2, "Configuring Oracle SQL*Net," in the Sun Cluster 2.2 Software Installation Guide.

Impact of quorum device failure - page 1-18 in the Sun Cluster 2.2 Software Installation Guide includes this note:

"The failure of a quorum device is similar to the failure of a node in a two-node cluster."

This note is misleading. Although the failure of a quorum device does not cause a failover of services, it does reduce the high availability of a two node cluster in that no further node failures can be tolerated. A failed quorum device can be reconfigured or replaced while the cluster is running. The cluster can remain running as long as no other component failure occurs while the quorum repair or replacement is in progress.

Using scconf(1M) to remove a cluster node - In Chapter 3 of the Sun Cluster 2.2 System Administration Guide, the procedure "How to Remove a Cluster Node" includes a step to use scconf -A clustername n to remove a cluster node. Note that in this command, the number n does not represent a node number, but instead represents the total number of cluster nodes that will be active after the scconf operation. The scconf operation always removes from the cluster the node with the highest node number. For example, assume a 4-node cluster. The following command would remove nodes 3 and 4 from a four-node cluster, resulting in a two-node cluster:

# scconf sc-cluster -A 2

Undocumented Error Messages

The following error messages for Sun Cluster HA for SAP were omitted from the Sun Cluster 2.2 Error Messages Manual.

SUNWcluster.ha.sap.stop_net.2076: proha:SUNWscsap_PRO: Found 2 leftover IPC objects for SAP instance, removing via cleanipc

This message indicates that during shutdown of the SAP central instance by the stop_net method, two IPC segments from the central instance were found. The stop_net code uses the SAP-supplied utility cleanipc to remove all IPC segments of the central instance during shutdown (and also before startup). This is to ensure a thorough shutdown as well as a clean startup. The error message is an informational message only, and is expected. No user action is required.

Graceful shutdown failed for oracle instance PRO, starting abort

This message indicates that the HA-Oracle oracle_db_shutdown script did not complete a graceful shutdown of the database within the timeout limit (30 seconds, by default). If the normal shutdown does not complete during the allowed time, then a shutdown abort is issued. This is an informational message and no user action is required.

SUNWcluster.ccd.ccdctl.4403: (error) checkpoint, ccdd, ticlts: RPC: Program not registered

This message indicates that the ccdadm command could not contact the ccdd demon for the requested operation--the RPC call clnt_create() failed. Verify that the cluster has been started on the current node, and the ccdd daemon is running.

SUNWcluster.clustd.transition.4010: cluster aborted on this node nodename

This message indicates that the current node is being aborted. Other error messages should indicate why this is occurring; check the scadmin.log log file in /var/opt/SUNWcluster.

reconf.pnm.3009: pnminit faced problems

This message is generated by the script /opt/SUNWcluster/bin/pnm. This script is called during step 1 of cluster reconfiguration, when PNM is initialized with pnminit. The error message appears if the execution of pnminit resulted in a non-zero exit. Reasons for a non-zero exit of pnminit include:

Invalid command line arguments
Enviroment variables not set (localnodeid, clustname, currnodes, numnodes)
pnminit could not communicate with the ccdd daemon correctly

Check for any error messages logged to /var/opt/SUNWcluster/ccd/ccd.log, then restart the cluster reconfiguration.

SUNWcluster.reconfig.4018: Aborting--received abort request from nodename

This message indicates a request from a remote node to abort the current node. Use checksum to verify that the /etc/opt/SUNWcluster/conf/clustername.cdb files are identical on all nodes. If necessaryt, manually copy the most recent clustername.cdb file to all nodes, and then restart the cluster.

Previous: Upgrade-Related Bugs