A P P E N D I X  A

Error Messages

This appendix lists the error messages that are generated by the Netra HA Suite and provides a brief summary of the possible cause of each error. The error messages are grouped by those produced during installation, those produced during run time, and those produced when using the command-line interface. Each list appears in alphabetical order of error message.

For information about the error messages that are produced by the Netra HA Suite, see the following sections:


Introduction to Error Messages

All of the Netra HA Suite log error and information messages to the system log files. These messages can be processed by client programs or by the Node Management Agent. For information about how to configure system log files, see the Netra High Availability Suite 3.0 1/08 Foundation Services Cluster Administration Guide.

This appendix lists some of the error messages that are produced by the error scenarios described in this book. This appendix does not contain an exhaustive list of error messages produced by the Netra HA Suite.

All error messages have the following format:


<date><nodeid><PID><stack><failed
service or daemon><message>

The following output shows some sample error messages:


1:Nov  8 10:05:20 b14-netra-4 CMM[547]: [ID 191875 local0.error] 
	S-CMM [Membership] (no_master_role) Cluster becomes stale
3:Oct 24 15:47:20 b14-netra-4 statd[3162]: [ID 514559 daemon.error] 
	svc_tp_create: Could not register prog 100024 vers 1 on udp
3:Oct 24 15:47:20 b14-netra-4 nhcrfsd[591]: [ID 191875 local0.error] 
	S-CORE  nhcrfsd' failed to stay up
3:Oct 24 15:47:25 b14-netra-4 nfs: [ID 609386 kern.warning] WARNING: 
	lockd: cannot contact statd (error 4), continuing
3:Oct 24 15:53:43 b14-netra-4 CMM[546]: [ID 191875 local0.error] 
	S-CMM   [PROBE] Node 173 becomes DOWN


Error Messages Written During Installation

Error Messages Written During Manual Installation


[nhcmm] /etc/opt/SUNWcgha/not_configured present => no action taken

  The installation or configuration process has failed.
[nhwdt] /etc/opt/SUNWcgha/not_configured present => no action taken

  The installation or configuration process has failed.

Error Messages Written During Installation Using the nhinstall Tool

The nhinstall tool is a script that runs a series of commands. If the nhinstall tool encounters an error during the installation, it issues a message to the console and stops. The error message indicates which command has failed. If the failing command is a Solaris command, the nhinstall tool indicates which Solaris command has failed. The nhinstall tool does not produce its own error messages.

When the nhinstall tool encounters an error, it does not continue to search for other errors. You must fix the error, then relaunch the nhinstall tool from the point at which it failed. If the nhinstall tool encounters another error, it stops again.


Error Messages Written During Run Time

Error Messages Written by the Cluster Membership Manager


[%s] port undefined in /etc/services

  Confirm that the nhfs.conf and cluster_nodes_table files are correctly configured. See the man pages in the Netra High Availability Suite 3.0 1/08 Foundation Services Reference Manual.

[%s/%s] port undefined in /etc/services

  Confirm that the nhfs.conf and cluster_nodes_table files are correctly configured. See the man pages in the Netra High Availability Suite 3.0 1/08 Foundation Services Reference Manual

Another CMM is running => exit

  A second instance of the nhcmmd daemon attempted to start on the node.

[Config] %s does not exist, could not recreate %s

  A cluster configuration file or its backup is missing. The file cannot be created. Confirm that the directory in which the file should exist has the correct access permissions.

[Config] Could not initialize Cluster nodes table (local)

  Confirm that the cluster_nodes_table file is correctly configured. See the cluster_nodes_table4 man page.

[Config] Could not initialize Cluster nodes table (NFS)

  Confirm that the cluster_nodes_table file is correctly configured. See the cluster_nodes_table4 man page.

[Config] Could not initialize minimal config

  Confirm that the target.conf file is correctly configured. See the target.conf4 man page.

[Config] Could not load backup configuration file

  A cluster configuration file or its backup is missing. The file cannot be created. Confirm that the directory in which the file should exist has the correct access permissions.

[Config] Could not read Cluster nodes table

  Investigate whether the cluster_nodes_table file is missing or inaccessible.

[Config] Could not read minimal configuration file

  Investigate whether the target.conf file is missing or inaccessible,

[Config] Invalid domain id in file %s

  Confirm that the nhfs.conf and cluster_nodes_table files are correctly configured. See the man pages in the Netra High Availability Suite 3.0 1/08 Foundation Services Reference Manual.

[Config] Invalid node id in file %s

  Confirm that the nhfs.conf and cluster_nodes_table files are correctly configured. See the man pages in the Netra High Availability Suite 3.0 1/08 Foundation Services Reference Manual.

[Config] Invalid node name in file %s

  Confirm that the cluster_nodes_table files is correctly configured. See the cluster_nodes_table4 man page.

[Config] Minimal configuration files %s and %s cannot be accessed

  Investigate whether the target.conf file is missing or inaccessible.

[Candidates] No master-eligible node

  See Chapter 3.

[Config] Nodes number exceed cluster capacities

  For information about the supported cluster configuration, see the Netra High Availability Suite 3.0 1/08 Foundation Services Getting Started Guide.

CURRENT NODE HAS LEFT CLUSTER

  The current node does not belong to the master node domain. Confirm that the following are true:
  • The node domainid is correct in the nhfs.conf file and the cluster_nodes_table file.

  • The node name is correct in the cluster_nodes_table file.

  • The node domainid is the same in the nhfs.conf file and the cluster_nodes_table file.

  • The master node domainid is correct.

[Election] Unable to extract the best choice

  No node is eligible to become the master node. Verify that the vice-master node is qualified, and that it is synchronized with the master node. See The Vice-Master Node Remains Unsynchronized After Startup.

[Membership] Could not access Cluster nodes table

  Investigate whether the cluster_nodes_table file is missing or inaccessible.

[PROBE] a probe is already executed

  Confirm whether the following are true:
  • The node is present in the cluster_nodes_table file.

  • The node name is unique in the cluster_nodes_table file.

  • The nodeid is unique in the cluster_nodes_table file.


Error Messages Written by Reliable NFS


Already mounted. In order to ensure data integrity, please proceed to a 'full sync' using 'nhcrfsadm -f %s' command

  This error occurs if a partition is already mounted at boot time.

Could not disable previous SNDR configuration

  Reliable NFS could not disable the SNDR boot time configuration. This can happen if the replication configuration is broken. Flush the replication configuration manually:
  1. Boot the master-eligible nodes in single-user mode:

  2. Reset the replication configuration on both nodes:

  3. Recreate an empty replicated configuration file by typing Y at on both nodes:

  4. Reboot the nodes.

  5. If the problem persists:

  • Boot both master-eligible nodes in single user mode.

  • On each master-eligible node, edit the /etc/opt/SUNWcgha/target.conf file by setting the attributes field to “” and the role field to “”.

    For information about the target.conf file, see the target.conf4 man page on the Solaris OS or the target.conf5 man page on Linux.

  • Repeat Steps 2 to 4 on each master-eligible node.

Could not export some directories

  Reliable NFS could not share some directories. Verify that the directories listed to be shared exist and that /usr/bin/share exists.

Could not get port number for server <port number>

  Reliable NFS could not bind to the specified service. Verify that the service is not defined in the/etc/services file. If it is not defined, add an entry to the /etc/services file to define the service.

Could not put SNDR into logging mode

  You cannot stop SNDR. Examine the disk configuration.

Could not reverse SNDR configuration

  You cannot reverse SNDR during a switchover. This role reversal is handled by switching the primary and secondary SNDR roles. Examine the disk configuration.

Could not set master dynamic address(es)

  Reliable NFS could not set the master node floating address triplet. Verify that the interfaces exist.

Could not start <command name>

  Reliable NFS could not execute the specified command. Verify that the command is available on the cluster and that its execution rights are correct.

Could not start SNDR

  The SNDR service failed to start. Verify that the node has a valid disk partition configuration. For information about disk partitions, see the Netra High Availability Suite 3.0 1/08 Foundation Services Manual Installation Guide for the Solaris OS.

Could not stop <command name>

  Reliable NFS could not execute the specified command. Verify that the command is available on the cluster, and that its execution rights are correct.

Could not unexport some directories

  Reliable NFS could not share some directories. Verify that the directories listed to be shared exist, and that /usr/bin/share exists.

Could not unset master dynamic address(es)

  Reliable NFS could not unset the master node floating address triplet. The specified interfaces might be unknown or unplumbed.

Emergency reboot of the node

  Reliable NFS rebooted the node because it did not restart correctly. This can occur if the nhcrfsd daemon dies during a switchover or a failover.

Error in configuration

  The Reliable NFS configuration is incorrect. The text following the message should indicate the type of configuration error. Verify that the configuration of the nhfs.conf file for the failing node is consistent with the nhfs.conf4 man page.

Illegal startup case: we are 'master' but were 'vice-master unsynchronized'. Please restart the node.

  The vice-master node was rebooted and became the master node. This scenario is not allowed. Nodes must be restarted with the same role as they had before shutdown.

Mount of local filesystems failed

  Reliable NFS could not mount or unmount local filesystems and abort. Verify that mount points and file systems are coherent. For the device listed in the error, check:
  • The associated mount point in the /etc/vfstab file

  • The access permission of this mount point

No canonical name found for address <IP address>

  No canonical name was found to correspond to the specified address. Canonical names are required for every address. Specify the canonical name for this address in /etc/hosts.

Node's CMM mastership and RNFS one is not coherent

  Reliable NFS believes that the current node is the master, but the nhcmmd daemon does not consider the current node to be the master.

Number of SNDR slices is greater than configuration file one

  SNDR slices are configured but not managed through Reliable NFS.

Unable to read kstat data

  A partition managed by Reliable NFS disappeared while the cluster was running. This might happen if you change the SNDR configuration while the cluster is running. This scenario is not allowed. Reboot the node.

Unmount of local filesystems failed

  See Mount of local filesystems failed.

Vice master has <number> slices, we have <number>: refusing vice master to follow

  The master disk and vice-master disks do not have the same disk partition configuration. This is not allowed. Stop the vice-master and change its disk partition configuration to be the same as that on the master. See Modifying and Adding Disk Partitions in the Netra High Availability Suite 3.0 1/08 Foundation Services Cluster Administration Guide.

Vice master has a wrong configuration: refusing vice master to follow

  The master's view of the disk configuration on the vice-master disk is not current. Check that the nhfs.conf file is consistent between the two nodes.

Wrong slice configured in SNDR

  This problem occurs if SNDR slices are configured but not managed through Reliable NFS. Do not use SNDR on behalf of Reliable NFS.

Error Messages Written by the Reliable Boot Service


cmm_connect() failed (#)

  Examine log files for messages from nhpmd saying that the nhcmmd daemon was stopped. If necessary, reboot the node to restart the nhcmmd daemon.

Error Messages Written by the Node Management Agent


CMM statistics (JNI) Failed to get stats from CMM: [CMM status]

  A call to the CMM succeeded from an RPC point of view. However, the CMM internals were unable to return valid statistics. Check the status of the nhcmmd daemon and its processes.

CMM statistics (JNI) Failed to get stats from CMM: [rpc return code]

  An RPC error occurred during an access to the CMM statistics. Use the RPC return code to diagnose and correct the problem.

CMM statistics (JNI). Unable to access CMM statistics (can't access cmm-api service port number)

  The CMM is incorrectly configured. Confirm that /etc/services contains an entry for cmm-api.

CMM statistics (JNI). Unable to access CMM statistics (can't access tcp netconfig).

  The netconfig database is incorrectly configured for TCP. Correct the /etc/netconfig configuration.

CMM statistics (JNI) rpc call failed

  RPC failed while attempting to access Cluster Membership Manager statistics. Correct the RPC configuration.

KSTAT (JNI). Unable to launch CGTP

  CGTP statistics are not available. Confirm that the redundant network is available and that the network configuration is correct.


Error Messages Written by Command-Line Tools

This section contains the error messages written by the Netra HA Suite command-line tools. For information about these tools, see their man pages in section 1M for the Solaris OS or 8 for Linux of the Netra High Availability Suite 3.0 1/08 Foundation Services Reference Manual.