Sun Cluster 3.1 Error Messages Guide

Message IDs 300000–399999


300397 resource %s property changed.

Description:

This is a notification from the rgmd that a resource's property has been edited by the cluster administrator. This may be used by system monitoring tools.

Solution:

This is an informational message, no user action is needed.


300777 reservation warning(%s) - Unable to open device %s, errno %d, will retry in %d seconds

Description:

The device fencing program has encountered errors while trying to access a device. The failed operation will be retried

Solution:

This is an informational message, no user action is needed.


301092 file specified in ENVIRONMENT_FILE parameter %s does not exist.

Description:

The 'Environment_File' property was set when configuring theresource. The file specified by the 'Environment_File' property may not exist. The file should be readable and specified with a fully qualifiedpath.

Solution:

Specify an existing file with a fully qualified file name whencreating a resource.


301573 clcomm: error in copyin for cl_change_flow_settings

Description:

The system failed a copy operation supporting a flow control state change.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


301603 fatal: cannot create any threads to handle switchback

Description:

The rgmd was unable to create a sufficient number of threads upon starting up. This is a fatal error. The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Make sure that the hardware configuration meets documented minimum requirements. Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


301635 clexecd: close returned %d. Exiting.

Description:

clexecd program has encountered a failed thr_create(3THR) system call. The error message indicates the error number for the failure.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


301884 This node is running software incompatible with the rest of the cluster and will shut down.

Description:

The cluster version manager exchanges version information between nodes running in the cluster and has detected an incompatibility. This is usually the result of performing a rolling upgrade where one or more nodes has been installed with a software version that the other cluster nodes do not support. This error may also be due to attempting to boot a cluster node in 64-bit address mode when other nodes are booted in 32-bit address mode, or vice versa.

Solution:

Verify that any recent software installations completed without errors and that the installed packages or patches are compatible with the rest of the installed software. Save the /var/adm/messages file. Check the messages file for earlier messages related to the version manager which may indicate which software component is failing to find a compatible version.


302670 udlm_setup_port: fcntl: %d

Description:

A server was not able to execute fnctl(). udlm exits and the node aborts and panics.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


303231 mount_client_impl::remove_client() failed attempted RM change_repl_prov_status() to remove client, spec %s, name %s

Description:

The system was unable to remove a PXFS replica on the node that this message was seen.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


303434 Warning: could not validate the settings in <%s>. It is recommended that the settings for host lookup consult "files" before a name server


303805 Cannot change the IPMP group on the local host.

Description:

A different IPMP group for property NetIfList is specified in scrgadm command. The IPMP group on local node is set at resource creation time. Users may only update the value of property NetIfList for adding a IPMP group on a new node.

Solution:

Rerun the scrgadm command with proper value of property NetIfList.


303879 INTERNAL ERROR: Unable to lock %s: %s.

Description:

An internal error has occurred.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider forassistance in diagnosing the problem.


303941 Unsuccessful probe of %s port %d for non-secure resource %s. (%s)

Description:

An error occurred while fault monitor attempted to probe the health of the data service.

Solution:

Wait for the fault monitor to correct this by doing restart or failover. For more error description, look at the syslog messages.


304365 clcomm: Could not create any threads for pool %d

Description:

The system creates server threads to support requests from other nodes in the cluster. The system could not create any server threads during system startup. This is caused by a lack of memory.

Solution:

There are two solutions. Install more memory. Alternatively, take steps to reduce memory usage. Since the creation of server threads takes place during system startup, application memory usage is normally not a factor.


305298 cm_callback_impl abort_trans: exiting

Description:

ucmm callback for abort transition failed.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


306407 Failed to stop Adaptive server.

Description:

Sun Cluster HA for Sybase failed to stop using KILL signal.

Solution:

Please examine whether any Sybase server processes are running on the server. Please manually shutdown the server.


307195 clcomm: error in copyin for cl_read_flow_settings

Description:

The system failed a copy operation supporting flow control state reporting.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


308800 ERROR: rebalance: <%s> is pending_methods on node <%d>

Description:

An internal error has occurred in the locking logic of the rgmd. This error should not occur. It may prevent the rgmd from bringing the indicated resource group online.

Solution:

Look for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


309875:Error encountered enabling failfast.

Description:

An error occurred while attempting to enable the reservation failfast on the disks that are shared by other nodes.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log, and /var/cluster/ucmm/dlm*/logs/* from all the nodes and contact your Sun service representative.


310953 clnt_control of program %s failed %s.

Description:

HA-NFS fault monitor failed to reset the retry timeout for retransmitting the rpc request.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


311463 Failover attempt failed.

Description:

The failover attempt of the resource is rejected or encountered an error.

Solution:

For more detailed error message, check the syslog messages. Check whether the Pingpong_interval has appropriate value. If not, adjust it using scrgadm(1M). Otherwise, use scswitch to switch the resource group to a healthy node.


311808 Can not open /etc/mnttab: %s

Description:

Error in open /etc/mnttab, the error message is followed.

Solution:

Check with system administrator and make sure /etc/mnttab is properly defined.


312004 Value of Start_timeout property may be small for %d max_offload_retry for %d resource groups to offload.

Description:

This is a warning message indicating the you may have set max_offload_retry to a high value which may cause the start method of RGOffload resource to timeout before an attempt can be made to offload all specified resource groups.

Solution:

Please calculate the max_offload_retry so that the Start_timeout is not exceeded if every resource group that has to be offloaded requires maximum retries. There is a 10 second interval between successive retries.


312053 Cannot execute %s: %s.

Description:

Failure in executing the command.

Solution:

Check the syslog message for the command description. Check whether the system is low in memory or the process table is full and take appropriate action. Make sure that the executable exists.


313510 SAP xserver is not available.

Description:

SAP xserver is not running currently.

Solution:

Informative message, no action is required.


313806 pm_tick delay of %lld ms exceeds %lld ms


313867 Unknown step: %s

Description:

Request to run an unknown udlm step.

Solution:

None.


314314 prog <%s> step <%s> terminated due to receipt of signal <%d>

Description:

ucmmd step terminated due to receipt of a signal.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified and if it recurs. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance.


314341 Invalid probe values. Retry_interval (currently set to %d) must be greater than or equal to the product of Thorough_probe_interval (currently set to % d), and Retry_count (currently set to %d).

Description:

Validation of the probe related parameters failed because invalid values were specified.

Solution:

Retry_interval must be greater than or equal to the product of Thorough_probe_interval, and Retry_count. Use scrgadm(1M) to modify the values of these parameters so that they will hold the above relationship.


314356 resource %s enabled.

Description:

This is a notification from the rgmd that the operator has enabled a resource. This may be used by system monitoring tools.

Solution:

This is an informational message, no user action is needed.


314358 Command %s failed to complete. Return code is %d.

Description:

The listed command failed to complete with the listed return code. The return code is from the script db_clear.

Solution:

Look for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


315446 node id <%d> is out of range

Description:

The low-level cluster machinery has encountered an error.

Solution:

Look for other syslog messages occurring just before or after this one on the same node; they may provide a description of the error.


316215 Process sapsocol is already running outside of Sun Cluster. Will terminate it now, and restart it under Sun Cluster.

Description:

The SAP OS collector process is running outside of the control of Sun Cluster. HA-SAP will terminate it and restart it under the control of Sun Cluster.

Solution:

Informational message. No user action needed.


317263 Unable to retrieve the cluster handle: %s.


318636 No executable $BV1TO1/bin/xbvconf

Description:

The specified executable is not foundAction:Check if the Broadvision software was installed properly. Make sure the specified executable is available at theright location.


319047 CMM: Issuing a SCSI2 Tkown failed for quorum device %s with error %d.

Description:

This node encountered the specified error while issuing a SCSI2 Tkown operation on the indicated quorum device. This will cause the node to conclude that it has been unsuccessful in preempting keys from the quorum device, and therefore the partition to which it belongs has been preempted. If a cluster gets divided into two or more disjoint subclusters, exactly one of these must survive as the operational cluster. The surviving cluster forces the other subclusters to abort by grabbing enough votes to grant it majority quorum. This is referred to as preemption of the losing subclusters.

Solution:

If the error encountered is EACCES, then the SCSI2 command could have failed due to the presence of SCSI3 keys on the quorum device. Scrub the SCSI3 keys off of it, and reboot the preempted nodes.


319048 CCR: Cluster has lost quorum while updating table %s, it is possibly in an inconsistent state - ABORTING.

Description:

The cluster lost quorum while the indicated table was being changed, leading to potential inconsistent copies on the nodes.

Solution:

Check if the indicated table are consistent on all the nodes in the cluster, if not, boot the cluster in -x mode to restore the indicated table from backup. The CCR tables are located at /etc/cluster/ccr/.


319261 liveCache %s failed to start.

Description:

liveCache started up with error.

Solution:

Sun Cluster will fail over the liveCache resource to another available node. No user action is needed.


319375 clexecd: wait_for_signals got NULL.

Description:

clexecd problem encountered an error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


319413 Siebel server can be started only after Siebel database and Siebel gateway are running, and the scsblconfig file is correctly configured.

Description:

This is a warning message indicating a problem in determining the status of Siebel database and/or the Siebel gateway.

Solution:

Please verify that the scsblconfig file is correctly configured, and that the Siebel database and Siebel gateway are up before attempting to start the Siebel server.


320378 INTERNAL ERROR: usage: $0 <server_root> <siebel_enterprise>

Description:

An internal error has occurred.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


320833 INTERNAL ERROR usage:$0 BVUSER BV1TO1_VAR bv_local_host action IT_CONNECT_ATTEMPTS BV_ORB_CONNECT_TIMEOUT

Description:

An internal error has occurred.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider forassistance in diagnosing the problem.


321245 resource <%s> is disabled but not offline

Description:

While attempting to execute an operator-requested enable or disable of a resource, the rgmd has found the indicated resource to have its Onoff_switch property set to DISABLED, yet the resource is not offline. This suggests corruption of the RGM's internal data and will cause the enable or disable action to fail.

Solution:

This may indicate an internal error or bug in the rgmd. Contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


321667 clcomm: cl_comm: not booted in cluster mode.

Description:

Attempted to load the cl_comm module when the node was not booted as part of a cluster.

Solution:

Users should not explicitly load this module.


321962 Command %s failed to complete. HA-SAP will continue to start SAP.

Description:

The command to cleanipc failed to complete.

Solution:

This is an internal error. No user action needed. Save the /var/adm/messages from all nodes. Contact your authorized Sun service provider.


322373 Unable to unplumb %s%d, rc %d

Description:

Topology Manage is done with using the interface and failed to unplumb the adapter.


322642 Error binding to %s port %d for non-secure resource %s: %s (%s)

Description:

An error occurred while fault monitor attempted to probe the health of the data service.

Solution:

Wait for the fault monitor to correct this by doing restart or failover. For more error description, look at the syslog messages.


322675 Some NFS system daemons are not running.

Description:

HA-NFS fault monitor checks the health of statd, lockd, mountd and nfsd daemons on the node. It detected that one or more of these are not currently running.

Solution:

No action. The monitor would restart these. If it doesn't, reboot the node.


322797 Error registering provider '%s' with the framework.

Description:

The device configuration system on this node has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


322862 clcomm: error in copyin for cl_read_threads_min

Description:

The system failed a copy operation supporting flow control state reporting.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


322879 clcomm: Invalid copyargs: node %d pid %d

Description:

The system does not support copy operations between the kernel and a user process when the specified node is not the local node.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


322908 CMM: Failed to join the cluster: error = %d.

Description:

The local node was unsuccessful in joining the cluster.

Solution:

There may be other related messages on this node that may indicate the cause of this failure. Resolve the problem and reboot the node.


323498 libsecurity: NULL RPC to program %ld failed will retry %s

Description:

A client of the rpc.pmfd, rpc.fed or rgmd server was not able to initiate an rpc connection, because it could not execute a test rpc call, and the program will retry to establish the connection. The message shows the specific rpc error. The program number is shown. To find out what program corresponds to this number, use the rpcinfo command. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


324478 (%s): Error %d from read

Description:

An error was encountered in the clexecd program while reading the data from the worker process.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


325322 clcomm: error in copyin for state_resource_pool

Description:

The system failed a copy operation supporting statistics reporting.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


326043 reservation fatal error(%s) - release_resv_lock() returned exception

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.


326528 Nonexistent Broker_name (%s).

Description:

The broker name provided in the extension property Broker_Name does not exist.

Solution:

Check that a broker instance exists for the supplied broker name. It should match the broker name portion of the path in Confdir_list.


327057 SharedAddress stopped.

Description:

The stop method is completed and the resource is stopped.

Solution:

This is informational message. No user action required.


329286 (%s) instead of UDLM_ACK got a %d

Description:

Did not receive an acknowledgment from udlm as was expected.

Solution:

None.


329429 reservation fatal error(%s) - host_name not specified

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.


329496 unlatch_intention(): IDL exception when communicating to node %d

Description:

An inter-node communication failed, probably because a node died.

Solution:

No action is required; the rgmd should recover automatically.


329616 svc_probe used entire timeout of %d seconds during connect operation and exceeded the timeout by %d seconds. Attempting disconnect with timeout %d

Description:

The probe timed out connecting to the application.

Solution:

If the problem persists investigate why the application is responding slowly or if the Probe_timeout property needs to be increased.


329778 clconf: Data length is more than max supported length in clconf_ccr read

Description:

In reading configuration data through CCR, found the data length is more than max supported length.

Solution:

Check the CCR configuration information.


329847 Warning: node %d has a weight of 0 assigned to it for property %s.

Description:

The named node has a weight of 0 assigned to it. A weight of 0 means that no new client connections will be distributed to that node.

Solution:

Consider assigning the named node a non-zero weight.


329957 odd table entry


330063 error in vop open %x

Description:

Opening a private interconnect interface failed.

Solution:

Reboot of the node might fix the problem.


330182 Internal error: default value missing for resource property

Description:

A non-fatal internal error has occurred in the RGM.

Solution:

Since this problem might indicate an internal logic error in the rgmd, please save a copy of the /var/adm/messages files on all nodes, the output of an scstat -g command, and the output of a scrgadm -pvv command. Report the problem to your authorized Sun service provider.


330526 CMM: Number of steps specified in registering callback = %d; should be <= %d.

Description:

The number of steps specified during registering a CMM callback exceeds the allowable maximum. This is an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


331221 CMM: Max detection delay specified is %ld which is larger than the max allowed %ld.

Description:

The maximum of the node down detection delays is larger than the allowable maximum. The maximum allowed will be used as the actual maximum in this case.

Solution:

This is an informational message, no user action is needed.


331325 sigprocmask: %s The rpc.fed server encountered an error with the sigprocmask function, and was not able to start. The message contains the system error.

Solution:

Save the syslog messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


333069 Failed to retrieve nodeid for %s.

Description:

The nodeid for the given name could not be determined.

Solution:

Make sure that the name given is a valid node identifier or node name.


333455 Multi-IP group '%s' removed

Description:

The Multi-IP group by that name is removed.

Solution:

This is an informational message, no user action is needed.


333928 LogicalHostname offline.


334697 Failed to retrieve the cluster property %s: %s.

Description:

The query for a property failed. The reason for the failure is given in the message.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


334992 clutil: Adding deferred task after threadpool shutdown id %s

Description:

During shutdown this operation is not allowed.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


335206 Failed to get host names from the resource.

Description:

Retrieving the IP addresses from the network resources from this resource group has failed.

Solution:

Internal error or API call failure might be the reasons. Check the error messages that occurred just before this message. If there is internal error, contact your authorized Sun service provider. For API call failure, check the syslog messages from other components. For the resource name and resource group name, check the syslog tag.


335591 Failed to retrieve the resource group property %s: %s.

Description:

An API operation has failed while retrieving the resource group property. Low memory or API call failure might be the reasons.

Solution:

In case of low memory, the problem will probably cured by rebooting. If the problem reoccurs, you might need to increase swap space by configuring additional swap devices. Otherwise, if it is API call failure, check the syslog messages from other components. For resource group name and the property name, check the current syslog message.


335468 Time allocated to stop development system is too small (less than 5 seconds).

Description:

Time allocated to stop the development system is too small.

Solution:

The time for stopping the development system is a percentage of the total Start_timeout. Increase the value for property Start_timeout or the value for propety Dev_stop_pct.


336860 read %d for %snum_ports

Description:

Could not get information about the number of ports udlm uses from config file udlm.conf.

Solution:

Check to make sure udlm.conf file exist and has entry for udlm.num_ports. If everything looks normal and the problem persists, contact your Sun service representative.


337008 rgm_comm_impl::_unreferenced() called unexpectedly

Description:

The low-level cluster machinery has encountered a fatal error. The rgmd will produce a core file and will cause the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


337073 $BV1TO1_VAR is not a directory.

Description:

The specified environment variable does not point to the right directory.

Solution:

Set the specified environment variable correctly.


337166 Error setting environment variable %s.

Description:

An error occured while setting the environment variable LD_LIBRARY_PATH. This is required by the fault monitor for the nsldap data service. The fault monitor appends the ldap server root path, including the lib directory, to the LD_LIBRARY_PATH environment variable

Solution:

Check that there is a lib directory under the server root of the nsldap data service which pertains to this resource. If this directory has been removed, then it must be replaced by reinstalling Netscape Directory Server, or whatever other means are appropriate.


337212 resource type %s removed.

Description:

This is a notification from the rgmd that a resource type has been deleted. This may be used by system monitoring tools.

Solution:

This is an informational message, no user action is needed.


338067 This resource does not depend on any SUNW.HAStoragePlus resources. Proceeding with normal checks.

Description:

The resource does not depend on any HAStoragePlus filesystems. The validation will continue with it's other checks.

Solution:

This message is informational; no user action is needed.


338839 clexecd: Could not create thread. Error: %d. Sleeping for %d seconds and retrying.

Description:

clexecd program has encountered a failed thr_create() system call. The error message indicates the error number for the failure. It will retry the system call after specified time.

Solution:

If the message is seen repeatedly, contact your authorized Sun service provider to determine whether a workaround or patch is available.


339424 Could not host device service %s because this node is being removed from the list of eligible nodes for this service.

Description:

A switchover/failover was attempted to a node that was being removed from the list of nodes that could host this device service.

Solution:

This is an informational message, no user action is needed.


339521 CCR: Lost quorum while starting to update table %s.

Description:

The cluster lost quorum when CCR started to update the indicated table.

Solution:

Reboot the cluster.


339590 Error (%s) when reading property %s.

Description:

Unable to read property value using API. Property name is indicated in message. Syslog messages may give more information on errors in other modules.

Solution:

Check syslog messages. Please report this problem.


339657 Issuing a restart request.

Description:

This is informational message. We are above to call API function to request for restart. In case of failure, follow the syslog messages after this message.

Solution:

No user action is needed.


339954 fatal: cannot create any threads to launch callback methods

Description:

The rgmd was unable to create a thread upon starting up. This is a fatal error. The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Make sure that the hardware configuration meets documented minimum requirements. Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


340287 idl_set_timestamp(): IDL Exception

Description:

The rgmd has encountered an error that prevents the scha_control function from successfully setting a ping-pong time stamp, presumably because a node died. This does not prevent the attempted failover from succeeding, but in the worst case, might prevent the anti-"pingpong" feature from working, which may permit a resource group to fail over repeatedly between two or more nodes.

Solution:

Examine syslog output on the node that rebooted, to determine the cause of node death. The syslog output might indicate further remedial actions.


340893 The stop command \'%s\' failed to stop %s. Using SIGKILL.

Description:

The specified stop command was unable to stop the specified resource. A SIGKILL signal will be sent to all the processes associated with the resource.

Solution:

No action required by the user. This is an informational message.


341502 Unable to plumb even after unplumbing. rc = %d

Description:

Topology Manager failed to plumb an adapter for private network. A possible reason for plumb to fail is that it is already plumbed. Solaris clustering has successfully unplumbed the adapter but failed while trying to plumb for private use.


341719 Restarting daemon %s.

Description:

HA-NFS is restarting the specified daemon.

Solution:

No action.


341754 INTERNAL ERROR: usage: $0 <logicalhost> <server_root> <siebel_enterprise> <siebel_servername> <timeout>

Description:

An internal error has occurred.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


341804 Failed to retrieve information for user %s.

Description:

Failed to retrieve information for the specified BV user

Solution:

Check if the proper Broadvision Unix User ID is set orcheck if this user exists on all the nodes of the cluster.


342113 Invalid script name %s. It cannot contain any '/'.

Description:

The script name should be just the script name, no path is needed.

Solution:

Specify just the script name without any path.


342336 clcomm: Pathend %p: path_down not allowed in state %d

Description:

The system maintains state information about a path. A path_down operation is not allowed in this state.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


342597 sigaddset: %s The rpc.fed server encountered an error with the sigemptyset function, and was not able to start. The message contains the system error.

Solution:

Save the syslog messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


342793 Successfully started the service %s.

Description:

Specified data service started successfully.

Solution:

None. This is only an informational message.


343307 Could not open file %s: %s.

Description:

System has failed to open the specified file.

Solution:

Check whether the permissions are valid. This might be the result of a lack of system resources. Check whether the system is low in memory and take appropriate action. For specific error information, check the syslog message.


345342 Failed to connect to %s port %d.

Description:

The data service fault monitor probe was trying to connect to the host and port specified and failed. There may be a prior message in syslog with further information.

Solution:

Make sure that the port configuration for the data service matches the port configuration for the underlying application.


346016 setproject: %s; continuing the process with system default project.

Description:

Either the given project name was invalid, or the caller of setproject() was not a valid user of the given project. The process was launched with project "default" instead of the specified project.

Solution:

Use the projects(1) command to check if the project name is valid and the caller is a valid user of the given project.


346036 libsecurity: unexpected getnetconfigent error

Description:

A client of the rpc.pmfd, rpc.fed or rgmd server was not able to initiate an rpc connection, because it could not get the network information. The pmfadm or scha command exits with error. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


347091 resource type %s added.

Description:

This is a notification from the rgmd that the operator has created a new resource type. This may be used by system monitoring tools.

Solution:

This is an informational message, no user action is needed.


347344 Did not find a valid port number to match field <%s> in configuration file <%s>: %s.

Description:

A failure occurred extracting a port number for the field within the configuration file. The field exists and the file exists and is accessible. The value for the field may not exist or may not be an integer greater than zero. An error in environment may have occurred, indicated by a non-zero errno value at the end of the message.

Solution:

Check to see if the value for the field in the configuration file exists and is an integer greater than zero. If there is an error in the field value, fix the value and retry the operation.


348240 clexecd: putmsg returned %d.

Description:

clexecd program has encountered a failed putmsg(2) system call. The error message indicates the error number for the failure.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


349049 CCR reported invalid table %s; halting node

Description:

The CCR reported to the rgmd that the CCR table specified is invalid or corrupted. The node will be halted to prevent further errors.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


349741 Command %s is not a regular file.

Description:

The specified pathname, which was passed to a libdsdev routine such as scds_timerun or scds_pmf_start, does not refer to a regular file. This could be the result of 1) mis-configuring the name of a START or MONITOR_START method or other property, 2) a programming error made by the resource type developer, or 3) a problem with the specified pathname in the file system itself.

Solution:

Ensure that the pathname refers to a regular, executable file.


351777 Resource is online again.


351887 Bulk registration failed

Description:

Solution:


353557 Filesystem (%s) is locked and cannot be frozen

Description:

The file system has been locked with the _FIOLFS ioctl. It is necessary to perform an unlock _FIOLFS ioctl. The growfs(1M) or lockfs(1M) command may be responsible for this lock.

Solution:

An _FIOLFS LOCKFS_ULOCK ioctl is required to unlock the file system.


353753 invalid mask in hosts list: %s

Description:

Solution:


354821 Attempting to start the fault monitor under process monitor facility.

Description:

The function is going to request the PMF to start the fault monitor. If the request fails, refer to the syslog messages that appear after this message.

Solution:

This is an informational message, no user action is required.


355950 HA: unknown invocation result status %d

Description:

An invocation completed with an invalid result status.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


356795 CMM: Reconfiguration step %d was forced to return.

Description:

One of the CMM reconfiguration step transitions failed, probably due to a problem on a remote node. A reconfiguration is forced assuming that the CMM will resolve the problem.

Solution:

This is an informational message, no user action is needed.


356930 Property %s is empty. This property must be specified for scalable resources.

Description:

The value of the specified property must not be empty for scalable resources.

Solution:

Use scrgadm(1M) to specify a non-empty value for the property.


357263 munmap: %s

Description:

The rpc.pmfd server was not able to delete shared memory for a semaphore, possibly due to low memory, and the system error is shown. This is part of the cleanup after a client call, so the operation might have gone through. An error message is output to syslog.

Solution:

Investigate if the machine is running out of memory. If this is not the case, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


357558 %s: Unable to start in multi-threaded mode.

Description:

RPC service is unable to start the daemon in multithreaded mode.


357767 %d entries found in property %s. For a nonsecure %s instance %s should have exactly one entry.

Description:

Since a nonsecure Server instance only listens on a single port, the specified property should only have a single entry. A different number of entries was found.

Solution:

Change the number of entries to be exactly one.


357915 Error: Unable to stat directory <%s> for scha_control timestamp file

Description:

The rgmd failed in a call to stat(2) on the local node. This may prevent the anti-"pingpong" feature from working, which may permit a resource group to fail over repeatedly between two or more nodes. The failure of the stat call might indicate a more serious problem on the node.

Solution:

Examine other syslog messages occurring around the same time on the same node, to see if the source of the problem can be identified.


358129 Either extension property <failover_enabled> is not defined, or an error occured while retrieving this property; using the default value of TRUE.

Description:

Property failover_enabled may not be defined in RTR file. Continue the process with the default value of TRUE.

Solution:

This is an informational message, no user action is needed.


358211 monitor_check: the failover requested by scha_control for resource <%s>, resource group <%s> was not completed because of error: %s

Description:

A scha_control(1HA,3HA) GIVEOVER attempt failed, due to the error listed.

Solution:

Examine other syslog messages on all cluster members that occurred about the same time as this message, to see if the problem that caused the error can be identified and repaired.


358533 Invalid protocol is specified in %s property.

Description:

The specified system property does not have a valid protocol.

Solution:

Using scrgadm(1M), change the value of the property to use a valid protocol. For example: TCP, UDP.


359648 Service has failed.

Description:

Probe is detected a failure in the data service. The data service cannot be restarted on the same node, since there are frequent failures. Probe is setting resource status as failed.

Solution:

Wait for the fault monitor to failover the data service. Check the syslog messages and configuration of the data service.


360600:Oracle UDLM package wrong instruction set architecture.

Description:

The Oracle UDLM package that is currently installed is the incorrect instruction set architecture for the mode that the node is currently booted in, (e.g., Oracle UDLM is 64-bit (sparc9) and the node is currently boot in 32-bit mode (sparc)).

Solution:

Obtain and install the proper Oracle UDLM package from Oracle for the instruction set architecture of the system, or boot the node in an instruction set architecture that is compatible with the current version of the Oracle UDLM.


361048 ERROR: rgm_run_state() returned non-zero while running boot methods

Description:

The rgmd state machine has encountered an error on this node.

Solution:

Look for preceding syslog messages on the same node, which may provide a description of the error.


361289 iPlanet service with config file <%s> does not configure %s.

Description:

The magnus.conf configuration file for the iPlanet Web Server instance does not contain the specified directive.

Solution:

Edit the configuration file and set the specified directive.


361489 in libsecurity __rpc_negotiate_uid failed for transport %s

Description:

A server (rpc.pmfd, rpc.fed or rgmd) was not able to start because it could not establish a rpc connection for the network. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


361831 Initialization failed. Invalid command line %s %s.

Description:

Unable to process parameters passed to the call back method. The parameters are indicated in the message. This is a Sun Cluster HAfor Sybase internal error.

Solution:

Report this problem to your authorized Sun service provider


362463 clcomm: Endpoint %p: path_down not allowed in state %d

Description:

The system maintains information about the state of an Endpoint. The path_down operation is not allowed in this state.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


362519 dl_attach: DL_ERROR_ACK bad PPA

Description:

Could not attach to the physical device. We are trying to open a fast path to the private transport adapters.

Solution:

Reboot of the node might fix the problem.


362657 Error when sending response from child process: %m

Description:

Error occurred when sending message from fault monitor child process. Child process will be stopped and restarted.

Solution:

If error persists, then disable the fault monitor and resport the problem.


363357 Failed to unregister callback for IPMP group %s with tag %s (request failed with %d).

Description:

An unexpected error occurred while trying to communicate with the network monitoring daemon (pnmd).

Solution:

Make sure the network monitoring daemon (pnmd) is running.


363505 check_for_ccrdata failed strdup for (%s)

Description:

Call to strdup failed. The "strdup" man page describes possible reasons.

Solution:

Install more memory, increase swap space or reduce peak memory consumption.


363972 reservation message(%s) - Waiting for reservation lock

Description:

Locking is used by the device fencing program to ensure correct behavior when different nodes see different cluster memberships. This node is waiting for an instance of the device fencing program on another node to complete.

Solution:

The lock should eventually be granted. If node failures are involved, the lock will not be granted until node deaths are assured, which may take a few minutes. If the lock is eventually granted, no user action is required. If the lock is not granted, your authorized Sun service provider should be contacted to help diagnose the problem.


364188 Validation failed. Listener_name not set

Description:

'Listener_name' property of the resource is not set. HA-Oracle will not be able to manage Oracle listener if Listener name is not set.

Solution:

Specify correct 'Listener_name' when creating resource. If resource is already created, please update resource property.


364510 The specified Oracle dba group id (%s) does not exist

Description:

Group id of oracle dba does not exist.

Solution:

Make sure /etc/nswitch.conf and /etc/group files are valid and have correct information to get the group id of dba.


366225 Listener %s stopped successfully

Description:

Informational message. HA-Oracle successfully stopped Oracle listener.

Solution:

None


366769 The Hosts in the startup order are not up. Waiting for them to start....

Description:

The probe has detected that the BV processes are not running but cannot take any action because the BV hosts in the startup orderare not running.

Solution:

If the Resource Groups which contain the Backend resources arenot online then bring them online. If they are online then probablythe BV processes are in the process of coming up and so no need totake any action.


367270 INTERNAL ERROR: Failed to create the path to the %s file.

Description:

An internal error has occurred.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider forassistance in diagnosing the problem.


367617 reservation fatal error(%s) - Invalid file format '%s'

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.


367864 svc_init failed.

Description:

The rpc.pmfd server was not able to initialize server operation. This happens while the server is starting up, at boot time. The server does not come up, and an error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


368363 Failed to retrieve the current primary node.

Description:

Cannot retrieve the current primary node for the given resource group.

Solution:

Check the syslog messages that occurred just before this message, to see whether there is any internal error has occurred. If it is, contact your authorized Sun service provider. Otherwise, Check if the resource group is in STOP_FAILED state. If it is, then clear the state and bring the resource group online.


368819 t_rcvudata in recv_request: %s

Description:

Call to t_rcvudata() failed. The "t_rcvudata" man page describes possible error codes. udlm will exit and the node will abort.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


369460 udlm_send_reply %s: udp is null!

Description:

Can not communicate with udlmctl because the address to send to is null.

Solution:

None. udlm will handle this error.


369570 Multiple ports specified in property %s.

Description:

Solution:


370604 This resource depends on a HAStoragePlus resouce that is in a different Resource Group. This configuration is not supported.

Description:

The resource depends on a HAStoragePlus resource that is configured in a different resource group. This configuration is not supported.

Solution:

Please add this resource and the HAStoragePlus resource in the same resource group.


370949 created %d threads to launch resource callback methods; desired number = %d

Description:

The rgmd was unable to create the desired number of threads upon starting up. This is not a fatal error, but it may cause RGM reconfigurations to take longer because it will limit the number of tasks that the rgmd can perform concurrently.

Solution:

Make sure that the hardware configuration meets documented minimum requirements. Examine other syslog messages on the same node to see if the cause of the problem can be determined.


371297 %s: Invalid command line option. Use -S for secure mode

Description:

rpc.sccheckd should always be invoked in secure mode. If this message shows up, someone has modified configuration files that affects server startup.

Solution:

Reinstall cluster packages or contact your service provider.


371369 CCR: CCR data server on node %s unreachable while updating table %s.

Description:

While the TM was updating the indicated table in the cluster, the specified node went down and has become unreachable.

Solution:

The specified node needs to be rebooted.


372880 CMM: Quorum device %ld (gdevname %s) can not be acquired by the current cluster members. This quorum device is held by node%s %s.

Description:

This node does not have its reservation key on the specified quorum device, which has been reserved by the specified node or nodes that the local node can not communicate with. This indicates that in the last incarnation of the cluster, the other nodes were members whereas the local node was not, indicating that the CCR on the local node may be out-of-date. In order to ensure that this node has the latest cluster configuration information, it must be able to communicate with at least one other node that was a member of the previous cluster incarnation. These nodes holding the specified quorum device may either be down or there may be up but the interconnect between them and this node may be broken.

Solution:

If the nodes holding the specified quorum devices are up, then fix the interconnect between them and this node so that communication between them is restored. If the nodes are indeed down, boot one of them.


372887 HA: repl_mgr: exception occurred while invoking RMA

Description:

An unrecoverable failure occurred in the HA framework.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


373148 The port portion of %s at position %d in property %s is not a valid port.

Description:

The property named does not have a legal value. The position index, which starts at 0 for the first element in the list, indicates which element in the property list was invalid.

Solution:

Assign the property a legal value.


373816 clcomm: copyinstr: max string length %d too long

Description:

The system attempted to copy a string from user space to the kernel. The maximum string length exceeds length limit.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


374006 prog <%s> failed on step <%s> retcode <%d>

Description:

ucmmd step failed on a step.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified and if it recurs. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance.


374738 dl_bind: DL_BIND_ACK bad sap %u

Description:

SAP in acknowledgment to bind request is different from the SAP in the request. We are trying to open a fast path to the private transport adapters.

Solution:

Reboot of the node might fix the problem.


376111 Unable to compose %s path. Sending SIGKILL now.

Description:

The STOP method was not able to construct the applications stop command. The STOP method will send SIGKILL to stop the application.

Solution:

Other messages will indicate what the underlying problem was such as no memory or a bad configuration.


376905 Failed to retrieve WLS extension properties.Will shutdown the WLS using sigkill

Description:

Failed to retrieve the WLS exension properties that are needed to do a smooth shutdown. The WLS stop method however will go ahead and kill the WLS process.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


376974 Error in initialization; exiting.

Description:

Solution:


377210 Failed to retrieve BV extension properties.

Description:

Failed to retrieve the Extension properties set by the user orFailed to retrieve a valid host for BV processes.

Solution:

Look for other error messages generated while retrieving thethe extension properties to identify the exact error. Look for appropriate action for that error message.


377347 CMM: Node %s (nodeid = %ld) is up; new incarnation number = %ld.

Description:

The specified node has come up and joined the cluster. A node is assigned a unique incarnation number each time it boots up.

Solution:

This is an informational message, no user action is needed.


377531 Stop saposcol under PMF times out.

Description:

Stopping the SAP OS collector process under the control of Process Monitor facility times out. This might happen under heavy system load.

Solution:

You might consider increase the stop time out value.


378427 prog <%s> step <%s> terminated due to receipt of signal

Description:

ucmmd step terminated due to receipt of a signal.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified and if it recurs. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance.


377897 Successfully started the service

Description:

Informational message. SAP started up successfully.

Solution:

No action needed.


378220 Siebel gateway already running.

Description:

Siebel gateway was not expected to be running. This may be due to the gateway having started outside Sun Cluster control.

Solution:

Please shutdown the gateway instance manually, and retry the previous operation.


378807 clexecd: %s: sigfillset returned %d. Exiting.

Description:

clexecd program has encountered a failed sigfillset(3C) system call. The error message indicates the error number for the failure.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


378872 %s operation failed: %s.

Description:

Specified system operation could not complete successfully.

Solution:

This is as an internal error. Contact your authorized Sun service provider with the following information. 1) Saved copy of /var/adm/messages file. 2) Output of "ifconfig -a" command.


379450 reservation fatal error(%s) - fenced_node not specified

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.


379574 Unable to lock %s: %s.


380064 switchback attempt failed on resource group <%s> with error <%s>

Description:

The rgmd was unable to failback the specified resource group to a more preferred node. The additional error information in the message explains why.

Solution:

Examine other syslog messages occurring around the same time on the same node. These messages may indicate further action.


380317 Failed to verify that all IPMP groups are in a stable state. Assuming this node cannot respond to client requests.

Description:

The state of the IPMP groups on the node could not be determined.

Solution:

Make sure all adapters and cables are working. Look in the /var/adm/messages file for message from the network monitoring daemon (pnmd).


380365 (%s) t_rcvudata, res %d, flag %d: tli error: %s

Description:

Call to t_rcvudata() failed. The "t_sndudata" man page describes possible error codes. udlmctl will exit.

Solution:

Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


380897 rebalance: WARNING: resource group <%s> is <%s> on node <%d>, resetting to OFFLINE.

Description:

The resource group has been found to be in the indicated state and is being reset to OFFLINE. This message is a warning only and should not adversely affect the operation of the RGM.

Solution:

Since this problem might indicate an internal logic error in the rgmd, please save a copy of the /var/adm/messages files on all nodes, the output of an scstat -g command, and the output of a scrgadm -pvv command. Report the problem to your authorized Sun service provider.


381244 in libsecurity mkdir of %s failed: %s

Description:

The rpc.pmfd, rpc.fed or rgmd server was not able to create a directory to contain "cache" files for rpcbind information. The affected component should still be able to function by directly calling rpcbind.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


381386 Prog <%s> step <%s>: unkillable.

Description:

The specified callback method for the specified resource became stuck in the kernel, and could not be killed with a SIGKILL. The UCMM reboots the node to prevent the stuck node from causing unavailability of the services provided by UCMM.

Solution:

No action is required. This is normal behavior of the UCMM. Other syslog messages that occurred just before this one might indicate the cause of the method failure.


381765 sema_post: %s

Description:

The rpc.pmfd server was not able to act on a semaphore. The message contains the system error. The server does not perform the action requested by the client, and an error message is output to syslog.

Solution:

Investigate if the machine is running out of memory. If this is not the case, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


382169 Share path name %s not absolute.

Description:

A path specified in the dfstab file does not begin with "/"

Solution:

Only absolute path names can be shared with HA-NFS.


382252 Share path %s: file system %s is not mounted.

Description:

The specified file system, which contains the share path specified, is not currently mounted.

Solution:

Correct the situation with the file system so that it gets mounted.


382295 Unable to register.


382995 ioctl in negotiate_uid failed

Description:

Call to ioctl() failed. The "ioctl" man page describes possible error codes. ucmmd will exit and the node will abort.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


383706 NULL value returned for the resource property %s.

Description:

NULL value was specified for the extension property of the resource.

Solution:

For the property name check the syslog message. Any of the following situations might have occurred. Different user action is needed for these different scenarios. 1) If a new resource is created or updated, check whether the value of the extension property is empty. If it is, provide valid value using scrgadm(1M). 2) For all other cases, treat it as an Internal error. Contact your authorized Sun service provider.


384549 CCR: Could not backup the CCR table %s errno = %d.

Description:

The indicated error occurred while backing up indicated CCR table on this node. The errno value indicates the nature of the problem. errno values are defined in the file /usr/include/sys/errno.h. An errno value of 28(ENOSPC) indicates that the root file system on the indicated node is full. Other values of errno can be returned when the root disk has failed(EIO) or some of the CCR tables have been deleted outside the control of the cluster software(ENOENT).

Solution:

There may be other related messages on this node, which may help diagnose the problem, for example: If the root file system is full on the node, then free up some space by removing unnecessary files. If the root disk on the afflicted node has failed, then it needs to be replaced. If the indicated CCR table was accidently deleted, then boot this node in -x mode to restore the indicated CCR table from other nodes in the cluster or backup. The CCR tables are located at /etc/cluster/ccr/.


384621 RDBMS probe successful

Description:

This message indicates that Fault monitor has successfully probed the RDBMS server

Solution:

No action required. This is informational message.


384820 libsecurity: rpc_createerror: %s

Description:

A client of the rpc.pmfd, rpc.fed or rgmd server was not able to initiate an rpc connection. The error message generated with a call to clnt_spcreateerror(3NSL) is appended.

Solution:

Save the /var/adm/messages file. Check the messages file for earlier errors related to the rpc.pmfd, rpc.fed, or rgmd server.


385407 t_alloc (open_cmd_port) failed with errno%d

Description:

Call to t_alloc() failed. The "t_alloc" man page describes possible error codes. ucmmd will exit and the node will abort.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


385550 Can't setup binding entries from node %d for GIF node %d

Description:

Failed to maintain client affinity for some sticky services running on the named server node due to a problem on the named GIF node. Connections from existing clients for those services might go to a different server node as a result.

Solution:

If client affinity is a requirement for some of the sticky services, say due to data integrity reasons, switchover all global interfaces (GIFs) from the named GIF node to some other node.


385902 pmf_search_children: Error signaling <%s>: %s

Description:

An error occured while rpc.pmfd attempted to send a signal to one of the processes of the given tag. The reason for the failure is also given. The signal was sent to the process as a result of some event external to rpc.pmfd. rpc.pmfd "intercepted" the signal, and is trying to pass the signal on to the monitored process.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


386024 ERROR: rebalance: duplicate nodeid <%d> in Nodelist of resource group <%s>; continuing

Description:

The same nodename appears twice in the Nodelist of the given resource group. Although non-fatal, this should not occur and may indicate an internal logic error in the rgmd.

Solution:

Use scrgadm -pv to check the Nodelist of the affected resource group. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


386072 chdir: %s

Description:

The rpc.pmfd server was not able to change directory. The message contains the system error. The server does not perform the action requested by the client, and an error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


386908 Resource is already stopped.

Description:

An attempt was made to stop a resource that has already been stopped.

Solution:

Using the ps command check to make sure that all processes for the Data Service have been stopped. Check syslog for any possible errors which may have occured just before this message. If everything appears to be correct, then no action is required.


386995 Failed to parse xml: NULL attribute

Description:

Solution:


387003 CCR: CCR metadata not found.

Description:

The CCR is unable to locate its metadata.

Solution:

Boot the offending node in -x mode to restore the indicated table from backup or other nodes in the cluster. The CCR tables are located at /etc/cluster/ccr/.


387150 scvxvmlg warning - found no matching volume for device node %s, removing it

Description:

The program responsible for maintaining the VxVM device namespace has discovered inconsistencies between the VxVM device namespace on this node and the VxVM configuration information stored in the cluster device configuration system. If configuration changes were made recently, then this message should reflect one of the configuration changes. If no changes were made recently or if this message does not correctly reflect a change that has been made, the VxVM device namespace on this node may be in an inconsistent state. VxVM volumes may be inaccessible from this node.

Solution:

If this message correctly reflects a configuration change to VxVM diskgroups then no action is required. If the change this message reflects is not correct, then the information stored in the device configuration system for each VxVM diskgroup should be examined for correctness. If the information in the device configuration system is accurate, then executing '/usr/cluster/lib/dcs/scvxvmlg' on this node should restore the device namespace. If the information stored in the device configuration system is not accurate, it must be updated by executing '/usr/cluster/bin/scconf -c -D name=diskgroup_name' for each VxVM diskgroup with inconsistent information.


387232 resource %s monitor enabled.

Description:

This is a notification from the rgmd that the operator has enabled monitoring on a resource. This may be used by system monitoring tools.

Solution:

This is an informational message, no user action is needed.


387288 clcomm: Path %s online

Description:

A communication link has been established with another node.

Solution:

No action required.


387572 File %s should be readable and writable by %s.

Description:

A program required the specified file to be readable and writable by the specified user.

Solution:

Set correct permissions for the specified file to allow the specified user to read it and write to it.


388330 Text server stopped.

Description:

The Text server has been stopped by Sun Cluster HA for Sybase.

Solution:

This is an informational message, no user action is needed.


389221 could not open configuration file: %s

Description:

The specified configuration file could not be opened.

Solution:

Check if the configuration file exists and has correct permissions. If the problem persists, contact your Sun Service representative.


389231 clcomm: inbound_invo::cancel:_state is 0x%x

Description:

The internal state describing the server side of a remote invocation is invalid when a cancel message arrives during processing of the remote invocation.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


389369 Validation failed. SYBASE ASE STOP_FILE %s is not executable.

Description:

File specified in the STOP_FILE extension property is not an executable file.

Solution:

Please check the permissions of file specified in the STOP_FILE extension property. File should be executable by the Sybase owner and root user.


389516 NULL value returned for the extension property %s.

Description:

NULL value was specified for the extension property of the resource.

Solution:

For the property name check the syslog message. Any of the following situations might have occurred. Different user action is needed for these different scenarios. 1) If a new resource is created or updated, check whether the value of the extension property is empty. If it is, provide valid value using scrgadm(1M). 2) For all other cases, treat it as an Internal error. Contact your authorized Sun service provider.


389901 ext_props(): Out of memory

Description:

System runs out of memory in function ext_props().

Solution:

Install more memory, increase swap space, or reduce peak memory consumption.


390130 Failed to allocate space for %s.

Description:

An internal error has occurred.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider forassistance in diagnosing the problem.


390691 NFS daemon down

Description:

HA-NFS fault monitor detected that an nfs daemon died and will automatically restart it later.

Solution:

No action required.


391177 Failed to open %s: %s.


391738 (%s) bad poll revent: %x (hex)

Description:

Call to poll() failed. The "poll" man page describes possible error codes. udlmctl will exit.

Solution:

Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


392782 Failed to retrieve the property %s for %s: %s.

Description:

API operation has failed in retrieving the cluster property.

Solution:

For property name, check the syslog message. For more details about API call failure, check the syslog messages from other components.


393385 Service daemon not running.

Description:

Process group has died and the data service's daemon is not running. Updating the resource status.

Solution:

Wait for the fault monitor to restart or failover the data service. Check the configuration of the data service.


393934 Stopping the adaptive server with wait option.

Description:

The Sun Cluster HA for Sybase will attempt to shutdown the Sybase adaptive server using the wait option.

Solution:

This is an informational message, no user action is needed.


393960 sigaction failed in set_signal_handler

Description:

The ucmmd has failed to initialize signal handlers by a call to sigaction(2). The error message indicates the reason for the failure. The ucmmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes and of the ucmmd core. Contact your authorized Sun service provider for assistance in diagnosing the problem.


394325 Received notice that IPMP group %s has failed.

Description:

The status of the named IPMP group has become degraded. If possible, the scalable resources currently running on this node with monitoring enabled will be relocated off of this node, if the IPMP group stays in a degraded state.

Solution:

Check the status of the IPMP group on the node. Try to fix the adapters in the IPMP group.


395353 Failed to check whether the resource is a network address resource.

Description:

While retrieving the IP addresses from the network resources, the attempt to check whether the resource is a network resource or not has failed.

Solution:

Internal error or API call failure might be the reasons. Check the error messages that occurred just before this message. If there is internal error, contact your authorized Sun service provider. For API call failure, check the syslog messages from other components.


396134 Register callback with NAFO %s failed: Error %d.

Description:

LogicalHostname resource was unable to register with IPMP for status updates.

Solution:

Most likely it is result of lack of system resources. Check for memory availability on the node. Reboot the node if problem persists.


396727 Attempting to check for existence of %s pid %d resulted in error: %s.

Description:

HA-NFS fault monitor attempted to check the status of the specified process but failed. The specific cause of the error is logged.

Solution:

No action. HA-NFS fault monitor would ignore this error and would attempt this operation at a later time. If this error persists, check to see if the system is lacking the required resources (memory and swap) and add or free resources if required. Reboot the node if the error persists.


397020 unix DLM abort failed

Description:

Failed to abort unix dlm. This is an error that can be ignored.

Solution:

None.


397219 RGM: Could not allocate %d bytes; node is out of swap space; aborting node.

Description:

The rgmd failed to allocate memory, most likely because the system has run out of swap space. The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

The problem is probably cured by rebooting. If the problem recurs, you might need to increase swap space by configuring additional swap devices. See swap(1M) for more information.


397340 Monitor initialization error. Unable to open resource: %s Group: %s: error %d

Description:

Error occured in monitor initialization. Monitor is unable to get resource information using API calls.

Solution:

Check syslog messages for errors logged from other system modules. Stop and start fault monitor. If error persists then disable fault monitor and report the problem.


397371 'dbmcli -d <LC_NAME> -n <logical host name> db_state' timed out.

Description:

The SAP utility listed timed out.

Solution:

Make sure the logical host name resource is online.


398345 Error %d setting policy %d %s

Description:

This message appears when the customer is initializing or changing a scalable services load balancer, by starting or updating a service. An internal error happened while trying to change the load balancing policy.

Solution:

This is an internal error and it could happen if another RGM are operation were happening at the same time. The user action is to try it again. If it happens when another RMG update is not happening, contact your Sun Service provider for help.


398878 reservation fatal error(%s) - dcs_get_service_parameters() error, returned %d

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.


398973 The probe has requested an immediate failover. Attempting to failover this resource group subject to the setting of the Failover_enabled property.

Description:

An immediate failover will be performed.

Solution:

This is an informational message, no user action is needed.


399037 mc_closeconn failed to close connection

Description:

The system has run out of resources that is required to process connection terminations for a scalable service.

Solution:

If cluster networking is required, add more resources (most probably, memory) and reboot.


399216 clexecd: Got an unexpected signal %d in process %s (pid=%d, ppid=%d)

Description:

clexecd program got a signal indicated in the error message.

Solution:

clexecd program will exit and node will be halted or rebooted to prevent data corruption. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


399266 Cluster goes into pingpong booting because of failure of method <%s> on resource <%s>. RGM is not aborting this node.

Description:

A stop method is failed and Failover_mode is set to HARD, but the RGM has detected this resource group falling into pingpong behavior and will not abort the node on which the resource's stop method failed. This is most likely due to the failure of both resource's start and stop methods.

Solution:

Save a copy of /var/adm/messages, check for both failed start and stop methods of the failing resource, and make sure to have the failure corrected. Refer to the procedure for clearing the ERROR_STOP_FAILED condition on a resource group in the Sun Cluster Administration Guide to restart resource group.


399753 CCR: CCR data server failed to register with CCR transaction manager.

Description:

The CCR data server on this node failed to join the cluster, and can only serve readonly requests.

Solution:

There may be other related CCR messages on this and other nodes in the cluster, which may help diagnose the problem. It may be necessary to reboot this node or the entire cluster.