Sun Cluster 3.0 5/02 Error Messages Guide

Chapter 4 Message IDs 300000 - 399999

Error Message List

The following list is ordered by the message ID.


300397 :resource %s property changed.

Description:

This is a notification from the rgmd that a resource's property has been edited by the cluster administrator. This may be used by system monitoring tools.

Solution:

This is an informational message, no user action is needed.


300777 :reservation warning(%s) - Unable to open device %s, errno %d, will retry in %d seconds.

Description:

The device fencing program has encountered errors while trying to access a device. The failure operation will be retried.

Solution:

This is an informational message, no user action is needed.


301092:file specified in ENVIRONMENT_FILE parameter %s does not exist.

Description:

The Environment_File property was set when configuring the resource. The file specified by the Environment_File property might not exist. The file should be readable and specified with a fully qualified path.

Solution:

Specify an existing file with a fully qualified file name when creating a resource.


301573 :clcomm: error in copying for cl_change_flow_settings

Description:

The system failed a copy operation supporting a flow control state change.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


301603 :fatal: cannot create any threads to handle switchback

Description:

The rgmd was unable to create a sufficient number of threads upon starting up. This is a fatal error. The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Make sure that the hardware configuration meets documented minimum requirements. Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


301635 :clexecd: close returned %d. Exiting.

Description:

clexecd program has encountered a failed thr_create(3THR) system call. The error message indicates the error number for the failure.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


302670 :udlm_setup_port: fcntl: %d

Description:

A server was not able to execute fnctl(). udlm exits and the node aborts and panics.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


303231 :mount_client_impl::remove_client() failed attempted RM change_repl_prov_status() to remove client, spec %s, name %s

Description:

The system was unable to remove a PXFS replica on the node that this message was seen.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


303879:INTERNAL ERROR: Unable to lock %s: %s.

Description:

An internal error occurred.

Solution:

Contact your authorized Sun service provider for assistance. Provide your authorized Sun service provider a copy of the /var/adm/messages files from all nodes.


304365 :clcomm: Could not create any threads for pool %d

Description:

The system creates server threads to support requests from other nodes in the cluster. The system could not create any server threads during system startup. This is caused by a lack of memory.

Solution:

There are two solutions. Install more memory. Alternatively, take steps to reduce memory usage. Since the creation of server threads takes place during system startup, application memory usage is normally not a factor.


305298 :cm_callback_impl abort_trans: exiting

Description:

ucmm callback for abort transition failed.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


305689 :Failed to stop the adaptive server using %s.

Description:

Sun Cluster HA for Sybase failed to stop ASE using the file specified in the STOP_FILE property. Other syslog messages and the log file will provide additional information on possible reasons for the failure.

Solution:

Please check the permissions of file specified in the STOP_FILE extension property. File should be executable by the Sybase owner and root user.


306407 :Failed to stop Adaptive server.

Description:

Sun Cluster HA for Sybase failed to stop using KILL signal.

Solution:

Please examine whether any Sybase server processes are running on the server. Please manually shutdown the server.


307195 :clcomm: error in copying for cl_read_flow_settings

Description:

The system failed a copy operation supporting flow control state reporting.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


307531 :open: %s

Description:

While starting up, the rgmd daemon was not able to open /dev/console. The message contains the system error. This will prevent the rgmd from starting on this node.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


308800 :ERROR: rebalance: <%s> is pending_methods on node <%d>

Description:

An internal error has occurred in the locking logic of the rgmd. This error should not occur. It may prevent the rgmd from bringing the indicated resource group online.

Solution:

Look for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


310953 :clnt_control of program %s failed %s.

Description:

HA-NFS fault monitor failed to reset the retry timeout for retransmitting the rpc request.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


311463 :Failover attempt failed.

Description:

The failover attempt of the resource is rejected or encountered an error.

Solution:

For more detailed error message, check the syslog messages. Check whether the Pingpong_interval has appropriate value. If not, adjust it using scrgadm(1M). Otherwise, use scswitch to switch the resource group to a healthy node.


311808 :Can not open /etc/mnttab: %s

Description:

Error in open /etc/mnttab, the error message is followed.

Solution:

Check with system administrator and make sure /etc/mnttab is properly defined.


312004 :Value of Start_timeout property may be small for %d max_offload_retry for %d resource groups to offload.

Description:

This is a warning message indicating the you may have set max_offload_retry to a high value which may cause the start method of RGOffload resource to timeout before an attempt can be made to offload all specified resource groups.

Solution:

Please calculate the max_offload_retry so that the Start_timeout is not exceeded if every resource group that has to be offloaded requires maximum retries. There is a 10 second interval between successive retries.


312053 :Cannot execute %s: %s.

Description:

Failure in executing the command.

Solution:

Check the syslog message for the command description. Check whether the system is low in memory or the process table is full and take appropriate action. Make sure that the executable exists.


313867 :Unknown step: %s

Description:

Request to run an unknown udlm step.

Solution:

None.


314314 :prog <%s> step <%s> terminated due to receipt of signal <%d>

Description:

ucmmd step terminated due to receipt of a signal.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified and if it recurs. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance.


314341 :Invalid probe values. Retry_interval (currently set to %d) must be greater than or equal to the product of Through_probe_interval (currently set to %d), and Retry_count (currently set to %d).

Description:

Validation of the probe related parameters failed because invalid values were specified.

Solution:

Retry_interval must be greater than or equal to the product of Thorough_probe_interval, and Retry_count. Use scrgadm(1M) to modify the values of these parameters so that they will hold the above relationship.


314356 :resource %s enabled.

Description:

This is a notification from the rgmd that the operator has enabled a resource. This may be used by system monitoring tools.

Solution:

This is an informational message, no user action is needed.


341719:Restarting daemon %s.

Description:

Sun Cluster HA for NFS is restarting the specified daemon.

Solution:

No user action required.


315446 :node id <%d> is out of range

Description:

The low-level cluster machinery has encountered an error.

Solution:

Look for other syslog messages occurring just before or after this one on the same node; they may provide a description of the error.


316215:Process sapsocol is already running outside of Sun Cluster. Will terminate it now, and restart it under Sun Cluster.

Description:

The OS collector process is running outside of the control of the Sun Cluster software. Sun Cluster HA for SAP should terminate it and restart it under the control of the Sun Cluster software.

Solution:

No user action required.


316344 :Adapter %s is not a valid NAFO name on this node.

Description:

Validation of the adapter information has failed. The specified NAFO group does not exist on this node.

Solution:

Create appropriate NAFO group on this node or recreate the logical host with correct NAFO.


318636:No executable $BV1TO1/bin/xbvconf.

Description:

The specified executable was not found.

Solution:

Verify that Sun Cluster HA for BroadVision One-To-One Enterprise was properly installed. Ensure that the specified executable is in the correct location.


318767 :Device switchovers cannot be performed since AffinityOn value is FALSE.

Description:

A message to the effect that device switchovers are not performed because AffinityOn is set to FALSE.

Solution:

An informational message only. No action is needed.


319047 :CMM: Issuing a SCSI2 Tkown failed for quorum device %s with error %d.

Description:

This node encountered the specified error while issuing a SCSI2 Tkown operation on the indicated quorum device. This will cause the node to conclude that it has been unsuccessful in preempting keys from the quorum device, and therefore the partition to which it belongs has been preempted. If a cluster gets divided into two or more disjoint subclusters, exactly one of these must survive as the operational cluster. The surviving cluster forces the other subclusters to abort by grabbing enough votes to grant it majority quorum. This is referred to as preemption of the losing subclusters.

Solution:

If the error encountered is EACCES, then the SCSI2 command could have failed due to the presence of SCSI3 keys on the quorum device. Scrub the SCSI3 keys off of it, and reboot the preempted nodes.


319048 :CCR: Cluster has lost quorum while updating table %s, it is possibly in an inconsistent state - ABORTING.

Description:

The cluster lost quorum while the indicated table was being changed, leading to potential inconsistent copies on the nodes.

Solution:

Check if the indicated table are consistent on all the nodes in the cluster, if not, boot the cluster in -x mode to restore the indicated table from backup. The CCR tables are located at /etc/cluster/ccr/.


319375 :clexecd: wait_for_signals got NULL.

Description:

clexecd problem encountered an error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


320833:INTERNAL ERROR usage:$0 BVUSER BV1TO1_VAR bv_local_host action IT_CONNECT_ATTEMPTS BV_ORB_CONNECT_TIMEOUT.

Description:

An internal error occurred.

Solution:

Contact your authorized Sun service provider for assistance. Provide your authorized Sun service provider a copy of the /var/adm/messages files from all nodes.


321245 :resource <%s> is disabled but not offline

Description:

While attempting to execute an operator-requested enable or disable of a resource, the rgmd has found the indicated resource to have its Onoff_switch property set to DISABLED, yet the resource is not offline. This suggests corruption of the RGM's internal data and will cause the enable or disable action to fail.

Solution:

This may indicate an internal error or bug in the rgmd. Contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


321667 :clcomm: cl_comm: not booted in cluster mode.

Description:

Attempted to load the cl_comm module when the node was not booted as part of a cluster.

Solution:

Users should not explicitly load this module.


321962:Command %s failed to complete. HA-SAP will continue to start SAP.

Description:

The command to clean ipc failed to complete. This is an internal error.

Solution:

Save the /var/adm/messages files from all nodes. Contact your authorized Sun service provider.


322373:Unable to unplumb %s%d, rc %d.

Description:

Topology Manage is done using the interface and failed to unplumb the adapter.

Solution:

Need an user action for this message.


322675 :Some NFS system daemons are not running.

Description:

HA-NFS fault monitor checks the health of statd, lockd, mountd and nfsd daemons on the node. It detected that one or more of these are not currently running.

Solution:

No action. The monitor would restart these. If it doesn't, reboot the node.


322797 :Error registering provider '%s' with the framework.

Description:

The device configuration system on this node has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


322862 :clcomm: error in copying for cl_read_threads_min

Description:

The system failed a copy operation supporting flow control state reporting.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


322879 :clcomm: Invalid copyargs: node %d pid %d

Description:

The system does not support copy operations between the kernel and a user process when the specified node is not the local node.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


322908 :CMM: Failed to join the cluster: error = %d.

Description:

The local node was unsuccessful in joining the cluster.

Solution:

There may be other related messages on this node that may indicate the cause of this failure. Resolve the problem and reboot the node.


323498 :libsecurity: NULL RPC to program %ld failed will retry %s

Description:

A client of the rpc.pmfd, rpc.fed or rgmd server was un able to initiate an rpc connection, because it could not execute a test rpc call, and the program will retry to establish the connection. The message shows the specific rpc error. The program number is shown. To find out what program corresponds to this number, use the rpcinfo command. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


323966 :NAFO group %s has status %s so no action will be taken.

Description:

The status of the NAFO group has become stable.

Solution:

This is an informational message, no user action is needed.


324478 :(%s): Error %d from read

Description:

An error was encountered in the clexecd program while reading the data from the worker process.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


325322 :clcomm: error in copying for state_resource_pool

Description:

The system failed a copy operation supporting statistics reporting.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


326043 :reservation fatal error(%s) - release_resv_lock() returned exception

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.


327057 :SharedAddress stopped.

Description:

The stop method is completed and the resource is stopped.

Solution:

This is informational message. No user action required.


327353 :Global device path %s is not recognized as a device group or a device special file.

Description:

One or more entries specified in the GlobalDevicePath extension property are invalid DCS global device paths.

Solution:

Misconfigured DCS device groups a possibility. scconf may be used to reconfigure the device groups. Inspect the syslog for other errors. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


329286 :(%s) instead of UDLM_ACK got a %d

Description:

Did not receive an acknowledgement from udlm as was expected.

Solution:

None.


329429 :reservation fatal error(%s) - host_name not specified

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.


329496 :unlatch_intention(): IDL exception when communicating to node %d

Description:

An inter-node communication failed, probably because a node died.

Solution:

No action is required; the rgmd should recover automatically.


329778 :clconf: Data length is more than max supported length in clconf_ccr read

Description:

In reading configuration data through CCR, found the data length is more than max supported length.

Solution:

Check the CCR configuration information.


329847 :Warning: node %d has a weight of 0 assigned to it for property %s.

Description:

The named node has a weight of 0 assigned to it. A weight of 0 means that no new client connections will be distributed to that node.

Solution:

Consider assigning the named node a non-zero weight.


330063 :error in vop open %x

Description:

Opening a private interconnect interface failed.

Solution:

Reboot of the node might fix the problem.


330526 :CMM: Number of steps specified in registering callback = %d; should be <= %d.

Description:

The number of steps specified during registering a CMM callback exceeds the allowable maximum. This is an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


331221 :CMM: Max detection delay specified is %ld which is larger than the max allowed %ld.

Description:

The maximum of the node down detection delays is larger than the allowable maximum. The maximum allowed will be used as the actual maximum in this case.

Solution:

This is an informational message, no user action is needed.


331626 :An error occured while obtaining replica information for global service %s associated with path %s: %s.

Description:

An error occured while invoking a DCS API call to retrieve replica information for a global service.

Solution:

Inspect the syslog for errors. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


333069 :Failed to retrieve nodeid for %s.

Description:

The nodeid for the given name could not be determined.

Solution:

Make sure that the name given is a valid node identifier or node name.


333924 :Failed to initialize DCS.

Description:

An error occurred when initializing the DCS library.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


334697 :Failed to retrieve the cluster property %s: %s.

Description:

The query for a property failed. The reason for the failure is given in the message.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


334992 :clutil: Adding deferred task after threadpool shutdown id %s

Description:

During shutdown this operation is not allowed.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


335129:Stopping the adaptive server.

Description:

Sun Cluster HA for Sybase is shutting down the Sybase Adaptive Server.

Solution:

No user action required.


335206 :Failed to get host names from the resource.

Description:

Retrieving the IP addresses from the network resources from this resource group has failed.

Solution:

Internal error or API call failure might be the reasons. Check the error messages that occurred just before this message. If there is internal error, contact your authorized Sun service provider. For API call failure, check the syslog messages from other components. For the resource name and resource group name, check the syslog tag.


335468:Time allocated to stop development system is too small (less than 5 seconds).

Description:

Time allocated to stop the development system is too small.

Solution:

Increase the value for property Start_timeout or the value for the Dev_stop_pct property. The time for stopping the development system is a percentage of the total Start_timeout.


335591 :Failed to retrieve the resource group property %s: %s.

Description:

An API operation has failed while retrieving the resource group property. Low memory or API call failure might be the reasons.

Solution:

In case of low memory, the problem will probably cured by rebooting. If the problem reoccurs, you might need to increase swap space by configuring additional swap devices. Otherwise, if it is API call failure, check the syslog messages from other components. For resource group name and the property name, check the current syslog message.


336860 :read %d for %snum_ports

Description:

Could not get information about the number of ports udlm uses from config file udlm.conf.

Solution:

Check to make sure udlm.conf file exist and has entry for udlm.num_ports. If everything looks normal and the problem persists, contact your Sun service representative.


337008 :rgm_comm_impl::_unreferenced() called unexpectedly

Description:

The low-level cluster machinery has encountered a fatal error. The rgmd will produce a core file and will cause the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


337073:$BV1TO1_VAR is not a directory.

Description:

The specified environment variable does not point to the correct directory.

Solution:

Correctly set the specified environment variable.


337166 :Error setting environment variable %s.

Description:

An error occurred while setting the environment variable LD_LIBRARY_PATH. This is required by the fault monitor for the nsldap data service. The fault monitor appends the ldap server root path, including the lib directory, to the LD_LIBRARY_PATH environment variable

Solution:

Check that there is a lib directory under the server root of the nsldap data service which pertains to this resource. If this directory has been removed, then it must be replaced by reinstalling Netscape Directory Server, or whatever other means are appropriate.


337212 :resource type %s removed.

Description:

This is a notification from the rgmd that a resource type has been deleted. This may be used by system monitoring tools.

Solution:

This is an informational message, no user action is needed.


338067:This resource does not depend on any SUNW.HAStoragePlus resources. Proceeding with normal checks.

Description:

This is an informational message only.

Solution:

None.


338839 :clexecd: Could not create thread. Error: %d. Sleeping for %d seconds and retrying.

Description:

clexecd program has encountered a failed thr_create() system call. The error message indicates the error number for the failure. It will retry the system call after specified time.

Solution:

If the message is seen repeatedly, contact your authorized Sun service provider to determine whether a workaround or patch is available.


339424 :Could not host device service %s because this node is being removed from the list of eligible nodes for this service.

Description:

A switchover/failover was attempted to a node that was being removed from the list of nodes that could host this device service.

Solution:

This is an informational message, no user action is needed.


339521 :CCR: Lost quorum while starting to update table %s.

Description:

The cluster lost quorum when CCR started to update the indicated table.

Solution:

Reboot the cluster.


339590:Error (%s) when reading property %s.

Description:

Unable to read a property value using the API. The property name is indicated in the message. Other syslog messages might give more information on errors in other modules.

Solution:

Check syslog messages. Report this problem to your authorized Sun service provider.


339657 :Issuing a restart request.

Description:

This is informational message. We are above to call API function to request for restart. In case of failure, follow the syslog messages after this message.

Solution:

No user action is needed.


339954 :fatal: cannot create any threads to launch callback methods

Description:

The rgmd was unable to create a thread upon starting up. This is a fatal error. The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Make sure that the hardware configuration meets documented minimum requirements. Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


340287 :idl_set_timestamp(): IDL Exception

Description:

The rgmd has encountered an error that prevents the scha_control function from successfully setting a pingpong time stamp, presumably because a node died. This does not prevent the attempted failover from succeeding, but in the worst case, might prevent the anti-"pingpong" feature from working, which may permit a resource group to fail over repeatedly between two or more nodes.

Solution:

Examine syslog output on the node that rebooted, to determine the cause of node death. The syslog output might indicate further remedial actions.


341502: Unable to plumb even after unplumbing. rc = %d.

Description:

Topology Manager failed to plumb an adapter for private network. A possible reason for plumb to fail is that it is already plumbed. Solaris clustering has successfully unplumbed the adapter but failed while trying to plumb for private use.

Solution:

Need an user for this message.


341804:Failed to retrieve information for user %s.

Description:

Failed to retrieve information for the specified Sun Cluster HA for BroadVision One-To-One Enterprise user.

Solution:

Ensure that the proper Sun Cluster HA for BroadVision One-To-One Enterprise Unix userid is set and that this user exists on all the nodes of the cluster.


342113 :Invalid script name %s. It cannot contain any '/'.

Description:

The script name should be just the script name, no path is needed.

Solution:

Specify just the script name without any path.


342336 :clcomm: Pathend %p: path_down not allowed in state %d

Description:

The system maintains state information about a path. A path_down operation is not allowed in this state.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


342793:Successfully started the service %s.

Description:

Specified data service successfully started.

Solution:

No user action required.


343307 :Could not open file %s: %s.

Description:

System has failed to open the specified file.

Solution:

Check whether the permissions are valid. This might be the result of a lack of system resources. Check whether the system is low in memory and take appropriate action. For specific error information, check the syslog message.


343650 :Both %s and %s are empty.

Description:

The GlobalDevicePaths and FilesystemMountPoints extension properties are both empty. A warning message is logged. The HAStoragePlus resource is a no-op i.e. it does not play a role in this case.

Solution:

A warning message only. No action is needed.


344470 :Unable to get status for NAFO group %s.

Description:

The specified NAFO group is not in functional state. Logical host resource can't be started without a functional NAFO.

Solution:

LogicalHostname resource will not be brought online on this node. Check the messages(pnmd errors) that encountered just before this message for any NAFO or adapter problem. Correct the problem and rerun the scrgadm.


345342 :Failed to connect to %s port %d.

Description:

The data service fault monitor probe was trying to connect to the host and port specified and failed. There may be a prior message in syslog with further information.

Solution:

Make sure that the port configuration for the data service matches the port configuration for the underlying application.


346036 :libsecurity: unexpected getnetconfigent error

Description:

A client of the rpc.pmfd, rpc.fed or rgmd server was unable to initiate an rpc connection, because it could not get the network information. The pmfadm or scha command exits with error. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


347091 :resource type %s added.

Description:

This is a notification from the rgmd that the operator has created a new resource type. This may be used by system monitoring tools.

Solution:

This is an informational message, no user action is needed.


347344 :Did not find a valid port number to match field <%s> in configuration file <%s>: %s.

Description:

A failure occurred extracting a port number for the field within the configuration file. The field exists and the file exists and is accessible. The value for the field may not exist or may not be an integer greater than zero. An error in environment may have occurred, indicated by a non-zero errno value at the end of the message.

Solution:

Check to see if the value for the field in the configuration file exists and is an integer greater than zero. If there is an error in the field value, fix the value and retry the operation.


348240 :clexecd: putmsg returned %d.

Description:

clexecd program has encountered a failed putmsg(2) system call. The error message indicates the error number for the failure.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


349741 :Command %s is not a regular file.

Description:

The specified pathname, which was passed to a libdsdev routine such as scds_timerun or scds_pmf_start, does not refer to a regular file. This could be the result of 1) incorrectly configuring the name of a START or MONITOR_START method or other property, 2) a programming error made by the resource type developer, or 3) a problem with the specified pathname in the file system itself.

Solution:

Ensure that the pathname refers to a regular, executable file.


353557 :Filesystem (%s) is locked and cannot be frozen

Description:

The file system has been locked with the _FIOLFS ioctl. It is necessary to perform an unlock _FIOLFS ioctl. The growfs(1M) or lockfs(1M) command may be responsible for this lock.

Solution:

An _FIOLFS LOCKFS_ULOCK ioctl is required to unlock the file system.


354821 :Attempting to start the fault monitor under process monitor facility.

Description:

The function is going to request the PMF to start the fault monitor. If the request fails, refer to the syslog messages that appear after this message.

Solution:

This is an informational message, no user action is required.


355950 :HA: unknown invocation result status %d

Description:

An invocation completed with an invalid result status.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


356795 :CMM: Reconfiguration step %d was forced to return.

Description:

One of the CMM reconfiguration step transitions failed, probably due to a problem on a remote node. A reconfiguration is forced assuming that the CMM will resolve the problem.

Solution:

This is an informational message, no user action is needed.


356930:Property %s is empty. This property must be specified for scalable resources.

Description:

The value of the specified property must not be empty for scalable resources.

Solution:

Use scrgadm(1M) to specify a non-empty value for the property.


357263 :munmap: %s

Description:

The rpc.pmfd server was not able to delete shared memory for a semaphore, possibly due to low memory, and the system error is shown. This is part of the cleanup after a client call, so the operation might have gone through. An error message is output to syslog.

Solution:

Investigate if the machine is running out of memory. If this is not the case, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


357915 :Error: Unable to stat directory <%s> for scha_control timestamp file

Description:

The rgmd failed in a call to stat(2) on the local node. This may prevent the anti-"pingpong" feature from working, which may permit a resource group to fail over repeatedly between two or more nodes. The failure of the stat call might indicate a more serious problem on the node.

Solution:

Examine other syslog messages occurring around the same time on the same node, to see if the source of the problem can be identified.


358129 :Either extension property <failover_enabled> is not defined, or an error occured while retrieving this property; using the default value of TRUE.

Description:

Property failover_enabled may not be defined in RTR file. Continue the process with the default value of TRUE.

Solution:

This is an informational message, no user action is needed.


358211 :monitor_check: the failover requested by scha_control for resource <%s>, resource group <%s> was not completed because of error: %s

Description:

A scha_control(1HA,3HA) GIVEOVER attempt failed, due to the error listed.

Solution:

Examine other syslog messages on all cluster members that occurred about the same time as this message, to see if the problem that caused the error can be identified and repaired.


358533 :Invalid protocol is specified in %s property.

Description:

The specified system property does not have a valid protocol.

Solution:

Using scrgadm(1M), change the value of the property to use a valid protocol. For example: TCP or UDP.


359648 :Service has failed.

Description:

Probe is detected a failure in the data service. The data service cannot be restarted on the same node, since there are frequent failures. Probe is setting resource status as failed.

Solution:

Wait for the fault monitor to failover the data service. Check the syslog messages and configuration of the data service.


361048 :ERROR: rgm_run_state() returned non-zero while running boot methods

Description:

The rgmd state machine has encountered an error on this node.

Solution:

Look for preceding syslog messages on the same node, which may provide a description of the error.


361489 :in libsecurity __rpc_negotiate_uid failed for transport %s

Description:

A server (rpc.pmfd, rpc.fed or rgmd) was not able to start because it could not establish a rpc connection for the network. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


361678:Successfully stopped HA-NetBackup master service.

Description:

Successfully stopped Sun Cluster HA for NetBackup processes.

Solution:

No user action required.


361831:Initialization failed. Invalid command line %s %s.

Description:

Unable to process parameters passed to the callback method. The parameters are indicated in the message. This is a Sun Cluster HA for Sybase internal error.

Solution:

Report this problem to your authorized Sun service provider.


362463 :clcomm: Endpoint %p: path_down not allowed in state %d

Description:

The system maintains information about the state of an Endpoint. The path_down operation is not allowed in this state.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


362519 :dl_attach: DL_ERROR_ACK bad PPA

Description:

Could not attach to the physical device. We are trying to open a fast path to the private transport adapters.

Solution:

Reboot of the node might fix the problem.


362657 :Error when sending response from child process: %m

Description:

Error occurred when sending message from fault monitor child process. Child process will be stopped and restarted.

Solution:

If error persists, then disable the fault monitor and report the problem.


363505 :check_for_ccrdata failed strdup for (%s)

Description:

Call to strdup failed. The "strdup" man page describes possible reasons.

Solution:

Install more memory, increase swap space or reduce peak memory consumption.


363972 :reservation message(%s) - Waiting for reservation lock

Description:

Locking is used by the device fencing program to ensure correct behavior when different nodes see different cluster memberships. This node is waiting for an instance of the device fencing program on another node to complete.

Solution:

The lock should eventually be granted. If node failures are involved, the lock will not be granted until node deaths are assured, which may take a few minutes. If the lock is eventually granted, no user action is required. If the lock is not granted, your authorized Sun service provider should be contacted to help diagnose the problem.


364188 :Validation failed. Listener_name not set

Description:

'Listener_name' property of the resource is not set. HA-Oracle will not be able to manage Oracle listener if Listener_name is not set.

Solution:

Specify correct 'Listener_name' when creating resource. If resource is already created, please update resource property.


364510 :The specified Oracle dba group id (%s) does not exist

Description:

Group id of oracle dba does not exist.

Solution:

Make sure /etc/nswitch.conf and /etc/group files are valid and have correct information to get the group id of dba.


366225 :Listener %s stopped successfully

Description:

Informational message. HA-Oracle successfully stopped Oracle listener.

Solution:

None


366769:The Hosts in the startup order are not up. Waiting for them to start.....

Description:

The Sun Cluster HA for BroadVision One-To-One Enterprise Probe detected that the Sun Cluster HA for BroadVision One-To-One Enterprise processes are not running, but it cannot take any action because the Sun Cluster HA for BroadVision One-To-One Enterprise hosts in the startup order are not running.

Solution:

If the Resource Groups that contain the backend resources are not online, bring them online. If the backend resources are online no user action required because the Sun Cluster HA for BroadVision One-To-One Enterprise processes might be in the process of coming up.


367270:INTERNAL ERROR: Failed to create the path to the %s file.

Description:

An internal error occurred.

Solution:

Contact your authorized Sun service provider for assistance. Provide your authorized Sun service provider a copy of the /var/adm/messages files from all nodes.


367617 :reservation fatal error(%s) - Invalid file format '%s'

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Provide your authorized Sun service provider a copy of the /var/adm/messages files from all nodes. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to acquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.


367864 :svc_init failed.

Description:

The rpc.pmfd server was not able to initialize server operation. This happens while the server is starting up, at boot time. The server does not come up, and an error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


368363 :Failed to retrieve the current primary node.

Description:

Cannot retrieve the current primary node for the given resource group.

Solution:

Check the syslog messages that occurred just before this message, to see whether there is any internal error has occurred. If it is, contact your authorized Sun service provider. Otherwise, Check if the resource group is in STOP_FAILED state. If it is, then clear the state and bring the resource group online.


368819 :t_rcvudata in recv_request: %s

Description:

Call to t_rcvudata() failed. The "t_rcvudata" man page describes possible error codes. udlm will exit and the node will abort.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


369460 :udlm_send_reply %s: udp is null!

Description:

Can not communicate with udlmctl because the address to send to is null.

Solution:

None. udlm will handle this error.


370604 :This resource depends on a HAStoragePlus resource that is in a different Resource Group. This configuration is not supported.

Description:

The resource depends on a HAStoragePlus resource that is configured in a different resource group. This configuration is not supported.

Solution:

Add this resource and the HAStoragePlus resource to the same resource group.


370694 :Error while retrieving resource group mode.

Description:

An error occurred during the invocation of a DSDL API to obtain the mode of the resource group.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


370949 :created %d threads to launch resource callback methods; desired number = %d

Description:

The rgmd was unable to create the desired number of threads upon starting up. This is not a fatal error, but it may cause RGM reconfigurations to take longer because it will limit the number of tasks that the rgmd can perform concurrently.

Solution:

Make sure that the hardware configuration meets documented minimum requirements. Examine other syslog messages on the same node to see if the cause of the problem can be determined.


371297 :%s: Invalid command line option. Use -S for secure mode

Description:

rpc.sccheckd should always be invoked in secure mode. If this message shows up, someone has modified configuration files that affects server startup.

Solution:

Reinstall cluster packages or contact your service provider.


371369 :CCR: CCR data server on node %s unreachable while updating table %s.

Description:

While the TM was updating the indicated table in the cluster, the specified node went down and has become unreachable.

Solution:

The specified node needs to be rebooted.


372880 :CMM: Quorum device %ld (gdevname %s) can not be acquired by the current cluster members. This quorum device is held by node%s %s.

Description:

This node does not have its reservation key on the specified quorum device, which has been reserved by the specified node or nodes that the local node can not communicate with. This indicates that in the last incarnation of the cluster, the other nodes were members whereas the local node was not, indicating that the CCR on the local node may be out-of-date. In order to ensure that this node has the latest cluster configuration information, it must be able to communicate with at least one other node that was a member of the previous cluster incarnation. These nodes holding the specified quorum device may either be down or there may be up but the interconnect between them and this node may be broken.

Solution:

If the nodes holding the specified quorum devices are up, then fix the interconnect between them and this node so that communication between them is restored. If the nodes are indeed down, boot one of them.


372887 :HA: repl_mgr: exception occurred while invoking RMA

Description:

An unrecoverable failure occurred in the HA framework.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


373148 :The port portion of %s at position %d in property %s is not a valid port.

Description:

The property named does not have a legal value. The position index, which starts at 0 for the first element in the list, indicates which element in the property list was invalid.

Solution:

Assign the property a legal value.


373816 :clcomm: copyinstr: max string length %d too long

Description:

The system attempted to copy a string from user space to the kernel. The maximum string length exceeds length limit.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


374006 :prog <%s> failed on step <%s> retcode <%d>

Description:

ucmmd step failed on a step.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified and if it recurs. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance.


374377 :NAFO Failure.

Description:

The NAFO group hosting the LogicalHostname has failed.

Solution:

The LogicalHostname resource would be failed over to a different node. If that fails, check the system logs for other messages. Also, correct the networking problem on the node so that the NAFO group in question is healthy again.


374738 :dl_bind: DL_BIND_ACK bad sap %u

Description:

SAP in acknowledgment to bind request is different from the SAP in the request. We are trying to open a fast path to the private transport adapters.

Solution:

Reboot of the node might fix the problem.


377210:Failed to retrieve BV extension properties.

Description:

Failed to retrieve the Extension properties set by the user or failed to retrieve a valid host for Sun Cluster HA for BroadVision One-To-One Enterprise processes.

Solution:

Look for other error messages generated while retrieving the extension properties to identify the exact error. Look for appropriate action for that error message.


377245 :request addr > max "%s & %s"

Description:

Error from udlm on an address request. Udlm exits and the nodes aborts and panics.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


377347 :CMM: Node %s (nodeid = %ld) is up; new incarnation number = %ld.

Description:

The specified node has come up and joined the cluster. A node is assigned a unique incarnation number each time it boots up.

Solution:

This is an informational message, no user action is needed.


377531:Stop sapsocol under PMF times out.

Description:

The Sun Cluster HA for SAP timed out while the Sun Cluster HA for SAP stop method stopped the OS collector process under the control of the Process Monitor Facility (PMF). This might happen under heavy system load.

Solution:

Increase the stop timeout value.


377897:Successfully started the service.

Description:

Sun Cluster HA for Sun Cluster HA for SAP successfully started.

Solution:

No user action required.


378427 :prog <%s> step <%s> terminated due to receipt of signal

Description:

ucmmd step terminated due to receipt of a signal.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified and if it recurs. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance.


378807 :clexecd: %s: sigfillset returned %d. Exiting.

Description:

clexecd program has encountered a failed sigfillset(3C) system call. The error message indicates the error number for the failure.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


378872 :%s operation failed: %s.

Description:

Specified system operation could not complete successfully.

Solution:

This is as an internal error. Contact your authorized Sun service provider with the following information. 1) Saved copy of /var/adm/messages file. 2) Output of "ifconfig -a" command.


379450 :reservation fatal error(%s) - fenced_node not specified

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.


380064 :switchback attempt failed on resource group <%s> with error <%s>

Description:

The rgmd was unable to failback the specified resource group to a more preferred node. The additional error information in the message explains why.

Solution:

Examine other syslog messages occurring around the same time on the same node. These messages may indicate further action.


380365 :(%s) t_rcvudata, res %d, flag %d: tli error: %s

Description:

Call to t_rcvudata() failed. The "t_sndudata" man page describes possible error codes. udlmctl will exit.

Solution:

Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


380897 :rebalance: WARNING: resource group <%s> is <%s> on node <%d>, resetting to OFFLINE.

Description:

The resource group has been found to be in the indicated state and is being reset to OFFLINE. This message is a warning only and should not adversely affect the operation of the RGM.

Solution:

Since this problem might indicate an internal logic error in the rgmd, please save a copy of the /var/adm/messages files on all nodes, the output of an scstat -g command, and the output of a scrgadm -pvv command. Report the problem to your authorized Sun service provider.


381244 :in libsecurity mkdir of %s failed: %s

Description:

The rpc.pmfd, rpc.fed or rgmd server was unable to create a directory to contain "cache" files for rpcbind information. The affected component should still be able to function by directly calling rpcbind.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


381765 :sema_post: %s

Description:

The rpc.pmfd server was not able to act on a semaphore. The message contains the system error. The server does not perform the action requested by the client, and an error message is output to syslog.

Solution:

Investigate if the machine is running out of memory. If this is not the case, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


382169 :Share path name %s not absolute.

Description:

A path specified in the dfstab file does not begin with "/"

Solution:

Only absolute path names can be shared with HA-NFS.


382252 :Share path %s: file system %s is not mounted.

Description:

The specified file system, which contains the share path specified, is not currently mounted.

Solution:

Correct the situation with the file system so that it gets mounted.


382995 :ioctl in negotiate_uid failed

Description:

Call to ioctl() failed. The "ioctl" man page describes possible error codes. ucmmd will exit and the node will abort.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


383706 :NULL value returned for the resource property %s.

Description:

NULL value was specified for the extension property of the resource.

Solution:

For the property name check the syslog message. Any of the following situations might have occurred. Different user action is needed for these different scenarios. 1) If a new resource is created or updated, check whether the value of the extension property is empty. If it is, provide valid value using scrgadm(1M). 2) For all other cases, treat it as an Internal error. Contact your authorized Sun service provider.


384549 :CCR: Could not backup the CCR table %s errno = %d.

Description:

The indicated error occurred while backing up indicated CCR table on this node. The errno value indicates the nature of the problem. errno values are defined in the file /usr/include/sys/errno.h. An errno value of 28(ENOSPC) indicates that the root file system on the indicated node is full. Other values of errno can be returned when the root disk has failed(EIO) or some of the CCR tables have been deleted outside the control of the cluster software(ENOENT).

Solution:

There may be other related messages on this node, which may help diagnose the problem, for example: If the root file system is full on the node, then free up some space by removing unnecessary files. If the root disk on the afflicted node has failed, then it needs to be replaced. If the indicated CCR table was accidently deleted, then boot this node in -x mode to restore the indicated CCR table from other nodes in the cluster or backup. The CCR tables are located at /etc/cluster/ccr/.


384621 :RDBMS probe successful

Description:

This message indicates that Fault monitor has successfully probed the RDBMS server

Solution:

No action required. This is informational message.


385407 :t_alloc (open_cmd_port) failed with errno%d

Description:

Call to t_alloc() failed. The "t_alloc" man page describes possible error codes. ucmmd will exit and the node will abort.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


385902 :pmf_search_children: Error signaling <%s>: %s

Description:

An error occurred while rpc.pmfd attempted to send a signal to one of the processes of the given tag. The reason for the failure is also given. The signal was sent to the process as a result of some event external to rpc.pmfd. rpc.pmfd "intercepted" the signal, and is trying to pass the signal on to the monitored process.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


386024 :ERROR: rebalance: duplicate nodeid <%d> in Nodelist of resource group <%s>; continuing

Description:

The same nodename appears twice in the Nodelist of the given resource group. Although non-fatal, this should not occur and may indicate an internal logic error in the rgmd.

Solution:

Use scrgadm -pv to check the Nodelist of the affected resource group. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


386072 :chdir: %s

Description:

The rpc.pmfd server was not able to change directory. The message contains the system error. The server does not perform the action requested by the client, and an error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


386908 :Resource is already stopped.

Description:

An attempt was made to stop a resource that has already been stopped.

Solution:

Using the ps command check to make sure that all processes for the Data Service have been stopped. Check syslog for any possible errors which may have occurred just before this message. If everything appears to be correct, then no action is required.


387003 :CCR: CCR metadata not found.

Description:

The CCR is unable to locate its metadata.

Solution:

Boot the offending node in -x mode to restore the indicated table from backup or other nodes in the cluster. The CCR tables are located at /etc/cluster/ccr/.


387150 :scvxvmlg warning - found no matching volume for device node %s, removing it

Description:

The program responsible for maintaining the VxVM device namespace has discovered inconsistencies between the VxVM device namespace on this node and the VxVM configuration information stored in the cluster device configuration system. If configuration changes were made recently, then this message should reflect one of the configuration changes. If no changes were made recently or if this message does not correctly reflect a change that has been made, the VxVM device namespace on this node may be in an inconsistent state. VxVM volumes may be inaccessible from this node.

Solution:

If this message correctly reflects a configuration change to VxVM diskgroups then no action is required. If the change this message reflects is not correct, then the information stored in the device configuration system for each VxVM diskgroup should be examined for correctness. If the information in the device configuration system is accurate, then executing '/usr/cluster/lib/dcs/scvxvmlg' on this node should restore the device namespace. If the information stored in the device configuration system is not accurate, it must be updated by executing '/usr/cluster/bin/scconf -c -D name=diskgroup_name' for each VxVM diskgroup with inconsistent information.


387232 :resource %s monitor enabled.

Description:

This is a notification from the rgmd that the operator has enabled monitoring on a resource. This may be used by system monitoring tools.

Solution:

This is an informational message, no user action is needed.


387288 :clcomm: Path %s online

Description:

A communication link has been established with another node.

Solution:

No action required.


388330:Text server stopped.

Description:

Sun Cluster HA for Sybase stopped the Text Server.

Solution:

No user action required.


389221 :could not open configuration file: %s

Description:

The specified configuration file could not be opened.

Solution:

Check if the configuration file exists and has correct permissions. If the problem persists, contact your Sun Service representative.


389231 :clcomm: inbound_invo::cancel:_state is 0x%x

Description:

The internal state describing the server side of a remote invocation is invalid when a cancel message arrives during processing of the remote invocation.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


389369:Validation failed. SYBASE ASE STOP_FILE %s is not executable.

Description:

File specified in the STOP_FILE extension property is not an executable file.

Solution:

Please check the permissions of file specified in the STOP_FILE extension property. File should be executable by the Sybase owner and root user.


389516 :NULL value returned for the extension property %s.

Description:

NULL value was specified for the extension property of the resource.

Solution:

For the property name check the syslog message. Any of the following situations might have occurred. Different user action is needed for these different scenarios. 1) If a new resource is created or updated, check whether the value of the extension property is empty. If it is, provide valid value using scrgadm(1M). 2) For all other cases, treat it as an Internal error. Contact your authorized Sun service provider.


389901 :ext_props(): Out of memory

Description:

System runs out of memory in function ext_props().

Solution:

Install more memory, increase swap space, or reduce peak memory consumption.


390130:Failed to allocate space for %s.

Description:

An internal error occurred.

Solution:

Save a copy of the /var/adm/messages files from all nodes. Contact your authorized Sun service provider for assistance.


390691 :NFS daemon down

Description:

HA-NFS fault monitor detected that an nfs daemon died and will automatically restart it later.

Solution:

No action required.


391738 :(%s) bad poll revent: %x (hex)

Description:

Call to poll() failed. The "poll" man page describes possible error codes. udlmctl will exit.

Solution:

Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


392782 :Failed to retrieve the property %s for %s: %s.

Description:

API operation has failed in retrieving the cluster property.

Solution:

For property name, check the syslog message. For more details about API call failure, check the syslog messages from other components.


393385 :Service daemon not running.

Description:

Process group has died and the data service's daemon is not running. Updating the resource status.

Solution:

Wait for the fault monitor to restart or failover the data service. Check the configuration of the data service.


393960 :sigaction failed in set_signal_handler

Description:

The ucmmd has failed to initialize signal handlers by a call to sigaction(2). The error message indicates the reason for the failure. The ucmmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes and of the ucmmd core. Contact your authorized Sun service provider for assistance in diagnosing the problem.


394584 :PNM: could not start due to configuration error in %s

Description:

An error is encountered when processing the named configuration file. The PNM daemon is thus not fully functional and adapter monitoring and failover are not enabled.

Solution:

Use the pnmset(1M) command to reset the PNM configuration on the node. The pnmd daemon will be reinitialized automatically.


395353 :Failed to check whether the resource is a network address resource.

Description:

While retrieving the IP addresses from the network resources, the attempt to check whether the resource is a network resource or not has failed.

Solution:

Internal error or API call failure might be the reasons. Check the error messages that occurred just before this message. If there is internal error, contact your authorized Sun service provider. For API call failure, check the syslog messages from other components.


396134 :Register callback with NAFO %s failed: Error %d.

Description:

LogicalHostname resource was unable to register with NAFO for status updates.

Solution:

Most likely it is result of lack of system resources. Check for memory availability on the node. Reboot the node if problem persists.


396727:Attempting to check for existence of %s pid %d resulted in error: %s.

Description:

Sun Cluster HA for NFS fault monitor attempted to check the status of the specified process but failed. The specific cause of the error is logged. This failure might occur if there is a lack of system resources.

Solution:

Attempt to free memory by terminating any programs that are using large amounts of memory and swap. If this error persists, reboot the node.


397020 :unix DLM abort failed

Description:

Failed to abort unix dlm. This is an error that can be ignored.

Solution:

None.


397219 :RGM: Could not allocate %d bytes; node is out of swap space; aborting node

Description:

The rgmd failed to allocate memory, most likely because the system has run out of swap space. This failure causes the rgmd to produce a core file and to force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Reboot. If the problem recurs, you might need to increase swap space by configuring additional swap devices. See swap(1M) for more information.


397340 :Monitor initialization error. Unable to open resource: %s Group: %s: error %d

Description:

Error occurred in monitor initialization. Monitor is unable to get resource information using API calls.

Solution:

Check syslog messages for errors logged from other system modules. Stop and start fault monitor. If error persists then disable fault monitor and report the problem.


398345 :Error %d setting policy %d %s

Description:

This message appears when the customer is initializing or changing a scalable services load balancer, by starting or updating a service. An internal error happened while trying to change the load balancing policy.

Solution:

This is an internal error and it could happen if another RGM are operation were happening at the same time. The user action is to try it again. If it happens when another RMG update is not happening, contact your Sun Service provider for help.


398878 :reservation fatal error(%s) - dcs_get_service_parameters() error, returned %d

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.


398973 :The probe has requested an immediate failover. Attempting to failover this resource group subject to the setting of the Failover_enabled property.

Description:

An immediate failover will be performed

Solution:

This is an informational message, no user action is needed.


399216 :clexecd: Got an unexpected signal %d in process %s (pid=%d, ppid=%d)

Description:

clexecd program got a signal indicated in the error message.

Solution:

clexecd program will exit and node will be halted or rebooted to prevent data corruption. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


399266 :Cluster goes into pingpong booting because of failure of method <%s> on resource <%s>. RGM is not aborting this node.

Description:

A stop method is failed and Failover_mode is set to HARD, but the RGM has detected this resource group falling into pingpong behavior and will not abort the node on which the resource's stop method failed. This is most likely due to the failure of both resource's start and stop methods.

Solution:

Save a copy of /var/adm/messages, check for both failed start and stop methods of the failing resource, and make sure to have the failure corrected. Refer to the procedure for clearing the ERROR_STOP_FAILED condition on a resource group in the Sun Cluster Administration Guide to restart resource group.


399753 :CCR: CCR data server failed to register with CCR transaction manager.

Description:

The CCR data server on this node failed to join the cluster, and can only serve read only requests.

Solution:

There may be other related CCR messages on this and other nodes in the cluster, which may help diagnose the problem. It may be necessary to reboot this node or the entire cluster.