Sun Cluster Error Messages Guide for Solaris OS

Message IDs 300000–399999

This section contains message IDs 300000–399999.

300278 Validate - CON_LIMIT=%s is incorrect, default CON_LIMIT=70 is being used

Description:

The value specified for CON_LIMIT is invalid.

Solution:

Either accept the default value of CON_LIMIT=70 or change the value of CON_LIMIT to be less than or equal to 100 when registering the resource.

300397 resource %s property changed.

Description:

This is a notification from the rgmd that a resource's property has been edited by the cluster administrator. This may be used by system monitoring tools.

Solution:

This is an informational message, no user action is needed.

300598 Validate - Winbind configuration directory %s does not exist

Description:

The Winbind configuration directory does not exist.

Solution:

Check that the correct Winbind configuration directory was entered when registering the Winbind resource and that the directory exists.

300777 reservation warning(%s) - Unable to open device %s, errno %d, will retry in %d seconds

Description:

The device fencing program has encountered errors while trying to access a device. The failed operation will be retried

Solution:

This is an informational message, no user action is needed.

300884 CL_EVENT Error: ${SERVER} is already running.

Description:

The cl_event init script found the cl_eventd already running. It will not start it again.

Solution:

No action required.

300956 stopped SAA rc<>

Description:

SAA stopped with result code <

Solution:

No user action is needed. This is a normal message when SAA gets stopped.

301092 file specified in ENVIRONMENT_FILE parameter %s does not exist.

Description:

The 'Environment_File' property was set when configuring the resource. The file specified by the 'Environment_File' property may not exist. The file should be readable and specified with a fully-qualified path.

Solution:

Specify an existing file with a fully qualified file name when creating a resource.

301114 Command failed: /bin/rm -f %s/ppid/%s

Description:

An attempt to remove the directory and all the files in the directory failed. This command is being executed as the user that is specified by the extension property DB_User.

Solution:

No action is required.

301573 clcomm: error in copyin for cl_change_flow_settings

Description:

The system failed a copy operation supporting a flow control state change.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

301603 fatal: cannot create any threads to handle switchback

Description:

The rgmd was unable to create a sufficient number of threads upon starting up. This is a fatal error. The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Make sure that the hardware configuration meets documented minimum requirements. Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.

301635 clexecd: close returned %d. Exiting.

Description:

clexecd program has encountered a failed thr_create(3THR) system call. The error message indicates the error number for the failure.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

302670 udlm_setup_port: fcntl: %d

Description:

A server was not able to execute fnctl(). udlm exits and the node aborts and panics.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.

303009 validate: EnvScript $Filename does not exist but it is required

Description:

The environment script $Filename is set in the parameter file but does not exist.

Solution:

Set the variable EnvScript in the parameter file mentioned in option -N of the start, stop and probe command to a valid contents.

303231 mount_client_impl::remove_client() failed attempted RM change_repl_prov_status() to remove client, spec %s, name %s

Description:

The system was unable to remove a PXFS replica on the node that this message was seen.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

303805 Cannot change the IPMP group on the local host.

Description:

A different IPMP group for property NetIfList is specified in scrgadm command. The IPMP group on local node is set at resource creation time. Users may only update the value of property NetIfList for adding a IPMP group on a new node.

Solution:

Rerun the scrgadm command with proper value of property NetIfList.

303879 INTERNAL ERROR: Unable to lock %s: %s.

Description:

An internal error has occurred.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.

303941 Unsuccessful probe of %s port %d for non-secure resource %s. (%s)

Description:

An error occurred while fault monitor attempted to probe the health of the data service.

Solution:

Wait for the fault monitor to correct this by doing restart or failover. For more error description, look at the syslog messages.

304282 Failed to retrieve information for user %s: %s.

Description:

The attempt to retrieve information for the specified user failed. The reason is listed in the error message.

Solution:

Take corrective action according to the reason specified in the error message.

304365 clcomm: Could not create any threads for pool %d

Description:

The system creates server threads to support requests from other nodes in the cluster. The system could not create any server threads during system startup. This is caused by a lack of memory.

Solution:

There are two solutions. Install more memory. Alternatively, take steps to reduce memory usage. Since the creation of server threads takes place during system startup, application memory usage is normally not a factor.

305163 Start of HADB database completed successfully.

Description:

The resource was able to successfully start the HADB database.

Solution:

This is an informational message, no user action is needed.

305195 Validate - %s is non-existent or non-executable

Description:

The defined startsap and stopsap scripts does not exist or is non-executable.

Solution:

Correct the defined SAP_START or SAP_STOP in /opt/SUNWscswa/util/ha_sap_j2ee_config and re-register the agent or change the filemodes to executable.

305298 cm_callback_impl abort_trans: exiting

Description:

ucmm callback for abort transition failed.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.

306407 Failed to stop Adaptive server.

Description:

Sun Cluster HA for Sybase failed to stop using KILL signal.

Solution:

Please examine whether any Sybase server processes are running on the server. Please manually shutdown the server.

306432 in libsecurity for program %s (%lu); NETPATH=%s

Description:

The specified server was not able to start because it could not establish a rpc connection for the network specified, because it couldn't find any transport. This happened because either there are no available transports at all, or there are but none is a loopback. The NETPATH environment variable is shown. This error message is informational, and appears together with other messages appropriate for this situation. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

306944 Validate - Only SUNWfiles or SUNWbinfiles are supported

Description:

The DHCP resource requires that the /etc/inet/dhcpsvc.conf file has RESOURCE=SUNWfiles or SUNWbinfiles.

Solution:

Ensure that /etc/inet/dhcpsvc.conf has RESOURCE=SUNWfiles or SUNWbinfiles by configuring DHCP appropriately, i.e. as defined within the Sun Cluster 3.0 Data Service for DHCP.

307195 clcomm: error in copyin for cl_read_flow_settings

Description:

The system failed a copy operation supporting flow control state reporting.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

307657 Oracle UDLM package is not installed. %s not found.

Description:

Oracle UNIX Distributed Lock Manager (UDLM) is not properly installed on this node. The UDLM reconfiguration program could not locate the lock manager binary at the location indicated in the message. Oracle OPS/RAC will not be able to function on this node.

Solution:

If you want to run OPS/RAC on this cluster node, verify installation of ORCLudlm package. Refer to Oracle's documentation for installation of Oracle UDLM.

309875 Error encountered enabling failfast

Description:

Error encountered when enabling failfast. Node will be rebooted.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log from all the nodes and contact your Sun service representative.

310640 runmqtrm - %s

Description:

The following output was generated from the runmqtrm command.

Solution:

No you action is required if the command was successful. Otherwise, examine the other syslog messages occurring at the same time on the same node to see if the cause of the problem can be identified.

310953 clnt_control of program %s failed %s.

Description:

HA-NFS fault monitor failed to reset the retry timeout for retransmitting the rpc request.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

311291 sema_post badchild: %s

Description:

The rpc.pmfd server was not able to act on a semaphore. The message contains the system error. The server does not perform the action requested by the client, and an error message is output to syslog.

Solution:

Investigate if the machine is running out of memory. If this is not the case, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

311463 Failover attempt failed.

Description:

The failover attempt of the resource is rejected or encountered an error.

Solution:

For more detailed error message, check the syslog messages. Check whether the Pingpong_interval has appropriate value. If not, adjust it using scrgadm(1M). Otherwise, use scswitch to switch the resource group to a healthy node.

311639 check_dhcp - Active interface has changed from %s to %s

Description:

The DHCP resource's fault monitor has detected that the active interface has changed.

Solution:

No user action is needed. The fault monitor will restart the DHCP server.

311795 "start_sge_commd failed"

Description:

The process sge_commd failed to start for reasons other than it was already running.

Solution:

Check '/var/adm/messages' for any relevant cluster messages. Respond accordingly, then retry.

311808 Can not open /etc/mnttab: %s

Description:

Error in open /etc/mnttab, the error message is followed.

Solution:

Check with system administrator and make sure /etc/mnttab is properly defined.

312004 Value of Start_timeout property may be small for %d max_offload_retry for %d resource groups to offload.

Description:

This is a warning message indicating the you may have set max_offload_retry to a high value which may cause the start method of RGOffload resource to timeout before an attempt can be made to offload all specified resource groups.

Solution:

Please calculate the max_offload_retry so that the Start_timeout is not exceeded if every resource group that has to be offloaded requires maximum retries. There is a 10 second interval between successive retries.

312053 Cannot execute %s: %s.

Description:

Failure in executing the command.

Solution:

Check the syslog message for the command description. Check whether the system is low in memory or the process table is full and take appropriate action. Make sure that the executable exists.

312124 Failed to establish socket's default destination: %s

Description:

Default destination for a raw icmp socket could not be established.

Solution:

If the error message string that is logged in this message does not offer any hint as to what the problem could be, contact your authorized Sun service provider.

313510 SAP xserver is not available.

Description:

SAP xserver is not running currently.

Solution:

Informative message, no action is required.

314314 prog <%s> step <%s> terminated due to receipt of signal <%d>

Description:

ucmmd step terminated due to receipt of a signal.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified and if it recurs. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance.

314341 Invalid probe values. Retry_interval (currently set to %d) must be greater than or equal to the product of Thorough_probe_interval (currently set to % d), andRetry_count (currently set to %d).

Description:

Validation of the probe related parameters failed because invalid values were specified.

Solution:

Retry_interval must be greater than or equal to the product of Thorough_probe_interval, and Retry_count. Use scrgadm(1M) to modify the values of these parameters so that they will hold the above relationship.

314345 Too few arguments specified on command line.

Description:

Either the resource name, resource group or resource type argument is missing on the command line. HA Storage Plus detected that it has been launched by someone other than the RGM.

Solution:

Only the RGM can execute HA Storage Plus binaries. However, if this is the result of an execution by the RGM, please contact your authorized Sun service provider.

314356 resource %s enabled.

Description:

This is a notification from the rgmd that the operator has enabled a resource. This may be used by system monitoring tools.

Solution:

This is an informational message, no user action is needed.

314358 Command %s failed to complete. Return code is %d.

Description:

The listed command failed to complete with the listed return code. The return code is from the script db_clear.

Solution:

Look for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.

314907 reservation fatal error() - malloc() error, errno %d

Description:

The device fencing program has been unable to allocate required memory.

Solution:

Memory usage should be monitored on this node and steps taken to provide more available memory if problems persist.

315446 node id <%d> is out of range

Description:

The low-level cluster machinery has encountered an error.

Solution:

Look for other syslog messages occurring just before or after this one on the same node; they may provide a description of the error.

315729 WARNING: Unable to get privatelink IP address for nodeid %d Call to get privatelink IP address for node failed.

Solution:

Make sure entries in /etc/nsswitch.conf are correct to get information about this node.

316019 WebSphere MQ Channel Initiator %s started

Description:

The WebSphere MQ Channel Initiator has been started.

Solution:

No user action is needed.

316215 Process sapsocol is already running outside of Sun Cluster. Will terminate it now, and restart it under Sun Cluster.

Description:

The SAP OS collector process is running outside of the control of Sun Cluster. HA-SAP will terminate it and restart it under the control of Sun Cluster.

Solution:

Informational message. No user action needed.

317213 in libsecurity for program %s (%lu); setnetconfig failed: %s

Description:

The specified server was not able to initiate an rpc connection, because it could not get the network database handle. The server does not start. The rpc error message is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

317928 dbmcli option (%s) is not 'db_warm' nor 'db_online'.

Description:

The extension property in the message is set to an invalid value.

Solution:

Change the value of the extension property to a valid value.

318011 in libsecurity for program %s (%lu); file registration failed for transport %s

Description:

A server (rpc.pmfd, rpc.fed or rgmd) was not able to create the "cache" file to duplicate the rpcbind information. The server should still be able to receive requests from clients (using rpcbind). An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

318767 Device switchovers cannot be performed since AffinityOn value is FALSE.

Description:

Self explanatory.

Solution:

This is an informational message, no user action is needed.

319047 CMM: Issuing a SCSI2 Tkown failed for quorum device %s.

Description:

This node encountered an error while issuing a SCSI2 Tkown operation on the indicated quorum device. This will cause the node to conclude that it has been unsuccessful in preempting keys from the quorum device, and therefore the partition to which it belongs has been preempted. If a cluster gets divided into two or more disjoint subclusters, exactly one of these must survive as the operational cluster. The surviving cluster forces the other subclusters to abort by grabbing enough votes to grant it majority quorum. This is referred to as preemption of the losing subclusters.

Solution:

If the error encountered is EACCES, then the SCSI2 command could have failed due to the presence of SCSI3 keys on the quorum device. Scrub the SCSI3 keys off of it, and reboot the preempted nodes.

319048 CCR: Cluster has lost quorum while updating table %s, it is possibly in an inconsistent state - ABORTING.

Description:

The cluster lost quorum while the indicated table was being changed, leading to potential inconsistent copies on the nodes.

Solution:

Check if the indicated table are consistent on all the nodes in the cluster, if not, boot the cluster in -x mode to restore the indicated table from backup. The CCR tables are located at /etc/cluster/ccr/.

319261 liveCache %s failed to start.

Description:

liveCache started up with error.

Solution:

Sun Cluster will fail over the liveCache resource to another available node. No user action is needed.

319375 clexecd: wait_for_signals got NULL.

Description:

clexecd problem encountered an error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

319413 Siebel server can be started only after Siebel database and Siebel gateway are running, and the scsblconfig file is correctly configured.

Description:

This is a warning message indicating a problem in determining the status of Siebel database and/or the Siebel gateway.

Solution:

Please verify that the scsblconfig file is correctly configured, and that the Siebel database and Siebel gateway are up before attempting to start the Siebel server.

319602 daemon started.

Description:

The scdpmd is started.

Solution:

No action required.

319873 ERROR: stop_mysql Option -R not set

Description:

The -R option is missing for stop_mysql command.

Solution:

Add the -R option for stop_mysql command.

320378 INTERNAL ERROR: usage: $0 <server_root> <siebel_enterprise>

Description:

An internal error has occurred.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.

320970 Failed to retrieve the node id from the node name for node %s: %s.

Description:

Self explanatory.

Solution:

Check the cluster configuration. If the problem persists, contact your authorized Sun service provider.

321245 resource <%s> is disabled but not offline

Description:

While attempting to execute an operator-requested enable or disable of a resource, the rgmd has found the indicated resource to have its Onoff_switch property set to DISABLED, yet the resource is not offline. This suggests corruption of the RGM's internal data and will cause the enable or disable action to fail.

Solution:

This may indicate an internal error or bug in the rgmd. Contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.

321667 clcomm: cl_comm: not booted in cluster mode.

Description:

Attempted to load the cl_comm module when the node was not booted as part of a cluster.

Solution:

Users should not explicitly load this module.

321962 Command %s failed to complete. HA-SAP will continue to start SAP.

Description:

The command to cleanipc failed to complete.

Solution:

This is an internal error. No user action needed. Save the /var/adm/messages from all nodes. Contact your authorized Sun service provider.

322252 %s failed to stop the application and returned with %d

Description:

A command was run to stop the application but it failed with an error. Both the command and the error code are specified in the message.

Solution:

Save the syslog and contact your authorized Sun service provider.

322323 INTERNAL ERROR in J2EE probe calling scds_fm_tcp_disconnect(): %s"

Description:

The data service detected an internal error from scds.

Solution:

Informational message. No user action is needed.

322642 Error binding to %s port %d for non-secure resource %s: %s (%s)

Description:

An error occurred while fault monitor attempted to probe the health of the data service.

Solution:

Wait for the fault monitor to correct this by doing restart or failover. For more error description, look at the syslog messages.

322675 Some NFS system daemons are not running.

Description:

HA-NFS fault monitor checks the health of statd, lockd, mountd and nfsd daemons on the node. It detected that one or more of these are not currently running.

Solution:

No action. The monitor would restart these. If it doesn't, reboot the node.

322797 Error registering provider '%s' with the framework.

Description:

The device configuration system on this node has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

322862 clcomm: error in copyin for cl_read_threads_min

Description:

The system failed a copy operation supporting flow control state reporting.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

322879 clcomm: Invalid copyargs: node %d pid %d

Description:

The system does not support copy operations between the kernel and a user process when the specified node is not the local node.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

322908 CMM: Failed to join the cluster: error = %d.

Description:

The local node was unsuccessful in joining the cluster.

Solution:

There may be other related messages on this node that may indicate the cause of this failure. Resolve the problem and reboot the node.

324115 ERROR: probe_sap_j2ee Option -S not set

Description:

The -S option is missing for the probe_command.

Solution:

Add -S option to the probe-command.

324478 (%s): Error %d from read

Description:

An error was encountered in the clexecd program while reading the data from the worker process.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

324924 in libsecurity contacting program %s (%lu): could not copy host name

Description:

A client was not able to make an rpc connection to the specified server because the host name could not be saved, probably due to low memory. An error message is output to syslog.

Solution:

Investigate if the host is low on memory. If not, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

325322 clcomm: error in copyin for state_resource_pool

Description:

The system failed a copy operation supporting statistics reporting.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

325482 File %s is not owned by group (GID) %d

Description:

The file is not owned by the gid which is listed in the message.

Solution:

Set the permissions on the file so that it is owned by the gid which is listed in the message.

325750 Failed to analyze the device special file associated with file system mount point %s: %s.

Description:

HA Storage Plus was not able to open the underlying device of the specified mount point.

Solution:

Check that this device is valid.

326023 reservation warning(%s) - MHIOCGRP_PREEMPTANDABORT error in MHIOCGRP_INKEYS, error %d

Description:

The device fencing program has encountered errors while trying to access a device. All retry attempts have failed.

Solution:

This may be indicative of a hardware problem, which should be resolved as soon as possible. Once the problem has been resolved, the following actions may be necessary: If the message specifies the 'node_join' transition, then this node may be unable to access the specified device. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access the device. In either case, access can be reacquired by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group may have failed to start on this node. If the device group was started on another node, it may be moved to this node with the scswitch command. If the device group was not started, it may be started with the scswitch command. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group may have failed. If so, the desired action may be retried.

326528 Nonexistent Broker_name (%s).

Description:

The broker name provided in the extension property Broker_Name does not exist.

Solution:

Check that a broker instance exists for the supplied broker name. It should match the broker name portion of the path in Confdir_list.

327057 SharedAddress stopped.

Description:

The stop method is completed and the resource is stopped.

Solution:

This is informational message. No user action required.

327339 unable to create file %s errno %d

Description:

Internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

327353 Global device path %s is not recognized as a device group or a device special file.

Description:

Self explanatory.

Solution:

Check that the specified global device path is valid.

327437 runmqchi - %s

Description:

The following output was generated from the runmqchi command.

Solution:

No user action is required if the command was successful. Otherwise, examine the other syslog messages occurring at the same time on the same node to see if the cause of the problem can be identified.

327549 Stop of HADB node %d did not complete.

Description:

The resource was unable to successfully run the hadbm startnode command either because it was unable to execute the program, or the hadbm command received a signal.

Solution:

This might be the result of a lack of system resources. Check whether the system is low in memory and take appropriate action.

328779 Probe for J2EE engine timed out in scds_fm_tcp_connect().

Description:

The data service timed out while connecting to the J2EE engine port.

Solution:

Informational message. No user action is needed.

329429 reservation fatal error(%s) - host_name not specified

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.

329496 unlatch_intention(): IDL exception when communicating to node %d

Description:

An inter-node communication failed, probably because a node died.

Solution:

No action is required; the rgmd should recover automatically.

329616 svc_probe used entire timeout of %d seconds during connect operation and exceeded the timeout by %d seconds. Attempting disconnect with timeout %d

Description:

The probe timed out connecting to the application.

Solution:

If the problem persists investigate why the application is responding slowly or if the Probe_timeout property needs to be increased.

329778 clconf: Data length is more than max supported length in clconf_ccr read

Description:

In reading configuration data through CCR, found the data length is more than max supported length.

Solution:

Check the CCR configuration information.

329847 Warning: node %d has a weight of 0 assigned to it for property %s.

Description:

The named node has a weight of 0 assigned to it. A weight of 0 means that no new client connections will be distributed to that node.

Solution:

Consider assigning the named node a non-zero weight.

330063 error in vop open %x

Description:

Opening a private interconnect interface failed.

Solution:

Reboot of the node might fix the problem.

330182 Internal error: default value missing for resource property

Description:

A non-fatal internal error has occurred in the RGM.

Solution:

Since this problem might indicate an internal logic error in the rgmd, please save a copy of the /var/adm/messages files on all nodes, the output of an scstat -g command, and the output of a scrgadm -pvv command. Report the problem to your authorized Sun service provider.

330526 CMM: Number of steps specified in registering callback = %d; should be <= %d.

Description:

The number of steps specified during registering a CMM callback exceeds the allowable maximum. This is an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

330700 Cannot read template file (%s).

Description:

Template file name was passed to the function for synchronizing the configuration file. Unable to locate the file indicated in he message. Unable to store property value of resource to configuration file.

Solution:

Verify installation of Sun Cluster support packages for Oracle RAC. If the problem persists, save the contents of /var/adm/messages and contact your Sun service representative.

330772 DB_password_file must be set when Auto_recovery is set to true.

Description:

When auto_recovery is set to true the db_password_file extension property must be set to a file that can be passed as the value to the hadbm command's --dbpasswordfile argument.

Solution:

Retry the resource creation and provide a value for the db_password_file extension property.

331221 CMM: Max detection delay specified is %ld which is larger than the max allowed %ld.

Description:

The maximum of the node down detection delays is larger than the allowable maximum. The maximum allowed will be used as the actual maximum in this case.

Solution:

This is an informational message, no user action is needed.

331325 sigprocmask: %s The rpc.fed server encountered an error with the sigprocmask function, and was not able to start. The message contains the system error.

Solution:

Save the syslog messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

333069 Failed to retrieve nodeid for %s.

Description:

The nodeid for the given name could not be determined.

Solution:

Make sure that the name given is a valid node identifier or node name.

333387 INTERNAL ERROR: starting_iter_deps_list: meth type <%d>

Description:

A non-fatal internal error has occurred in the rgmd state machine.

Solution:

333622 Property %s can be changed only while ucmmd is not running on the node.

Description:

This property can be changed only while ucmmd is not running on the node. The volume manager processes on all the nodes must use identical value of this property for proper functioning on the cluster.

Solution:

Change the RAC framework resource group to unmanaged state. Reboot all the nodes that can run RAC framework and modify the property.

333752 scha_control() returned error %d for %s in %s.

Description:

The data service detected an internal error from scds.

Solution:

Informational message. No user action is needed.

333890 ERROR: stop_mysql Option -D not set

Description:

The -D option is missing for stop_mysql command.

Solution:

Add the -D option for stop_mysql command.

334697 Failed to retrieve the cluster property %s: %s.

Description:

The query for a property failed. The reason for the failure is given in the message.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.

334992 clutil: Adding deferred task after threadpool shutdown id %s

Description:

During shutdown this operation is not allowed.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

335206 Failed to get host names from the resource.

Description:

Retrieving the IP addresses from the network resources from this resource group has failed.

Solution:

Internal error or API call failure might be the reasons. Check the error messages that occurred just before this message. If there is internal error, contact your authorized Sun service provider. For API call failure, check the syslog messages from other components. For the resource name and resource group name, check the syslog tag.

335316 ERROR: start_mysql Option -H not set

Description:

The -H option is missing for start_mysql command.

Solution:

Add the -H option for start_mysql command.

335468 Time allocated to stop development system is too small (less than 5 seconds).

Description:

Time allocated to stop the development system is too small.

Solution:

The time for stopping the development system is a percentage of the total Start_timeout. Increase the value for property Start_timeout or the value for property Dev_stop_pct.

335591 Failed to retrieve the resource group property %s: %s.

Description:

An attempt to retrieve a resource group property has failed.

Solution:

If the failure is cased by insufficient memory, reboot. If the problem recurs after rebooting, consider increasing swap space by configuring additional swap devices. If the failure is caused by an API call, check the syslog messages for the possible cause.

336128 start_winbind - Could not start winbind

Description:

The Winbind resource could not start winbind.

Solution:

Examine the other syslog messages occurring at the same time on the same node to see if the cause of the problem can be identified. If required turn on debug for the resource. Please refer to the data service documentation to determine how to do this.

336860 read %d for %snum_ports

Description:

Could not get information about the number of ports udlm uses from config file udlm.conf.

Solution:

Check to make sure udlm.conf file exist and has entry for udlm.num_ports. If everything looks normal and the problem persists, contact your Sun service representative.

337008 rgm_comm_impl::_unreferenced() called unexpectedly

Description:

The low-level cluster machinery has encountered a fatal error. The rgmd will produce a core file and will cause the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.

337166 Error setting environment variable %s.

Description:

An error occurred while setting the environment variable LD_LIBRARY_PATH. This is required by the fault monitor for the nsldap data service. The fault monitor appends the ldap server root path, including the lib directory, to the LD_LIBRARY_PATH environment variable

Solution:

Check that there is a lib directory under the server root of the nsldap data service which pertains to this resource. If this directory has been removed, then it must be replaced by reinstalling Netscape Directory Server, or whatever other means are appropriate.

337212 resource type %s removed.

Description:

This is a notification from the rgmd that a resource type has been deleted. This may be used by system monitoring tools.

Solution:

This is an informational message, no user action is needed.

338067 This resource does not depend on any SUNW.HAStoragePlus resources. Proceeding with normal checks.

Description:

The resource does not depend on any HAStoragePlus file systems. The validation will continue with it's other checks.

Solution:

This message is informational; no user action is needed.

338839 clexecd: Could not create thread. Error: %d. Sleeping for %d seconds and retrying.

Description:

clexecd program has encountered a failed thr_create() system call. The error message indicates the error number for the failure. It will retry the system call after specified time.

Solution:

If the message is seen repeatedly, contact your authorized Sun service provider to determine whether a workaround or patch is available.

339206 validate: An invalid option entered or

Description:

There is an invalid variable set the parameter file mentioned in option -N to a of the start, stop and probe command or the first character of a

Solution:

Fix the parameter file mentioned in option -N to a of the start, stop and probe command to valid contents.

339424 Could not host device service %s because this node is being removed from the list of eligible nodes for this service.

Description:

A switchover/failover was attempted to a node that was being removed from the list of nodes that could host this device service.

Solution:

This is an informational message, no user action is needed.

339521 CCR: Lost quorum while starting to update table %s.

Description:

The cluster lost quorum when CCR started to update the indicated table.

Solution:

Reboot the cluster.

339590 Error (%s) when reading property %s.

Description:

Unable to read a property value using the API. The property name is indicated in message. Other syslog messages may give more information on errors in other modules.

Solution:

Check syslog messages. Report this problem to your authorized Sun service provider.

339657 Issuing a restart request.

Description:

This is informational message. We are above to call API function to request for restart. In case of failure, follow the syslog messages after this message.

Solution:

No user action is needed.

339954 fatal: cannot create any threads to launch callback methods

Description:

The rgmd was unable to create a thread upon starting up. This is a fatal error. The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

340287 idl_set_timestamp(): IDL Exception

Description:

The rgmd has encountered an error that prevents the scha_control function from successfully setting a ping-pong time stamp, presumably because a node died. This does not prevent the attempted failover from succeeding, but in the worst case, might prevent the anti-"pingpong" feature from working, which may permit a resource group to fail over repeatedly between two or more nodes.

Solution:

Examine syslog output on the node that rebooted, to determine the cause of node death. The syslog output might indicate further remedial actions.

340441 Unexpected - test_rdbms_pid<%s>

Description:

The fault monitor found an unexpected error with testing the RDBMS.

Solution:

340893 The stop command \"%s\" failed to stop %s. Using SIGKILL.

Description:

The specified stop command was unable to stop the specified resource. A SIGKILL signal will be sent to all the processes associated with the resource.

Solution:

No action required by the user. This is an informational message.

341702 could not start swa_rpcd, aborting

Description:

swa_rpcd could not be started.

Solution:

Check configuration of SAA.

341754 INTERNAL ERROR: usage: $0 <logicalhost> <server_root> <siebel_enterprise> <siebel_servername> <timeout>

Description:

An internal error has occurred.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.

341804 Failed to retrieve information for user %s.

Description:

Failed to retrieve information for the specified BV user

Solution:

Check if the proper Broadvision Unix User ID is set or check if this user exists on all the nodes of the cluster.

342113 Invalid script name %s. It cannot contain any '/'.

Description:

The script name should be just the script name, no path is needed.

Solution:

Specify just the script name without any path.

342336 clcomm: Pathend %p: path_down not allowed in state %d

Description:

The system maintains state information about a path. A path_down operation is not allowed in this state.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

342597 sigaddset: %s The rpc.fed server encountered an error with the sigemptyset function, and was not able to start. The message contains the system error.

Solution:

Save the syslog messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

342793 Successfully started the service %s.

Description:

Specified data service started successfully.

Solution:

None. This is only an informational message.

342922 User %s does not belong to the project %s: %s.

Description:

The specified user does not belong to the project that is specified in the error message.

Solution:

Use the projadd(1M) command to add the specified user to the specified project.

343307 Could not open file %s: %s.

Description:

System has failed to open the specified file.

Solution:

Check whether the permissions are valid. This might be the result of a lack of system resources. Check whether the system is low in memory and take appropriate action. For specific error information, check the syslog message.

344059 RAC server %s probe successful.

Description:

This message indicates that fault monitor has successfully probed the RAC server

Solution:

No action required. This is informational message.

344614 Started SAP processes successfully w/o PMF.

Description:

The data service started the processes w/o PMF.

Solution:

Informational message. No user action is needed.

345342 Failed to connect to %s port %d.

Description:

The data service fault monitor probe was trying to connect to the host and port specified and failed. There may be a prior message in syslog with further information.

Solution:

Make sure that the port configuration for the data service matches the port configuration for the underlying application.

345472 INITUCMM Validation failed. The ucmmd daemon will not be started on this node.

Description:

At least one of the modules off Sun Cluster support for Oracle OPS/RAC returned error during validation. The ucmmd daemon will not be started on this node and this node will not be able to run Oracle OPS/RAC.

Solution:

This message can be ignored if this node is not configured to run Oracle OPS/RAC. Examine other syslog messages logged at about the same time to determine the configuration errors. Examine the ucmm reconfiguration log file /var/cluster/ucmm/ucmm_reconf.log. Correct the problem and reboot the node. If problem persists, save a copy of the of the log files on this nodes and contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.

346016 setproject: %s; continuing the process with system default project.

Description:

Either the given project name was invalid, or the caller of setproject() was not a valid user of the given project. The process was launched with project "default" instead of the specified project.

Solution:

Use the projects(1) command to check if the project name is valid and the caller is a valid user of the given project.

347023 Could not run %s. User program did not execute cleanly.

Description:

There were problems making an upcall to run a user-level program.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

347091 resource type %s added.

Description:

This is a notification from the rgmd that the operator has created a new resource type. This may be used by system monitoring tools.

Solution:

This is an informational message, no user action is needed.

347126 dl_info: DLPI error %u

Description:

DLPI protocol error. We cannot get a info_ack from the physical device. We are trying to open a fast path to the private transport adapters.

Solution:

Reboot of the node might fix the problem.

347344 Did not find a valid port number to match field <%s> in configuration file <%s>: %s.

Description:

A failure occurred extracting a port number for the field within the configuration file. The field exists and the file exists and is accessible. The value for the field may not exist or may not be an integer greater than zero. An error in environment may have occurred, indicated by a non-zero errno value at the end of the message.

Solution:

Check to see if the value for the field in the configuration file exists and is an integer greater than zero. If there is an error in the field value, fix the value and retry the operation.

348240 clexecd: putmsg returned %d.

Description:

clexecd program has encountered a failed putmsg(2) system call. The error message indicates the error number for the failure.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

348772 INITPMF Error: ${SERVER} is already running.

Description:

The initpmf init script found the rpc.pmfd already running. It will not start it again.

Solution:

No action required.

349049 CCR reported invalid table %s; halting node

Description:

The CCR reported to the rgmd that the CCR table specified is invalid or corrupted. The node will be halted to prevent further errors.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.

349741 Command %s is not a regular file.

Description:

The specified pathname, which was passed to a libdsdev routine such as scds_timerun or scds_pmf_start, does not refer to a regular file. This could be the result of 1) mis-configuring the name of a START or MONITOR_START method or other property, 2) a programming error made by the resource type developer, or 3) a problem with the specified pathname in the file system itself.

Solution:

Ensure that the pathname refers to a regular, executable file.

350574 Start of HADB database did not complete: %s.

Description:

The resource was unable to successfully run the hadbm start command either because it was unable to execute the program, or the hadbm command received a signal.

Solution:

This might be the result of a lack of system resources. Check whether the system is low in memory and take appropriate action.

350843 INITPMF Error: ${SERVER} registered in rpcbind, but is not available.

Description:

The initpmf init script was unable to verify the availability of the rpc.pmfd server, even though it successfully registered with rpcbind. This error may prevent the rgmd from starting, which will prevent this node from participating as a full member of the cluster.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.

351887 Bulk registration failed

Description:

The cl_apid was unable to perform the requested registration for the CRNP service.

Solution:

351983 "Validate - can't determine path to Grid Engine utility binaries"

Description:

The SGE binary 'gethostname' wasn't found in $SGE_ROOT/utilbin/<arch>. 'gethostname' is used only representatively; if it isn't found the other SGE utilities are presumed misplaced also.

Solution:

Find 'gethostname' in the SGE installation; make certain it is in a location conforming to $SGE_ROOT/utilbin/<arch>, where <arch> is the result of running $SGE_ROOT/util/arch.

352149 monitor_check: method <%s> failed on resource <%s> in resource group <%s> on node <%s>, exit code <%d>, time used: %d%% of timeout <%d seconds>

Description:

In scha_control, monitor_check method of the resource failed on specific node.

Solution:

No action is required, this is normal phenomenon of scha_control, which launches the corresponding monitor_check method of the resource on all candidate nodes and looks for a healthy node which passes the test. If a healthy node is found, scha_control will let the node take over the resource group. Otherwise, scha_control will just exit early.

352954 Resource (%s) not configured.

Description:

This is an internal error. Resource name indicated in the message was passed to the function for synchronizing the configuration files. Unable to locate the resource.

Solution:

If the problem persists, save the contents of /var/adm/messages and contact your Sun service representative.

353368 Unable to run %s: %s.

Description:

The specified command was unable to run because of the specified reason.

Solution:

Examine other syslog messages occurring around the same time on the same node, to see if the source of the problem can be identified.

353467 %s: CCR exception for %s table

Description:

libscdpm could not perform an operation on the CCR table specified in the message.

Solution:

Check if the table specified in the message is present in the Cluster Configuration Repository (CCR). The CCR is located in /etc/cluster/ccr. This error could happen because the table is corrupted, or because there is no space left on the root file system to make any updates to the table. This message will mean that the Disk Path Monitoring daemon cannot access the persistent information it maintains about disks and their monitoring status. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

353557 Filesystem (%s) is locked and cannot be frozen

Description:

The file system has been locked with the _FIOLFS ioctl. It is necessary to perform an unlock _FIOLFS ioctl. The growfs(1M) or lockfs(1M) command may be responsible for this lock.

Solution:

An _FIOLFS LOCKFS_ULOCK ioctl is required to unlock the file system.

353566 Error %d from start_failfast_server

Description:

Internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

353753 invalid mask in hosts list: %s

Description:

The allow_hosts or deny_hosts for the CRNP service contained an invalid mask. This error may prevent the cl_apid from starting up.

Solution:

Remove the offending IP address from the allow_hosts or deny_hosts property, or fix the mask.

354821 Attempting to start the fault monitor under process monitor facility.

Description:

The function is going to request the PMF to start the fault monitor. If the request fails, refer to the syslog messages that appear after this message.

Solution:

This is an informational message, no user action is required.

355663 Failed to post event %lld: %s

Description:

The cl_eventd was unable to post an event to the sysevent queue locally, and will not retry.

Solution:

355950 HA: unknown invocation result status %d

Description:

An invocation completed with an invalid result status.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

356795 CMM: Reconfiguration step %d was forced to return.

Description:

One of the CMM reconfiguration step transitions failed, probably due to a problem on a remote node. A reconfiguration is forced assuming that the CMM will resolve the problem.

Solution:

This is an informational message, no user action is needed.

356930 Property %s is empty. This property must be specified for scalable resources.

Description:

The value of the specified property must not be empty for scalable resources.

Solution:

Use scrgadm(1M) to specify a non-empty value for the property.

357263 munmap: %s

Description:

The rpc.pmfd server was not able to delete shared memory for a semaphore, possibly due to low memory, and the system error is shown. This is part of the cleanup after a client call, so the operation might have gone through. An error message is output to syslog.

Solution:

357766 ping_retry out of bound. The number of retry must be between %d and %d: using the default value

Description:

scdpmd config file (/etc/cluster/scdpm/scdpmd.conf) has a bad ping_retry value. The default ping_retry value is used.

Solution:

Fix the value of ping_retry in the config file.

357767 %d entries found in property %s. For a nonsecure %s instance %s should have exactly one entry.

Description:

Since a nonsecure Server instance only listens on a single port, the specified property should only have a single entry. A different number of entries was found.

Solution:

Change the number of entries to be exactly one.

357915 Error: Unable to stat directory <%s> for scha_control timestamp file

Description:

The rgmd failed in a call to stat(2) on the local node. This may prevent the anti-"pingpong" feature from working, which may permit a resource group to fail over repeatedly between two or more nodes. The failure of the stat call might indicate a more serious problem on the node.

Solution:

Examine other syslog messages occurring around the same time on the same node, to see if the source of the problem can be identified.

357988 set_status - rc<%s> type<%s>

Description:

If the call to scha_resource_setstatus returns an error, the resource's fault probe sets the appropriate return code from scha_resource_setstatus call.

Solution:

Examine the other syslog messages occurring at the same time on the same node to see if the cause of the problem can be identified.

358211 monitor_check: the failover requested by scha_control for resource <%s>, resource group <%s> was not completed because of error: %s

Description:

A scha_control(1HA,3HA) GIVEOVER attempt failed, due to the error listed.

Solution:

Examine other syslog messages on all cluster members that occurred about the same time as this message, to see if the problem that caused the error can be identified and repaired.

358404 Validation failed. PASSWORD missing in CONNECT_STRING

Description:

PASSWORD is missing in the specified CONNECT_STRING. The format could be either 'username/password' or '/' (if operating system authentication is used).

Solution:

Specify CONNECT_STRING in the specified format.

358533 Invalid protocol is specified in %s property.

Description:

The specified system property does not have a valid protocol.

Solution:

Using scrgadm(1M), change the value of the property to use a valid protocol. For example: TCP, UDP.

358861 validate: The Tomcat port is not set but it is required

Description:

The Tomcat port is not set in the parameter file

Solution:

Set the Tomcat Port in the parameter file.

359648 Service has failed.

Description:

Probe is detected a failure in the data service. The data service cannot be restarted on the same node, since there are frequent failures. Probe is setting resource status as failed.

Solution:

Wait for the fault monitor to failover the data service. Check the syslog messages and configuration of the data service.

360300 Unable to determine Sun Cluster node for HADB node %d.

Description:

The HADB database must be created using the hostnames of the Sun Cluster private interconnect, the hostname for the specified HADB node is not a private cluster hostname.

Solution:

Recreate the HADB and specify cluster private interconnect hostnames.

360432 Validation failed. Resource group property RG_AFFINITIES should specify a STRONG POSITIVE affinity (++) with the SCALABLE resource group containing the RAC framework resources

Description:

The resource being created or modified must belong to a group that has a strong positive affinity with the SCALABLE RAC framework resource group.

Solution:

If not already created, create the RAC framework resource group and it's associated resources. Then specify the RAC resource group (preceded with "++") for this resource's group RG_AFFINITIES property.

360600 Oracle UDLM package wrong instruction set architecture.

Description:

Proper Oracle UNIX Distributed Lock Manager (UDLM) is not installed on this node. A 64 bit UDLM package should not be installed in 32 bit operating environment. Oracle OPS/RAC will not be able to function on this node.

Solution:

Install ORCLudlm package that is appropriate in this operating environment. Refer to Oracle's documentation for installation of Oracle UDLM.

360990 Service is already running. Cannot use PMF.

Description:

The data service detected a running SAP instance. Webas_Use_Pmf is specified, thus the data service cannot start probing the instance.

Solution:

Either set Webas_Use_Pmf to False, or stop the service.

361048 ERROR: rgm_run_state() returned non-zero while running boot methods

Description:

The rgmd state machine has encountered an error on this node.

Solution:

Look for preceding syslog messages on the same node, which may provide a description of the error.

361289 iPlanet service with config file <%s> does not configure %s.

Description:

The magnus.conf configuration file for the iPlanet Web Server instance does not contain the specified directive.

Solution:

Edit the configuration file and set the specified directive.

361831 Initialization failed. Invalid command line %s %s.

Description:

Unable to process parameters passed to the call back method. The parameters are indicated in the message. This is a Sun Cluster HA for Sybase internal error.

Solution:

Report this problem to your authorized Sun service provider

361880 None of the hostnames/IP addresses specified can be managed by this resource.

Description:

There was a failure in obtaining a list of IP addresses for the hostnames in the resource. Messages logged immediately before this message may indicate what the exact problem is.

Solution:

Check the settings in /etc/nsswitch.conf and verify that the resolver is able to resolve the hostnames.

362117 Config file: %s unknown variable name line %d

Description:

Error in scdpmd config file (/etc/cluster/scdpm/scdpmd.conf).

Solution:

Fix the config file.

362463 clcomm: Endpoint %p: path_down not allowed in state %d

Description:

The system maintains information about the state of an Endpoint. The path_down operation is not allowed in this state.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

362519 dl_attach: DL_ERROR_ACK bad PPA

Description:

Could not attach to the physical device. We are trying to open a fast path to the private transport adapters.

Solution:

Reboot of the node might fix the problem.

362657 Error when sending response from child process: %m

Description:

Error occurred when sending message from fault monitor child process. Child process will be stopped and restarted.

Solution:

If error persists, then disable the fault monitor and resport the problem.

363357 Failed to unregister callback for IPMP group %s with tag %s (request failed with %d).

Description:

An unexpected error occurred while trying to communicate with the network monitoring daemon (pnmd).

Solution:

Make sure the network monitoring daemon (pnmd) is running.

363491 INTERNAL ERROR CMM: device_type_registery_impl::initialize called already.

Description:

This is an internal error during node initialization, and the system can not continue.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

363505 check_for_ccrdata failed strdup for (%s)

Description:

Call to strdup failed. The "strdup" man page describes possible reasons.

Solution:

Install more memory, increase swap space or reduce peak memory consumption.

363696 Error: unable to bring Resource Group <%s> ONLINE, because the Resource Groups <%s> for which it has a strong positive affinity are not online.

Description:

The rgmd is enforcing the strong positive affinities of the resource groups. This behavior is normal and expected.

Solution:

No action required. If desired, use scrgadm(1M) to change the resource group affinities.

364188 Validation failed. Listener_name not set

Description:

'Listener_name' property of the resource is not set. HA-Oracle will not be able to manage Oracle listener if Listener name is not set.

Solution:

Specify correct 'Listener_name' when creating resource. If resource is already created, please update resource property.

364510 The specified Oracle dba group id (%s) does not exist

Description:

Group id of oracle dba does not exist.

Solution:

Make sure /etc/nswitch.conf and /etc/group files are valid and have correct information to get the group id of dba.

365287 Error detected while parsing -> %s

Description:

This message shows the line that was being parsed when an error was detected.

Solution:

Please ensure that all entries in the custom monitor action file are valid and follow the correct syntax. After the file is corrected, validate it again to verify the syntax.

366483 SAPDB parent kernel process was terminated.

Description:

The SAPDB parent kernel process was not running on the system.

Solution:

During normal operation, this error should not occurred, unless the process was terminated manually, or the SAPDB kernel process was terminated due to SAPDB kernel problem. Consult the SAPDB log file for additional information to determine whether the process was terminated abnormally.

366769 The Hosts in the startup order are not up. Waiting for them to start....

Description:

The probe has detected that the BV processes are not running but cannot take any action because the BV hosts in the startup order are not running.

Solution:

If the Resource Groups which contain the Backend resources are not online then bring them online. If they are online then probably the BV processes are in the process of coming up and so no need to take any action.

367270 INTERNAL ERROR: Failed to create the path to the %s file.

Description:

An internal error has occurred.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.

367864 svc_init failed.

Description:

The rpc.pmfd server was not able to initialize server operation. This happens while the server is starting up, at boot time. The server does not come up, and an error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

367880 Probe command '%s' timed out: %s.

Description:

Probing with the specified command timed out.

Solution:

Other syslog messages occurring just before this one might indicate the reason for the failure. You might consider increasing the timeout value for the method that generated the error.

368363 Failed to retrieve the current primary node.

Description:

Cannot retrieve the current primary node for the given resource group.

Solution:

Check the syslog messages that occurred just before this message, to see whether there is any internal error has occurred. If it is, contact your authorized Sun service provider. Otherwise, Check if the resource group is in STOP_FAILED state. If it is, then clear the state and bring the resource group online.

368373 %s is locally mounted. Unmounting it.

Description:

HA Storage Plus found the specified mount point mounted as a local file system. It will unmount it, as asked for.

Solution:

This is an informational message, no user action is needed.

368596 libsecurity: program %s (%lu); unexpected getnetconfigent error

Description:

A client of the specified server was not able to initiate an rpc connection, because it could not get the network information. The pmfadm or scha command exits with error. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

368819 t_rcvudata in recv_request: %s

Description:

Call to t_rcvudata() failed. The "t_rcvudata" man page describes possible error codes. udlm will exit and the node will abort.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.

368910 check_sap_j2ee - J2EE instance %s is not running

Description:

One of the defined SAP J2EE instances is not working and has been detected by the agent fault probe.

Solution:

If this message is repeated, look in the SAP J2EE logfiles to get more information otherwise do nothing.

369460 udlm_send_reply %s: udp is null!

Description:

Can not communicate with udlmctl because the address to send to is null.

Solution:

None. udlm will handle this error.

369570 Multiple ports specified in property %s.

Description:

The validate method for the SUNW.Event service found the specified problem. Thus, the service could not be started.

Solution:

Modify the specified property to fix the problem specified.

369728 Taking action specified in Custom_action_file.

Description:

Fault monitor detected an error specified by the user in a custom action filer. This message indicates that the fault monitor is taking action as specified by the user for such an error.

Solution:

This is an informational message.

370060 "Validate - sge_commd file does not exist or is not executable at ${bin_dir}/sge_commd"

Description:

The binary file '$SGE_ROOT/bin/<arch>/sge_commd' was not found or is not executable.

Solution:

Make certain the binary file '$SGE_ROOT/bin/<arch>/sge_commd' is both in that location, and executable.

370251 ERROR: stop_sap_j2ee Option -J not set

Description:

The -J option is missing for the stop_command.

Solution:

Add -J option to the stop-command.

370604 This resource depends on a HAStoragePlus resource that is in a different Resource Group. This configuration is not supported.

Description:

The resource depends on a HAStoragePlus resource that is configured in a different resource group. This configuration is not supported.

Solution:

Please add this resource and the HAStoragePlus resource in the same resource group.

370949 created %d threads to launch resource callback methods; desired number = %d

Description:

The rgmd was unable to create the desired number of threads upon starting up. This is not a fatal error, but it may cause RGM reconfigurations to take longer because it will limit the number of tasks that the rgmd can perform concurrently.

Solution:

Make sure that the hardware configuration meets documented minimum requirements. Examine other syslog messages on the same node to see if the cause of the problem can be determined.

371297 %s: Invalid command line option. Use -S for secure mode

Description:

rpc.sccheckd should always be invoked in secure mode. If this message shows up, someone has modified configuration files that affects server startup.

Solution:

Reinstall cluster packages or contact your service provider.

371369 CCR: CCR data server on node %s unreachable while updating table %s.

Description:

While the TM was updating the indicated table in the cluster, the specified node went down and has become unreachable.

Solution:

The specified node needs to be rebooted.

371615 In J2EE probe, failed to find Content-Length: in %s.

Description:

The reply from the J2EE engine did not contain a Content-Length: entry in the http header.

Solution:

Informational message. No user action is needed.

372880 CMM: Quorum device %ld (gdevname %s) can not be acquired by the current cluster members. This quorum device is held by node%s %s.

Description:

This node does not have its reservation key on the specified quorum device, which has been reserved by the specified node or nodes that the local node can not communicate with. This indicates that in the last incarnation of the cluster, the other nodes were members whereas the local node was not, indicating that the CCR on the local node may be out-of-date. In order to ensure that this node has the latest cluster configuration information, it must be able to communicate with at least one other node that was a member of the previous cluster incarnation. These nodes holding the specified quorum device may either be down or there may be up but the interconnect between them and this node may be broken.

Solution:

If the nodes holding the specified quorum devices are up, then fix the interconnect between them and this node so that communication between them is restored. If the nodes are indeed down, boot one of them.

372887 HA: repl_mgr: exception occurred while invoking RMA

Description:

An unrecoverable failure occurred in the HA framework.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

373148 The port portion of %s at position %d in property %s is not a valid port.

Description:

The property named does not have a legal value. The position index, which starts at 0 for the first element in the list, indicates which element in the property list was invalid.

Solution:

Assign the property a legal value.

373573 ERROR: resource group %s state change to %s on node %s is INVALID because we are not running at a high enough version. Aborting node.

Description:

The rgmd on this node is running at a lower version than the rgmd on the specified node.

Solution:

Run scversions -c to commit your latest upgrade. Look for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.

373816 clcomm: copyinstr: max string length %d too long

Description:

The system attempted to copy a string from user space to the kernel. The maximum string length exceeds length limit.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

374006 prog <%s> failed on step <%s> retcode <%d>

Description:

ucmmd step failed on a step.

Solution:

374738 dl_bind: DL_BIND_ACK bad sap %u

Description:

SAP in acknowledgment to bind request is different from the SAP in the request. We are trying to open a fast path to the private transport adapters.

Solution:

Reboot of the node might fix the problem.

376111 Unable to compose %s path. Sending SIGKILL now.

Description:

The STOP method was not able to construct the applications stop command. The STOP method will send SIGKILL to stop the application.

Solution:

Other messages will indicate what the underlying problem was such as no memory or a bad configuration.

376905 Failed to retrieve WLS extension properties.Will shutdown the WLS using sigkill

Description:

Failed to retrieve the WLS extension properties that are needed to do a smooth shutdown. The WLS stop method however will go ahead and kill the WLS process.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.

376974 Error in initialization; exiting.

Description:

The cl_apid was unable to start-up. There should be other error message with more detailed explanations of the specific problems.

Solution:

Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.

377035 The path name %s associated with the FilesystemCheckCommand extension property is not a regular file.

Description:

HA Storage Plus found that the specified file was not a plain file, but of different type (directory, symbolic link, etc.).

Solution:

Correct the value of the FilesystemCheckCommand extension property by specifying a regular executable.

377210 Failed to retrieve BV extension properties.

Description:

Failed to retrieve the Extension properties set by the user or failed to retrieve a valid host for BV processes.

Solution:

Look for other error messages generated while retrieving the extension properties to identify the exact error. Look for appropriate action for that error message.

377347 CMM: Node %s (nodeid = %ld) is up; new incarnation number = %ld.

Description:

The specified node has come up and joined the cluster. A node is assigned a unique incarnation number each time it boots up.

Solution:

This is an informational message, no user action is needed.

377531 Stop saposcol under PMF times out.

Description:

Stopping the SAP OS collector process under the control of Process Monitor facility times out. This might happen under heavy system load.

Solution:

You might consider increase the stop time out value.

377897 Successfully started the service

Description:

Informational message. SAP started up successfully.

Solution:

No action needed.

378220 Siebel gateway already running.

Description:

Siebel gateway was not expected to be running. This may be due to the gateway having started outside Sun Cluster control.

Solution:

Please shutdown the gateway instance manually, and retry the previous operation.

378427 prog <%s> step <%s> terminated due to receipt of signal

Description:

ucmmd step terminated due to receipt of a signal.

Solution:

378807 clexecd: %s: sigfillset returned %d. Exiting.

Description:

clexecd program has encountered a failed sigfillset(3C) system call. The error message indicates the error number for the failure.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

378872 %s operation failed: %s.

Description:

Specified system operation could not complete successfully.

Solution:

This is as an internal error. Contact your authorized Sun service provider with the following information. 1) Saved copy of /var/adm/messages file. 2) Output of "ifconfig -a" command.

379284 Archive log destination %s is > 90%% full and has < 20Mb available free space

Description:

The HA-Oracle fault monitor has detected that this archive log destination is greater than 90% full and has less than 20Mb of available free space.

Solution:

Whilst this might not cause immediate concern, more space should be made available in the file system to ensure it does not become full.

379450 reservation fatal error(%s) - fenced_node not specified

Description:

The device fencing program has suffered an internal error.

Solution:

379458 Successfully opened <%s>

Description:

The clapi_mod in the syseventd successfully opened the specified door.

Solution:

No action required.

379820 INTERNAL ERROR in J2EE probe calling scds_fm_tcp_write().

Description:

The data service could not write to the J2EE engine port.

Solution:

Informational message. No user action is needed.

380064 switchback attempt failed on resource group <%s> with error <%s>

Description:

The rgmd was unable to failback the specified resource group to a more preferred node. The additional error information in the message explains why.

Solution:

Examine other syslog messages occurring around the same time on the same node. These messages may indicate further action.

380317 Failed to verify that all IPMP groups are in a stable state. Assuming this node cannot respond to client requests.

Description:

The state of the IPMP groups on the node could not be determined.

Solution:

Make sure all adapters and cables are working. Look in the /var/adm/messages file for message from the network monitoring daemon (pnmd).

380365 (%s) t_rcvudata, res %d, flag %d: tli error: %s

Description:

Call to t_rcvudata() failed. The "t_sndudata" man page describes possible error codes. udlmctl will exit.

Solution:

Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.

380445 scds_pmf_stop() failed with error %s.

Description:

Shutdown through PMF returned an error.

Solution:

No user action needed.

380897 rebalance: WARNING: resource group <%s> is <%s> on node <%d>, resetting to OFFLINE.

Description:

The resource group has been found to be in the indicated state and is being reset to OFFLINE. This message is a warning only and should not adversely affect the operation of the RGM.

Solution:

381244 in libsecurity mkdir of %s failed: %s

Description:

The rpc.pmfd, rpc.fed or rgmd server was not able to create a directory to contain "cache" files for rpcbind information. The affected component should still be able to function by directly calling rpcbind.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

381386 Prog <%s> step <%s>: unkillable.

Description:

The specified callback method for the specified resource became stuck in the kernel, and could not be killed with a SIGKILL. The UCMM reboots the node to prevent the stuck node from causing unavailability of the services provided by UCMM.

Solution:

No action is required. This is normal behavior of the UCMM. Other syslog messages that occurred just before this one might indicate the cause of the method failure.

382114 start_dhcp - DHCP batch job failed rc<%s>

Description:

Whenever the DHCP server has switched over to another node, the DHCP network table is updated via a batch job running pntadm commands. This message is produced if the submission of that batch job fails.

Solution:

Please refer to the pntadm(1M) man page.

382169 Share path name %s not absolute.

Description:

A path specified in the dfstab file does not begin with "/"

Solution:

Only absolute path names can be shared with HA-NFS.

382252 Share path %s: file system %s is not mounted.

Description:

The specified file system, which contains the share path specified, is not currently mounted.

Solution:

Correct the situation with the file system so that it gets mounted.

382460 Failed to obtain SAM-FS constituent volumes from mount point %s: %s.

Description:

HA Storage Plus was not able to determine the volumes that are part of this SAM-FS file system.

Solution:

Check the SAM-FS file system configuration. If the problem persists, contact your authorized Sun service provider.

382661 WebSphere MQ Broker Queue Manager available

Description:

The WebSphere MQ Broker is dependent on the WebSphere MQ Broker Queue Manager. This message simple informs that the WebSphere MQ Broker Queue Manager is available.

Solution:

No user action is needed.

382995 ioctl in negotiate_uid failed

Description:

Call to ioctl() failed. The "ioctl" man page describes possible error codes. ucmmd will exit and the node will abort.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

383583 Instance number <%s> does not consist of 2 numeric characters.

Description:

The instance number specified is not a valid SAP instance number.

Solution:

Specify a valid instance number.

383706 NULL value returned for the resource property %s.

Description:

NULL value was specified for the extension property of the resource.

Solution:

For the property name check the syslog message. Any of the following situations might have occurred. Different user action is needed for these different scenarios. 1) If a new resource is created or updated, check whether the value of the extension property is empty. If it is, provide valid value using scrgadm(1M). 2) For all other cases, treat it as an Internal error. Contact your authorized Sun service provider.

384549 CCR: Could not backup the CCR table %s errno = %d.

Description:

The indicated error occurred while backing up indicated CCR table on this node. The errno value indicates the nature of the problem. errno values are defined in the file /usr/include/sys/errno.h. An errno value of 28(ENOSPC) indicates that the root file system on the indicated node is full. Other values of errno can be returned when the root disk has failed(EIO) or some of the CCR tables have been deleted outside the control of the cluster software(ENOENT).

Solution:

There may be other related messages on this node, which may help diagnose the problem, for example: If the root file system is full on the node, then free up some space by removing unnecessary files. If the root disk on the afflicted node has failed, then it needs to be replaced. If the indicated CCR table was accidently deleted, then boot this node in -x mode to restore the indicated CCR table from other nodes in the cluster or backup. The CCR tables are located at /etc/cluster/ccr/.

384621 RDBMS probe successful

Description:

This message indicates that Fault monitor has successfully probed the RDBMS server

Solution:

No action required. This is informational message.

384922 pthread_rwlock_rdlock err %d line %d\n

Description:

Internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

385407 t_alloc (open_cmd_port) failed with errno%d

Description:

Call to t_alloc() failed. The "t_alloc" man page describes possible error codes. ucmmd will exit and the node will abort.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

385550 Can't setup binding entries from node %d for GIF node %d

Description:

Failed to maintain client affinity for some sticky services running on the named server node due to a problem on the named GIF node. Connections from existing clients for those services might go to a different server node as a result.

Solution:

If client affinity is a requirement for some of the sticky services, say due to data integrity reasons, switchover all global interfaces (GIFs) from the named GIF node to some other node.

385803 Nodelist must contain an even number of nodes.

Description:

The HADB resource must be configured to run on an even number of Sun Cluster nodes.

Solution:

Recreate the resource group and specify an even number of Sun Cluster nodes in the nodelist.

385902 pmf_search_children: Error signaling <%s>: %s

Description:

An error occurred while rpc.pmfd attempted to send a signal to one of the processes of the given tag. The reason for the failure is also given. The signal was sent to the process as a result of some event external to rpc.pmfd. rpc.pmfd "intercepted" the signal, and is trying to pass the signal on to the monitored process.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

386072 chdir: %s

Description:

The rpc.pmfd server was not able to change directory. The message contains the system error. The server does not perform the action requested by the client, and an error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

386282 ccr_initialize failure

Description:

An attempt to start the scdpmd failed.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

386908 Resource is already stopped.

Description:

An attempt was made to stop a resource that has already been stopped.

Solution:

Using the ps command check to make sure that all processes for the Data Service have been stopped. Check syslog for any possible errors which may have occurred just before this message. If everything appears to be correct, then no action is required.

387003 CCR: CCR metadata not found.

Description:

The CCR is unable to locate its metadata.

Solution:

Boot the offending node in -x mode to restore the indicated table from backup or other nodes in the cluster. The CCR tables are located at /etc/cluster/ccr/.

387150 scvxvmlg warning - found no matching volume for device node %s, removing it

Description:

The program responsible for maintaining the VxVM device namespace has discovered inconsistencies between the VxVM device namespace on this node and the VxVM configuration information stored in the cluster device configuration system. If configuration changes were made recently, then this message should reflect one of the configuration changes. If no changes were made recently or if this message does not correctly reflect a change that has been made, the VxVM device namespace on this node may be in an inconsistent state. VxVM volumes may be inaccessible from this node.

Solution:

If this message correctly reflects a configuration change to VxVM diskgroups then no action is required. If the change this message reflects is not correct, then the information stored in the device configuration system for each VxVM diskgroup should be examined for correctness. If the information in the device configuration system is accurate, then executing '/usr/cluster/lib/dcs/scvxvmlg' on this node should restore the device namespace. If the information stored in the device configuration system is not accurate, it must be updated by executing '/usr/cluster/bin/scconf -c -D name=diskgroup_name' for each VxVM diskgroup with inconsistent information.

387232 resource %s monitor enabled.

Description:

This is a notification from the rgmd that the operator has enabled monitoring on a resource. This may be used by system monitoring tools.

Solution:

This is an informational message, no user action is needed.

387288 clcomm: Path %s online

Description:

A communication link has been established with another node.

Solution:

No action required.

387572 File %s should be readable and writable by %s.

Description:

A program required the specified file to be readable and writable by the specified user.

Solution:

Set correct permissions for the specified file to allow the specified user to read it and write to it.

387689 Validate - The Sap user %s home-directory %s does not exist

Description:

The Sap user's home directory does exist.

Solution:

Create the Sap user's home directory.

388330 Text server stopped.

Description:

The Text server has been stopped by Sun Cluster HA for Sybase.

Solution:

This is an informational message, no user action is needed.

389221 could not open configuration file: %s

Description:

The specified configuration file could not be opened.

Solution:

Check if the configuration file exists and has correct permissions. If the problem persists, contact your Sun Service representative.

389231 clcomm: inbound_invo::cancel:_state is 0x%x

Description:

The internal state describing the server side of a remote invocation is invalid when a cancel message arrives during processing of the remote invocation.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

389369 Validation failed. SYBASE ASE STOP_FILE %s is not executable.

Description:

File specified in the STOP_FILE extension property is not an executable file.

Solution:

Please check the permissions of file specified in the STOP_FILE extension property. File should be executable by the Sybase owner and root user.

389516 NULL value returned for the extension property %s.

Description:

NULL value was specified for the extension property of the resource.

Solution:

389901 ext_props(): Out of memory

Description:

System runs out of memory in function ext_props().

Solution:

Install more memory, increase swap space, or reduce peak memory consumption.

390130 Failed to allocate space for %s.

Description:

An internal error has occurred.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.

390212 Core files in ${SCCOREDIR}: ${CORES_FILES}

Description:

One of the daemon launched by /etc/init.d/bootcluster script has core dumped.

Solution:

Provide core dumps to your authorized Sun service provider. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

390647 INITFED Error: rpc.pmfd is not running.

Description:

The initfed init script found that the rpc.pmfd is not running. The rpc.fed will not be started, which will prevent the rgmd from starting, and which will prevent this node from participating as a full member of the cluster.

Solution:

Examine other syslog messages occurring at about the same time to determine why the rpc.pmfd is not running. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.

390691 NFS daemon down

Description:

HA-NFS fault monitor detected that an nfs daemon died and will automatically restart it later.

Solution:

No action required.

390782 lkcm_cfg: Unable to get new nodelist

Description:

Error when reading nodelist.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.

391177 Failed to open %s: %s.

Description:

HA Storage Plus failed to open the specified file.

Solution:

Check the system configuration. If the problem persists, contact your authorized Sun service provider.

391240 RAC server %s is not running. Manual intervention is required.

Description:

The fault monitor for the RAC server instance has identified that the instance is no longer running. The fault monitor does NOT attempt to restart or failover the resource or resource group.

Solution:

The cause of the instance failure should be investigated, and once rectified, the RAC server instance should be restarted.

391502 Database is still running after %d seconds. Selecting new Sun Cluster node to run the stop command.

Description:

The database is still running but the resource on the Sun Cluster node that was selected to run the hadbm stop command is now offline, possibly because of an error while running the hadbm command. A different Sun Cluster node will be selected to run the hadbm stop command again.

Solution:

This is an informational message, no user action is needed.

391738 (%s) bad poll revent: %x (hex)

Description:

Call to poll() failed. The "poll" man page describes possible error codes. udlmctl will exit.

Solution:

Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.

392782 Failed to retrieve the property %s for %s: %s.

Description:

API operation has failed in retrieving the cluster property.

Solution:

For property name, check the syslog message. For more details about API call failure, check the syslog messages from other components.

392998 Fault monitor probe response time exceeded timeout (%d secs). The timeout for subsequent probes will be temporarily increased by 10%%

Description:

The time taken for the last fault monitor probe to complete was greater than the resource's configured probe timeout, so a timeout error occurred. The timeout for subsequent probes will be increased by 10% until the probe response time drops below 50% of the timeout, at which point the timeout will be reduced to it's configured value.

Solution:

The database should be investigated for the cause of the slow response and the problem fixed, or the resource's probe timeout value increased accordingly.

393385 Service daemon not running.

Description:

Process group has died and the data service's daemon is not running. Updating the resource status.

Solution:

Wait for the fault monitor to restart or failover the data service. Check the configuration of the data service.

393934 Stopping the adaptive server with wait option.

Description:

The Sun Cluster HA for Sybase will attempt to shutdown the Sybase adaptive server using the wait option.

Solution:

This is an informational message, no user action is needed.

393960 sigaction failed in set_signal_handler

Description:

The ucmmd has failed to initialize signal handlers by a call to sigaction(2). The error message indicates the reason for the failure. The ucmmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes and of the ucmmd core. Contact your authorized Sun service provider for assistance in diagnosing the problem.

394325 Received notice that IPMP group %s has failed.

Description:

The status of the named IPMP group has become degraded. If possible, the scalable resources currently running on this node with monitoring enabled will be relocated off of this node, if the IPMP group stays in a degraded state.

Solution:

Check the status of the IPMP group on the node. Try to fix the adapters in the IPMP group.

395008 Error reading stopstate file.

Description:

The stopstate file could not be opened and read.

Solution:

Examine other syslog messages occurring around the same time on the same node, to see if the source of the problem can be identified.

395353 Failed to check whether the resource is a network address resource.

Description:

While retrieving the IP addresses from the network resources, the attempt to check whether the resource is a network resource or not has failed.

Solution:

396727 Attempting to check for existence of %s pid %d resulted in error: %s.

Description:

HA-NFS fault monitor attempted to check the status of the specified process but failed. The specific cause of the error is logged.

Solution:

No action. HA-NFS fault monitor would ignore this error and would attempt this operation at a later time. If this error persists, check to see if the system is lacking the required resources (memory and swap) and add or free resources if required. Reboot the node if the error persists.

397219 RGM: Could not allocate %d bytes; node is out of swap space; aborting node.

Description:

The rgmd failed to allocate memory, most likely because the system has run out of swap space. The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

The problem is probably cured by rebooting. If the problem recurs, you might need to increase swap space by configuring additional swap devices. See swap(1M) for more information.

397340 Monitor initialization error. Unable to open resource: %s Group: %s: error %d

Description:

Error occurred in monitor initialization. Monitor is unable to get resource information using API calls.

Solution:

Check syslog messages for errors logged from other system modules. Stop and start fault monitor. If error persists then disable fault monitor and report the problem.

397371 'dbmcli -d <LC_NAME> -n <logical host name> db_state' timed out.

Description:

The SAP utility listed timed out.

Solution:

Make sure the logical host name resource is online.

397698 Daemon cl_eventd not responding: Will retry in %d sec

Description:

The clapi_mod in the syseventd failed to deliver an event to the cl_eventd, but will retry.

Solution:

No action required.

397940 INTERNAL ERROR in J2EE probe calling scds_fm_tcp_read(): %s.

Description:

The data service could not read the J2EE probe reply.

Solution:

Informational message. No user action is needed.

398345 Error %d setting policy %d %s

Description:

This message appears when the customer is initializing or changing a scalable services load balancer, by starting or updating a service. An internal error happened while trying to change the load balancing policy.

Solution:

This is an internal error and it could happen if another RGM are operation were happening at the same time. The user action is to try it again. If it happens when another RMG update is not happening, contact your Sun Service provider for help.

398643 Validate - nmblookup %s non-existent executable

Description:

The Samba resource tries to validate that the nmblookup program exists and is executable.

Solution:

Check the correct pathname for the Samba bin directory was entered when registering the resource and that the program exists and is executable.

398973 The probe has requested an immediate failover. Attempting to failover this resource group subject to the setting of the Failover_enabled property.

Description:

An immediate failover will be performed.

Solution:

This is an informational message, no user action is needed.

399037 mc_closeconn failed to close connection

Description:

The system has run out of resources that is required to process connection terminations for a scalable service.

Solution:

If cluster networking is required, add more resources (most probably, memory) and reboot.

399216 clexecd: Got an unexpected signal %d in process %s (pid=%d, ppid=%d)

Description:

clexecd program got a signal indicated in the error message.

Solution:

clexecd program will exit and node will be halted or rebooted to prevent data corruption. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

399266 Cluster goes into pingpong booting because of failure of method <%s> on resource <%s>. RGM is not aborting this node.

Description:

A stop method is failed and Failover_mode is set to HARD, but the RGM has detected this resource group falling into pingpong behavior and will not abort the node on which the resource's stop method failed. This is most likely due to the failure of both resource's start and stop methods.

Solution:

Save a copy of /var/adm/messages, check for both failed start and stop methods of the failing resource, and make sure to have the failure corrected. Refer to the procedure for clearing the ERROR_STOP_FAILED condition on a resource group in the Sun Cluster Administration Guide to restart resource group.

399548 %s failed with exit code %d.

Description:

The specified command returned the specified exit code.

Solution:

Examine other syslog messages occurring around the same time on the same node, to see if the source of the problem can be identified.

399753 CCR: CCR data server failed to register with CCR transaction manager.

Description:

The CCR data server on this node failed to join the cluster, and can only serve readonly requests.

Solution:

There may be other related CCR messages on this and other nodes in the cluster, which may help diagnose the problem. It may be necessary to reboot this node or the entire cluster.