Sun Cluster Error Messages Guide for Solaris OS

Message IDs 500000–599999

This section contains message IDs 500000–599999.


500133 Device switchover of global service %s associated with path %s to this node failed: %s.

Description:

The DCS was not able to perform a switchover on the specified global service.

Solution:

Check the global service configuration.


500568 fatal: Aborting this node because method <%s> on resource <%s> is unkillable

Description:

The specified callback method for the specified resource became stuck in the kernel, and could not be killed with a SIGKILL. The RGM reboots the node to force the data service to fail over to a different node, and to avoid data corruption.

Solution:

No action is required. This is normal behavior of the RGM. Other syslog messages that occurred just before this one might indicate the cause of the method failure.


501632 Incorrect syntax in the environment file %s. Ignoring %s

Description:

HA-Oracle reads the file specified in USER_ENV property and exports the variables declared in the file. Syntax for declaring the variables is : VARIABLE=VALUE Lines starting with " VARIABLE is expected to be a valid Korn shell variable that starts with alphabet or "_" and contains alphanumerics and "_".

Solution:

Please check the environment file and correct the syntax errors. Do not use export statement in environment file.


501733 scvxvmlg fatal error - _cladm() failed

Description:

The program responsible for maintaining the VxVM namespace has suffered an internal error. If configuration changes were recently made to VxVM diskgroups or volumes, this node may be unaware of those changes. Recently created volumes may be inaccessible from this node.

Solution:

If no configuration changes have been recently made to VxVM diskgroups or volumes and all volumes continue to be accessible from this node, then no action is required. If changes have been made, the device namespace on this node can be updated to reflect those changes by executing '/usr/cluster/lib/dcs/scvxvmlg'. If the problem persists, contact your authorized Sun service provider to determine whether a workaround or patch is available.


501763 recv_request: t_alloc: %s

Description:

Call to t_alloc() failed. The "t_alloc" man page describes possible error codes. udlm will exit and the node will abort.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


501917 process_intention(): IDL exception when communicating to node %d

Description:

An inter-node communication failed, probably because a node died.

Solution:

No action is required; the rgmd should recover automatically.


501981 Valid connection attempted from %s: %s

Description:

The cl_apid received a valid client connection attempt from the specified IP address.

Solution:

No action required. If you do not wish to allow access to the client with this IP address, modify the allow_hosts or deny_hosts as required.


502022 fatal: joiners_read_ccr: exiting early because of unexpected exception

Description:

The low-level cluster machinery has encountered a fatal error. The rgmd will produce a core file and will cause the node to halt or reboot.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


502258 fatal: Resource group <%s> create failed with error <%s>; aborting node

Description:

Rgmd failed to read new resource group from the CCR on this node.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


502438 IPMP group %s has status %s.

Description:

The specified IPMP group is not in functional state. Logical host resource can't be started without a functional IPMP group.

Solution:

LogicalHostname resource will not be brought online on this node. Check the messages(pnmd errors) that encountered just before this message for any IPMP or adapter problem. Correct the problem and rerun the scrgadm command.


503048 NULL value returned for resource group property %s.

Description:

NULL value was returned for resource group property.

Solution:

For the property name check the syslog message. Any of the following situations might have occurred. Different user action is needed for these 1) If a new resource group is created or updated, check whether the value of the property is valid. 2) For all other cases, treat it as an Internal error.


503064 Method <%s> on resource <%s>: Method timed out.

Description:

A VALIDATE method execution has exceeded its configured timeout and was killed by the rgmd. This in turn will cause the failure of a creation or update operation on a resource or resource group.

Solution:

Consult resource type documentation to diagnose the cause of the method failure. Other syslog messages occurring just before this one might indicate the reason for the failure. After correcting the problem that caused the method to fail, the operator may retry the resource group update operation.


503399 Failed to parse xml: invalid attribute %s

Description:

The cl_apid was unable to parse an xml message because of an invalid attribute. This message probably represents a CRNP client error.

Solution:

No action needed.


503715 CMM: Open failed for quorum device %ld with gdevname '%s'.

Description:

The open operation on the specified quorum device failed, and this node will ignore the quorum device.

Solution:

The quorum device has failed or the path to this device may be broken. Refer to the quorum disk repair section of the administration guide for resolving this problem.


503771 reservation warning(%s) - MHIOCGRP_REGISTERANDIGNOREKEY error will retry in %d seconds

Description:

The device fencing program has encountered errors while trying to access a device. The failed operation will be retried

Solution:

This is an informational message, no user action is needed.


503817 No PDT Dispatcher thread.

Description:

The system has run out of resources that is required to create a thread. The system could not create the connection processing thread for scalable services.

Solution:

If cluster networking is required, add more resources (most probably, memory) and reboot.


504363 ERROR: process_resource: resource <%s> is pending_update but no UPDATE method is registered

Description:

A non-fatal internal error has occurred in the rgmd state machine.

Solution:

Since this problem might indicate an internal logic error in the rgmd, please save a copy of the /var/adm/messages files on all nodes, the output of an scstat -g command, and the output of a scrgadm -pvv command. Report the problem to your authorized Sun service provider.


504402 CMM: Aborting due to stale sequence number. Received a message from node %ld indicating that node %ld has a stale sequence

Description:

After receiving a message from the specified remote node, the local node has concluded that it has stale state with respect to the remote node, and will therefore abort. The state of a node can get out-of-date if it has been in isolation from the nodes which have majority quorum.

Solution:

Reboot the node.


505040 Failed to allocate memory.

Description:

HA Storage Plus ran out of resources.

Solution:

Usually, this means that the system has exhausted its resources. Check if the swap file is big enough to run Sun Cluster.


505101 Found another active instance of clexecd. Exiting daemon_process.

Description:

An active instance of clexecd program is already running on the node.

Solution:

This would usually happen if the operator tries to start the clexecd program by hand on a node which is booted in cluster mode. If that is not the case, contact your authorized Sun service provider to determine whether a workaround or patch is available.


506381 Custom action file %s contains errors.

Description:

The custom action file that is specified contains errors which need to be corrected before it can be processed completely.

Solution:

Please ensure that all entries in the custom monitor action file are valid and follow the correct syntax. After the file is corrected, validate it again to verify the syntax.


506740 Home directory is not set for user %s.

Description:

No home directory set for the specified Broadvision user.

Solution:

Set the home directory of the Broadvision User to point to the directory containing the Broadvision config files.


507193 Queued event %lld

Description:

The cl_apid successfully queued the incoming sysevent.

Solution:

This message is informational only. No action is required.


507882 J2EE engine probe returned http code 503. Temp. not available.

Description:

The data service received http code 503, which indicates that the service is temporarily not available.

Solution:

Informational message. No user action is needed.


508391 Error: failed to load catalog %s

Description:

The cl_apid was unable to load the xml catalog for the dtds. No validation will be performed on CRNP xml messages.

Solution:

No action is required. If you want to diagnose the problem, examine other syslog messages occurring at about the same time to see if the problem can be identified. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


508671 mmap: %s

Description:

The rpc.pmfd server was not able to allocate shared memory for a semaphore, possibly due to low memory, and the system error is shown. The server does not perform the action requested by the client, and pmfadm returns error. An error message is also output to syslog.

Solution:

Investigate if the machine is running out of memory. If this is not the case, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


508789 Failed to read from file %s, error : %s.

Description:

There was an error reading from a file. Both the file and the exact error are described in the error message.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


509069 CMM: Halting because this node has no configuration info about node %ld which is currently configured in the cluster and running.

Description:

The local node has no configuration information about the specified node. This indicates a misconfiguration problem in the cluster. The /etc/cluster/ccr/infrastructure table on this node may be out of date with respect to the other nodes in the cluster.

Solution:

Correct the misconfiguration problem or update the infrastructure table if out of date, and reboot the nodes. To update the table, boot the node in non-cluster (-x) mode, restore the table from the other nodes in the cluster or backup, and boot the node back in cluster mode.


509136 Probe failed.

Description:

Fault monitor was unable to perform complete health check of the service.

Solution:

1) Fault monitor would take appropriate action (by restarting or failing over the service.). 2) Data service could be under load, try increasing the values for Probe_timeout and Thororugh_probe_interval properties. 3) If this problem continues to occur, look at other messages in syslog to determine the root cause of the problem. If all else fails reboot node.


510280 check_mysql - MySQL slave instance %s is not connected to master %s with MySql error (%s)

Description:

The fault monitor has detected that the MySQL slave instance is not connected to the specified master.

Solution:

Check MySQL logfiles to determine why the slave has been disconnected to the master.


510369 sema_wait parent: %s

Description:

The rpc.pmfd server was not able to act on a semaphore. The message contains the system error. The server does not perform the action requested by the client, and an error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


510659 Failover %s data services must be in a failover resource group.

Description:

The Scalable resource property for the data service was set to FALSE, which indicates a failover resource, but the corresponding data service resource group is not a failover resource group. Failover resources of this resource type must reside in a failover resource group.

Solution:

Decide whether this resource is to be scalable or failover. If scalable, set the Scalable property value to TRUE. If failover, leave Scalable set to FALSE and create this resource in a failover resource group. A failover resource group has its resource group property RG_mode set to Failover.


511177 clcomm: solaris xdoor door_info failed

Description:

A door_info operation failed. Refer to the "door_info" man page for more information.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


511749 rgm_launch_method: failed to get method <%s> timeout value from resource <%s>

Description:

Due to an internal error, the rgmd was unable to obtain the method timeout for the indicated resource. This is considered a method failure. Depending on which method was being invoked and the Failover_mode setting on the resource, this might cause the resource group to fail over or move to an error state.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing the problem.


511810 Property <%s> does not exist in SUNW.HAStorage

Description:

Property set in SUNW.HAStorage type resource is not defined in SUNW.HAStorage.

Solution:

Check /var/adm/message and see what property name is used. Correct it according to the definition in SUNW.HAStorage.


511917 clcomm: orbdata: unable to add to hash table

Description:

The system records object invocation counts in a hash table. The system failed to enter a new hash table entry for a new object type.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


513538 scvxvmlg error - mkdirp(%s) failed

Description:

The program responsible for maintaining the VxVM namespace was unable to access the global device namespace. If configuration changes were recently made to VxVM diskgroups or volumes, this node may be unaware of those changes. Recently created volumes may be inaccessible from this node.

Solution:

Verify that the /global/.devices/node@N (N = this node's node number) is mounted globally and is accessible. If no configuration changes have been recently made to VxVM diskgroups or volumes and all volumes continue to be accessible from this node, then no further action is required. If changes have been made, the device namespace on this node can be updated to reflect those changes by executing '/usr/cluster/lib/dcs/scvxvmlg'. If the problem persists, contact your authorized Sun service provider to determine whether a workaround or patch is available.


514047 select: %s

Description:

The cl_apid received the specified error from select(3C). No action was taken by the daemon.

Solution:

No action required. If the problem persists, save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


514383 ping_timeout %d

Description:

The ping_timeout value used by scdpmd.

Solution:

No action required.


514688 Invalid port number %s in the %s property.

Description:

The specified system property does not have a valid port number.

Solution:

Using scrgadm(1M), specify the positive, valid port number.


514973 Validate - This cluster version does not support Inter-RG dependencies

Description:

This agent is dependent of Inter-RG dependencies.

Solution:

Upgrade the cluster to a version that supports Inter-RG dependencies.


515583 %s is not a valid IP address.

Description:

Validation method has failed to validate the ip addresses. The mapping for the given ip address in the local host files can't be done: the specified ip address is invalid.

Solution:

Invalid hostnames/ip addresses have been specified while creating the resource. Recreate the resource with valid hostnames.


516407 reservation warning(%s) - MHIOCGRP_INKEYS error will retry in %d seconds

Description:

The device fencing program has encountered errors while trying to access a device. The failed operation will be retried

Solution:

This is an informational message, no user action is needed.


516499 WARNING: IPMP group %s can not host %s addresses. At least one of the hostnames that were specified has %s mapping(s). All such mappings will be ignored on this node.

Description:

The IPMP group cannot host IP addresses of the type specified. One or more hostnames in the resource have mappings of that type. Those IP addresses will not be hosted on this node.


517009 lkcm_act: invalid handle was passed %s %d

Description:

Handle for communication with udlmctl is invalid.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


517036 connection from outside the cluster - rejected

Description:

There was a connection from an IP address which does not belong to the cluster. This is not allowed so the PNM daemon rejected the connection.

Solution:

This message is informational; no user action is needed. However, it would be a good idea to see who is trying to talk to the PNM daemon and why?


517343 clexecd: Error %d from pipe

Description:

clexecd program has encountered a failed pipe(2) system call. The error message indicates the error number for the failure.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


517363 clconf: Unrecognized property type

Description:

Found the unrecognized property type in the configuration file.

Solution:

Check the configuration file.


518018 CMM: Node being aborted from the cluster.

Description:

This node is being excluded from the cluster.

Solution:

Node should be rebooted if required. Resolve the problem according to other messages preceding this message.


518291 Warning: Failed to check if scalable service group %s exists: %s.

Description:

A call to the underlying scalable networking code failed.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


519262 Validation failed. SYBASE monitor server startup file RUN_%s not found SYBASE=%s.

Description:

Monitor server was specified in the extension property Monitor_Server_Name. However, monitor server startup file was not found. Monitor server startup file is expected to be: $SYBASE/$SYBASE_ASE/install/RUN_<Monitor_Server_Name>

Solution:

Check the monitor server name specified in the Monitor_Server_Name property. Verify that SYBASE and SYBASE_ASE environment variables are set property in the Environment_file. Verify that RUN_<Monitor_Server_Name> file exists.


520231 Unable to set the number of threads for the FED RPC threadpool.

Description:

The rpc.fed server was unable to set the number of threads for the RPC threadpool. This happens while the server is starting up.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


520384 %s action in bv_utils Failed.

Description:

The specified action failed to succeed. There can be several reasons for the failure (wrong configuration,orbixd not running,on existent config files,could not start BV processes, could not stop BV servers etc....)

Solution:

Look for other syslog messages to get the exact failure location. If it is a Broadvision configuration error, try to run it manually and see if everything is OK. If manually everything is OK but under HA if the error is occuring, contact sun support with the /var/adm/messages and BV logs.


520982 CMM: Preempting node %ld from quorum device %s failed.

Description:

This node was unable to preempt the specified node from the quorum device, indicating that the partition to which the local node belongs has been preempted and will abort. If a cluster gets divided into two or more disjoint subclusters, exactly one of these must survive as the operational cluster. The surviving cluster forces the other subclusters to abort by grabbing enough votes to grant it majority quorum. This is referred to as preemption of the losing subclusters.

Solution:

There may be other related messages that may indicate why the partition to which the local node belongs has been preempted. Resolve the problem and reboot the node.


521393 Backup server stopped.

Description:

The backup server was stopped by Sun Cluster HA for Sybase.

Solution:

This is an information message, no user action is needed.


521538 monitor_check: set_env_vars() failed for resource <%s>, resource group <%s>

Description:

During execution of a scha_control(1HA,3HA) function, the rgmd was unable to set up environment variables for method execution, causing a MONITOR_CHECK method invocation to fail. This in turn will prevent the attempted failover of the resource group from its current master to a new master.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


521671 uri <%s> probe failed

Description:

The probing of the url set in the monitor_uri_list extension property failed. The agent probe will take action.

Solution:

None. The agent probe will take action. However, the cause of the failure should be investigated further. Examine the log file and syslog messages for additional information.


521918 Validation failed. Connect string is NULL

Description:

The 'Connect_String' extension property used for fault monitoring is null. This has the format "username/password".

Solution:

Check for syslog messages from other system modules. Check the resource configuration and the value of the 'Connect_string'property.


522480 RGM state machine returned error %d

Description:

An error has occurred on this node while attempting to execute the rgmd state machine.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing the problem.


522710 Failed to parse xml: invalid reg_type [%s]

Description:

The cl_apid was unable to parse an xml message because of an invalid registration type.

Solution:

No action needed.


522779 IP address (hostname) string %s in property %s, entry %d could not be resolved to an IP address.

Description:

The IP address (hostname) string within the named property in the message did not resolve to a real IP address.

Solution:

Change the IP address (hostname) string within the entry in the property to one that does resolve to a real IP address. Make sure the syntax of the entry is correct.


523302 fatal: thr_keycreate: %s (UNIX errno %d)

Description:

The rgmd failed in a call to thr_keycreate(3T). The error message indicates the reason for the failure. The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


523643 INTERNAL ERROR: %s

Description:

An internal error has occurred.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


523933 Although there are no other potential masters, RGM is failing resource group <%s> off of node <%d> because there are other current healthy masters.

Description:

The resource group was brought OFFLINE on the node specified, probably because of a public network failure on that node. The operation was performed despite the lack of a healthy candidate node to host the resource group, because the resource group was currently mastered by at least one other healthy node.

Solution:

No action required. If desired, examine other syslog messages on the node in question to determine the cause of the network failure.


524300 get_server_ip - pntadm failed rc<%s>

Description:

The DHCP resource gets the server ip from the DHCP network table using the pntadm command, however this command has failed.

Solution:

Please refer to the pntadm(1M) man page.


525197 No network address resources in resource group.

Description:

The cl_apid encountered an invalid property value. If it is trying to start, it will terminate. If it is trying to reload the properties, it will use the old properties instead.

Solution:

Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


525628 CMM: Cluster has reached quorum.

Description:

Enough nodes are operational to obtain a majority quorum; the cluster is now moving into operational state.

Solution:

This is an informational message, no user action is needed.


525979 ERROR: probe_sap_j2ee Option -J not set

Description:

The -J option is missing for the probe_command.

Solution:

Add -J option to the probe-command.


526056 Resource <%s> of Resource Group <%s> failed pingpong check on node <%s>. The resource group will not be mastered by that node.

Description:

A scha_control(1HA,3HA) call has failed because no healthy new master could be found for the resource group. A given node is considered unhealthy for a given resource if that same resource has recently initiated a failover off of that node by a previous scha_control call. In this context, "recently" means within the past Pingpong_interval seconds, where Pingpong_interval is a user-configurable property of the resource group. The default value of Pingpong_interval is 3600 seconds. This check is performed to avoid the situation where a resource group repeatedly "ping-pongs" or moves back and forth between two or more nodes, which might occur if some external problem prevents the resource group from running successfully on *any* node.

Solution:

A properly-implemented resource monitor, upon encountering the failure of a scha_control call, should sleep for awhile and restart its probes. If the resource remains unhealthy, the problem that caused the scha_control call to fail (such as pingpong check described above) will eventually resolve, permitting a later scha_control request to succeed. Therefore, no user action is required. If the system administrator wishes to permit failovers to be attempted even at the risk of ping-pong behavior, the Pingpong_interval property of the resource group should be set to a smaller value.


526403 ff_open: %s

Description:

A server (rpc.pmfd or rpc.fed) was not able to establish a link to the failfast device, which ensures that the host aborts if the server dies. The error message is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


526492 Service object [%s, %s, %d] removed from group '%s'

Description:

A specific service known by its unique name SAP (service access point), the three-tuple, has been deleted in the designated group.

Solution:

This is an informational message, no user action is needed.


526671 Failed to initialize the DSDL.

Description:

HA Storage Plus was not able to connect to the DSDL.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


526846 Daemon <%s> is not running.

Description:

The HA-NFS fault monitor detected that the specified daemon is no longer running.

Solution:

No action. The fault monitor would restart the daemon. If it doesn't happen, reboot the node.


527700 stop_chi - WebSphere MQ Channel Initiator %s stopped

Description:

The WebSphere MQ Channel Initiator has been stopped.

Solution:

No user action is needed.


527795 clexecd: setrlimit returned %d

Description:

clexecd program has encountered a failed setrlimit() system call. The error message indicates the error number for the failure.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


528020 CCR: Remove table %s failed.

Description:

The CCR failed to remove the indicated table.

Solution:

The failure can happen due to many reasons, for some of which no user action is required because the CCR client in that case will handle the failure. The cases for which user action is required depends on other messages from CCR on the node, and include: If it failed because the cluster lost quorum, reboot the cluster. If the root file system is full on the node, then free up some space by removing unnecessary files. If the root disk on the afflicted node has failed, then it needs to be replaced. If the cluster repository is corrupted as indicated by other CCR messages, then boot the offending node(s) in -x mode to restore the cluster repository from backup. The cluster repository is located at /etc/cluster/ccr/.


528499 scsblconfig not configured correctly.

Description:

The specified file has not been configured correctly, or it does not have all the required settings.

Solution:

Please verify that required variables (according to the installation instructions for this data service) are correctly configured in this file. Try to manually source this file in korn shell (". scsblconfig"), and verify if the required variables are getting set correctly.


528566 Method <%s> on resource <%s>, resource group <%s>, is_frozen=<%d>: Method timed out.

Description:

A method execution has exceeded its configured timeout and was killed by the rgmd. Depending on which method was being invoked and the Failover_mode setting on the resource, this might cause the resource group to fail over or move to an error state.

Solution:

Consult resource type documentation to diagnose the cause of the method failure. Other syslog messages occurring just before this one might indicate the reason for the failure. After correcting the problem that caused the method to fail, the operator may choose to issue an scswitch(1M) command to bring resource groups onto desired primaries. Note, if the indicated value of is_frozen is 1, this might indicate an internal error in the rgmd. Please save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


529131 Method <%s> on resource <%s>: RPC connection error.

Description:

An attempted method execution failed, due to an RPC connection problem. This failure is considered a method failure. Depending on which method was being invoked and the Failover_mode setting on the resource, this might cause the resource group to fail over or move to an error state; or it might cause an attempted edit of a resource group or its resources to fail.

Solution:

Examine other syslog messages occurring around the same time on the same node, to see if the cause of the problem can be identified. If the same error recurs, you might have to reboot the affected node. After the problem is corrected, the operator may choose to issue an scswitch(1M) command to bring resource groups onto desired primaries, or retry the resource group update operation.


529191 clexecd: Sending fd to workerd returned %d. Exiting.

Description:

There was some error in setting up interprocess communication in the clexecd program.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


529407 resource group %s state on node %s change to %s

Description:

This is a notification from the rgmd that a resource group's state has changed. This may be used by system monitoring tools.

Solution:

This is an informational message, no user action is needed.


529502 UNRECOVERABLE ERROR: /etc/cluster/ccr/infrastructure file is corrupted

Description:

/etc/cluster/ccr/infrastructure file is corrupted.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


529702 pthread_mutex_trylock error %d line %d\n

Description:

Internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


530064 reservation error(%s) - do_enfailfast() error for disk %s

Description:

The device fencing program has encountered errors while trying to access a device. All retry attempts have failed.

Solution:

This may be indicative of a hardware problem, which should be resolved as soon as possible. Once the problem has been resolved, the following actions may be necessary: If the message specifies the 'node_join' transition, then this node may be unable to access the specified device. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access the device. In either case, access can be reacquired by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group may have failed to start on this node. If the device group was started on another node, it may be moved to this node with the scswitch command. If the device group was not started, it may be started with the scswitch command. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group may have failed. If so, the desired action may be retried.


530492 fatal: ucmm_initialize() failed

Description:

The daemon indicated in the message tag (rgmd or ucmmd) was unable to initialize its interface to the low-level cluster membership monitor. This is a fatal error, and causes the node to be halted or rebooted to avoid data corruption. The daemon produces a core file before exiting.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the core file generated by the daemon. Contact your authorized Sun service provider for assistance in diagnosing the problem.


530603 Warning: Scalable service group for resource %s has already been created.

Description:

It was not expected that the scalable services group for the named resource existed.

Solution:

Rebooting all nodes of the cluster will cause the scalable services group to be deleted.


530828 Failed to disconnect from host %s and port %d.

Description:

The data service fault monitor probe was trying to disconnect from the specified host/port and failed. The problem may be due to an overloaded system or other problems. If such failure is repeated, Sun Cluster will attempt to correct the situation by either doing a restart or a failover of the data service.

Solution:

If this problem is due to an overloaded system, you may consider increasing the Probe_timeout property.


530938 Starting NFS daemon %s.

Description:

The specified NFS daemon is being started by the HA-NFS implementation.

Solution:

This is an informational message. No action is needed.


531148 fatal: thr_create stack allocation failure: %s (UNIX error %d)

Description:

The rgmd was unable to create a thread stack, most likely because the system has run out of swap space. The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Rebooting the node has probably cured the problem. If the problem recurs, you might need to increase swap space by configuring additional swap devices. See swap(1M) for more information.


531560 No IPMP group for node %s.

Description:

No IPMP group has been specified for this node.

Solution:

If this error message has occurred during resource creation, supply valid adapter information and retry it. If this message has occurred after resource creation, remove the LogicalHostname resource and recreate it with the correct IPMP group for each node which is a potential master of the resource group.


531989 Prog <%s> step <%s>: authorization error: %s.

Description:

An attempted program execution failed, apparently due to a security violation; this error should not occur. The last portion of the message describes the error. This failure is considered a program failure.

Solution:

Correct the problem identified in the error message. If necessary, examine other syslog messages occurring at about the same time to see if the problem can be diagnosed. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing the problem.


532118 All the SUNW.HAStoragePlus resources that this resource depends on are not online on the local node. Skipping the checks for the existence and permissions of the start/stop/probe commands.

Description:

The HAStoragePlus resources that this resource depends on are online on another node. Validation checks will be done on that other node.

Solution:

This message is informational; no user action is needed.


532454 file specified in USER_ENV parameter %s does not exist

Description:

'User_env' property was set when configuring the resource. File specified in 'User_env' property does not exist or is not readable. File should be specified with fully qualified path.

Solution:

Specify existing file with fully qualified file name when creating resource. If resource is already created, please update resource property 'User_env'.


532557 Validate - This version of samba <%s> is not supported with this dataservice

Description:

The Samba resource check to see that an appropriate version of Samba is being deployed. Versions below v2.2.2 will generate this message.

Solution:

Ensure that the Samba version is equal to or above v2.2.2


532636 Encountered an error while conducting checks of device services availability.

Description:

The start method of a HAStoragePlus resource has detected an error while verifying the availability of a device service. It is highly likely that a DCS function call returned an error.

Solution:

Contact your authorized Sun service provider for assistance in diagnosing the problem.


532654 The -c or -u flag must be specified for the %s method.

Description:

The arguments passed to the function unexpected omitted the given flags.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


532854 Probe for J2EE engine timed out in scds_fm_tcp_disconnect().

Description:

The data service timed out while disconnecting the socket.

Solution:

Informational message. No user action is needed.


532973 clcomm: User Error: Loading duplicate TypeId: %s.

Description:

The system records type identifiers for multiple kinds of type data. The system checks for type identifiers when loading type information. This message identifies that a type is being reloaded most likely as a consequence of reloading an existing module.

Solution:

Reboot the node and do not load any Sun Cluster modules manually.


532979 orbixd is started outside HA Broadvision.Stop orbixd and other BV processes running outside HA BroadVision.

Description:

The orbix daemon is probably started outside HA BroadVision. There should not be any BV servers or daemons started outside HA BroadVision.

Solution:

Shutdown orbix daemon running outside HA BroadVision and also stop all BV servers started outside HA BroadVision. Delete the file/var/run/cluster/bv/bv_orbixd_lock_file if it exists and then restart the BV resources again.


532980 clcomm: Pathend %p: deferred task not allowed in state %d

Description:

The system maintains state information about a path. A deferred task is not allowed in this state.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


533359 pmf_monitor_suspend: Error opening procfs control file <%s> for tag <%s>: %s

Description:

The rpc.pmfd server was not able to resume the monitoring of a process because the rpc.pmfd server was not able to open a procfs control file. If the system error is 'Device busy', the procfs control file is being used by another command (like truss, pstack, dbx...) and the monitoring of this process remains suspended. If this is not the case, the monitoring of this process has been aborted and can not be resumed.

Solution:

If the system error is 'Device busy', stop the command which is using the procfs control file and issue the monitoring resume command again. Otherwise investigate if the machine is running out of memory. If this is not the case, save the syslog messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


534499 Failed to take the resource out of PMF control.

Description:

Sun Cluster was unable to remove the resource out of PMF control. This may cause the service to keep restarting on an unhealthy node.

Solution:

Look in /var/adm/messages for the cause of failure. Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


534644 Failed to start SAP processes with %s.

Description:

The data service failed to start the SAP processes.

Solution:

Check the script and execute manually.


534826 clexecd: Error %d from start_failfast_server

Description:

clexecd program could not enable one of the mechanisms which causes the node to be shutdown to prevent data corruption, when clexecd program dies.

Solution:

To avoid data corruption, system will halt or reboot the node. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


535044 Creation of resource <%s> failed because none of the nodes on which VALIDATE would have run are currently up

Description:

In order to create a resource whose type has a registered VALIDATE method, the rgmd must be able to run VALIDATE on at least one node. However, all of the candidate nodes are down. "Candidate nodes" are either members of the resource group's Nodelist or members of the resource type's Installed_nodes list, depending on the setting of the resource's Init_nodes property.

Solution:

Boot one of the resource group's potential masters and retry the resource creation operation.


535181 Host %s is not valid.

Description:

Validation method has failed to validate the ip addresses.

Solution:

Invalid hostnames/ip addresses have been specified while creating resource. Recreate the resource with valid hostnames. Check the syslog message for the specific information.


535886 Could not find a mapping for %s in %s. It is recommended that a mapping for %s be added to %s.

Description:

No mapping was found in the local hosts file for the specified ip address.

Solution:

Applications may use hostnames instead of ip addresses. It is recommended to have a mapping in the hosts file. Add an entry in the hosts file for the specified ip address.


536091 Failed to retrieve the cluster handle while querying for property %s: %s.

Description:

Access to the object named failed. The reason for the failure is given in the message.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


537175 CMM: node %s (nodeid: %ld, incarnation

Description:

The cluster can communicate with the specified node. A node becomes reachable before it is declared up and having joined the cluster.

Solution:

This is an informational message, no user action is needed.


537352 reservation error(%s) - do_scsi3_preemptandabort() error for disk %s

Description:

The device fencing program has encountered errors while trying to access a device. All retry attempts have failed.

Solution:

This may be indicative of a hardware problem, which should be resolved as soon as possible. Once the problem has been resolved, the following actions may be necessary: If the message specifies the 'node_join' transition, then this node may be unable to access the specified device. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access the device. In either case, access can be reacquired by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group may have failed to start on this node. If the device group was started on another node, it may be moved to this node with the scswitch command. If the device group was not started, it may be started with the scswitch command. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group may have failed. If so, the desired action may be retried.


537380 Invalid option -%c for the validate method.

Description:

Invalid option is passed to validate call back method.

Solution:

This is an internal error. Contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


537498 Invalid value was returned for resource property %s for %s.

Description:

The value returned for the named property was not valid.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


537607 Not found clexecd on node %d for %d seconds. Retrying ...

Description:

Could not find clexecd to execute the program on a node. Indicated retry times.

Solution:

This is an informational message, no user action is needed.


538177 in libsecurity contacting program %s (%lu); uname sys call failed: %s

Description:

A client was not able to make an rpc connection to the specified server because the host name could not be obtained. The system error message is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


538570 sema_post parent: %s

Description:

The rpc.pmfd server was not able to act on a semaphore. The message contains the system error. The server does not perform the action requested by the client, and an error message is output to syslog.

Solution:

Investigate if the machine is running out of memory. If this is not the case, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


538656 Restarting some BV daemons.

Description:

This message is from the BV probe. While the data service is starting up, some BV daemons may have failed to startup. The probe will restart the daemons that did not startup.

Solution:

Make sure the DB is available. The BV daemon startup may fail if DB is not available. If the DB is available no user action needed. The BV probe will take appropriate action.


538835 ERROR: probe_mysql Option -B not set

Description:

The -B option is missing for probe_mysql command.

Solution:

Add the -B option for probe_mysql command.


539760 Error parsing URL: %s. Will shutdown the WLS using sigkill

Description:

There is an error in parsing the URL needed to do a smooth shutdown. The WLS stop method however will go ahead and kill the WLS process.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


540274 got unexpected exception %s

Description:

An inter-node communication failed with an unknown exception.

Solution:

Examine syslog output for related error messages. Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file (if any). Contact your authorized Sun service provider for assistance in diagnosing the problem.


540376 Unable to change the directory to %s: %s. Current directory is /.

Description:

Callback method is failed to change the current directory . Now the callback methods will be executed in "/", so the core dumps from this callbacks will be located in "/".

Solution:

No user action needed. For detailed error message, check the syslog message.


540705 The DB probe script %s Timed out while executing

Description:

probing of the URLs set in the Server_url or the Monitor_uri_list failed. Before taking any action the WLS probe would make sure the DB is up. The Database probe script set in the extension property db_probe_script timed out while executing. The probe will not take any action till the DB is up and the DB probe succeeds.

Solution:

Make sure the DB probe (set in db_probe_script) succeeds. Once the DB is started the WLS probe will take action on the failed WLS instance.


541145 No active replicas where detected for the global service %s.

Description:

The DCS did not find any information on the specified global service.

Solution:

Check the global service configuration.


541180 Sun udlmlib library called with unknown option: '%c'

Description:

Unknown option used while starting up Oracle unix dlm.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


541206 Couldn't read deleted directory: error (%d)

Description:

The file system is unable to create temporary copies of deleted files.

Solution:

Mount the affected file system as a local file system, and ensure that there is no file system entry with name "._" at the root level of that file system. Alternatively, run fsck on the device to ensure that the file system is not corrupt.


541445 Unable to compose %s path.

Description:

Unable to construct the path to the indicated file.

Solution:

Check system log messages to determine the underlying problem.


541818 Service group '%s' created

Description:

The service group by that name is now known by the scalable services framework.

Solution:

This is an informational message, no user action is needed.


541955 INITFED Error: ${SERVER} is already running.

Description:

The initfed init script found the rpc.fed already running. It will not start it again.

Solution:

No action required.


543720 Desired Primaries is %d. It should be 1.

Description:

Invalid value for Desired Primaries property.

Solution:

Invalid value is set for Desired Primaries property. The value should be 1. Reset the property value using scrgadm(1M).


544252 Method <%s> on resource <%s>: Execution failed: no such method tag.

Description:

An internal error has occurred in the rpc.fed daemon which prevents method execution. This is considered a method failure. Depending on which method was being invoked and the Failover_mode setting on the resource, this might cause the resource group to fail over or move to an error state, or it might cause an attempted edit of a resource group or its resources to fail.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing the problem. Retry the edit operation.


544380 Failed to retrieve the resource type handle: %s.

Description:

An API operation on the resource type has failed.

Solution:

For the resource type name, check the syslog tag. For more details, check the syslog messages from other components. If the error persists, reboot the node.


544497 Auto_recovery_command is not set.

Description:

Auto recovery was performed and the database was reinitialized by running the hadbm clear command. However the extension property auto_recovery_command was not set so no further recovery was attempted.

Solution:

HADB is up with a fresh database. Session stores will need to be recreated. In the future the auto_recovery_command extension property can be set to a command that can do this whenever autorecovery is performed.


544592 PCSENTRY: %s

Description:

The rpc.pmfd server was not able to monitor a process, and the system error is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


544600 Reading from server timed out: server %s port %d

Description:

While reading the response from a request sent to the server the probe timeout ran out.

Solution:

Check that the server is functioning correctly and if the resources probe timeout is set too low.


544775 libpnm system error: %s

Description:

A system error has occurred in libpnm. This could be because of the resources on the system being very low. eg: low memory.

Solution:

The user of libpnm should handle these errors. However, if the message is out of memory - increase the swap space, install more memory or reduce peak memory consumption. Otherwise the error is unrecoverable, and the node needs to be rebooted. write error - check the "write" man page for possible errors. read error - check the "read" man page for possible errors. socket failed - check the "socket" man page for possible errors. TCP_ANONPRIVBIND failed - check the "setsockopt" man page for possible errors. gethostbyname failed %s - make sure entries in /etc/hosts, /etc/nsswitch.conf and /etc/netconfig are correct to get information about this host. bind failed - check the "bind" man page for possible errors. SIOCGLIFFLAGS failed - check the "ioctl" man page for possible errors. SIOCSLIFFLAGS failed - check the "ioctl" man page for possible errors. SIOCSLIFADDR failed - check the "ioctl" man page for possible errors. open failed - check the "open" man page for possible errors. SIOCLIFADDIF failed - check the "ioctl" man page for possible errors. SIOCLIFREMOVEIF failed - check the "ioctl" man page for possible errors. SIOCSLIFNETMASK failed - check the "ioctl" man page for possible errors. SIOCGLIFNUM failed - check the "ioctl" man page for possible errors. SIOCGLIFCONF failed - check the "ioctl" man page for possible errors. SIOCGLIFGROUPNAME failed - check the "ioctl" man page for possible errors. wrong address family - check the "ioctl" man page for possible errors.


545135 Validate - 64 bit libloghost_64.so.1 is not secure

Description:

libloghost_64.so.1 is not found within /usr/lib/secure/libloghost_64.so.1.

Solution:

Ensure that libloghost_64.so.1 is placed within /usr/lib/secure/libloghost_64.so.1 as documented within the Sun Cluster Data Service for Oracle Application Server for Solaris OS.


546018 Class <%s> SubClass <%s> Vendor <%s> Pub <%s>

Description:

The cl_eventd is posting an event locally with the specified attributes.

Solution:

This message is informational only, and does not require user action.


546856 CCR: Could not find the CCR transaction manager.

Description:

The CCR data server could not find the CCR transaction manager in the cluster.

Solution:

Reboot the cluster. Also contact your authorized Sun service provider to determine whether a workaround or patch is available.


547057 thr_sigsetmask: %s

Description:

A server (rpc.pmfd or rpc.fed) was not able to establish a link with the failfast device because of a system error. The error message is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


547145 Initialization error. Fault Monitor username is NULL

Description:

Internal error. Environment variable SYBASE_MONITOR_USER not set before invoking fault monitor.

Solution:

Report this problem to your authorized Sun service provider.


547301 reservation error(%s) error. Unknown internal error returned from clconf_do_execution().

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.


547307 Initialization failed. Invalid command line %s

Description:

This method is expected to be called the resource group manager. This method was not called with expected command line arguments.

Solution:

This is an internal error. Save the contents of /var/adm/messages from all the nodes and contact your Sun service representative.


547385 dl_bind: bad ACK header %u

Description:

An unexpected error occurred. The acknowledgment header for the bind request (to bind to the physical device) is bad. We are trying to open a fast path to the private transport adapters.

Solution:

Reboot of the node might fix the problem.


548024 RGOffload resource cannot offload the resource group containing itself.

Description:

You may have configured an RGOffload resource to offload the resource group in which it is configured.

Solution:

Please reconfigure RGOffload resource to not offload the resource group in which it is configured.


548162 Error (%s) when reading standard property <%s>.

Description:

This is an internal error. Program failed to read a standard property of the resource.

Solution:

If the problem persists, save the contents of /var/adm/messages and contact your Sun service representative.


548237 Validation failed. Connect string contains 'sa' password.

Description:

The 'Connect_String' extension property used for fault monitoring uses 'sa' as the account password combination. This is a security risk, because the extension properties are accessible by everyone.

Solution:

Check the resource configuration and the value of the 'Connect_string'.property. Ensure that a dedicated account (with minimal privileges) is created for fault monitoring purposes.


548260 The ucmmd daemon will not be started due to errors in previous reconfiguration.

Description:

Error was detected during previous reconfiguration of the RAC framework component. Error is indicated in the message. As a result of error, the ucmmd daemon was stopped and node was rebooted. On node reboot, the ucmmd daemon was not started on the node to allow investigation of the problem. RAC framework is not running on this node. Oracle parallel server/ Real Application Clusters database instances will not be able to start on this node.

Solution:

Review logs and messages in /var/adm/messages and /var/cluster/ucmm/ucmm_reconf.log. Resolve the problem that resulted in reconfiguration error. Reboot the node to start RAC framework on the node. Refer to the documentation of Sun Cluster support for Oracle Parallel Server/ Real Application Clusters. If problem persists, contact your Sun service representative.


548691 Delegating giveover request to resource group %s, due to a +++ affinity of resource group %s (possibly transitively).

Description:

A resource from a resource group which declares a strong positive affinity with failover delegation (+++ affinity) attempted a giveover request via the scha_control command. Accordingly, the giveover request was forwarded to the delegate resource group.

Solution:

This is an informational message, no user action is needed.


549190 Text server successfully started.

Description:

The Sybase text server has been successfully started by Sun Cluster HA for Sybase.

Solution:

This is an information message, no user action is needed.


549709 Local node isn't in replica nodelist of service <%s> with path <%s>. No affinity switchover can be done

Description:

Local node does not support the replica of the service.

Solution:

No user action required.


549765 Preparing to start service %s.

Description:

Sun Cluster is preparing to start the specified application

Solution:

This is an informational message, no user action is needed.


549875 check_mysql - Couldn't get SHOW SLAVE STATUS for instance %s (%s)

Description:

The fault monitor can't retrieve the MySQL slave status for the specified instance.

Solution:

Either was MySQL already down or the fault monitor user doesn't have the right permission. The defined fault monitor should have Process-,Select-, Reload- and Shutdown-privileges and for MySQL 4.0.x also Super-privileges. Check also the MySQL logfiles for any other errors.


550471 Failed to initialize the cluster handle: %s.

Description:

An API operation has failed while retrieving the cluster information.

Solution:

This may be solved by rebooting the node. For more details about API failure, check the messages from other components.


551094 reservation warning(%s) - Unable to open device %s, will retry in %d seconds

Description:

The device fencing program has encountered errors while trying to access a device. The failed operation will be retried

Solution:

This is an informational message, no user action is needed.


551139 Failed to initialize scalable services group: Error %d.

Description:

The data service in scalable mode was unable to register itself with the cluster networking.

Solution:

There may be prior messages in syslog indicating specific problems. Reboot the node if unable to correct the situation.


551654 Port (%d) determined from Monitor_Uri_List is invalid

Description:

The indicated port is not valid.

Solution:

Correct the port specified in Monitor_Uri_List.


552420 INITPMF Error: Timed out waiting for ${SERVER} service to register.

Description:

The initpmf init script was unable to verify that the rpc.pmfd registered with rpcbind. This error may prevent the rgmd from starting, which will prevent this node from participating as a full member of the cluster.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


553048 Cannot determine the nodeid

Description:

Internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


553376 CMM: Open failed with error '(%s)' and errno = %d for quorum device '%s'\n. Unable to scrub device.

Description:

The open operation failed for the specified quorum device while it was being added into the cluster. The add of this quorum device will fail.

Solution:

The quorum device has failed or the path to this device may be broken. Refer to the disk repair section of the administration guide for resolving this problem. Retry adding the quorum device after the problem has been resolved.


555134 reservation warning(%s) - MHIOCGRP_PREEMPTANDABORT returned EACCES, but actually worked; error ignored.

Description:

An attempt to remove a scsi-3 PGR key returned failure even though the key was successfully removed.

Solution:

This is an informational message, no user action is needed.


555844 Cannot send reply: invalid socket

Description:

The cl_apid experienced an internal error.

Solution:

No action required. If the problem persists, save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


556113 Resource dependency on SAP enqueue server does not match the affinity setting on SAP enqueue server resource group.

Description:

Both the dependency on a SAP enqueue server resource and the affinity on a SAP enqueue server resource group are set. However, the resource groups specified in the affinity property does not correspond to the resource group that the SAP enqueue server resource (specified in the resource dependency) belongs to.

Solution:

Specify a SAP enqueue server resource group in the affinity property. Make sure that SAP enqueue server resource group contains the SAP enqueue server resource that is specified in the resource dependency property.


556466 clexecd: dup2 of stdout returned with errno %d while exec'ing (%s). Exiting.

Description:

clexecd program has encountered a failed dup2(2) system call. The error message indicates the error number for the failure.

Solution:

The clexecd program will exit and the node will be halted or rebooted to prevent data corruption. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


556694 Nodes %u and %u have incompatible versions and will not communicate properly.

Description:

This is an informational message from the cluster version manager and may help diagnose which systems have incompatible versions with eachother during a rolling upgrade. This error may also be due to attempting to boot a cluster node in 64-bit address mode when other nodes are booted in 32-bit address mode, or vice versa.

Solution:

This message is informational; no user action is needed. However, one or more nodes may shut down in order to preserve system integrity. Verify that any recent software installations completed without errors and that the installed packages or patches are compatible with the software installed on the other cluster nodes.


556945 No permission for owner to execute %s.

Description:

The specified path does not have the correct permissions as expected by a program.

Solution:

Set the permissions for the file so that it is readable and executable by the owner.


557585 No filesystem mounted on %s.

Description:

There is no file system mounted on the mount point.

Solution:

None. This is a debug message.


558184 Validate - MySQL logdirectory for mysqld does not exist

Description:

The defined (-L option) logdirectory doesn't exist.

Solution:

Make sure that the defined logdirectory exists.


558350 Validation failed. Connect string is incomplete.

Description:

The 'Connect_String' extension property used for fault monitoring has been incorrectly specified. This has the format"username/password".

Solution:

Check the resource configuration and the value of the 'Connect_String'property. Ensure that there are no spaces in the 'Connect_String'specification.


558742 Resource group %s is online on more than one node.

Description:

The named resource group should be online on only one node, but it is actually online on more than one node.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


558763 Failed to start the WebLogic server configured in resource %s

Description:

The WebLogic Server configured in the resource could not be started by the agent.

Solution:

Try to start the Weblogic Server manually and check if it can be started manually. Make sure the logical host is UP before starting the WLS manually. If it fails to start manually then check your configuration and make sure it can be started manually before it can be started by the agent. Make sure the extension properties are correct. Make sure the port is not already in use.


559206 Either extension property <stop_signal> is not defined, or an error occurred while retrieving this property; using the default value of SIGTERM.

Description:

Property stop_signal may not be defined in RTR file. Continue the process with the default value of SIGTERM.

Solution:

This is an informational message, no user action is needed.


559550 Error in opening /etc/vfstab: %s

Description:

Failed to open /etc/vfstab. Error message is followed.

Solution:

Check with system administrator and make sure /etc/vfstab is properly defined.


559614 Resource <%s> of Resource Group <%s> failed monitor check on node <%s>\n

Description:

Message logged for failed scha_control monitor check methods on specific node.

Solution:

No user action required.


559857 Starting liveCache with command %s failed. Return code is %d.

Description:

Starting liveCache failed.

Solution:

Check SAP liveCache log files and also look for syslog error messages on the same node for potential errors.


560781 Tag %s: could not allocate history.

Description:

The rpc.pmfd server was not able to allocate memory for the history of the tag shown, probably due to low memory. The process associated with the tag is stopped and pmfadm returns error.

Solution:

Investigate if the machine is running out of memory. If this is not the case, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


561862 PNM daemon config error: %s

Description:

A configuration error has occurred in the PNM daemon. This could be because of wrong configuration/format etc.

Solution:

If the message is: IPMP group %s not found - either an IPMP group name has been changed or all the adapters in the IPMP group have been unplumbed. There would have been an earlier NOTICE which said that a particular IPMP group has been removed. The pnmd has to be restarted. Send a KILL (9) signal to the PNM daemon. Because pnmd is under PMF control, it will be restarted automatically. If the problem persists, restart the node with scswitch -S and shutdown(1M). IPMP group %s already exists - the user of libpnm (scrgadm) is trying to auto-create an IPMP group with a groupname that is already being used. Typically, this should not happen so contact your authorized Sun service provider to determine whether a workaround or patch or suggestion is available. Make a note of all the IPMP group names on your cluster. wrong format for /etc/hostname.%s - the format of /etc/hostname.<adp> file is wrong. Either it has the keyword group but no group name following it or the file has multiple lines. Correct the format of the file after going through the IPMP Admin Guide. The pnmd has to be restarted. Send a KILL (9) signal to the PNM daemon. Because pnmd is under PMF control, it will be restarted automatically. If the problem persists, restart the node with scswitch -S and shutdown(1M). We do not support multi-line /etc/hostname.adp file in auto-create since it becomes difficult to figure out which IP address the user wanted to use as the non-failover IP address. All adps in %s do not have similar instances (IPv4/IPv6) plumbed - An IPMP group is supposed to be homogenous. If any adp in an IPMP group has a certain instance (IPv4 or IPv6) then all the adps in that IPMP group must have that particular instance plumbed in order to facilitate failover. Please see the Solaris IPMP docs for more details. %s is blank - this means that the /etc/hostname.<adp> file is blank. We do not support auto-create for blank v4 hostname files - since this will mean that 0.0.0.0 will be plumbed on the v4 address.


562200 Application failed to stay up.

Description:

The probe for the SUNW.Event data service found that the cl_apid application failed to stay up.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


562397 Failfast: %s.

Description:

A failfast client has encountered a deferred panic timeout and is going to panic the node. This may happen if a critical userland process, as identified by the message, dies unexpectedly.

Solution:

Check for core files of the process after rebooting the node and report these to your authorized Sun service provider.


562700 unable to arm failfast.

Description:

Internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


563288 Validation failed. The rac_framework resource %s, specified in the RESOURCE_DEPENDENCIES property, does not belong to the group specified in the resource group property RG_AFFINITIES

Description:

The resource being created or modified should be dependent upon the rac_framework resource in the RAC framework resource group.

Solution:

If not already created, create the RAC framework resource group and it's associated resources. Then specify the rac_framework resource for this resource's RESOURCE_DEPENDENCIES property.


563343 resource type %s updated.

Description:

This is a notification from the rgmd that the operator has edited a property of a resource type. This may be used by system monitoring tools.

Solution:

This is an informational message, no user action is needed.


563800 Failed to get all IPMP groups (request failed with %d).

Description:

An unexpected error occurred while trying to communicate with the network monitoring daemon (pnmd).

Solution:

Make sure the network monitoring daemon (pnmd) is running.


563847 INTERNAL ERROR: POSTNET_STOP method is not registered for resource <%s>

Description:

A non-fatal internal error has occurred in the rgmd state machine.

Solution:

Since this problem might indicate an internal logic error in the rgmd, please save a copy of the /var/adm/messages files on all nodes, the output of an scstat -g command, and the output of a scrgadm -pvv command. Report the problem to your authorized Sun service provider.


563976 Unable to get socket flags: %s.

Description:

Failed to get status flags for the socket used in communicating with the application.

Solution:

This is an internal error, no user action is required. Also contact your authorized Sun service provider.


564486 INITRGM Error: rpc.pmfd is not running.

Description:

The initrgm init script was unable to verify that the rpc.pmfd is running and available. This error will prevent the rgmd from starting, which will prevent this node from participating as a full member of the cluster.

Solution:

Examine other syslog messages occurring at about the same time to determine why the rpc.pmfd is not running. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


564771 Error in reading /etc/vfstab: getvfsent() returns <%d>

Description:

Error in reading /etc/vfstab. The return code of getvfsent() is followed.

Solution:

Check with system administrator and make sure /etc/vfstab is properly defined. 158981: Path <%s> is not a valid file system mount point specified in /etc/vfstab

Description:

The "ServicePaths" property of the hastorage resource should be valid disk group or device special file or global file system mount point specified in the /etc/vfstab file.

Solution:

Check the definition of the extension property "ServicePaths" of SUNW.HAStorage type resource. If they are file system mount points, verify that the /etc/vfstab file contains correct entries. 549969: Error doing stat on device special file <%s> corresponding to path <%s>

Description:

The file system mount point can not be mapped to global service correctly as stat fails on the device special file corresponding to the file system mount point.

Solution:

Check the definition of mountpoint path in extension property "ServicePaths" of SUNW.HAStorage type resource and make sure they are for global file system with correct entries in /etc/vfstab.


564851 the setting of Failover_mode for resource %s doesn't allow the scha_control operation

Description:

The rgmd is enforcing the RESTART_ONLY or LOG_ONLY value for the Failover_mode system property. Those settings may prevent some operations initiated by scha_control.

Solution:

No action required. If desired, use scrgadm(1M) to change the Failover_mode setting.


564883 The Data base probe %s also failed. Will restart or failover the WLS only if the DB is UP

Description:

probing of the URLs set in the Server_url or the Monitor_uri_list failed. Before taking any action the WLS probe would make sure the DB is up (if a db_probe_script extension property is set). But, the DB probe also failed. The probe will not take any action till the DB is UP and the DB probe succeeds.

Solution:

Start the Data Base and make sure the DB probe (the script set in the db_probe_script) returns 0 for success. Once the DB is started the WLS probe will take action on the failed WLS instance.


565126 "Validate - can't determine Qmaster Spool dir"

Description:

The qmaster spool directory could not be determined, using the 'QmasterSpoolDir' function.

Solution:

Use 'qconf -sconf' to determine the value of 'qmaster_spool_dir'. Update/correct the value if necessary using 'qconf -mconf'.


565159 "pmfadm -s": Error signaling <%s>: %s

Description:

An error occurred while rpc.pmfd attempted to send a signal to one of the processes of the given tag. The reason for the failure is also given. The signal was sent as a result of a 'pmfadm -s' command.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


565198 did subpath %s created for instance %d.

Description:

Informational message from scdidadm.

Solution:

No user action required.


565362 Transport heart beat timeout is changed to %s.

Description:

The global transport heart beat timeout is changed.

Solution:

None. This is only for information.


565438 svc_run returned

Description:

The rpc.pmfd server was not able to run, due to an rpc error. This happens while the server is starting up, at boot time. The server does not come up, and an error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


565884 tag %s: command file %s is not executable

Description:

The rpc.fed server checked the command indicated by the tag, and this check failed because the command is not executable. An error message is output to syslog.

Solution:

Check the permission mode of the command, make sure that it is executable.


565978 Home dir is not set for user %s.

Description:

Home directory for the specified user is not set in the system.

Solution:

Check and make sure the home directory is set up correctly for the specified user.


566781 ORACLE_HOME %s does not exist

Description:

Directory specified as ORACLE_HOME does not exist. ORACLE_HOME property is specified when creating Oracle_server and Oracle_listener resources.

Solution:

Specify correct ORACLE_HOME when creating resource. If resource is already created, please update resource property 'ORACLE_HOME'.


567300 ERROR: resource group %s state change to %s is INVALID because we are not running at a high enough version. Aborting node.

Description:

The rgmd is trying to use features from a higher version than that at which it is running. This error indicates a possible mismatch of the rgmd version and the contents of the CCR.

Solution:

Look for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


567374 Failed to stop %s.

Description:

Sun Cluster failed to stop the application.

Solution:

Use process monitor facility (pmfadm (1M)) with -L option to retrieve all the tags that are running on the server. Identify the tag name for the application in this resource. This can be easily identified as the tag ends in the string ".svc" and contains the resource group name and the resource name. Then use pmfadm (1M) with -s option to stop the application. This problem may occur when the cluster is under load and Sun Cluster cannot stop the application within the timeout period specified. You may consider increasing the Stop_timeout property. If the error still persists, then reboot the node.


567610 PARAMTER_FILE %s does not exist

Description:

Oracle parameter file (typically init<sid>.ora) specified in property 'Parameter_file' does not exist or is not readable.

Solution:

Please make sure that 'Parameter_file' property is set to the existing Oracle parameter file. Reissue command to create/update the resource using correct 'Parameter_file'.


567783 %s - %s

Description:

The first %s refers to the calling program, whereas the second %s represents the output produced by that program. Typically, these messages are produced by programs such as strmqm, endmqm rumqtrm etc.

Solution:

No user action is required if the command was successful. Otherwise, examine the other syslog messages occurring at the same time on the same node to see if the cause of the problem can be identified.


567819 clcomm: Fixed size resource_pool short server threads: pool %d for client %d total %d

Description:

The system can create a fixed number of server threads dedicated for a specific purpose. The system expects to be able to create this fixed number of threads. The system could fail under certain scenarios without the specified number of threads. The server node creates these server threads when another node joins the cluster. The system cannot create a thread when there is inadequate memory.

Solution:

There are two possible solutions. Install more memory. Alternatively, reduce memory usage. Application memory usage could be a factor, if the error occurs when a node joins an operational cluster and not during cluster startup.


568162 Unable to create failfast thread

Description:

A server (rpc.pmfd or rpc.fed) was not able to start because it was not able to create the failfast thread, which ensures that the host aborts if the server dies. An error message is output to syslog.

Solution:

Investigate if the host is low on memory. If not, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


568314 Failed to remove node %d from scalable service group %s: %s.

Description:

A call to the underlying scalable networking code failed.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


569559 Start of %s completed successfully.

Description:

The resource successfully started the application.

Solution:

This message is informational; no user action is needed.


570070 There is already an instance of this daemon running\n

Description:

The server did not start because there is already another identical server running

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


570394 reservation warning(%s) - USCSI_RESET failed for device %s, returned %d, will retry in %d seconds

Description:

The device fencing program has encountered errors while trying to access a device. The failed operation will be retried

Solution:

This is an informational message, no user action is needed.


570802 fatal: Got error <%d> trying to read CCR when disabling monitor of resource <%s>; aborting node

Description:

Rgmd failed to read updated resource from the CCR on this node.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


571642 ucm_callback for cmmreturn generated exception %d

Description:

ucmm callback for step cmmreturn failed.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


571734 Validation failed. ORACLE_SID is not set

Description:

ORACLE_SID property for the resource is not set. HA-Oracle will not be able to manage Oracle server if ORACLE_SID is incorrect.

Solution:

Specify correct ORACLE_SID when creating resource. If resource is already created, please update resource property 'ORACLE_SID'.


572885 Pathprefix %s for resource group %s is not readable: %s.

Description:

A HA-NFS method attempted to access the specified Pathprefix but was unable to do so. The reason for the failure is also logged.

Solution:

This could happen if the file system on which the Pathprefix directory resides is not available. Use the HAStorage resource in the resource group to make sure that HA-NFS methods have access to the file system at the time when they are launched. Check to see if the pathname is correct and correct it if not so. HA-NFS would attempt to recover from this situation by failing over to some other node.


572955 host %s: client is null

Description:

The rgm is not able to obtain an rpc client handle to connect to the rpc.fed server on the named host. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


574421 MQSeriesIntegrator2%s file deleted

Description:

The WebSphere Broker fault monitor checks to see if MQSeriesIntegrator2BrokerResourceTableLockSempahore or MQSeriesIntegrator2RetainedPubsTableLockSemaphore exists within /var/mqsi/locks and that their respective semaphore id exists.

Solution:

No user action is needed. If either MQSeriesIntegrator2%s file exists without an IPC semaphore entry, then the MQSeriesIntegrator2%s file is deleted. This prevents (a) Execution Group termination on startup with BIP2123 and (b) bipbroker termination on startup with BIP2088.


574542 clexecd: fork1 returned %d. Exiting.

Description:

clexecd program has encountered a failed fork1(2) system call. The error message indicates the error number for the failure.

Solution:

If the error number is 12 (ENOMEM), install more memory, increase swap space, or reduce peak memory consumption. If error number is something else, contact your authorized Sun service provider to determine whether a workaround or patch is available.


574675 nodeid of ctxp is bad: %d

Description:

nodeid in the context pointer is bad.

Solution:

None. udlm takes appropriate action.


575351 Can't retrieve binding entries from node %d for GIF node %d

Description:

Failed to maintain client affinity for some sticky services running on the named server node. Connections from existing clients for those services might go to a different server node as a result.

Solution:

If client affinity is a requirement for some of the sticky services, say due to data integrity reasons, these services must be brought offline on the named node, or the node itself should be restarted.


575545 fatal: rgm_chg_freeze: INTERNAL ERROR: invalid value of rgl_is_frozen <%d> for resource group <%s>

Description:

The in-memory state of the rgmd has been corrupted due to an internal error. The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


576196 clcomm: error loading kernel module: %d

Description:

The loading of the cl_comm module failed.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


576621 Command %s does not have execute permission set.

Description:

The specified pathname, which was passed to a libdsdev routine such as scds_timerun or scds_pmf_start, refers to a file that does not have execute permission set. This could be the result of 1) mis-configuring the name of a START or MONITOR_START method or other property, 2) a programming error made by the resource type developer, or 3) a problem with the specified pathname in the file system itself.

Solution:

Ensure that the pathname refers to a regular, executable file.


576744 INTERNAL ERROR: Invalid resource property type <%d> on resource <%s>

Description:

An attempted creation or update of a resource has failed because of invalid resource type data. This may indicate CCR data corruption or an internal logic error in the rgmd.

Solution:

Use scrgadm(1M) -pvv to examine resource properties. If the resource or resource type properties appear to be corrupted, the CCR might have to be rebuilt. If values appear correct, this may indicate an internal error in the rgmd. Retry the creation or update operation. If the problem recurs, save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance.


576769 Start command %s failed with error %s.

Description:

The execution of the command for starting the data service failed.

Solution:

No user action needed.


577130 Failed to retrieve network configuration: %s.

Description:

Attempt to get network configuration failed. The precise error strings is included with the message.

Solution:

Check to make sure that the /etc/netconfig file on the system is not corrupted and it has entries for tcp and udp. Additionally, make sure that the system has enough resources (particularly memory). HA-NFS would attempt to recover from this failure by failing over to another node, by crashing the current node, if this error occurs during STOP or POSTNET_STOP method execution.


577140 clcomm: Exception during unmarshal_receive

Description:

The server encountered an exception while unmarshalling the arguments for a remote invocation. The system prints the exception causing this error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


578055 stop_mysql - Failed to flush MySQL tables for %s

Description:

mysqladmin command failed to flush MySQL tables.

Solution:

Either was MySQL already down or the fault monitor user doesn't have the right permission to flush tables. The defined fault monitor should have Process-,Select-, Reload- and Shutdown-privileges and for MySQL 4.0.x also Super-privileges.


579190 INTERNAL ERROR: resource group <%s> state <%s> node <%s> contains resource <%s> in state <%s>

Description:

The rgmd has discovered that the indicated resource group's state information appears to be incorrect. This may prevent any administrative actions from being performed on the resource group.

Solution:

Since this problem might indicate an internal logic error in the rgmd, please save a copy of the /var/adm/messages files on all nodes, the output of an scstat -g command, and the output of a scrgadm -pvv command. Report the problem to your authorized Sun service provider.


579208 Global service %s associated with path %s is found to be in maintenance state.

Description:

Self explanatory.

Solution:

Check the global service configuration.


579235 Method <%s> on resource <%s> terminated abnormally

Description:

A resource method terminated without using an exit(2) call. The rgmd treats this as a method failure.

Solution:

Consult resource type documentation, or contact the resource type developer for further information.


579575 Will not start database. Waiting for %s (HADB node %d) to start resource.

Description:

The specified Sun Cluster node must be running the HADB resource before the database can be started.

Solution:

Bring the HADB resource online on the specified Sun Cluster node.


579987 Error binding '%s' in the name server. Exiting.

Description:

clexecd program was unable to start because of some problems in the low-level clustering software.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


580163 reservation warning(%s) - MHIOCTKOWN error will retry in %d seconds

Description:

The device fencing program has encountered errors while trying to access a device. The failed operation will be retried

Solution:

This is an informational message, no user action is needed.


580416 Cannot restart monitor: Monitor is not enabled.

Description:

An update operation on the resource would have been restarted the fault monitor. But, the monitor is currently disabled for the resource.

Solution:

This is informational message. Check whether the monitor is disabled for the resource. If not, consider it as an internal error and contact your authorized Sun service provider.


580802 Either extension property <network_aware> is not defined, or an error occurred while retrieving this property; using the default value of TRUE.

Description:

Property Network_aware may not be defined in RTR file. Use the default value of true.

Solution:

This is an informational message, no user action is needed.


581117 Cleanup with command %s.

Description:

Cleanup the SAPDB database instance before starting up the instance in case the database crashed and was in a bad state with the command which is listed.

Solution:

Informational message. No action is needed.


581180 launch_validate: call to rpc.fed failed for resource <%s>, method <%s>

Description:

The rgmd failed in an attempt to execute a VALIDATE method, due to a failure to communicate with the rpc.fed daemon. If the rpc.fed process died, this might lead to a subsequent reboot of the node. Otherwise, this will cause the failure of a creation or update operation on a resource or resource group.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified. Retry the creation or update operation. If the problem recurs, save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance.


581376 clcomm: solaris xdoor: too much reply data

Description:

The reply from a user level server will not fit in the available space.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


581413 Daemon %s is not running.

Description:

HA-NFS fault monitor checks the health of statd, lockd, mountd and nfsd daemons on the node. It detected that one these are not currently running.

Solution:

No action. The monitor would restart these.


581898 Application failed to stay up. Start method Failure.

Description:

The Application failed to stay start up.

Solution:

Look in /var/adm/messages for the cause of failure. Try to start the Weblogic Server manually and check if it can be started manually. Make sure the logical host is UP before starting the WLS manually. If it fails to start manually then check your configuration and make sure it can be started manually before it can be started by the agent. Make sure the extension properties are correct. Make sure the port is not already in use. Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


581902 (%s) invalid timeout '%d'

Description:

Invalid timeout value for a method.

Solution:

Make sure udlm.conf file has correct timeouts for methods.


582276 This node has a lower preference to node %d for global service %s associated with path %s. Device switchover can still be done to this node.

Description:

HA Storage Plus determined that the node is less preferred to another node from the DCS global service view, but as the Failback setting is Off, this is not a problem.

Solution:

This is an informational message, no user action is needed.


582353 Adaptive server shutdown with wait failed. STOP_FILE %s.

Description:

The Sybase adaptive server failed to shutdown with the wait option using the file specified in the STOP_FILE property.

Solution:

This is an informational message, no user action is needed.


582418 Validation failed. SYBASE ASE startserver file not found SYBASE=%s.

Description:

The Sybase Adaptive server is started by execution of the"startserver" file. This file is missing. The SYBASEdirectory is specified as a part of this error message.

Solution:

Verify the Sybase installation including the existence and proper permissions of the "startserver" file in the $SYBASE/$SYBASE_ASE/install directory.


582651 tag %s: does not belong to caller

Description:

The user sent a suspend/resume command to the rpc.fed server for a tag that was started by a different user. An error message is output to syslog.

Solution:

Check the tag name.


582757 No PDT Fastpath thread.

Description:

The system has run out of resources that is required to create a thread. The system could not create the Fastpath thread that is required for cluster networking.

Solution:

If cluster networking is required, add more resources (most probably, memory) and reboot.


583138 dfstab not readable

Description:

HA-NFS fault monitor failed to read dfstab when it detected that dfstab has been modified.

Solution:

Make sure the dfstab file exists and has read permission set appropriately. Look at the prior syslog messages for any specific problems and correct them.


583224 Rebooting this node because daemon %s is not running.

Description:

The rpcbind daemon on this node is not running.

Solution:

No action. Fault monitor would reboot the node. Also see message id 804791.


583542 clcomm: Pathend: would abort node because %s for %u ms

Description:

The system would have aborted the node for the specified reason if the check for send thread running was enabled.

Solution:

No user action is required.


583970 Failed to shutdown liveCache immediately with command %s.

Description:

Stopping SAP liveCache with the specified command failed to complete.

Solution:

Look for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


584207 Stopping %s.

Description:

Sun Cluster is stopping the specified application.

Solution:

This is an informational message, no user action is needed.


584386 PENDING_OFFLINE: bad resource state <%s> (%d) for resource <%s>

Description:

The rgmd state machine has discovered a resource in an unexpected state on the local node. This should not occur and may indicate an internal logic error in the rgmd.

Solution:

Look for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


585726 pthread_cond_init: %s

Description:

The cl_apid was unable to initialize a synchronization object, so it was not able to start-up. The error message is specified.

Solution:

Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


586294 Thread creation error: %s

Description:

Internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


586298 clcomm: unknown type of signals message %d

Description:

The system has received a signals message of unknown type.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


586344 clcomm: unable to unbind %s from name server

Description:

The name server would not unbind the specified entity.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


586689 Cannot access the %s command <%s> : <%s>

Description:

The command input to the agent builder is not accessible and executable. This may be due to the program not existing or the permissions not being set properly.

Solution:

Make sure the program in the command exists, is in the proper directory, and has read and execute permissions set appropriately.


588030 INITRGM Waiting for ${SERVER} to be ready.

Description:

The initrgm init script is waiting for the rgmd daemon to start. This warning informs the user that the startup of rgmd is abnormally long.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


588055 Failed to retrieve information for SAP xserver user %s.

Description:

The SAP xserver user is not found on the system.

Solution:

Make sure the user is available on the system.


588133 Error reading CCR table %s

Description:

The specified CCR table was not present or was corrupted.

Solution:

Reboot the node in non-cluster (-x) mode and restore the CCR table from the other nodes in the cluster or from backup. If the problem persists, save the contents of /var/adm/messages and contact your Sun service representative.


589025 ERROR: stop_mysql Option -F not set

Description:

The -F option is missing for stop_mysql command.

Solution:

Add the -F option for stop_mysql command.


589373 scha_control RESOURCE_RESTART failed with error code: %s

Description:

Fault monitor had detected problems in Oracle listener. Attempt to switchover resource to another node failed. Error returned by API call scha_control is indicated in the message. If problem persists, fault monitor will make another attempt to restart the listener.

Solution:

Check Oracle listener setup. Please make sure that Listener_name specified in the resource property is configured in listener.ora file. Check 'Host' property of listener in listener.ora file. Examine log file and syslog messages for additional information.


589594 scha_control: warning, RESOURCE_IS_RESTARTED call on resource <%s> will not increment the resource_restart count because the resource does not have a Retry_interval property

Description:

A resource monitor (or some other program) is attempting to notify the RGM that the indicated resource has been restarted, by calling scha_control(1ha),(3ha) with the RESOURCE_IS_RESTARTED option. This request will trigger restart dependencies, but will not increment the retry_count for the resource, because the resource type does not declare the Retry_interval property for its resources. To enable the NUM_RESOURCE_RESTARTS query, the resource type registration (RTR) file must declare the Retry_interval property.

Solution:

Contact the author of the data service (or of whatever program is attempting to call scha_control) and report the warning.


589689 get_resource_dependencies - Only one WebSphere MQ Broker RDBMS resource dependency can be set

Description:

The WebSphere MQ Broker resource checks to see if the correct resource dependencies exists, however it appears that there already is a WebSphere MQ Broker RDBMS defined in resource_dependencies when registering the WebSphere MQ Broker resource.

Solution:

Check the resource_dependencies entry when you registered the WebSphere MQ Broker resource.


589719 Issuing failover request.

Description:

This is informational message. We are above to call API function to request for failover. In case of failure, follow the syslog messages after this message.

Solution:

No user action is needed.


589805 Fault monitor probe response times are back below 50%% of probe timeout. The timeout will be reduced to it's configured value

Description:

The resource's probe timeout had been previously increased by 10% because of slow response, but the time taken for the last fault monitor probe to complete, and the average probe time, are both now less than 50% of the timeout. The probe timeout will therefore be reduced to it's configured value.

Solution:

This is an informational message, no user action is needed.


589817 clcomm: nil_sendstream::send

Description:

The system attempted to use a "send" operation for a local invocation. Local invocations do not use a "send" operation.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


590263 Online check Error %s : %ld

Description:

Error detected when checking ONLINE status of RDBMS. Error number is indicated in message. This can be because of RDBMS server problems or configuration problems.

Solution:

Check RDBMS server using vendor provided tools. If server is running properly, this can be fault monitor set-up error.


590454 TCPTR: Machine with MAC address %s is using cluster private IP address %s on a network reachable from me. Path timeouts are likely.

Description:

The transport at the local node detected an arp cache entry that showed the specified MAC address for the above IP address. The IP address is in use at this cluster on the private network. However, the MAC address is a foreign MAC address. A possible cause is that this machine received an ARP request from another machine that does not belong to this cluster, but hosts the same IP address using the above MAC address on a network accessible from this machine. The transport has temporarily corrected the problem by flushing the offending arp cache entry. However, unless corrective steps are taken, TCP/IP communication over the relevant subnet of the private interconnect might break down, thus causing path downs.

Solution:

Make sure that no machine outside this cluster hosts this IP address on a network reachable from this cluster. If there are other sunclusters sharing a public network with this cluster, please make sure that their private network adapters are not miscabled to the public network. By default all sunclusters use the same set of IP addresses on their private networks.


590700 ALERT_LOG_FILE %s doesn't exist

Description:

File specified in resource property 'Alert_log_file' does no exist. HA-Oracle requires correct Alert Log file for fault monitoring.

Solution:

Check 'Alert_log_file' property of the resource. Specify correct Oracle Alert Log file when creating resource. If resource is already created, please update resource property Alert_log_file'.


592233 setrlimit(RLIMIT_NOFILE): %s

Description:

The rpc.pmfd server was not able to set the limit of files open. The message contains the system error. This happens while the server is starting up, at boot time. The server does not come up, and an error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


592285 clexecd: getrlimit returned %d

Description:

clexecd program has encountered a failed getrlimit(2) system call. The error message indicates the error number for the failure.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


592378 Resource %s is not online anywhere.

Description:

The named resource is not online on any cluster node.

Solution:

None. This is an informational message.


592738 Validate - smbclient %s non-existent executable

Description:

The Samba resource tries to validate that the smbclient exists and is executable.

Solution:

Check the correct pathname for the Samba bin directory was entered when registering the resource and that the program exists and is executable.


592920 sigemptyset: %s

Description:

The rpc.fed server encountered an error with the sigemptyset function, and was not able to start. The message contains the system error.

Solution:

Save the syslog messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


593267 ERROR: probe_sap_j2ee Option -G not set

Description:

The -G option is missing for the probe_command.

Solution:

Add -G option to the probe-command.


593330 Resource type name is null.

Description:

This is an internal error. While attempting to retrieve the resource information, null value was retrieved for the resource type name.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


594396 Thread creation failed for node %d.

Description:

The cl_eventd failed to create a thread. This error may cause the daemon to operate in a mode of reduced functionality.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


594629 Failed to stop the fault monitor.

Description:

Process monitor facility has failed to stop the fault monitor.

Solution:

Use pmfadm(1M) with -L option to retrieve all the tags that are running on the server. Identify the tag name for the fault monitor of this resource. This can be easily identified, as the tag ends in string ".mon" and contains the resource group name and the resource name. Then use pmfadm (1M) with -s option to stop the fault monitor. If the error still persists, then reboot the node.


594675 reservation warning(%s) - MHIOCGRP_REGISTER error will retry in %d seconds

Description:

The device fencing program has encountered errors while trying to access a device. The failed operation will be retried

Solution:

This is an informational message, no user action is needed.


594827 INITPMF Waiting for ${SERVER} to be ready.

Description:

The initpmf init script is waiting for the rpc.pmfd daemon to start. This warning informs the user that the startup of rpc.pmfd is abnormally long.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


595077 Error in hasp_check. Validation failed.

Description:

Internal error occurred in hasp_check.

Solution:

Check the errors logged in the syslog messages by hasp_check. Please verify existence of /usr/cluster/bin/hasp_check binary. Please report this problem.


595448 Validate - mysqld %s non-existent executable

Description:

The mysqld command doesn't exist or is not executable.

Solution:

Make sure that MySQL is installed correctly or right base directory is defined.


595686 %s is %d for %s. It should be 1.

Description:

The named property has an unexpected value.

Solution:

Change the value of the property to be 1.


595926 Stopping Adaptive server with nowait option.

Description:

The Sun Cluster HA for Sybase will retry the shutdown using the nowait option.

Solution:

This is an informational message, no user action is needed.


596604 clcomm: solookup on routing socket failed with error = %d

Description:

The system prepares IP communications across the private interconnect. A lookup operation on the routing socket failed.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


597239 The weight portion of %s at position %d in property %s is not a valid weight. The weight should be an integer between %d and %d.

Description:

The weight noted does not have a valid value. The position index, which starts at 0 for the first element in the list, indicates which element in the property list was invalid.

Solution:

Give the weight a valid value.


597381 setrlimit before exec: %s

Description:

rpc.pmfd was unable to set the number of file descriptors before executing a process.

Solution:

Look for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


598087 PCWSTOP: %s

Description:

The rpc.pmfd server was not able to monitor a process, and the system error is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


598115 Ignoring invalid expression <%s> in custom action file.

Description:

This is an informational message indicating that an entry with an invalid expression was found in the custom action file and will be ignored.

Solution:

Remove the invalid entry from the custom action file or correct it to be a valid regular expression.


598259 scvxvmlg fatal error - ckmode received unknown mode %d

Description:

The program responsible for maintaining the VxVM namespace has suffered an internal error. If configuration changes were recently made to VxVM diskgroups or volumes, this node may be unaware of those changes. Recently created volumes may be inaccessible from this node.

Solution:

If no configuration changes have been recently made to VxVM diskgroups or volumes and all volumes continue to be accessible from this node, then no action is required. If changes have been made, the device namespace on this node can be updated to reflect those changes by executing '/usr/cluster/lib/dcs/scvxvmlg'. If the problem persists, contact your authorized Sun service provider to determine whether a workaround or patch is available.


598483 Waiting for WebSphere MQ Broker RDBMS

Description:

The WebSphere MQ Broker is dependent on the WebSphere MQ Broker RDBMS, which is not available. So the WebSphere MQ Broker will wait until it is available before it is started, or until Start_timeout for the resource occurs.

Solution:

No user action is needed.


598540 clcomm: solaris xdoor: completed invo: door_return returned, errno = %d

Description:

An unusual but harmless event occurred. System operations continue unaffected.

Solution:

No user action is required.


598554 launch_validate_method: getlocalhostname() failed for resource <%s>, resource group <%s>, method <%s>

Description:

The rgmd was unable to obtain the name of the local host, causing a VALIDATE method invocation to fail. This in turn will cause the failure of a creation or update operation on a resource or resource group.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified. Retry the creation or update operation. If the problem recurs, save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance.


598979 tag %s: already suspended

Description:

The user sent a suspend command to the rpc.fed server for a tag that is already suspended. An error message is output to syslog.

Solution:

Check the tag name.


599371 Failed to stop the WebLogic server smoothly. Will try killing the process using sigkill

Description:

The Smooth shutdown of the WLS failed. The WLS stop method however will go ahead and kill the WLS process.

Solution:

Check the WLS logs for more details. Check the /var/adm/messages and the syslogs for more details to fix the problem.


599430 Failed to retrieve the resource property %s: %s.

Description:

An API operation has failed while retrieving the resource property. Low memory or API call failure might be the reasons.

Solution:

In case of low memory, the problem will probably cured by rebooting. If the problem reoccurs, you might need to increase swap space by configuring additional swap devices. Otherwise, if it is API call failure, check the syslog messages from other components. For the resource name and property name, check the current syslog message.


599558 SIOCLIFADDIF of %s failed: %s.

Description:

Specified system operation failed

Solution:

This is as an internal error. Contact your authorized Sun service provider with the following information. 1) Saved copy of /var/adm/messages file. 2) Output of "ifconfig -a" command.