Sun Cluster 3.0 5/02 Error Messages Guide

Error Message List

The following list is ordered by the message ID.

500133 ::Device switchover of global service %s associated with path %s to this node failed: %s.

Description:

An attempt to switchover the global service to the current node failed.

Solution:

Inspect the syslog for errors. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

501582 :in libsecurity setnetpath failed: %s

Description:

The rpc.pmfd, rpc.fed or rgmd server was unable to initiate an rpc connection, because it could not get the network database handle. The server does not start. The rpc error message is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

501632 :Incorrect syntax in the environment file %s. Ignoring %s

Description:

HA-Oracle reads the fle specified in USER_ENV property and exports the variables declared in the file. Syntax for declaring the variables is VARIABLE=VALUE. Lines starting with "#" are treated as comment lines. VARIABLE is expected to be a valid Korn shell variable that starts with alphabet or "_" and contains alphanumerics and "_".

Solution:

Check the environment file and correct the syntax errors. Do not use export statement in environment file.

501733 :scvxvmlg fatal error - _cladm() failed

Description:

The program responsible for maintaining the VxVM namespace has suffered an internal error. If configuration changes were recently made to VxVM diskgroups or volumes, this node may be unaware of those changes. Recently created volumes may be inaccessible from this node.

Solution:

If no configuration changes have been recently made to VxVM diskgroups or volumes and all volumes continue to be accessible from this node, then no action is required. If changes have been made, the device namespace on this node can be updated to reflect those changes by executing '/usr/cluster/lib/dcs/scvxvmlg'. If the problem persists, contact your authorized Sun service provider to determine whether a workaround or patch is available.

501763 :recv_request: t_alloc: %s

Description:

Call to t_alloc() failed. The "t_alloc" man page describes possible error codes. udlm will exit and the node will abort.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.

501917 :process_intention(): IDL exception when communicating to node %d

Description:

An inter-node communication failed, probably because a node died.

Solution:

No action is required; the rgmd should recover automatically.

502022 :fatal: joiners_read_ccr: exiting early because of unexpected exception

Description:

The low-level cluster machinery has encountered a fatal error. The rgmd will produce a core file and will cause the node to halt or reboot.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.

503048 :NULL value returned for resource group property %s.

Description:

NULL value was returned for resource group property.

Solution:

For the property name check the syslog message. Any of the following situations might have occurred. Different user action is needed for these 1) If a new resource group is created or updated, check whether the value of the property is valid. 2) For all other cases, treat it as an Internal error.

503064 :Method <%s> on resource <%s>: Method timed out.

Description:

A VALIDATE method execution has exceeded its configured timeout and was killed by the rgmd. This in turn will cause the failure of a creation or update operation on a resource or resource group.

Solution:

504363 :ERROR: process_resource: resource <%s> is pending_update but no UPDATE method is registered

Description:

A non-fatal internal error has occurred in the rgmd state machine.

Solution:

Since this problem might indicate an internal logic error in the rgmd, please save a copy of the /var/adm/messages files on all nodes, the output of an scstat -g command, and the output of a scrgadm -pvv command. Report the problem to your authorized Sun service provider.

504402 :CMM: Aborting due to stale sequence number. Received a message from node %ld indicating that node %ld has a stale sequence#. node %ld: state '%s', sequence# %lld; node %ld: state '%s', sequence# %lld.

Description:

After receiving a message from the specified remote node, the local node has concluded that it has stale state with respect to the remote node, and will therefore abort. The state of a node can get out-of-date if it has been in isolation from the nodes which have majority quorum.

Solution:

Reboot the node.

505101 :Found another active instance of clexecd. Exiting daemon_process.

Description:

An active instance of clexecd program is already running on the node.

Solution:

This would usually happen if the operator tries to start the clexecd program by hand on a node which is booted in cluster mode. If that is not the case, contact your authorized Sun service provider to determine whether a workaround or patch is available.

506740:Home directory is not set for user %s.

Description:

No home directory set for the specified Sun Cluster HA for BroadVision One-To-One Enterprise user.

Solution:

Set the home directory of the Sun Cluster HA for BroadVision One-To-One Enterprise user to point to the directory containing the Sun Cluster HA for BroadVision One-To-One Enterprise configuration files.

508671 :mmap: %s

Description:

The rpc.pmfd server was not able to allocate shared memory for a semaphore, possibly due to low memory, and the system error is shown. The server does not perform the action requested by the client, and pmfadm returns error. An error message is also output to syslog.

Solution:

Investigate if the machine is running out of memory. If this is not the case, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

508687 :monitor_check: method <%s> failed on resource <%s> in resource group <%s> on node <%s>, exit code <%d>

Description:

In scha_control, monitor_check method of the resource failed on specific node.

Solution:

No action is required, this is normal phenomenon of scha_control, which launches the corresponding monitor_check method of the resource on all candidate nodes and looks for a healthy node which passes the test. If a healthy node is found, scha_control will let the node take over the resource group. Otherwise, scha_control will just exit early.

509069 :CMM: Halting because this node has no configuration info about node %ld which is currently configured in the cluster and running.

Description:

The local node has no configuration information about the specified node. This indicates a misconfiguration problem in the cluster. The /etc/cluster/ccr/infrastructure table on this node may be out of date with respect to the other nodes in the cluster.

Solution:

Correct the misconfiguration problem or update the infrastructure table if out of date, and reboot the nodes. To update the table, boot the node in non-cluster (-x) mode, restore the table from the other nodes in the cluster or backup, and boot the node back in cluster mode.

509136 :Probe failed.

Description:

Fault monitor was unable to perform complete health check of the service.

Solution:

1) Fault monitor would take appropriate action (by restarting or failing over the service). 2) Data service could be under load, try increasing the values for Probe_timeout and Thorough_probe_interval properties. 3) If this problem continues to occur, look at other messages in syslog to determine the root cause of the problem. If all else fails reboot node.

510659 :Failover %s data services must be in a failover resource group.

Description:

The Scalable resource property for the data service was set to FALSE, which indicates a failover resource, but the corresponding data service resource group is not a failover resource group. Failover resources of this resource type must reside in a failover resource group.

Solution:

Decide whether this resource is to be scalable or failover. If scalable, set the Scalable property value to TRUE. If failover, leave Scalable set to FALSE and create this resource in a failover resource group. A failover resource group has its resource group property RG_mode set to Failover.

511177 :clcomm: solaris xdoor door_info failed

Description:

A door_info operation failed. Refer to the "door_info" man page for more information.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

511749 :rgm_launch_method: failed to get method <%s> timeout value from resource <%s>

Description:

Due to an internal error, the rgmd was unable to obtain the method timeout for the indicated resource. This is considered a method failure. Depending on which method was being invoked and the Failover_mode setting on the resource, this might cause the resource group to fail over or move to an error state.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing the problem.

511810 :Property <%s> does not exist in SUNW.HAStorage

Description:

Property set in SUNW.HAStorage type resource is not defined in SUNW.HAStorage.

Solution:

Check /var/adm/message and see what property name is used. Correct it according to the definition in SUNW.HAStorage.

511917 :clcomm: orbdata: unable to add to hash table

Description:

The system records object invocation counts in a hash table. The system failed to enter a new hash table entry for a new object type.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

512422 :WARNING: unknown msg (type %d) was picked up by a lkcm_act, returning LKCM_NOOP

Description:

Warning for unknown message picked up during udlm state update.

Solution:

None.

513538 :scvxvmlg error - mkdirp(%s) failed

Description:

The program responsible for maintaining the VxVM namespace was unable to access the global device namespace. If configuration changes were recently made to VxVM diskgroups or volumes, this node may be unaware of those changes. Recently created volumes may be inaccessible from this node.

Solution:

Verify that the /global/.devices/node@N (N = this node's node number) is mounted globally and is accessible. If no configuration changes have been recently made to VxVM diskgroups or volumes and all volumes continue to be accessible from this node, then no further action is required. If changes have been made, the device namespace on this node can be updated to reflect those changes by executing '/usr/cluster/lib/dcs/scvxvmlg'. If the problem persists, contact your authorized Sun service provider to determine whether a workaround or patch is available.

514156 :PNM: shutting down

Description:

The PNM daemon (pnmd) is shutting down, disabling all monitoring and failover of adapters.

Solution:

This message is informational; no user action is needed.

514688 :Invalid port number %s in the %s property.

Description:

The specified system property does not have a valid part number.

Solution:

Using scrgadm(1M), specify a positive, and valid port number.

514731 :Failed to kill listener process for %s

Description:

Failed to kill listener processes.

Solution:

None

515583 :%s is not a valid IP address.

Description:

Validation method has failed to validate the IP addresses. The mapping for the given IP address in the local host files can't be done: the specified IP address is invalid.

Solution:

Invalid hostnames/IP addresses have been specified while creating the resource. Recreate the resource with valid hostnames.

516407 :reservation warning(%s) - MHIOCGRP_INKEYS error will retry in %d seconds

Description:

The device fencing program has encountered errors while trying to access a device. The failed operation will be retried

Solution:

This is an informational message, no user action is needed.

517009 :lkcm_act: invalid handle was passed %s %d

Description:

Handle for communication with udlmctl is invalid.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.

517343 :clexecd: Error %d from pipe

Description:

clexecd program has encountered a failed pipe(2) system call. The error message indicates the error number for the failure.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

517363 :clconf: Unrecognized property type

Description:

Found the unrecognized property type in the configuration file.

Solution:

Check the configuration file.

518018 :CMM: Node being aborted from the cluster.

Description:

This node is being excluded from the cluster.

Solution:

Node should be rebooted if required. Resolve the problem according to other messages preceding this message.

518291 :Warning: Failed to check if scalable service group %s exists: %s.

Description:

A call to the underlying scalable networking code failed.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.

518721 :Error while retrieving resource name.

Description:

An error occurred during the invocation of a DSDL API to obtain the resource name.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

519262 :Validation failed. SYBASE monitor server startup file RUN_%s not found SYBASE=%s.

Description:

Monitor server was specified in the extension property Monitor_Server_Name. However, monitor server startup file was not found. Monitor server startup file is expected to be: $SYBASE/$SYBASE_ASE/install/RUN_<Monitor_Server_Name>

Solution:

Check the monitor server name specified in the Monitor_Server_Name property. Verify that SYBASE and SYBASE_ASE environment variables are set property in the Environment_file. Verify that RUN_<Monitor_Server_Name> file exists.

520231 :Unable to set the number of threads for the FED RPC threadpool.

Description:

The rpc.fed server was unable to set the number of threads for the RPC threadpool. This happens while the server is starting up.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

520384:%s action in bv_utils Failed.

Description:

The specified action failed to succeed. There might be several reasons for the failure: (1) incorrect configuration, (2) the orbixd daemon is not running, (3) non-existent configuration files, (4) Sun Cluster HA for BroadVision One-To-One Enterprise processes could not start, or (5) Sun Cluster HA for BroadVision One-To-One Enterprise servers could not stop.

Solution:

Look for other syslog messages to get the exact failure location. If it is a Sun Cluster HA for BroadVision One-To-One Enterprise configuration error, manually run it. If you receive the same error message, contact your authorized Sun service provider for assistance. Provide your authorized Sun service provider a copy of the /var/adm/messages files from all nodes.

520982 :CMM: Preempting node %ld from quorum device %s failed with error %d.

Description:

This node was unable to preempt the specified node from the quorum device, indicating that the partition to which the local node belongs has been preempted and will abort. If a cluster gets divided into two or more disjoint subclusters, exactly one of these must survive as the operational cluster. The surviving cluster forces the other subclusters to abort by grabbing enough votes to grant it majority quorum. This is referred to as preemption of the losing subclusters.

Solution:

There may be other related messages that may indicate why the partition to which the local node belongs has been preempted. Resolve the problem and reboot the node.

521393:Backup server stopped.

Description:

Sun Cluster HA for Sybase stopped the Backup Server.

Solution:

No user action required.

521538 :monitor_check: set_env_vars() failed for resource <%s>, resource group <%s>

Description:

During execution of a scha_control(1HA,3HA) function, the rgmd was unable to set up environment variables for method execution, causing a MONITOR_CHECK method invocation to fail. This in turn will prevent the attempted failover of the resource group from its current master to a new master.

Solution:

521918:Validation failed. Connect string is NULL.

Description:

The Connect_String extension property used for fault monitoring is null. This property has the format username/password.

Solution:

Check for syslog messages from other system modules. Verify the resource configuration and the value of the Connect_string property.

522315 :resource group %s on node %s state change to RG_PENDING_ONLINE

Description:

This is a notification from the rgmd that the resource group is being brought online on the indicated node. This may be used by system monitoring tools.

Solution:

This is an informational message, no user action is needed.

522480 :RGM state machine returned error %d

Description:

An error has occurred on this node while attempting to execute the rgmd state machine.

Solution:

522779 :IP address (hostname) string %s in property %s, entry %d could not be resolved to an IP address.

Description:

The IP address (hostname) string within the named property in the message did not resolve to a real IP address.

Solution:

Change the IP address (hostname) string within the entry in the property to one that does resolve to a real IP address. Make sure the syntax of the entry is correct.

523302 :fatal: thr_keycreate: %s (UNIX errno %d)

Description:

The rgmd failed in a call to thr_keycreate(3T). The error message indicates the reason for the failure. The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.

523643 :INTERNAL ERROR: %s

Description:

An internal error has occurred.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.

525628 :CMM: Cluster has reached quorum.

Description:

Enough nodes are operational to obtain a majority quorum; the cluster is now moving into operational state.

Solution:

This is an informational message, no user action is needed.

526403 :ff_open: %s

Description:

A server (rpc.pmfd or rpc.fed) was not able to establish a link to the failfast device, which ensures that the host aborts if the server dies. The error message is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

526413 :Failed to verify that all NAFO groups are in a stable state. Assuming this node cannot respond to client requests.

Description:

The state of the NAFO groups on the node could not be determined.

Solution:

Make sure all adapters and cables are working. Look in the /var/adm/messages file for message from the network monitoring daemon (pnmd).

526492 :Service object [%s, %s, %d] removed from group '%s'

Description:

A specific service known by its unique name SAP (service access point), the three-tuple, has been deleted in the designated group.

Solution:

This is an informational message, no user action is needed.

526846 :Daemon <%s> is not running.

Description:

The HA-NFS fault monitor detected that the specified daemon is no longer running.

Solution:

No action. The fault monitor would restart the daemon. If it doesn't happen, reboot the node.

527795 :clexecd: setrlimit returned %d

Description:

clexecd program has encountered a failed setrlimit() system call. The error message indicates the error number for the failure.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

528020 :CCR: Remove table %s failed.

Description:

The CCR failed to remove the indicated table.

Solution:

The failure can happen due to many reasons, for some of which no user action is required because the CCR client in that case will handle the failure. The cases for which user action is required depends on other messages from CCR on the node, and include: If it failed because the cluster lost quorum, reboot the cluster. If the root file system is full on the node, then free up some space by removing unnecessary files. If the root disk on the afflicted node has failed, then it needs to be replaced. If the cluster repository is corrupted as indicated by other CCR messages, then boot the offending node(s) in -x mode to restore the cluster repository from backup. The cluster repository is located at /etc/cluster/ccr/.

528566 :Method <%s> on resource <%s>, resource group <%s>, is_frozen=<%d>: Method timed out.

Description:

A method execution has exceeded its configured timeout and was killed by the rgmd. Depending on which method was being invoked and the Failover_mode setting on the resource, this might cause the resource group to fail over or move to an error state.

Solution:

Consult resource type documentation to diagnose the cause of the method failure. Other syslog messages occurring just before this one might indicate the reason for the failure. After correcting the problem that caused the method to fail, the operator may choose to issue an scswitch(1M) command to bring resource groups onto desired primaries. Note, if the indicated value of is_frozen is 1, this might indicate an internal error in the rgmd. Please save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.

529131 :Method <%s> on resource <%s>: RPC connection error.

Description:

An attempted method execution failed, due to an RPC connection problem. This failure is considered a method failure. Depending on which method was being invoked and the Failover_mode setting on the resource, this might cause the resource group to fail over or move to an error state; or it might cause an attempted edit of a resource group or its resources to fail.

Solution:

Examine other syslog messages occurring around the same time on the same node, to see if the cause of the problem can be identified. If the same error recurs, you might have to reboot the affected node. After the problem is corrected, the operator may choose to issue an scswitch(1M) command to bring resource groups onto desired primaries, or re-try the resource group update operation.

529191 :clexecd: Sending fd to worker returned %d. Exiting.

Description:

There was some error in setting up interprocess communication in the clexecd program.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

529407 :resource group %s state on node %s change to %s

Description:

This is a notification from the rgmd that a resource group's state has changed. This may be used by system monitoring tools.

Solution:

This is an informational message, no user action is needed.

530064 :reservation error(%s) - do_enfailfast() error for disk %s

Description:

The device fencing program has encountered errors while trying to access a device. All retry attempts have failed.

Solution:

For the user action required by this message, see the user action for message 192619.

530492 :fatal: ucmm_initialize() failed

Description:

The daemon indicated in the message tag (rgmd or ucmmd) was unable to initialize its interface to the low-level cluster membership monitor. This is a fatal error, and causes the node to be halted or rebooted to avoid data corruption. The daemon produces a core file before exiting.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the core file generated by the daemon. Contact your authorized Sun service provider for assistance in diagnosing the problem.

530603 :Warning: Scalable service group for resource %s has already been created.

Description:

It was not expected that the scalable services group for the named resource existed.

Solution:

Rebooting all nodes of the cluster will cause the scalable services group to be deleted.

531148 :fatal: thr_create stack allocation failure: %s (UNIX error %d)

Description:

The rgmd was unable to create a thread stack, most likely because the system has run out of swap space. The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Rebooting the node has probably cured the problem. If the problem recurs, you might need to increase swap space by configuring additional swap devices. See swap(1M) for more information.

531989 :Prog <%s> step <%s>: authorization error: %s.

Description:

An attempted program execution failed, apparently due to a security violation; this error should not occur. The last portion of the message describes the error. This failure is considered a program failure.

Solution:

Correct the problem identified in the error message. If necessary, examine other syslog messages occurring at about the same time to see if the problem can be diagnosed. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing the problem.

532118 :All the SUNW.HAStoragePlus resources that this resource depends on are not online on the local node. Skipping the checks for the existence and permissions of the start/stop/probe commands.

Description:

This is an informational message which means that the SUNW.HAStoragePlus resource(s) that this application resource depends on is not online on the local node and therefore the validation checks related to start/stop/probe commands can not be carried out on the local node.

Solution:

None.

532454 :file specified in USER_ENV parameter %s does not exist

Description:

'User_env' property was set when configuring the resource. File specified in 'User_env' property does not exist or is not readable. File should be specified with fully qualified path.

Solution:

Specify existing file with fully qualified file name when creating resource. If resource is already created, please update resource property 'User_env'.

532654 :The -c or -u flag must be specified for the %s method.

Description:

The arguments passed to the function unexpected omitted the given flags.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.

532979:orbixd is started outside HA Broadvision.Stop orbixd and other BV processes running outside HA BroadVision.

Description:

The orbix daemon probably started outside Sun Cluster HA for BroadVision One-To-One Enterprise.There should not be any Sun Cluster HA for BroadVision One-To-One Enterprise servers or daemons running outside of Sun Cluster HA for BroadVision One-To-One Enterprise.

Solution:

Shut down the orbix daemon running outside of Sun Cluster HA for BroadVision One-To-One Enterprise, and stop all Sun Cluster HA for BroadVision One-To-One Enterprise servers running outside of Sun Cluster HA for BroadVision One-To-One Enterprise. Delete the /var/run/cluster/bv/bv_orbixd_lock_file file if it exists, and then restart the Sun Cluster HA for BroadVision One-To-One Enterprise resources again.

532980 :clcomm: Pathend %p: deferred task not allowed in state %d

Description:

The system maintains state information about a path. A deferred task is not allowed in this state.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

534512 :in libsecurity svc_tp_create failed for transport %s

Description:

A server (rpc.pmfd, rpc.fed or rgmd) was not able to start because it could not create a rpc handle for the network specified. The rpc error message is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

534826 :clexecd: Error %d from start_failfast_server

Description:

clexecd program could not enable one of the mechanisms which causes the node to be shutdown to prevent data corruption, when clexecd program dies.

Solution:

To avoid data corruption, system will halt or reboot the node. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

535044 :Creation of resource <%s> failed because none of the nodes on which VALIDATE would have run are currently up

Description:

In order to create a resource whose type has a registered VALIDATE method, the rgmd must be able to run VALIDATE on at least one node. However, all of the candidate nodes are down. "Candidate nodes" are either members of the resource group's Nodelist or members of the resource type's Installed_nodes list, depending on the setting of the resource's Init_nodes property.

Solution:

Boot one of the resource group's potential masters and retry the resource creation operation.

535181 :Host %s is not valid.

Description:

Validation method has failed to validate the IP addresses.

Solution:

Invalid hostnames/IP addresses have been specified while creating resource. Recreate the resource with valid hostnames. Check the syslog message for the specific information.

535182 :in libsecurity NETPATH=%s

Description:

A server (rpc.pmfd, rpc.fed or rgmd) was not able to start because it could not establish a rpc connection for the network specified, because it couldn't find any transport. This happened because either there are no available transports at all, or there are but none is a loopback. The NETPATH environment variable is shown. This error message is informational, and appears together with other messages appropriate for this situation. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

535886 :Could not find a mapping for %s in %s. It is recommended that a mapping for %s be added to %s.

Description:

No mapping was found in the local hosts file for the specified IP address.

Solution:

Applications may use hostnames instead of IP addresses. It is recommended to have a mapping in the hosts file. Add an entry in the hosts file for the specified IP address.

536091 :Failed to retrieve the cluster handle while querying for property %s: %s.

Description:

Access to the object named failed. The reason for the failure is given in the message.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.

536488 :Error during reads of %s. getvfsent() returned %d.

Description:

An error occured during the reading of the /etc/vfstab file.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

536838 :clconf: Node %d is not in the cluster.

Description:

Executed a program on the node that is not a current member of the cluster.

Solution:

The specified node needs to be rebooted.

537175 :CMM: node %s (nodeid: %ld, incarnation #: %ld) has become reachable.

Description:

The cluster can communicate with the specified node. A node becomes reachable before it is declared up and having joined the cluster.

Solution:

This is an informational message, no user action is needed.

537352 :reservation error(%s) - do_scsi3_preemptandabort() error for disk %s

Description:

The device fencing program has encountered errors while trying to access a device. All retry attempts have failed.

Solution:

For the user action required by this message, see the user action for message 192619.

537380 :Invalid option -%c for the validate method.

Description:

Invalid option is passed to validate call back method.

Solution:

This is an internal error. Contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.

537498 :Invalid value was returned for resource property %s for %s.

Description:

The value returned for the named property was not valid.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.

537607 :Not found clexecd on node %d for %d seconds. Retrying ...

Description:

Could not find clexecd to execute the program on a node. Indicated retry times.

Solution:

This is an informational message, no user action is needed.

538656:Restarting some BV daemons.

Description:

This message is from the Sun Cluster HA for BroadVision One-To-One Enterprise Probe. Some Sun Cluster HA for BroadVision One-To-One Enterprise daemons might not start when the data service starts. The Sun Cluster HA for BroadVision One-To-One Enterprise Probe should restart the daemons that did not start.

Solution:

Make sure the DB is available. The Sun Cluster HA for BroadVision One-To-One Enterprise daemon might not start if DB is not available. If the DB is available no user action needed. The Sun Cluster HA for BroadVision One-To-One Enterprise Probe should take appropriate action.

540274 :got unexpected exception %s

Description:

An inter-node communication failed with an unknown exception.

Solution:

Examine syslog output for related error messages. Save a copy of the /var/adm/messages files on all nodes, and of the regime core file (if any). Contact your authorized Sun service provider for assistance in diagnosing the problem.

540376 :Unable to change the directory to %s: %s. Current directory is /.

Description:

Callback method is failed to change the current directory. Now the callback methods will be executed in "/", so the core dumps from this callbacks will be located in "/".

Solution:

No user action needed. For detailed error message, check the syslog message.

541180 :Sun udlmlib library called with unknown option: '%c'

Description:

Unknown option used while starting up Oracle unix dlm.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.

541206 :Couldn't read deleted directory: error (%d)

Description:

The file system is unable to create temporary copies of deleted files.

Solution:

Mount the affected file system as a local file system, and ensure that there is no file system entry with name "._" at the root level of that file system. Alternatively, run fsck on the device to ensure that the file system is not corrupt.

541818 :Service group '%s' created

Description:

The service group by that name is now known by the scalable services framework.

Solution:

This is an informational message, no user action is needed.

543720 :Desired Primaries is %d. It should be 1.

Description:

Invalid value for Desired Primaries property.

Solution:

Invalid value is set for Desired Primaries property. The value should be 1. Reset the property value using scrgadm(1M).

544252 :Method <%s> on resource <%s>: Execution failed: no such method tag.

Description:

An internal error has occurred in the rpc.fed daemon which prevents method execution. This is considered a method failure. Depending on which method was being invoked and the Failover_mode setting on the resource, this might cause the resource group to fail over or move to an error state, or it might cause an attempted edit of a resource group or its resources to fail.

Solution:

544380 :Failed to retrieve the resource type handle: %s.

Description:

An API operation on the resource type has failed.

Solution:

For the resource type name, check the syslog tag. For more details, check the syslog messages from other components. If the error persists, reboot the node.

544592 :PCSENTRY: %s

Description:

The rpc.pmfd server was not able to monitor a process, and the system error is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

546856 :CCR: Could not find the CCR transaction manager.

Description:

The CCR data server could not find the CCR transaction manager in the cluster.

Solution:

Reboot the cluster. Also contact your authorized Sun service provider to determine whether a workaround or patch is available.

547057 :thr_sigsetmask: %s

Description:

A server (rpc.pmfd or rpc.fed) was not able to establish a link with the failfast device because of a system error. The error message is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

547301 :reservation error(%s) error. Unknown internal error returned from clconf_do_execution().

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the `node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to share device by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.

547385 :dl_bind: bad ACK header %u

Description:

An unexpected error occurred. The acknowledgment header for the bind request (to bind to the physical device) is bad. We are trying to open a fast path to the private transport adapters.

Solution:

Reboot of the node might fix the problem.

548024:RGOffload resource cannot offload the resource group containing itself.

Description:

You may have configured an RGOffload resource to offload the resource group in which it is configured.

Solution:

Please reconfigure RGOffload resource to not offload the resource group in which it is configured.

549190:Text server successfully started.

Description:

Sun Cluster HA for Sybase successfully started the Text Server.

Solution:

No user action required.

549709 :Local node isn't in replica nodelist of service <%s> with path <%s>. No affinity switchover can be done.

Description:

Local node does not support the replica of the service.

Solution:

No user action required.

549969:Error doing stat on device special file <%s> corresponding to path <%>.

Description:

The file system mount point cannot be mapped to global service correctly as stat fails on the device special file corresponding to the file system mount point.

Solution:

Check the definition of mountpoint path in extension property "ServicePaths" of SUNW.HAStorage type resource and make sure they are for global file system with correct entries in /etc/vfstab.

550143 :resource %s state on node %s change to R_OFFLINE

Description:

This is a notification from the rgmd that the resource has been brought offline on the indicated node. This may be used by system monitoring tools.

Solution:

This is an informational message, no user action is needed.

550471 :Failed to initialize the cluster handle: %s.

Description:

An API operation has failed while retrieving the cluster information.

Solution:

This may be solved by rebooting the node. For more details about API failure, check the messages from other components.

551094 :reservation warning(%s) - Unable to open device %s, will retry in %d seconds

Description:

The device fencing program has encountered errors while trying to access a device. The failed operation will be retried

Solution:

This is an informational message, no user action is needed.

551139 :Failed to initialize scalable services group: Error %d.

Description:

The data service in scalable mode was unable to register itself with the cluster networking.

Solution:

There may be prior messages in syslog indicating specific problems. Reboot the node if unable to correct the situation.

551436 :libsecurity: clnt_authenticate failed

Description:

A client of the rpc.pmfd, rpc.fed or rgmd server was unable to initiate an rpc connection, because it failed the authentication process. The pmfadm or scha command exits with error. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

551999 :All global device services successfully switched over to this node.

Description:

An informational message informing that all device swithovers were successful.

Solution:

An informational message only. No action is needed.

553376 :CMM: Open failed with error `(%s)' and errno = %d for quorum device %ld

Description:

The open operation on the specified quorum device failed, and this node will ignore the quorum device.

Solution:

The quorum device has failed or the path to this device may be broken. Refer to the quorum disk repair section of the administration guide for resolving this problem.

553652 :fatal: cannot create thread to wake up President

Description:

The rgmd was unable to create a thread upon starting up. This is a fatal error. The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Make sure that the hardware configuration meets documented minimum requirements. Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.

556466 :clexecd: dup2 of stdout returned with errno %d while exec'ing (%s). Exiting.

Description:

clexecd program has encountered a failed dup2(2) system call. The error message indicates the error number for the failure.

Solution:

The clexecd program will exit and the node will be halted or rebooted to prevent data corruption. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

557585 :No filesystem mounted on %s.

Description:

There is no filesystem mounted on the mount point.

Solution:

None. This is a debug message.

558350:Validation failed. Connect string is incomplete.

Description:

The Connect_String extension property used for fault monitoring has been incorrectly specified. This property has the format username/password.

Solution:

Verify the resource configuration and the value of the Connect_String property. Ensure that there are no spaces in the Connect_String specification.

558742 :Resource group %s is online on more than one node.

Description:

The named resource group should be online on only one node, but it is actually online on more than one node.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.

559550 :Error in opening /etc/vfstab: %s

Description:

Failed to open /etc/vfstab. Error message is followed.

Solution:

Check with system administrator and make sure /etc/vfstab is properly defined.

559614 :Resource <%s> of Resource Group <%s> failed monitor check on node <%s>\n

Description:

Message logged for failed scha_control monitor check methods on specific node.

Solution:

No user action required.

560047 :UNIX DLM version (%d) and SUN Unix DLM library version (%d): compatible.

Description:

The Unix DLM is compatible with the installed version of lubudlm.

Solution:

None.

560781 :Tag %s: could not allocate history.

Description:

The rpc.pmfd server was not able to allocate memory for the history of the tag shown, probably due to low memory. The process associated with the tag is stopped and pmfadm returns error.

Solution:

562397 :Failfast: %s.

Description:

A failfast client has encountered a deferred panic timeout and is going to panic the node. This may happen if a critical userland process, as identified by the message, dies unexpectedly.

Solution:

Check for core files of the process after rebooting the node and report these to your authorized Sun service provider.

563343 :resource type %s updated.

Description:

This is a notification from the rgmd that the operator has edited a property of a resource type. This may be used by system monitoring tools.

Solution:

This is an informational message, no user action is needed.

563847 :INTERNAL ERROR: POSTNET_STOP method is not registered for resource <%s>

Description:

A non-fatal internal error has occurred in the rgmd state machine.

Solution:

563976 :Unable to get socket flags: %s.

Description:

Failed to get status flags for the socket used in communicating with the application.

Solution:

This is an internal error, no user action is required. Also contact your authorized Sun service provider.

564771 :Error in reading /etc/vfstab: getvfsent() returns <%d>

Description:

Error in reading /etc/vfstab. The return code of getvfsent() is followed.

Solution:

Check with system administrator and make sure /etc/vfstab is properly defined.

565159 :"pmfadm -s": Error signaling <%s>: %s

Description:

An error occurred while rpc.pmfd attempted to send a signal to one of the processes of the given tag. The reason for the failure is also given. The signal was sent as a result of a 'pmfadm -s' command.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

565198 :did subpath %s created for instance %d.

Description:

Informational message from scdidadm.

Solution:

No user action required.

565438 :svc_run returned

Description:

The rpc.pmfd server was not able to run, due to an rpc error. This happens while the server is starting up, at boot time. The server does not come up, and an error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

565884 :tag %s: command file %s is not executable

Description:

The rpc.fed server checked the command indicated by the tag, and this check failed because the command is not executable. An error message is output to syslog.

Solution:

Check the permission mode of the command, make sure that it is executable.

565978:Home dir is not set for user %s.

Description:

The home directory for the specified user is not set in the system.

Solution:

Ensure that the home directory is set up correctly for the specified user.

566781 :ORACLE_HOME %s does not exist

Description:

Directory specified as ORACLE_HOME does not exist. ORACLE_HOME property is specified when creating Oracle_server and Oracle_listener resources.

Solution:

Specify correct ORACLE_HOME when creating resource. If resource is already created, please update resource property 'ORACLE_HOME'.

567374 :Failed to stop %s.

Description:

Sun Cluster failed to stop the application.

Solution:

Use process monitor facility (pmfadm (1M)) with -L option to retrieve all the tags that are running on the server. Identify the tag name for the application in this resource. This can be easily identified as the tag ends in the string ".svc" and contains the resource group name and the resource name. Then use pmfadm (1M) with -s option to stop the application. This problem may occur when the cluster is under load and Sun Cluster cannot stop the application within the timeout period specified. You may consider increasing the Stop_timeout property. If the error still persists, then reboot the node.

567610 :PARAMTER_FILE %s does not exist

Description:

Oracle parameter file (typically init.ora) specified in property 'Parameter_file' does not exist or is not readable.

Solution:

Please make sure that 'Parameter_file' property is set to the existing Oracle parameter file. Reissue command to create/update the resource using correct 'Parameter_file'.

567819 :clcomm: Fixed size resource_pool short server threads: pool %d for client %d total %d

Description:

The system can create a fixed number of server threads dedicated for a specific purpose. The system expects to be able to create this fixed number of threads. The system could fail under certain scenarios without the specified number of threads. The server node creates these server threads when another node joins the cluster. The system cannot create a thread when there is inadequate memory.

Solution:

There are two possible solutions. Install more memory. Alternatively, reduce memory usage. Application memory usage could be a factor, if the error occurs when a node joins an operational cluster and not during cluster startup.

568162 :Unable to create failfast thread

Description:

A server (rpc.pmfd or rpc.fed) was not able to start because it was not able to create the failfast thread, which ensures that the host aborts if the server dies. An error message is output to syslog.

Solution:

Investigate if the host is low on memory. If not, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

568314 :Failed to remove node %d from scalable service group %s: %s.

Description:

A call to the underlying scalable networking code failed.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.

569559 :Start of %s completed successfully.

Description:

The start command of the application completed successfully.

Solution:

No action required.

570394 :reservation warning(%s) - USCSI_RESET failed for device %s, returned %d, will retry in %d seconds.

Description:

The device fencing program has encountered errors while trying to access a device. The failed operation will be retried.

Solution:

This is an informational message, no user action is needed.

570802 :fatal: Got error <%d> trying to read CCR when disabling monitor of resource <%s>; aborting node

Description:

Rgmd failed to read updated resource from the CCR on this node.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.

571642 :ucm_callback for cmmreturn generated exception %d

Description:

ucmm callback for step cmmreturn failed.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.

571734 :Validation failed. ORACLE_SID is not set

Description:

ORACLE_SID property for the resource is not set. HA-Oracle will not be able to manage Oracle server if ORACLE_SID is incorrect.

Solution:

Specify correct ORACLE_SID when creating resource. If resource is already created, please update resource property 'ORACLE_SID'.

571825 :Stopping listener %s.

Description:

Informational message. HA-Oracle will be stopping Oracle listener.

Solution:

None

571950 :Fault monitor detected error %s: %ld Action=%s : %s

Description:

Fault monitor has detected an error. Error detected by fault monitor and action taken by fault monitor is indicated in message.

Solution:

None

572885 :Pathprefix %s for resource group %s is not readable: %s.

Description:

A HA-NFS method attempted to access the specified Pathprefix but was unable to do so. The reason for the failure is also logged.

Solution:

This could happen if the filesystem on which the Pathprefix directory resides is not available. Use the HAStorage resource in the resource group to make sure that HA-NFS methods have access to the file system at the time when they are launched. Check to see if the pathname is correct and correct it if not. HA-NFS would attempt to recover from this situation by failing over to some other node.

572955 :host %s: client is null

Description:

The rgm is not able to obtain an rpc client handle to connect to the rpc.fed server on the named host. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

574542 :clexecd: fork1 returned %d. Exiting.

Description:

clexecd program has encountered a failed fork1(2) system call. The error message indicates the error number for the failure.

Solution:

If the error number is 12 (ENOMEM), install more memory, increase swap space, or reduce peak memory consumption. If error number is something else, contact your authorized Sun service provider to determine whether a workaround or patch is available.

574675 :nodeid of ctxp is bad: %d

Description:

nodeid in the context pointer is bad.

Solution:

None. udlm takes appropriate action.

575545 :fatal: rgm_chg_freeze: INTERNAL ERROR: invalid value of rgl_is_frozen <%d> for resource group <%s>

Description:

The in-memory state of the rgmd has been corrupted due to an internal error. The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.

575853 :libsecurity: create of rpc handle to program %ld failed, will keep trying

Description:

A client of the rpc.pmfd, rpc.fed or rgmd server was unable to initiate an rpc connection. The maximum time allowed for connecting (1 hr) has not been reached yet, and the pmfadm or scha command will retry to connect. An accompanying error message shows the rpc error data. The program number is shown. To find out what program corresponds to this number, use the rpcinfo command. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

575875 :CMM: Resetting bus for quorum device %s failed with error %d.

Description:

When a node connected to a quorum device goes down, the surviving node tries to reset the device's bus. That reset operation for the specified quorum device failed with the indicated error.

Solution:

Check to see if the disk identified above is accessible from the node the message was seen on. If it is accessible, then contact your authorized Sun service provider to determine whether a workaround or patch is available.

576196 :clcomm: error loading kernel module: %d

Description:

The loading of the cl_comm module failed.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

576621:Command %s does not have execute permission set.

Description:

The specified pathname, which was passed to a libdsdev routine such as scds_timerun or scds_pmf_start, refers to a file that does not have execute permission set. This could be the result of 1) incorrectly configuring the name of a START or MONITOR_START method or other property, 2) a programming error made by the resource type developer, or 3) a problem with the specified pathname in the file system itself.

Solution:

Ensure that the pathname refers to a regular, executable file.

576744 :INTERNAL ERROR: Invalid resource property type <%d> on resource <%s>

Description:

An attempted creation or update of a resource has failed because of invalid resource type data. This may indicate CCR data corruption or an internal logic error in the rgmd.

Solution:

Use scrgadm(1M) -pvv to examine resource properties. If the resource or resource type properties appear to be corrupted, the CCR might have to be rebuilt. If values appear correct, this may indicate an internal error in the rgmd. Retry the creation or update operation. If the problem recurs, save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance.

577140 :clcomm: Exception during unmarshal_receive

Description:

The server encountered an exception while unmarshalling the arguments for a remote invocation. The system prints the exception causing this error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

579190 :INTERNAL ERROR: resource group <%s> state <%s> node <%s> contains resource <%s> in state <%s>

Description:

The rgmd has discovered that the indicated resource group's state information appears to be incorrect. This may prevent any administrative actions from being performed on the resource group.

Solution:

579235 :Method <%s> on resource <%s> terminated abnormally

Description:

A resource method terminated without using an exit(2) call. The rgmd treats this as a method failure.

Solution:

Consult resource type documentation, or contact the resource type developer for further information.

579987 :Error binding '%s' in the name server. Exiting.

Description:

clexecd program was unable to start because of some problems in the low-level clustering software.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

580163 :reservation warning(%s) - MHIOCTKOWN error will retry in %d seconds

Description:

The device fencing program has encountered errors while trying to access a device. The failed operation will be retried

Solution:

This is an informational message, no user action is needed.

580416 :Cannot restart monitor: Monitor is not enabled.

Description:

An update operation on the resource would have been restarted the fault monitor. But, the monitor is currently disabled for the resource.

Solution:

This is informational message. Check whether the monitor is disabled for the resource. If not, consider it as an internal error and contact your authorized Sun service provider.

581180 :launch_validate: call to rpc.fed failed for resource <%s>, method <%s>

Description:

The rgmd failed in an attempt to execute a VALIDATE method, due to a failure to communicate with the rpc.fed daemon. If the rpc.fed process died, this might lead to a subsequent reboot of the node. Otherwise, this will cause the failure of a creation or update operation on a resource or resource group.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified. Retry the creation or update operation. If the problem recurs, save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance.

581376 :clcomm: solaris xdoor: too much reply data

Description:

The reply from a user level server will not fit in the available space.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

581413 :Daemon %s is not running.

Description:

HA-NFS fault monitor checks the health of statd, lockd, mountd and nfsd daemons on the node. It detected that one these are not currently running.

Solution:

No action. The monitor would restart these.

581898 :Application failed to stay up. Start method Failure.

Description:

The application being started under pmf has exited. Either the user has decided to stop monitoring this process, or the process exceeded the number of retries. An error message is output to syslog.

Solution:

Check syslog messages and correct the problems specified in prior syslog messages. If the error still persists, please report this problem.

581902 :(%s) invalid timeout '%d'

Description:

Invalid timeout value for a method.

Solution:

Make sure udlm.conf file has correct timeouts for methods.

582418:Validation failed. SYBASE ASE startserver file not found SYBASE=%s.

Description:

The Sybase Adaptive Server starts by the execution of the startserver file. This file is missing. The SYBASE directory is specified as a part of this message.

Solution:

Verify the Sybase installation including the existence and proper permissions of the startserver file in the $SYBASE/$SYBASE_ASE/install directory.

582651 :tag %s: does not belong to caller

Description:

The user sent a suspend/resume command to the rpc.fed server for a tag that was started by a different user. An error message is output to syslog.

Solution:

Check the tag name.

582757 :No PDT Fastpath thread.

Description:

The system has run out of resources that is required to create a thread. The system could not create the Fastpath thread that is required for cluster networking.

Solution:

If cluster networking is required, add more resources (most probably, memory) and reboot.

583138 :dfstab not readable

Description:

HA-NFS fault monitor failed to read dfstab when it detected that dfstab has been modified.

Solution:

Make sure the dfstab file exists and has read permission set appropriately. Look at the prior syslog messages for any specific problems and correct them.

583224:Rebooting this node because daemon %s is not running.

Description:

The rpcbind daemon on this node is not running.

Solution:

No user action required. Fault monitor should reboot the node. Also see message id 804791.

583542 :clcomm: Pathend: would abort node because %s for %u ms

Description:

The system would have aborted the node for the specified reason if the check for send thread running was enabled.

Solution:

No user action is required.

583563 :fatal: rgm_run_state: internal error: bad state <%d> for resource group <%s>

Description:

An internal error has occurred. The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.

584386 :PENDING_OFFLINE: bad resource state <%s> (%d) for resource <%s>

Description:

The rgmd state machine has discovered a resource in an unexpected state on the local node. This should not occur and may indicate an internal logic error in the rgmd.

Solution:

Look for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.

586298 :clcomm: unknown type of signals message %d

Description:

The system has received a signals message of unknown type.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

586344 :clcomm: unable to unbind %s from name server

Description:

The name server would not unbind the specified entity.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

586689:Cannot access the %s command <%s> : <%>.

Description:

The command input to the agent builder is not accessible and executable. This may be due to the program not existing or the permissions not being set properly.

Solution:

Make sure the program in the command exists, is in the proper directory, and has read and execute permissions set appropriately.

589719 :Issuing failover request.

Description:

This is informational message. We are above to call API function to request for failover. In case of failure, follow the syslog messages after this message.

Solution:

No user action is needed.

589817 :clcomm: nil_sendstream::send

Description:

The system attempted to use a "send" operation for a local invocation. Local invocations do not use a "send" operation.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

590263 :Online check Error %s : %ld

Description:

Error detected when checking ONLINE status of RDBMS. Error number is indicated in message.This can be because of RDBMS server problems or configuration problems.

Solution:

Check RDBMS server using vendor provided tools. If server is running properly, this can be fault monitor set-up error.

590454:TCPTR: Machine with MAC address %s is using cluster private IP address %s on a network reachable from me. Path timeouts are likely.

Description:

The transport at the local node detected an arp cache entry that showed the specified MAC address for the above IP address. The IP address is in use at this cluster on the private network. However, the MAC address is a foreign MAC address. A possible cause is that this machine received an ARP request from another machine that does not belong to this cluster, but hosts the same IP address using the above MAC address on a network accessible from this machine. The transport has temporarily corrected the problem by flushing the offending arp cache entry. However, unless corrective steps are taken, TCP/IP communication over the relevant subnet of the private interconnect might break down, thus causing path downs.

Solution:

Make sure that no machine outside this cluster hosts this IP address on a network reachable from this cluster. If there are other clusters sharing a public network with this cluster, please make sure that their private network adapters are not miscabled to the public network. By default all clusters use the same set of IP addresses on their private networks.

590700 :ALERT_LOG_FILE %s doesn't exist

Description:

File specified in resource property 'Alert_log_file' does no exist. HA-Oracle requires correct Alert Log file for fault monitoring.

Solution:

Check 'Alert_log_file' property of the resource. Specify correct Oracle Alert Log file when creating resource. If resource is already created, please update resource property Alert_log_file'.

592233 :setrlimit(RLIMIT_NOFILE): %s

Description:

he rpc.pmfd server was not able to set the limit of files open. The message contains the system error. This happens while the server is starting up, at boot time. The server does not come up, and an error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

592285 :clexecd: getrlimit returned %d

Description:

clexecd program has encountered a failed getrlimit(2) system call. The error message indicates the error number for the failure.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

592378 :Resource %s is not online anywhere.

Description:

The named resource is not online on any cluster node.

Solution:

None. This is an informational message.

593330 :Resource type name is null.

Description:

This is an internal error. While attempting to retrieve the resource information, null value was retrieved for the resource type name.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.

594629 :Failed to stop the fault monitor.

Description:

Process monitor facility has failed to stop the fault monitor.

Solution:

Use pmfadm(1M) with -L option to retrieve all the tags that are running on the server. Identify the tag name for the fault monitor of this resource. This can be easily identified, as the tag ends in string ".mon" and contains the resource group name and the resource name. Then use pmfadm (1M) with -s option to stop the fault monitor. If the error still persists, then reboot the node.

594675 :reservation warning(%s) - MHIOCGRP_REGISTER error will retry in %d seconds

Description:

The device fencing program has encountered errors while trying to access a device. The failed operation will be retried

Solution:

This is an informational message, no user action is needed.

595077 :Error in hasp_check. Validation failed.

Description:

Internal error occured in hasp_check.

Solution:

Check the errors logged in the syslog messages by hasp_check, and then verify existance of /usr/cluster/bin/hasp_check binary. Report this problem to your Sun service provider.

595101 :t_sndudata in send_reply: %s

Description:

Call to t_sndudata() failed. The "t_sndudata" man page describes possible error codes. udlm will try to resend the message. abort.

Solution:

None.

595686 :%s is %d for %s. It should be 1.

Description:

The named property has an unexpected value.

Solution:

Change the value of the property to be 1.

596447 :UNIX DLM is asking for a reconfiguration to recover from a communication error.

Description:

A reconfiguration has been requested by udlm.

Solution:

None.

596604 :clcomm: solookup on routing socket failed with error = %d

Description:

The system prepares IP communications across the private interconnect. A lookup operation on the routing socket failed.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

597171 :Unexpected early exit while performing: '%s'

Description:

clexecd program got an error while executing the program indicated in the error message.

Solution:

Please check the error message. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

597239 :The weight portion of %s at position %d in property %s is not a valid weight. The weight should be an integer between %d and %d.

Description:

The weight noted does not have a valid value. The position index, which starts at 0 for the first element in the list, indicates which element in the property list was invalid.

Solution:

Give the weight a valid value.

597381 :setrlimit before exec: %s

Description:

rpc.pmfd was unable to set the number of file descriptors before executing a process.

Solution:

Look for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.

598087 :PCWSTOP: %s

Description:

The rpc.pmfd server was not able to monitor a process, and the system error is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

598259 :scvxvmlg fatal error - ckmode received unknown mode %d

Description:

Solution:

598540 :clcomm: solaris xdoor: completed invo: door_return returned, errno = %d

Description:

An unusual but harmless event occurred. System operations continue unaffected.

Solution:

No user action is required.

598554 :launch_validate_method: getlocalhostname() failed for resource <%s>, resource group <%s>, method <%s>

Description:

The rgmd was unable to obtain the name of the local host, causing a VALIDATE method invocation to fail. This in turn will cause the failure of a creation or update operation on a resource or resource group.

Solution:

598979 :tag %s: already suspended

Description:

The user sent a suspend command to the rpc.fed server for a tag that is already suspended. An error message is output to syslog.

Solution:

Check the tag name.

599430 :Failed to retrieve the resource property %s: %s.

Description:

An API operation has failed while retrieving the resource property. Low memory or API call failure might be the reasons.

Solution:

In case of low memory, the problem will probably cured by rebooting. If the problem reoccurs, you might need to increase swap space by configuring additional swap devices. Otherwise, if it is API call failure, check the syslog messages from other components. For the resource name and property name, check the current syslog message.

599558 :SIOCLIFADDIF of %s failed: %s.

Description:

Specified system operation failed

Solution:

This is as an internal error. Contact your authorized Sun service provider with the following information. 1) Saved copy of /var/adm/messages file. 2) Output of "ifconfig -a" command.