Sun Cluster 3.1 10/03 Error Messages Guide

Message IDs 600000–699999


600967 Could not allocate buffer for DBMS log messages: %m

Description:

Fault monitor could not allocate memory for reading RDBMS log file. As a result of this error, fault monitor will not scan errors from log file. However it will continue fault monitoring.

Solution:

Check if system is low on memory. If problem persists, please stop and start the fault monitor.


600967 Could not allocate buffer for DBMS log messages: %m

Description:

Fault monitor could not allocate memory for reading RDBMS log file. As a result of this error, fault monitor will not scan errors from log file. However it will continue fault monitoring.

Solution:

Check if system is low on memory. If problem persists, please stop and start the fault monitor.

601852 rpcbind is not responding, however /tmp/portmap.file and /tmp/rpcbind.file exist. rpcbind can be restarted with /usr/sbin/rpcbind -w. Not taking any action.


601901 Failed to retrieve the resource property %s for %s: %s.

Description:

The query for a property failed. The reason for the failure is given in the message.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


603096 resource %s disabled.

Description:

This is a notification from the rgmd that the operator has disabled a resource. This may be used by system monitoring tools.

Solution:

This is an informational message, no user action is needed.


603149 Error reading file %s

Description:

Specified file could not be read or opened.

Solution:

Examine other syslog messages occurring around the same time on the same node, to see if the source of the problem can be identified.


603490 Only a single path to the WLS Home directoryhas to be set in Confdir_list

Description:

Only one single path to the WLS home directory has to be set in the Confdir_list property. The resource creation will fail if multiple home directories are configured in Confdir_list.

Solution:

The Confdir_list extension property takes only one single path to the WLS home directory. Set a single path and create the resource again.


604153 clcomm: Path %s errors during initiation

Description:

Communication could not be established over the path. The interconnect may have failed or the remote node may be down.

Solution:

Any interconnect failure should be resolved, and/or the failed node rebooted.


604784 Probing SAP xserver timed out with command %s.

Description:

Probing the SAP xserver with the listed command timed out.

Solution:

Other syslog messages occurring just before this one might indicate the reason for the failure. You might consider increase the time out value for the method error was generated from.


605102 This node can be a primary for scalable resource %s, but there is no IPMP group defined on this node. A IPMP group must be created on this node.

Description:

The node does not have a IPMP group defined.

Solution:

Any adapters on the node which are connected to the public network should be put under IPMP control by placing them in a IPMP group. See the ifconfig(1M) man page for details.


605301 lkcm_sync: invalid handle was passed %s%d

Description:

Invalid handle passed during lockstep execution.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


605330 clapi_mod: Class<%s> SubClass<%s> Pub<%s> Seq<%lld>

Description:

The clapi_mod in the syseventd is delivering the specified event to its subscribers.

Solution:

This message is informational only, and does not require user action.


606138 SCDPMD Error: ${SERVER} is already running.

Description:

The scdpmd init script found the daemon scdpmd already running. It will not start it again.

Solution:

No action required.


606203 Couldn't get the root vnode: error (%d)

Description:

The file system is corrupt or was not mounted correctly.

Solution:

Run fsck, and mount the affected file system again.


606362 The stop command <%s> failed to stop the application. Will now use SIGKILL to stop the application.

Description:

The stop command was unable to stop the application. The STOP method will now stop the application by sending it SIGKILL.

Solution:

Look at application and system logs for the cause of the failure.


606362 The stop command <%s> failed to stop the application. Will now use SIGKILL to stop the application.

Description:

The user provided stop command cannot stop the application. Will re-attempt to stop the application by sending SIGKILL to the pmf tag.

Solution:

No action required.


606467 CMM: Initialization for quorum device %s failed with error EACCES. Will retry later.

Description:

This node is not able to access the specified quorum device because the node is still fenced off. An attempt will be made to access the quorum device again after the node's CCR has been recovered.

Solution:

This is an informational message, no user action is needed.


607054 %s not found.

Description:

Could not find the binary to startup udlm.

Solution:

Make sure the unix dlm package is intalled properly.


607613 transition '%s' timed out for cluster, as did attempts to reconfigure.

Description:

Step transition failed. udlmctl will exit.

Solution:

Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


607678 clconf: No valid quorum_resv_key field for node %u

Description:

Found the quorum_resv_key field being incorrect while converting the quorum configuration information into quorum table.

Solution:

Check the quorum configuration information.


607726 sysevent_subscribe_event(): %s

Description:

The cl_apid or cl_eventd was unable to create the channel by which it receives sysevent messages. It will exit.

Solution:

Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


608202 scha_control: resource group <%s> was frozen on Global_resources_used within the past %d seconds; exiting

Description:

A scha_control call has failed with a SCHA_ERR_CHECKS error because the resource group has a non-null Global_resources_used property, and a global device group was failing over within the indicated recent time interval. The resource fault probe is presumed to have failed because of the temporary unavailability of the device group. A properly-written resource monitor, upon getting the SCHA_ERR_CHECKS error code from a scha_control call, should sleep for awhile and restart its probes.

Solution:

No user action is required. Either the resource should become healthy again after the device group is back online, or a subsequent scha_control call should succeed in failing over the resource group to a new master.


608286 Stopping the text server.

Description:

The Text server is about to be brought down by Sun Cluster HAfor Sybase.

Solution:

This is an information message, no user action is needed.


608453 failfast disarm error: %d

Description:

Error during a failfast device disarm operation.

Solution:

None.


608876 PCRUN: %s

Description:

The rpc.pmfd server was not able to monitor a process, and the system error is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


609118 Error creating deleted directory: error (%d)

Description:

While mounting this file system, PXFS was unable to create some directories that it reserves for internal use.

Solution:

If the error is 28(ENOSPC), then mount this FS non-globally, make some space, and then mount it globally. If there is some other error, and you are unable to correct it, contact your authorized Sun service provider to determine whether a workaround or patch is available.


610273 IPMP group %s has failed, so scalable resource %s in resource group %s may not be able to respond to client requests. A request will be issued to relocate resource %s off of this node.

Description:

The named IPMP group has failed, so the node may not be able to respond to client requests. It would be desirable to move the resource to another node that has functioning IPMP groups. A request will be issued on behalf of this resource to relocate the resource to another node.

Solution:

Check the status of the IPMP group on the node. Try to fix the adapters in the IPMP group.


611101 ucmmd startup program %s not found

Description:

Unable to locate the required program indicated in message. The ucmmd daemon and RAC framework is not started on this node. Oracle parallel server/ Real Application Clusters database instances will not be able to start on this node.

Solution:

Verify installation of SUNWscucm package. Refer to the documentation of Sun Cluster support for Oracle Parallel Server/ Real Application Clusters for installation procedure. If problem persists, contact your Sun service representative.


612049 resource <%s> in resource group <%s> depends on disabled network address resource <%s>

Description:

An enabled application resource was found to implicitly depend on a network address resource that is disabled. This error is non-fatal but may indicate an internal logic error in the rgmd.

Solution:

Look for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


612117 Failed to stop Text server.

Description:

Sun Cluster HA for Sybase failed to stop text server.

Solution:

Please examine whether any Sybase server processes are running on the server. Please manually shutdown the server.


612124 Volume configuration daemon not running.

Description:

Volume manager is not running.

Solution:

Bring up the volume manager.


612562 Error on line %ld

Description:

Indicates the line number on which the error occurred.

Solution:

Please ensure that all entries in the custom monitor action file are valid and follow the correct syntax. After the file is corrected, repeat the operation that was being performed.


612931 Unable to get device major number for %s driver: %s.

Description:

System was unable to translate the given driver name into device major number.

Solution:

Check whether the /etc/name_to_major file is corrupted. Reboot the node if problem persists.


613458 All devices services started successfully.

Description:

All device services specified directly or indirectly via the GlobalDevicePath and FilesystemMountPoint extension properties respectively are started on a given node.

Solution:


613522 clexecd: Error %d from poll. Exiting.

Description:

clexecd program has encountered a failed poll(2) system call. The error message indicates the error number for the failure.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


613896 INTERNAL ERROR: process_resource: Resource <%s> is R_BOOTING in PENDING_OFFLINE or PENDING_DISABLED resource group

Description:

The rgmd is attempting to bring a resource group offline on a node where BOOT methods are still being run on its resources. This should not occur and may indicate an internal logic error in the rgmd.

Solution:

Look for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


613984 scha_control: request failed because the given resource group <%s> does not contain the given resource <%s>

Description:

A resource monitor (or some other program) is attempting to initiate a restart or failover on the indicated resource and group by calling scha_control(1ha),(3ha). However, the indicated resource group does not contain the indicated resource, so the request is rejected. This represents a bug in the calling program.

Solution:

The resource group may be restarted manually on the same node or switched to another node by using scswitch(1m) or the equivalent GUI command. Contact the author of the data service (or of whatever program is attempting to call scha_control) and report the error.


614706 Failed to create directory: <%s>

Description:

Not available at this time.

Solution:

Not available at this time.


615120 fatal: unknown scheduling class '%s'

Description:

An internal error has occurred. The daemon indicated in the message tag (rgmd or ucmmd) will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the core file generated by the daemon. Contact your authorized Sun service provider for assistance in diagnosing the problem.


615814 INITUCMM Notice: not OK to join, retry in ${RETRY_INTERVAL} seconds (

Description:

This is informational message. This message can be seen when ucmm reconfiguration is in progress on other cluster nodes. The operation will be retried.

Solution:

This message is informational; no user action is needed.


616999 did reconfiguration discovered invalid diskpath This path must be removed before a new path can be added. Please run did cleanup (-C) then re-run did reconfiguration (-r).\n

Description:

During scdidadm -r reconfiguration, a non-existent diskpath was found in the current namespace. This must be cleaned up before any new subpath can be added by scdidadm.

Solution:

Run devfsadm -C, then scdidadm -C then re-run scdidadm -r.


617643 Unable to fork(): %s.

Description:

Upon an IPMP failure, the system was unable to take any action, because it failed to fork another process.

Solution:

This might be the result from the lack of the system resources. Check whether the system is low in memory or the process table is full, and take appropriate action. For specific error information check the syslog message.


617917 Initialization failed. Invalid command line %s %s

Description:

Unable to process parameters passed to the call back method. This is an internal error.

Solution:

Please report this problem.


617917 Initialization failed. Invalid command line %s %s

Description:

Unable to process parameters passed to the call back method. This is an internal error.

Solution:

Please report this problem.


618107 Path %s initiation encountered errors, errno = %d. Remote node may be down or unreachable through this path.

Description:

Communication with another node could not be established over the path.

Solution:

Any interconnect failure should be resolved, and/or the failed node rebooted.


618466 Unix DLM no longer running

Description:

UNIX DLM is expected to be running, but is not. This will result in a udlmstep1 failure.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


618585 clexecd: getmsg returned %d. Exiting.

Description:

clexecd program has encountered a failed getmsg(2) system call. The error message indicates the error number for the failure.

Solution:

The clexecd program will exit and the node will be halted or rebooted to prevent data corruption. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


618637 The port number %d from entry %s in property %s was not found in config file <%s>.

Description:

All entries in the list property must have port numbers that correspond to ports configured in the configuration file. The port number from the list entry does not correspond to a port in the configuration file.

Solution:

Remove the entry or change its port number to correspond to a port in the configuration file.


618764 fe_set_env_vars() failed for Resource <%s>, resource group <%s>, method <%s>

Description:

The rgmd was unable to set up environment variables for a method execution, causing the method invocation to fail. Depending on which method was being invoked and the Failover_mode setting on the resource, this might cause the resource group to fail over or move to an error state.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


619171 Failed to retrieve information for user %s for SAP system %s.

Description:

Failed to retrieve home directory for the specified SAP user for the specified system ID.

Solution:

Check the system ID for SAP. SAPSID is case sensitive.


619184 %s: Unable to register callback function.

Description:

The daemon has encountered error in RPC.

Solution:

Not available at this time.


619213 t_alloc (recv_request) failed with error %d

Description:

Call to t_alloc() failed. The "t_alloc" man page describes possible error codes. ucmmd will exit and the node will abort.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


619312 "%s" restarting too often ... sleeping %d seconds.

Description:

The tag shown, run by rpc.pmfd server, is restarting and exiting too often. This means more than once a minute. This can happen if the application is restarting, then immediately exiting for some reason, then the action is executed and returns OK (0), which causes the server to restart the application. When this happens, the rpc.pmfd server waits for up to 1 min before it restarts the application. An error message is output to syslog.

Solution:

Examine the state of the application, try to figure out why the application doesn't stay up, and yet the action returns OK.


620204 Failed to start scalable service.

Description:

Unable to configure service for scalability.

Solution:

The start method on this node will fail. Sun Cluster resource management will attempt to start the service on some other node.


621686 CCR: Invalid checksum length %d in table %s, expected %d.

Description:

The checksum of the indicated table has a wrong size. This causes the consistency check of the indicated table to fail.

Solution:

Boot the offending node in -x mode to restore the indicated table from backup or other nodes in the cluster. The CCR tables are located at /etc/cluster/ccr/.


622387 constchar*fmt

Description:

Function definition. Please ignore

Solution:

None


622387 constchar*fmt

Description:

Function definition. Please ignore

Solution:

None


623528 clcomm: Unregister of adapter state proxy failed

Description:

The system failed to unregister an adapter state proxy.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


623635 Warning: Failed to configure client affinity for group %s: %s

Description:

A call to the underlying scalable networking code failed.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


623759 svc_setschedprio: Could not lookup RT (real time) scheduling class info: %s

Description:

The server was not able to determine the scheduling mode info, and the system error is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


624265 Text server terminated.

Description:

Text server processes were stopped in STOP method.

Solution:

None


624447 fatal: sigaction: %s (UNIX errno %d)

Description:

The rgmd has failed to initialize signal handlers by a call to sigaction(2). The error message indicates the reason for the failure. The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


626478 stat of file %s failed: <%s>.

Description:

There was a failure to stat the specified file.

Solution:

Make sure that the file exists.


627610 clconf: Invalid clconf_obj type

Description:

An invalid clconf_obj type has been encountered while converting an clconf_obj type to group name. Valid objtypes are "CL_CLUSTER", "CL_NODE", "CL_ADAPTER", "CL_PORT", "CL_BLACKBOX", "CL_CABLE", "CL_QUORUM_DEVICE".

Solution:

This is an unrecoverable error, and the cluster needs to be rebooted. Also contact your authorized Sun service provider to determine whether a workaround or patch is available.


628771 CCR: Can't read CCR metadata.

Description:

Reading the CCR metadata failed on this node during the CCR data server initialization.

Solution:

There may be other related messages on this node, which may help diagnose the problem. For example: If the root disk on the afflicted node has failed, then it needs to be replaced. If the cluster repository is corrupted, then boot this node in -x mode to restore the cluster repository from backup or other nodes in the cluster. The cluster repository is located at /etc/cluster/ccr/.


629154 Validation failed. Resource group property RG_AFFINITIES should specify a SCALABLE resource group containing the RAC framework resources

Description:

The resource being created or modified must belong to a group that has an affinity with the SCALABLE RAC framework resource group.

Solution:

If not already created, create the RAC framework resource group and it's associated resources. Then specify the RAC resource group for this resource's group RG_AFFINITIES property.


629584 pthread_mutex_init: %s

Description:

The cl_apid was unable to initialize a synchronization object, so it was not able to start-up. The error message is specified.

Solution:

Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


630250 in libsecurity for program %s (%lu); setnetpath failed: %s

Description:

The specified server was not able to initiate an rpc connection, because it could not get the network database handle. The server does not start. The rpc error message is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


630462 Error trying to get logical hostname: <%s>.

Description:

There was an error while trying to get the logical hostname. The reason for the error is specified.

Solution:

Save a copy of /var/adm/messages from all nodes of the cluster and contact your Sun support representative for assistance.


630653 Failed to initialize DCS

Description:

There was a fatal error while this node was booting.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


630971 No memory

Description:

Not available at this time.

Solution:

Not available at this time.


631373 Fatal error; aborting the rpc.fed daemon.Explanation The rpc.fed server experienced an unrecoverable error, and is aborting the node.

Solution:

Save the syslog messages file. Examine other syslog messages occurring around the same time on the same node, to see if the cause of the problem can be identified. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


631408 PCSET: %s

Description:

The rpc.pmfd server was not able to monitor a process, and the system error is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


631429 huge address size %d

Description:

Size of MAC address in acknowledgment of the bind request exceeds the maximum size allotted. We are trying to open a fast path to the private transport adapters.

Solution:

Reboot of the node might fix the problem.


631648 Retrying to retrieve the resource group information.

Description:

An update to cluster configuration occurred while resource group properties were being retrieved

Solution:

Ignore the message.


632435 Error: attempting to copy larger list to smaller

Description:

The cl_apid experienced an internal error.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


632645 Timeout retrieving result of bind to %s port %d for non-secure resource %s

Description:

An error occurred while fault monitor attempted to probe the health of the data service.

Solution:

Wait for the fault monitor to correct this by doing restart or failover. For more error description, look at the syslog messages.


633457 reservation fatal error(%s) - my_map_to_did_device() error in is_scsi3_disk()

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.


633745 pthread_kill: %s

Description:

The rpc.fed server encountered an error with the pthread_kill function. The message contains the system error.

Solution:

Save the syslog messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


634957 thr_keycreate failed in init_signal_handlers

Description:

The ucmmd failed in a call to thr_keycreate(3T). ucmmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes and of the ucmmd core. Contact your authorized Sun service provider for assistance in diagnosing the problem.


636851 INTERNAL ERROR: usage: $0 <gateway_root>

Description:

An internal error has occurred.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


637372 invalid IP address in hosts list: %s

Description:

The allow_hosts or deny_hosts for the CRNP service contained an invalid IP address. This error may cause the validate method to fail or prevent the cl_apid from starting up.

Solution:

Remove the offending IP address from the allow_hosts or deny_hosts property.


637677 (%s) t_alloc: tli error: %s

Description:

Call to t_alloc() failed. The "t_alloc" man page describes possible error codes. udlmctl will exit.

Solution:

Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


638868 %s does not exist or is not mounted.

Description:

Not available at this time.

Solution:

Not available at this time.


639087 reservation warning() - Failure fencing in progress for %s, retrying...

Description:

Device fencing is still in progress, so shared data cannot yet be accessed from this node. Starting of device groups will be delayed until the fencing has completed.

Solution:

This is an informational message, no user action is needed.


639855 IPMP group %s has status %s. Assuming this node cannot respond to client requests.

Description:

The state of the IPMP group named is degraded.

Solution:

Make sure all adapters and cables are working. Look in the /var/adm/messages file for message from the network monitoring daemon (pnmd).


640029 PENDING_ONLINE: bad resource state <%s> (%d) for resource <%s>

Description:

The rgmd state machine has discovered a resource in an unexpected state on the local node. This should not occur and may indicate an internal logic error in the rgmd.

Solution:

Look for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


640087 udlmctl: incorrect comand line

Description:

udlmctl will not startup because of incorrect command line options.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


640090 CMM: Initialization for quorum device %s failed with error %d.

Description:

The initialization of the specified quorum device failed with the specified error, and this node will ignore this quorum device.

Solution:

There may be other related messages on this node which may indicate the cause of this problem. Refer to the quorum disk repair section of the administration guide for resolving this problem.


640484 clconf: No valid votecount field for quorum device %d

Description:

Found the votecount field for the quorum device being incorrect while converting the quorum configuration information into quorum table.

Solution:

Check the quorum configuration information.


640799 pmf_alloc_thread: ENOMEM

Description:

The rpc.pmfd server was not able to allocate a new monitor thread, probably due to low memory. As a consequence, the rpc.pmfd server was not able to monitor a process. An error message is output to syslog.

Solution:

Investigate if the machine is running out of memory. If this is not the case, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


641686 sigaction: %s The rpc.fed server encountered an error with the sigaction function, and was not able to start. The message contains the system error.

Solution:

Save the syslog messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


642678 INTERNAL ERROR: usage: $0 <logicalhost> <server_root> <siebel_enterprise> <siebel_servername>

Description:

An internal error has occurred.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


642678 INTERNAL ERROR: usage: $0 <logicalhost> <server_root> <siebel_enterprise> <siebel_servername>

Description:

An internal error has occurred.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


643472 fatal: Got error <%d> trying to read CCR when enabling resource <%s>; aborting node

Description:

Rgmd failed to read updated resource from the CCR on this node.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


643802 Resource group is online on more than one node.

Description:

An internal error has occurred. Resource group should be online on only one node.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


644140 Fault monitor is not running.

Description:

Sun cluster tried to stop the fault monitor for this resource, but the fault monitor was not running. This is most likely because the fault monitor was unable to start.

Solution:

Look for prior syslog messages relating to starting of fault monitor and take corrective action. No other action needed


644850 File %s is not readable: %s.

Description:

Unable to open the file in read only mode.

Solution:

Make sure the specified file exists and have correct permissions. For the file name and details, check the syslog messages.


644850 File %s is not readable: %s.

Description:

Unable to open the file in read only mode.

Solution:

Make sure the specified file exists and have correct permissions. For the file name and details, check the syslog messages.


645309 Thread already running for node %d.

Description:

The cl_eventd does not need to create a thread because it found one that it could use.

Solution:

This message is informational only, and does not require user action.


645501 %s initialization failure

Description:

Failed to initialize the hafoip or hascip callback method.

Solution:

Retry the operation. If the error persists, contact your Sun service representative.


646037 Probe timed out.

Description:

The data service fault monitor probe could not complete all actions in Probe_timeout. This may be due to an overloaded system or other problems. Repeated timeouts will cause a restart or failover of the data service.

Solution:

If this problem is due to an overloaded system, you may consider increasing the Probe_timeout property.


646037 Probe timed out.

Description:

The simple probe on the network aware application timed out.

Solution:

This problem may occur when the cluster is under heavy load. You may consider increasing the Probe_timeout property.


646664 Online check Error %s: %ld: %s

Description:

Error detected when checking ONLINE status of RDBMS. Error number is indicated in message. This can be because of RDBMS server problems or configuration problems.

Solution:

Check RDBMS server using vendor provided tools. If server is running properly, this can be fault monitor set-up error.


646815 PCUNSET: %s

Description:

The rpc.pmfd server was not able to monitor a process, and the system error is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


646950 clcomm: Path %s being cleaned up

Description:

A communication link is being removed with another node. The interconnect may have failed or the remote node may be down.

Solution:

Any interconnect failure should be resolved, and/or the failed node rebooted.


647339 (%s) scan of dlmmap failed on "%s", idx =%d

Description:

Failed to scan dlmmap.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


647673 scvxvmlg error - dcs_get_service_parameters() failed, returned %d

Description:

The program responsible for maintaining the VxVM namespace has suffered an internal error. If configuration changes were recently made to VxVM diskgroups or volumes, this node may be unaware of those changes. Recently created volumes may be inaccessible from this node.

Solution:

If no configuration changes have been recently made to VxVM diskgroups or volumes and all volumes continue to be accessible from this node, then no action is required. If changes have been made, the device namespace on this node can be updated to reflect those changes by executing '/usr/cluster/lib/dcs/scvxvmlg'. If the problem persists, contact your authorized Sun service provider to determine whether a workaround or patch is available.


648339 Failed to retrieve ip addresses configured on adapter %s.

Description:

System was attempting to list all the ip addresses configured on the specified adapter, but it was unable to do that.

Solution:

Check the messages that are logged just before this message for possible causes. For more help, contact your authorized Sun service provider with the following information. Output of /var/adm/messages file and the output of "ifconfig -a" command .


649584 Modification of resource group <%s> failed because none of the nodes on which VALIDATE would have run for resource <%s> are currently up

Description:

Before it will permit the properties of a resource group to be edited, the rgmd runs the VALIDATE method on each resource in the group for which a VALIDATE method is registered. For each such resource, the rgmd must be able to run VALIDATE on at least one node. However, all of the candidate nodes are down. "Candidate nodes" are either members of the resource group's Nodelist or members of the resource type's Installed_nodes list, depending on the setting of the resource's Init_nodes property.

Solution:

Boot one of the resource group's potential masters and retry the resource creation operation.


649648 svc_probe used entire timeout of %d seconds during read operation and exceeded the timeout by %d seconds. Attempting disconnect with timeout %d

Description:

The probe timed out while reading from the application.

Solution:

If the problem persists investigate why the application is responding slowly or if the Probe_timeout property needs to be increased.


649860 RGM isn't failing resource group <%s> off of node <%d>, because no current or potential master is healthy enough

Description:

A scha_control(1HA,3HA) GIVEOVER attempt failed on all potential masters, because no candidate node was healthy enough to host the resource group.

Solution:

Examine other syslog messages on all cluster members that occurred about the same time as this message, to see if the problem that caused the MONITOR_CHECK failure can be identified. Repair the condition that is preventing any potential master from hosting the resource.


650276 Failed to get port numbers from config file <%s>.

Description:

An error occurred while parsing the configuration file to extract port numbers.

Solution:

Check that the configuration file path exists and is accessible. Check that port keywords and values exist in the file.


650390 Validation failed. init<sid>.ora file does not exist: %s

Description:

Oracle Parameter file has not been specified. Default parameter file indicated in the message does not exist. Cannot start Oracle server.

Solution:

Please make sure that parameter file exists at the location indicated in message or specify 'Parameter_file' property for the resource. Clear START_FAILED flag on the resource and bring the resource online.


650825 Method <%s> on resource <%s> terminated due to receipt of signal <%d>

Description:

A resource method was terminated by a signal, most likely resulting from an operator-issued kill(1). The method is considered to have failed.

Solution:

No action is required. The operator may choose to issue an scswitch(1M) command to bring resource groups onto desired primaries, or re-try the administrative action that was interrupted by the method failure.


650932 malloc failed for ipaddr string

Description:

Call to malloc failed. The "malloc" man page describes possible reasons.

Solution:

Install more memory, increase swap space or reduce peak memory consumption.


651091 INTERNAL ERROR: Invalid upgrade-from tunablity flag <%d>; aborting node

Description:

A fatal internal error has occurred in the RGM.

Solution:

Since this problem might indicate an internal logic error in the rgmd, please save a copy of the /var/adm/messages files on all nodes, the output of an scstat -g command, and the output of a scrgadm -pvv command. Report the problem to your authorized Sun service provider.


651093 reservation message(%s) - Fencing node %d from disk %s

Description:

The device fencing program is taking access to the specified device away from a non-cluster node.

Solution:

This is an informational message, no user action is needed.


651327 Failed to delete scalable service group %s: %s.

Description:

A call to the underlying scalable networking code failed.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


651865 Failed to stop liveCache gracefully with command %s. Will stop it immediately with db_stop.

Description:

Failed to stop liveCache with 'lcinit stop'. Will shutdown immediately with 'dbmcli db_stop'.

Solution:

Informative message. No user action is needed.


652399 Ignoring the SCHA_ERR_SEQID while retrieving %s

Description:

An update to the cluster configuration tables occurred while trying to retrieve certain cluster related information. However, the update does not affect the property that is being retrieved.

Solution:

Ignore the message


652662 libsecurity: program %s (%lu) rpc_createerror: %s

Description:

A client of the specified server was not able to initiate an rpc connection. The error message generated with a call to clnt_spcreateerror(3NSL) is appended.

Solution:

Save the /var/adm/messages file. Check the messages file for earlier errors related to the rpc.pmfd, rpc.fed, or rgmd server.


653058 Adapter not specified for node %s.

Description:

Not available at this time.

Solution:

Not available at this time.


653062 Syntax error on line %s in dfstab file.

Description:

The specified share command is incorrect.

Solution:

Correct the share command using the dfstab(4) man pages.


653062 Syntax error on line %s in dfstab file.

Description:

The specified share command is incorrect.

Solution:

Correct the share command using the dfstab(4) man pages.


653183 Unable to create the directory %s: %s. Current directory is /.

Description:

Callback method is failed to create the directory specified. Now the callback methods will be executed in "/", so the core dumps from this callbacks will be located in "/".

Solution:

No user action needed. For detailed error message, check the syslog message.


654276 realloc failed with error code: %d

Description:

The call to realloc(3C) in the cl_apid failed with the specified error code.

Solution:

Increase swap space, install more memory, or reduce peak memory consumption.


654520 INTERNAL ERROR: rgm_run_state: bad state <%d> for resource group <%s>

Description:

The rgmd state machine on this node has discovered that the indicated resource group's state information is corrupted. The state machine will not launch any methods on resources in this resource group. This may indicate an internal logic error in the rgmd.

Solution:

Other syslog messages occurring before or after this one might provide further evidence of the source of the problem. If not, save a copy of the /var/adm/messages files on all nodes, and (if the rgmd crashes) a copy of the rgmd core file, and contact your authorized Sun service provider for assistance.


654546 Probe_timeout is not set.

Description:

The resource property Probe_timeout is not set. This property controls the probe time interval.

Solution:

Check whether this property is set. Otherwise, set it using scrgadm(1M).


654567 Failed to retrieve SAP binary path.

Description:

Cannot retrieve the path to SAP binaries.

Solution:

This is an internal error. There may be prior messages in syslog indicating specific problems. Make sure that the system has enough memory and swap space available. Save the /var/adm/messages from all nodes. Contact your authorized Sun service provider.


655410 getsockopt: %s

Description:

The cl_apid received the following error while trying to deliver an event to a CRNP client. This error probably represents a CRNP client error or temporary network congestion.

Solution:

No action required, unless the problem persists (ie. there are many messages of this form). In that case, examine other syslog messages occurring at about the same time to see if the problem can be identified. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


655416 setsockopt: %s

Description:

The cl_apid experienced an error while configuring a socket. This error may prohibit event delivery to CRNP clients.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


655523 Write to server failed: server %s port %d: %s.

Description:

The agent could not send data to the server at the specified server and port.

Solution:

This is an informational message, no user action is needed. If the problem persists the fault monitor will restart or failover the resource group the server is part of.


656721 clexecd: %s: sigdelset returned %d. Exiting.

Description:

clexecd program has encountered a failed sigdelset(3C) system call. The error message indicates the error number for the failure.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


656795 CMM: Unable to bind <%s> to nameserver.

Description:

An instance of the userland CMM encountered an internal initialization error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


657495 Tag %s: error number %d in throttle wait; process will not be requeued.

Description:

An internal error has occurred in the rpc.pmfd server while waiting before restarting the specified tag. rpc.pmfd will delete this tag from its tag list and discontinue retry attempts.

Solution:

If desired, restart the tag under pmf using the 'pmfadm -c' command.


657560 CMM: Reading reservations from quorum device %s failed with error %d.

Description:

The specified error was encountered while trying to read reservations on the specified quorum device.

Solution:

There may be other related messages on this and other nodes connected to this quorum device that may indicate the cause of this problem. Refer to the quorum disk repair section of the administration guide for resolving this problem.


657885 sigwait: %s

Description:

The cl_apid was unable to configure its signal handling functionality, so it is unable to run.

Solution:

Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


658329 CMM: Waiting for initial handshake to complete.

Description:

The userland CMM has not been able to complete its initial handshake protocol with its counterparts on the other cluster nodes, and will only be able to join the cluster after this is completed.

Solution:

This is an informational message, no user action is needed.


658555 Retrying to retrieve the resource information.

Description:

An update to cluster configuration occurred while resource properties were being retrieved

Solution:

Ignore the message.


659665 kill -KILL: %s

Description:

The rpc.fed server is not able to stop a tag that timed out, and the error message is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Examine other syslog messages occurring around the same time on the same node, to see if the cause of the problem can be identified.


659827 CCR: Can't access CCR metadata on node %s errno = %d.

Description:

The indicated error occurred when CCR is trying to access the CCR metadata on the indicated node. The errno value indicates the nature of the problem. errno values are defined in the file /usr/include/sys/errno.h. An errno value of 28(ENOSPC) indicates that the root files system on the node is full. Other values of errno can be returned when the root disk has failed(EIO).

Solution:

There may be other related messages on the node where the failure occurred. These may help diagnose the problem. If the root file system is full on the node, then free up some space by removing unnecessary files. If the root disk on the afflicted node has failed, then it needs to be replaced. If the cluster repository is corrupted, boot the indicated node in -x mode to restore it from backup. The cluster repository is located at /etc/cluster/ccr/.


660332 launch_validate: fe_set_env_vars() failed for resource <%s>, resource group <%s>, method <%s>

Description:

The rgmd was unable to set up environment variables for method execution, causing a VALIDATE method invocation to fail. This in turn will cause the failure of a creation or update operation on a resource or resource group.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified. Re-try the creation or update operation. If the problem recurs, save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance.


660368 CCR: CCR service not available, service is %s.

Description:

The CCR service is not available due to the indicated failure.

Solution:

Reboot the cluster. Also contact your authorized Sun service provider to determine whether a workaround or patch is available.


660974 file specified in USER_ENV %s does not exist

Description:

'User_env' property was set when configuring the resource. File specified in 'User_env' property does not exist or is not readable. File should be specified with fully qualified path.

Solution:

Specify existing file with fully qualified file name when creating resource. If resource is already created, please update resource property 'User_env'.


660974 file specified in USER_ENV %s does not exist

Description:

'User_env' property was set when configuring the resource. File specified in 'User_env' property does not exist or is not readable. File should be specified with fully qualified path.

Solution:

Specify existing file with fully qualified file name when creating resource. If resource is already created, please update resource property 'User_env'.


661084 liveCache was stopped by the user outside of Sun Cluster. Sun Cluster will suspend monitoring until liveCache is again started up successfully outside of Sun Cluster.

Description:

When Sun Cluster tries to bring up liveCache, it detects that liveCache was brought down by user intendedly outside of Sun Cluster. Suu Cluster will not try to restart it under the control of Sun Cluster until liveCache is started up successfully again by the user. This behaviour is enforced across nodes in the cluster.

Solution:

Informative message. No action is needed.


661560 All the SUNW.HAStoragePlus resources that this resource depends on are online on the local node. Proceeding with the checks for the existence and permissions of the start/stop/probe commands.

Description:

The HAStoragePlus resource that this resource depends on is local to this node. Proceeding with the rest of the validation checks.

Solution:

This message is informational; no user action is needed.


661560 All the SUNW.HAStoragePlus resources that this resource depends on are online on the local node. Proceeding with the checks for the existence and permissions of the start/stop/probe commands.

Description:

This is an informational message which means that the SUNW.HAStoragePlus resource(s) that this application resource depends on is online on the local node and therefore the validation checks related to start/stop/probe commands will be carried out on the local node.

Solution:

None.


661778 clcomm: memory low: freemem 0x%x

Description:

The system is reporting that the system has a very low level of free memory.

Solution:

If the system fails soon after this message, then there is a significantly greater chance that the system ran out of memory. In which case either install more memory or reduce system load. When the system continues to function, this means that the system recovered and no user action is required.


662056 Failed to shutdown lockd gracefully.

Description:

Not available at this time.

Solution:

Not available at this time.


662516 SIOCGLIFNUM: %s

Description:

The ioctl command with this option failed in the cl_apid. This error may prevent the cl_apid from starting up.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


663089 clexecd: %s: sigwait returned %d. Exiting.

Description:

clexecd program has encountered a failed sigwait(3C) system call. The error message indicates the error number for the failure.

Solution:

The clexecd program will exit and the node will be halted or rebooted to prevent data corruption. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


663293 reservation error(%s) - do_status() error for disk %s

Description:

The device fencing program has encountered errors while trying to access a device. All retry attempts have failed.

Solution:

The action which failed is a scsi-2 ioctl. These can fail if there are scsi-3 keys on the disk. To remove invalid scsi-3 keys from a device, use 'scdidadm -R' to repair the disk (see scdidadm man page for details). If there were no scsi-3 keys present on the device, then this error is indicative of a hardware problem, which should be resolved as soon as possible. Once the problem has been resolved, the following actions may be necessary: If the message specifies the 'node_join' transition, then this node may be unable to access the specified device. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access the device. In either case, access can be reacquired by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group may have failed to start on this node. If the device group was started on another node, it may be moved to this node with the scswitch command. If the device group was not started, it may be started with the scswitch command. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group may have failed. If so, the desired action may be retried.


663851 Failover %s data services must have exactly one value for extension property %s.

Description:

Failover data services must have one and only one value for Confdir_list.

Solution:

Create a failover resource group for each configuration file.


663851 Failover %s data services must have exactly one value for extension property %s.

Description:

Failover data services must have one and only one value for Confdir_list.

Solution:

Create a failover resource group for each configuration file.


663897 clcomm: Endpoint %p: %d is not an endpoint state

Description:

The system maintains information about the state of an Endpoint. The Endpoint state is invalid.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


663943 Quorum: Unable to reset node information on quorum disk.

Description:

This node was unable to reset some information on the quorum device. This will lead the node to believe that its partition has been preempted. This is an internal error. If a cluster gets divided into two or more disjoint subclusters, exactly one of these must survive as the operational cluster. The surviving cluster forces the other subclusters to abort by grabbing enough votes to grant it majority quorum. This is referred to as preemption of the losing subclusters.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


664371 %s: Not in cluster mode. Exiting...

Description:

Must be in cluster mode to execute this command.

Solution:

Reboot into cluster mode and retry the command.


665015 Scalable service instance [%s,%s,%d] registered on node %s.

Description:

The specified scalable service has been registered on the specified node. Now, the gif node can redirect packets for the specified service to this node.

Solution:

This is an informational message, no user action is needed.


665090 libsecurity: program %s (%lu); getnetconfigent error: %s

Description:

A client of the specified server was not able to initiate an rpc connection, because it could not get the network information. The pmfadm or scha command exits with error. The rpc error is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


665195 INTERNAL ERROR: rebalance: invalid node name in Nodelist of resource group <%s>

Description:

An internal error has occurred in the rgmd. This error may prevent the rgmd from bringing the affected resource group online.

Solution:

Look for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


665297 Failed to validate BV configuration.

Description:

The Validation of the BV extension properties or Broadvision configuration has failed.

Solution:

Look for other error messages generated while validatingthe extension properties or Broadvision configuration toidentify the exact error.Look for appropriate action for that error message.


665845 Node list for resource group %s is empty.

Description:

Not available at this time.

Solution:

Not available at this time.


665931 Initialization error. CONNECT_STRING is NULL

Description:

Error occurred in monitor initialization. Monitor is unable to get resource property 'Connect_string'.

Solution:

Check syslog messages for errors logged from other system modules. Check the resource configuration and value of 'Connect_string' property. Check syslog messages for errors logged from other system modules. Stop and start fault monitor. If error persists then disable fault monitor and report the problem.


665931 Initialization error. CONNECT_STRING is NULL

Description:

Error occurred in monitor initialization. Monitor is unable to get resource property 'Connect_string'.

Solution:

Check syslog messages for errors logged from other system modules. Check the resource configuration and value of 'Connect_string' property. Check syslog messages for errors logged from other system modules. Stop and start fault monitor. If error persists then disable fault monitor and report the problem.


666391 clcomm: invalid invocation result status %d

Description:

An invocation completed with an invalid result status.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


666443 unix DLM already running

Description:

UNIX DLM is already running. Another dlm will not be started.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


666603 clexecd: Error %d in fcntl(F_GETFD). Exiting.

Description:

clexecd program has encountered a failed fcntl(2) system call. The error message indicates the error number for the failure.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


667020 Invalid shared path

Description:

HA-NFS fault monitor detected that one or more shared paths in dftab are invalid paths.

Solution:

Make sure all paths in dfstab are correct. Look at the prior syslog messages for any specific problems and correct them.


667429 Multiple entries in %s have the same port number: %d.

Description:

Multiple entries in the specified property have the same port number.

Solution:

Remove one of the entries or change its port number.


668866 Successful shutdown; terminating daemon

Description:

The cl_apid daemon is shutting down normally due to a SIGTERM.

Solution:

No action required.


669026 fcntl(F_SETFD) failed in close_on_exec

Description:

A fcntl operation failed. The "fcntl" man page describes possible error codes.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


670753 reservation fatal error(%s) - unable to determine node id for node %s

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.


670799 CMM: Registering reservation key on quorum device %s failed with error %d.

Description:

The specified error was encountered while trying to place the local node's reservation key on the specified quorum device. This node will ignore this quorum device.

Solution:

There may be other related messages on this and other nodes connected to this quorum device that may indicate the cause of this problem. Refer to the quorum disk repair section of the administration guide for resolving this problem.


671376 cl_apid internal error: unable to update client registrations.

Description:

The cl_apid experienced in internal error that prevented it from modifying the client registrations as requested.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


671954 waitpid: %s

Description:

The rpc.pmfd or rpc.fed server was not able to wait for a process. The message contains the system error. The server does not perform the action requested by the client, and an error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


672019 Stop method failed. Error: %d.

Description:

Stop method failed, while attempting to restart the data service.

Solution:

Check the Stop_timeout and adjust it if it is not appropriate. For the detailed explanation of failure, check the syslog messages that occurred just before this message.


672372 dl_attach: bad ACK header %u

Description:

Could not attach to the physical device. We are trying to open a fast path to the private transport adapters.

Solution:

Reboot of the node might fix the problem.


672511 Failed to start Text server.

Description:

Sun Cluster HA for Sybase failed to start the text server. Other syslog messages and the log file will provide additional information on possible reasons for the failure.

Solution:

Please whether the server can be started manually. Examine the HA-Sybase log files, text server log files and setup.


674359 load balancer deleted

Description:

This message indicates that the service group has been deleted.

Solution:

This is an informational message, no user action is needed.


674415 svc_restore_priority: %s

Description:

The rpc.pmfd or rpc.fed server was not able to run the application in the correct scheduling mode, and the system error is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


674848 fatal: Failed to read CCR

Description:

The rgmd is unable to read the cluster configuration repository. This is a fatal error. The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


675221 clcomm:Cannot fork1() after ORB server initialization.

Description:

A user level process attempted to fork1 after ORB server initialization. This is not allowed.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


675432 pmf_monitor_suspend: poll: %s

Description:

The rpc.pmfd server was not able to monitor a process. and the system error is shown. This error occurred for a process whose monitoring had been suspended. The monitoring of this process has been aborted and can not be resumed.

Solution:

Save the syslog messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


675776 Stopped the fault monitor.

Description:

The fault monitor for this data service was stopped successfully.

Solution:

No action needed.


675776 Stopped the fault monitor.

Description:

The fault monitor for this data service was stopped successfully.

Solution:

No action needed.


676558 WARNING: Global_resources_used property of resource group <%s> is set to non-null string, assuming wildcard

Description:

The Global_resources_used property of the resource group was set to a specific non-null string. The only supported settings of this property in the current release are null ("") or wildcard ("*").

Solution:

No user action is required; the rgmd will interpret this value as wildcard. This means that method timeouts for this resource group will be suspended while any device group temporarily goes offline during a switchover or failover. This is usually the desired setting, except when a resource group has no dependency on any global device service or pxfs file system.


677278 No network address resource in resource group.

Description:

A resource has no associated network address.

Solution:

For a failover data service, add a network address resource to the resource group. For a scalable data service, add a network resource to the resource group referenced by the RG_dependencies property.


677278 No network address resource in resource group.

Description:

A resource has no associated network address.

Solution:

For a failover data service, add a network address resource to the resource group. For a scalable data service, add a network resource to the resource group referenced by the RG_dependencies property.


677428 %s can't UP %s

Description:

This means that the Logical IP address could not be set to UP.

Solution:

There could be other related error messages which might be helpful. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


677759 Unknown status code %d.

Description:

Checking for HAStoragePlus resources returned an unknown status code.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


677759 Unknown status code %d.

Description:

This message indicates that an unknown status code was rerturned by one of the underlying subsystems and an internal error has occurred.

Solution:

Report this problem.


678041 lkcm_sync: cm_reconfigure failed: %s

Description:

ucmm reconfiguration failed.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


678319 (%s) getenv of "%s" failed.

Description:

Failed to get the value of an environmental variable. udlm will fail to go through a transition.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


678755 dl_bind: DL_BIND_ACK protocol error

Description:

Could not bind to the physical device. We are trying to open a fast path to the private transport adapters.

Solution:

Reboot of the node might fix the problem.


679912 uaddr2taddr: %s

Description:

Call to uaddr2taddr() failed. The "uaddr2taddr" man page describes possible error codes. udlm will exit and the node will abort and panic.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


680437 Start method failed. Error: %d.

Description:

Restart of the data service failed.

Solution:

Check the sylog messages that are occurred just before this message to check whether there is any internal error. In case of internal error, contact your Sun service provider. Otherwise, any of the following situations may have happened. 1) Check the Start_timeout value and adjust it if it is not appropriate. 2) Check whether the application's configuration is correct. 3) This might be the result of lack of the system resources. Check whether the system is low in memory or the process table is full and take appropriate action.


680675 clcomm: thread_create failed for monitor

Description:

The system could not create the needed thread, because there is inadequate memory.

Solution:

There are two possible solutions. Install more memory. Alternatively, reduce memory usage. Since this happens during system startup, application memory usage is normally not a factor.


680960 Unable to write data: %s.Explanation Failed to write the data to the socket. The reason might be expiration of timeout, hung application or heavy load.

Solution:

Check if the application is hung. If this is the case, restart the appilcation.


681547 fatal: Method <%s> on resource <%s>: Received unexpected result <%d> from rpc.fed, aborting node

Description:

A serious error has occurred in the communication between rgmd and rpc.fed. The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


682887 CMM: Initialization for quorum device %s failed with error EACCES. Will issue a SCSI2 Tkown and retry.

Description:

This node is not able to access the specified quorum device because the node is still fenced off. A retry will be attempted.

Solution:

This is an informational message, no user action is needed.


683997 Failed to retrieve the resource group property %s: %s

Description:

Unable to retrieve the resource group property.

Solution:

For the property name and the reason for failure, check the syslog message. For more details about the api failure, check the syslog messages from the RGM .


684383 Development system shut down successfully.

Description:

Informational message.

Solution:

No action needed.


684753 store_binding: <%s> bad bind type <%d>

Description:

During a name server binding store an unknown binding type was encountered.

Solution:

No action required. This is informational message.


684895 Failed to validate scalable service configuration: Error %d.

Description:

An error was detected in the Load_balancing_weights property for the data service.

Solution:

Use the scrgadm command to change the Load_balancing_weights property to a valid value.


685886 Failed to communicate: %s.

Description:

While determining the health of the data service, fault monitor is failed to communicate with the process monitor facility.

Solution:

This is internal error. Save /var/adm/messages file and contact your authorized Sun service provider. For more details about error, check the syslog messges.


687457 Attempting to kill pid %d name %s resulted in error: %s.

Description:

HA-NFS callback method attempted to stop the specified NFS process with SIGKILL but was unable to do so because of the specified error.

Solution:

The failure of the method would be handled by SunCluster. If the failure happened during starting of a HA-NFS resource, the resource would be failed over to some other node. If this happened during stopping, the node would be rebooted and HA-NFS service would continue on some other node. If this error persists, please contact your local SUN service provider for assistance.


687543 shutdown abort did not succeed.

Description:

HA-Oracle failed to shutdown Oracle server using 'shutdown abort'.

Solution:

Examine log files and syslog messages to determine the cause of failure.


687543 shutdown abort did not succeed.

Description:

HA-Oracle failed to shutdown Oracle server using 'shutdown abort'.

Solution:

Examine log files and syslog messages to determine the cause of failure.


687929 daemon %s did not respond to null rpc call: %s.

Description:

HA-NFS fault monitor failed to ping an nfs daemon.

Solution:

No action required. The fault monitor will restart the daemon if necessary.


687929 daemon %s did not respond to null rpc call: %s.

Description:

HA-NFS fault monitor failed to ping an nfs daemon.

Solution:

No action required. The fault monitor will restart the daemon if necessary.


688163 clexecd: pipe returned %d. Exiting.

Description:

clexecd program has encountered a failed pipe(2) system call. The error message indicates the error number for the failure.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


689075 Failed to delete scalable services group: Error %d.

Description:

Not available at this time.

Solution:

Not available at this time.


689538 Listener %s did not stop.(%s)

Description:

Failed to start Oracle listener using 'lsnrctl' command. HA-Oracle will attempt to kill listener process.

Solution:

None


689538 Listener %s did not stop.(%s)

Description:

Failed to start Oracle listener using 'lsnrctl' command. HA-Oracle will attempt to kill listener process.

Solution:

None


689887 Failed to stop the process with: %s. Retry with SIGKILL.

Description:

Process monitor facility is failed to stop the data service. It is reattempting to stop the data service.

Solution:

This is informational message. Check the Stop_timeout and adjust it, if it is not appropriate value.


689989 Invalid device group name <%s> supplied

Description:

The diskgroup name defined in SUNW.HAStorage type resource is invalid

Solution:

Check and set the correct diskgroup name in extension property "ServicePaths" of SUNW.HAStorage type resource.


690417 Protocol is missing in system defined property %s.

Description:

The specified system property does not have a valid format. The value of the property must include a protocol.

Solution:

Use scrgadm(1M) to specify the property value with protocol. For example: TCP.


690463 Cannot bring server online on this node.

Description:

Oracle server is running but it cannot be brought online on this node. START method for the resource has failed.

Solution:

Check if Oracle server can be started manually. Examine the log files and setup. Clear START_FAILED flag on the resource and bring the resource online.


690463 Cannot bring server online on this node.

Description:

Oracle server is running but it cannot be brought online on this node. START method for the resource has failed.

Solution:

Check if Oracle server can be started manually. Examine the log files and setup. Clear START_FAILED flag on the resource and bring the resource online.


690463 Cannot bring server online on this node.

Description:

The Adaptive server cannot be brought online on this node.

Solution:

See if the Adaptive server can be started manually. Examine thelog files and setup.


691493 One or more of the SUNW.HAStoragePlus resources that this resource depends on is in a different resource group. Failing validate method configuration checks.

Description:

The HAStoragePlus resource that this resource depends on must be configured into the same resource group.

Solution:

Move the HAStoragePlus resource into this resource's resource group.


691493 One or more of the SUNW.HAStoragePlus resources that this resource depends on is in a different resource group.

Description:

It is an invalid configuration to have an application resource depend on one or more SUNW.HAStoragePlus resource(s) that are in a different resource group.

Solution:

Change the resource/resource group configuration such that the appliaction resource and the SUNW.HAStoragePlus resource(s) are in the same resource group.


691736 CMM: Quorum device %ld (%s) with votecount = %d removed.

Description:

The specified quorum device with the specified votecount has been removed from the cluster. A quorum device being placed in maintenance state is equivalent to it being removed from the quorum subsystem's perspective, so this message will be logged when a quorum device is put in maintenance state as well as when it is actually removed.

Solution:

This is an informational message, no user action is needed.


692203 Failed to stop development system.

Description:

Stopping the development system failed.

Solution:

Informational message. Check previous messages in the system log for more details regarding why it failed.


692806 INITUCMM Error: ${RECONF_PROG} does not exist or not an executible.

Description:

The /usr/cluster/lib/ucmm/ucmm_reconf program does not exist on the node or is not executable. This file is installed as a part of SUNWscucm package. This error message indicates that there can be a problem in installation of SUNWscucm package or patches.

Solution:

Check installation of SUNWscucm package using pkgchk command. Correct the installation problems and reboot the cluster node. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


695728 Skipping checks dependant on HAStoragePlus resources on this node.

Description:

This resource will not perform some filesystem specific checks (during VALIDATE or MONITOR_CHECK) on this node because atleast one SUNW.HAStoragePlus resource that it depends on is online on some other node.

Solution:

None.


696186 This list element in System property %s has an invalid port number: %s.

Description:

The system property that was named does not have a valid port number.

Solution:

Change the value of the property to use a valid port number.


696463 rgm_clear_util called on resource <%s> with incorrect flag <%d>

Description:

An internal rgmd error has occurred while attempting to carry out an operator request to clear an error flag on a resource. The attempted clear action will fail.

Solution:

Since this problem might indicate an internal logic error in the rgmd, please save a copy of the /var/adm/messages files on all nodes, the output of an scstat -g command, and the output of a scrgadm -pvv command. Report the problem to your authorized Sun service provider.


697026 did instance %d created.

Description:

Informational message from scdidadm.

Solution:

No user action required.


697108 t_sndudata in send_reply failed.

Description:

Call to t_sndudata() failed. The "t_sndudata" man page describes possible error codes. ucmmd will exit and the node will abort.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


697588 Nodeid must be less than %d. Nodeid passed: '%s'

Description:

Incorrect nodeid passed to Oracle unix dlm. Oracle unix dlm will not start.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


697663 INTERNAL ERROR:BV extension property structure is NULL.

Description:

An internal error has occurred.

Solution:

Save a copy of the /var/adm/messages files on all nodes.Contact your authorized Sun service provider forassistance in diagnosing the problem.


697725 Error retrieving result of bind to %s port %d for non-secure resource %s: %s

Description:

An error occurred while fault monitor attempted to probe the health of the data service.

Solution:

Wait for the fault monitor to correct this by doing restart or failover. For more error description, look at the syslog messages.


698239 Monitor server stopped.

Description:

The Monitor server has been stopped by Sun Cluster HA for Sybase.

Solution:

This is an information message, no user action is needed.


698512 Directory %s is not readable: %s.

Description:

The specified path doesn't exist or is not readable

Solution:

Consult the HA-NFS configuration guide on how to configure the dfstab.<resource_name> file for HA-NFS resources.


698526 scvxvmlg error - service %s has service_class %s, not %s, ignoring it

Description:

The program responsible for maintaining the VxVM namespace has suffered an internal error. If configuration changes were recently made to VxVM diskgroups or volumes, this node may be unaware of those changes. Recently created volumes may be inaccessible from this node.

Solution:

If no configuration changes have been recently made to VxVM diskgroups or volumes and all volumes continue to be accessible from this node, then no action is required. If changes have been made, the device namespace on this node can be updated to reflect those changes by executing '/usr/cluster/lib/dcs/scvxvmlg'. If the problem persists, contact your authorized Sun service provider to determine whether a workaround or patch is available.


698744 scvxvmlg error - lstat(%s) failed with errno %d

Description:

The program responsible for maintaining the VxVM namespace was unable to access the global device namespace. If configuration changes were recently made to VxVM diskgroups or volumes, this node may be unaware of those changes. Recently created volumes may be inaccessible from this node.

Solution:

Verify that the /global/.devices/node@N (N = this node's node number) is mounted globally and is accessible. If no configuration changes have been recently made to VxVM diskgroups or volumes and all volumes continue to be accessible from this node, then no further action is required. If changes have been made, the device namespace on this node can be updated to reflect those changes by executing '/usr/cluster/lib/dcs/scvxvmlg'. If the problem persists, contact your authorized Sun service provider to determine whether a workaround or patch is available.


699104 VALIDATE failed on resource <%s>, resource group <%s>, time used: %d%% of timeout <%d, seconds>

Description:

The resource's VALIDATE method exited with a non-zero exit code. This indicates that an attempted update of a resource or resource group is invalid.

Solution:

Examine syslog messages occurring just before this one to determine the cause of validation failure. Re-try the update.


699455 Error in creating udlmctl link: %s. Error (%s)

Description:

The udlmreconfig program received error when creating link using /bin/ln. This error will prevent startup of RAC framework on this node.

Solution:

Check the error code, correct the error and reboot the cluster node to start UNIX Distributed Lock Manager on this node.


699689 (%s) poll failed: %s (UNIX errno %d)

Description:

Call to poll() failed. The "poll" man page describes possible error codes. udlmctl will exit.

Solution:

Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.