Sun Cluster 3.0 5/02 Error Messages Guide

Chapter 7 Message IDs 600000 - 699999

Error Message List

The following list is ordered by the message ID.


600967 :Could not allocate buffer for DBMS log messages: %m

Description:

Fault monitor could not allocate memory for reading RDBMS log file. As a result of this error, fault monitor will not scan errors from log file. However it will continue fault monitoring.

Solution:

Check if system is low on memory. If problem persists, please stop and start the fault monitor.


601901 :Failed to retrieve the resource property %s for %s: %s.

Description:

The query for a property failed. The reason for the failure is given in the message.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


602324:Backup server shutdown did not succeed.

Description:

Sun Cluster HA for Sybase did not successfully shut down the Backup Server.

Solution:

Manually stop the Backup Server. Examine the log files and setup. See if the STOP method timeout value is set too low.


603096 :resource %s disabled.

Description:

This is a notification from the rgmd that the operator has disabled a resource. This may be used by system monitoring tools.

Solution:

This is an informational message, no user action is needed.


603479 :This node can be a primary for scalable resource %s, but there is no NAFO group defined on this node. A NAFO group must be created on this node.

Description:

The node does not have a NAFO group defined.

Solution:

Any adapters on the node which are connected to the public network should be put under NAFO control by placing them in a NAFO group. See the pnmset(1M) man page for details.


604153 :clcomm: Path %s errors during initiation

Description:

Communication could not be established over the path. The interconnect may have failed or the remote node may be down.

Solution:

Any interconnect failure should be resolved, and/or the failed node rebooted.


605301 :lkcm_sync: invalid handle was passed %s%d

Description:

Invalid handle passed during lockstep execution.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


606203 :Couldn't get the root vnode: error (%d)

Description:

The file system is corrupt or was not mounted correctly.

Solution:

Run fsck, and mount the affected file system again.


606362 :The stop command <%s> failed to stop the application. Will now use SIGKILL to stop the application.

Description:

The user provided stop command cannot stop the application. Will re-attempt to stop the application by sending SIGKILL to the pmf tag.

Solution:

No action required.


606467 :CMM: Initialization for quorum device %s failed with error EACCES. Will retry later.

Description:

This node is not able to access the specified quorum device because the node is still fenced off. An attempt will be made to access the quorum device again after the node's CCR has been recovered.

Solution:

This is an informational message, no user action is needed.


607054 :%s not found.

Description:

Could not find the binary to startup udlm.

Solution:

Make sure the unix dlm package is installed properly.


607054 :Stopsap development system script is NULL.

Description:

The stopsap script for the development system is not provided.

Solution:

Need to provide the script to shutdown the development system if the property Shutdown_dev is set to TRUE.


607498:Stopsap development system script is NULL.

Description:

The stopsap script for the development system is not provided.

Solution:

Provide the script to shutdown the development system if the property Shutdown_dev is set to TRUE.


607613 :transition '%s' timed out for cluster, as did attempts to reconfigure.

Description:

Step transition failed. udlmctl will exit.

Solution:

Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


607678 :clconf: No valid quorum_resv_key field for node %u

Description:

Found the quorum_resv_key field being incorrect while converting the quorum configuration information into quorum table.

Solution:

Check the quorum configuration information.


608202 :scha_control: resource group <%s> was frozen on Global_resources_used within the past %d seconds; exiting

Description:

A scha_control call has failed with a SCHA_ERR_CHECKS error because the resource group has a non-null Global_resources_used property, and a global device group was failing over within the indicated recent time interval. The resource fault probe is presumed to have failed because of the temporary unavailability of the device group. A properly-written resource monitor, upon getting the SCHA_ERR_CHECKS error code from a scha_control call, should sleep for awhile and restart its probes.

Solution:

No user action is required. Either the resource should become healthy again after the device group is back online, or a subsequent scha_control call should succeed in failing over the resource group to a new master.


608286:Stopping the text server.

Description:

Sun Cluster HA for Sybase is stopping the Text Server.

Solution:

No user action required.


608453 :failfast disarm error: %d

Description:

Error during a failfast device disarm operation.

Solution:

None.


608453 :Stopping the text server.

Description:

The Text server is about to be brought down by Sun Cluster HA for Sybase.

Solution:

This is an information message, no user action is needed.


608876 :PCRUN: %s

Description:

The rpc.pmfd server was not able to monitor a process, and the system error is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


609118 :Error creating deleted directory: error (%d)

Description:

While mounting this file system, cluster file system was unable to create some directories that it reserves for internal use.

Solution:

If the error is 28(ENOSPC), then mount this FS non-globally, make some space, and then mount it globally. If there is some other error, and you are unable to correct it, contact your authorized Sun service provider to determine whether a workaround or patch is available.


612049 :resource <%s> in resource group <%s> depends on disabled network address resource <%s>

Description:

An enabled application resource was found to implicitly depend on a network address resource that is disabled. This error is non-fatal but may indicate an internal logic error in the rgmd.

Solution:

Look for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


612117 :Failed to stop Text server.

Description:

Sun Cluster HA for Sybase failed to stop text server.

Solution:

Please examine whether any Sybase server processes are running on the server. Please manually shutdown the server.


612124 :Volume configuration daemon not running.

Description:

Volume manager is not running.

Solution:

Bring up the volume manager.


612931 :Unable to get device major number for %s driver: %s.

Description:

System was unable to translate the given driver name into device major number.

Solution:

Check whether the /etc/name_to_major file is corrupted. Reboot the node if problem persists.


613522 :clexecd: Error %d from poll. Exiting.

Description:

clexecd program has encountered a failed poll(2) system call. The error message indicates the error number for the failure.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


613896 :INTERNAL ERROR: process_resource: Resource <%s> is R_BOOTING in PENDING_OFFLINE or PENDING_DISABLED resource group

Description:

The rgmd is attempting to bring a resource group offline on a node where BOOT methods are still being run on its resources. This should not occur and may indicate an internal logic error in the rgmd.

Solution:

Look for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


615120 :fatal: unknown scheduling class '%s'

Description:

An internal error has occurred. The daemon indicated in the message tag (rgmd or ucmmd) will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the core file generated by the daemon. Contact your authorized Sun service provider for assistance in diagnosing the problem.


617643 :Unable to fork(): %s.

Description:

Upon a NAFO failure, the system was unable to take any action, because it failed to fork another process.

Solution:

This might be the result from the lack of the system resources. Check whether the system is low in memory or the process table is fully, and take appropriate action. For specific error information check the syslog message.


617917 :Initialization failed. Invalid command line %s %s

Description:

Unable to process parameters passed to the call back method. This is an internal error.

Solution:

Please report this problem.


618107 :Path %s initiation encountered errors, errno = %d. Remote node may be down or unreachable through this path.

Description:

Communication with another node could not be established over the path.

Solution:

Any interconnect failure should be resolved, and/or the failed node rebooted.


618466 :Unix DLM no longer running

Description:

UNIX DLM is expected to be running, but is not. This will result in a udlmstep1 failure.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


618585 :clexecd: getmsg returned %d. Exiting.

Description:

clexecd program has encountered a failed getmsg(2) system call. The error message indicates the error number for the failure.

Solution:

The clexecd program will exit and the node will be halted or rebooted to prevent data corruption. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


618637 :The port number %d from entry %s in property %s was not found in config file <%s>.

Description:

All entries in the list property must have port numbers that correspond to ports configured in the configuration file. The port number from the list entry does not correspond to a port in the configuration file.

Solution:

Remove the entry or change its port number to correspond to a port in the configuration file.


618764 :fe_set_env_vars() failed for Resource <%s>, resource group <%s>, method <%s>

Description:

The rgmd was unable to set up environment variables for a method execution, causing the method invocation to fail. Depending on which method was being invoked and the Failover_mode setting on the resource, this might cause the resource group to fail over or move to an error state.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


619171:Failed to retrieve information for user %s for SAP system %s.

Description:

Failed to retrieve the home directory for the specified Sun Cluster HA for SAP user for the specified system ID.

Solution:

Check the system ID for SAP. SAPSID is case sensitive.


619213 :t_alloc (recv_request) failed with error %d

Description:

Call to t_alloc() failed. The "t_alloc" man page describes possible error codes. ucmmd will exit and the node will abort.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


619312 :"%s" restarting too often ... sleeping %d seconds.

Description:

The tag shown, run by rpc.pmfd server, is restarting and exiting too often. This means more than once a minute. This can happen if the application is restarting, then immediately exiting for some reason, then the action is executed and returns OK (0), which causes the server to restart the application. When this happens, the rpc.pmfd server waits for up to 1 minute before it restarts the application. An error message is output to syslog.

Solution:

Examine the state of the application, try to figure out why the application doesn't stay up, and yet the action returns OK.


620204 :Failed to start scalable service.

Description:

Unable to configure service for scalability.

Solution:

The start method on this node will fail. Sun Cluster resource management will attempt to start the service on some other node.


621264 :clconf: Not found clexecd on node %d for %d seconds. Retrying ...

Description:

Could not find clexecd to execute the program on a node. Indicated retry times.

Solution:

No action required. This is informational message.


621686 :CCR: Invalid checksum length %d in table %s, expected %d.

Description:

The checksum of the indicated table has a wrong size. This causes the consistency check of the indicated table to fail.

Solution:

Boot the offending node in -x mode to restore the indicated table from backup or other nodes in the cluster. The CCR tables are located at /etc/cluster/ccr/.


622387 :constchar*fmt

Description:

Function definition. Please ignore

Solution:

None


623528 :clcomm: Unregister of adapter state proxy failed

Description:

The system failed to unregister an adapter state proxy.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


623759 :svc_setschedprio: Could not lookup RT (real time) scheduling class info: %s

Description:

The server was not able to determine the scheduling mode info, and the system error is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


624265 :Text server terminated.

Description:

Text server processes were stopped in STOP method.

Solution:

None


624447 :fatal: sigaction: %s (UNIX errno %d)

Description:

The rgmd has failed to initialize signal handlers by a call to sigaction(2). The error message indicates the reason for the failure. The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


627610 :clconf: Invalid clconf_obj type

Description:

An invalid clconf_obj type has been encountered while converting an clconf_obj type to group name. Valid objtypes are "CL_CLUSTER", "CL_NODE", "CL_ADAPTER", "CL_PORT", "CL_BLACKBOX", "CL_CABLE", "CL_QUORUM_DEVICE".

Solution:

This is an unrecoverable error, and the cluster needs to be rebooted. Also contact your authorized Sun service provider to determine whether a workaround or patch is available.


628203 :in libsecurity could not find any tcp transport

Description:

A client was not able to make an rpc connection to a server (rpc.pmfd, rpc.fed or rgmd) because it could not find a tcp transport. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


628771 :CCR: Can't read CCR metadata.

Description:

Reading the CCR metadata failed on this node during the CCR data server initialization.

Solution:

There may be other related messages on this node, which may help diagnose the problem. For example: If the root disk on the afflicted node has failed, then it needs to be replaced. If the cluster repository is corrupted, then boot this node in -x mode to restore the cluster repository from backup or other nodes in the cluster. The cluster repository is located at /etc/cluster/ccr/.


629702 :Unable to locate ldapsearch binary.

Description:

The ldapsearch program, which is required by the fault monitor, can not be located.

Solution:

Ensure that the directory server has been installed correctly and that the ldapsearch binary is available.


630653 :Failed to initialize DCS

Description:

There was a fatal error while this node was booting.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


631408 :PCSET: %s

Description:

The rpc.pmfd server was not able to monitor a process, and the system error is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


631429 :huge address size %d

Description:

Size of MAC address in acknowledgment of the bind request exceeds the maximum size allotted. We are trying to open a fast path to the private transport adapters.

Solution:

Reboot of the node might fix the problem.


631648 :Retrying to retrieve the resource group information.

Description:

An update to cluster configuration occurred while resource group properties were being retrieved

Solution:

Ignore the message.


633457 :reservation error(%s) - my_map_to_did_device() error in is_scsi3_disk()

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.


634957 :thr_keycreate failed in init_signal_handlers

Description:

The ucmmd failed in a call to thr_keycreate(3T). ucmmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes and of the ucmmd core. Contact your authorized Sun service provider for assistance in diagnosing the problem.


635859:Started the fault monitor.

Description:

The fault monitor successfully started.

Solution:

No user action required.


636456:Monitor server shutdown did not succeed.

Description:

Sun Cluster HA for Sybase did not successfully shut down the Monitor Server.

Solution:

Manually stop the Monitor Server. Examine the log files and setup. See if the STOP method timeout values are set too low.


636848 :Failed to get all NAFO groups (request failed with %d).

Description:

An unexpected error occurred while trying to communicate with the network monitoring daemon (pnmd).

Solution:

Make sure the network monitoring daemon (pnmd) is running.


637677 :(%s) t_alloc: tli error: %s

Description:

Call to t_alloc() failed. The "t_alloc" man page describes possible error codes. udlmctl will exit.

Solution:

Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


640029 :PENDING_ONLINE: bad resource state <%s> (%d) for resource <%s>

Description:

The rgmd state machine has discovered a resource in an unexpected state on the local node. This should not occur and may indicate an internal logic error in the rgmd.

Solution:

Look for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


640087 :udlmctl: incorrect command line

Description:

udlmctl will not startup because of incorrect command line options.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


640090 :CMM: Initialization for quorum device %s failed with error %d.

Description:

The initialization of the specified quorum device failed with the specified error, and this node will ignore this quorum device.

Solution:

There may be other related messages on this node which may indicate the cause of this problem. Refer to the quorum disk repair section of the administration guide for resolving this problem.


640484 :clconf: No valid votecount field for quorum device %d

Description:

Found the votecount field for the quorum device being incorrect while converting the quorum configuration information into quorum table.

Solution:

Check the quorum configuration information.


640799 :pmf_alloc_thread: ENOMEM

Description:

The rpc.pmfd server was not able to allocate a new monitor thread, probably due to low memory. As a consequence, the rpc.pmfd server was not able to monitor a process. An error message is output to syslog.

Solution:

Investigate if the machine is running out of memory. If this is not the case, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


643472 :fatal: Got error <%d> trying to read CCR when enabling resource <%s>; aborting node

Description:

Rgmd failed to read updated resource from the CCR on this node.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


643802 :Resource group is online on more than one node.

Description:

An internal error has occurred. Resource group should be online on only one node.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


644140 :Fault monitor is not running.

Description:

Sun cluster tried to stop the fault monitor for this resource, but the fault monitor was not running. This is most likely because the fault monitor was unable to start.

Solution:

Look for prior syslog messages relating to starting of fault monitor and take corrective action. No other action needed


644850 :File %s is not readable: %s.

Description:

Unable to open the file in read only mode.

Solution:

Make sure the specified file exists and have correct permissions. For the file name and details, check the syslog messages.


645665 :resource group %s state on node %s change to RG_OFFLINE

Description:

This is a notification from the rgmd that a resource group's state has changed. This may be used by system monitoring tools.

Solution:

This is an informational message, no user action is needed.


646037 :Probe timed out.

Description:

The simple probe on the network aware application timed out.

Solution:

This problem may occur when the cluster is under heavy load. You may consider increasing the Probe_timeout property.


646815 :PCUNSET:%s

Description:

The rpc.pmfd server was not able to monitor a process, and the system error is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


646950 :clcomm: Path %s being cleaned up

Description:

A communication link is being removed with another node. The interconnect may have failed or the remote node may be down.

Solution:

Any interconnect failure should be resolved, and/or the failed node rebooted.


647339 :(%s) scan of dlmmap failed on "%s", idx =%d

Description:

Failed to scan dlmmap.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


647673 :scvxvmlg error - dcs_get_service_parameters() failed, returned %d

Description:

The program responsible for maintaining the VxVM namespace has suffered an internal error. If configuration changes were recently made to VxVM diskgroups or volumes, this node may be unaware of those changes. Recently created volumes may be inaccessible from this node.

Solution:

If no configuration changes have been recently made to VxVM diskgroups or volumes and all volumes continue to be accessible from this node, then no action is required. If changes have been made, the device namespace on this node can be updated to reflect those changes by executing '/usr/cluster/lib/dcs/scvxvmlg'. If the problem persists, contact your authorized Sun service provider to determine whether a workaround or patch is available.


648339 :Failed to retrieve ip addresses configured on adapter %s.

Description:

System was attempting to list all the ip addresses configured on the specified adapter, but it was unable to do that.

Solution:

Check the messages that are logged just before this message for possible causes. For more help, contact your authorized Sun service provider with the following information. Output of /var/adm/messages file and the output of "ifconfig -a" command.


648814 :Loading transport %s failed.

Description:

Topology Manager could not load the specified transport module.

Solution:

Check if the transport modules exist with right permissions in the right directions.


649584 :Modification of resource group <%s> failed because none of the nodes on which VALIDATE would have run for resource <%s> are currently up

Description:

Before it will permit the properties of a resource group to be edited, the rgmd runs the VALIDATE method on each resource in the group for which a VALIDATE method is registered. For each such resource, the rgmd must be able to run VALIDATE on at least one node. However, all of the candidate nodes are down. "Candidate nodes" are either members of the resource group's Nodelist or members of the resource type's Installed_nodes list, depending on the setting of the resource's Init_nodes property.

Solution:

Boot one of the resource group's potential masters and retry the resource creation operation.


649860 :RGM isn't failing resource group <%s> off of node <%d>, because no current or potential master is healthy enough

Description:

A scha_control(1HA,3HA) GIVEOVER attempt failed on all potential masters, because no candidate node was healthy enough to host the resource group.

Solution:

Examine other syslog messages on all cluster members that occurred about the same time as this message, to see if the problem that caused the MONITOR_CHECK failure can be identified. Repair the condition that is preventing any potential master from hosting the resource.


650276 :Failed to get port numbers from config file <%s>.

Description:

An error occurred while parsing the configuration file to extract port numbers.

Solution:

Check that the configuration file path exists and is accessible. Check that port keywords and values exist in the file.


650390 :Validation failed. init.ora file does not exist: %s

Description:

Oracle Parameter file has not been specified. Default parameter file indicated in the message does not exist. Cannot start Oracle server.

Solution:

Please make sure that parameter file exists at the location indicated in message or specify 'Parameter_file' property for the resource. Clear START_FAILED flag on the resource and bring the resource online.


650825 :Method <%s> on resource <%s> terminated due to receipt of signal <%d>

Description:

A resource method was terminated by a signal, most likely resulting from an operator-issued kill(1). The method is considered to have failed.

Solution:

No action is required. The operator may choose to issue an scswitch(1M) command to bring resource groups onto desired primaries, or re-try the administrative action that was interrupted by the method failure.


650932 :malloc failed for ipaddr string

Description:

Call to malloc failed. The "malloc" man page describes possible reasons.

Solution:

Install more memory, increase swap space or reduce peak memory consumption.


651093 :reservation message(%s) - Fencing node %d from disk %s

Description:

The device fencing program is taking access to the specified device away from a non-cluster node.

Solution:

This is an informational message, no user action is needed.


651327 :Failed to delete scalable service group %s: %s.

Description:

A call to the underlying scalable networking code failed.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


652399 :Ignoring the SCHA_ERR_SEQID while retrieving %s

Description:

An update to the cluster configuration tables occurred while trying to retrieve certain cluster related information. However, the update does not affect the property that is being retrieved.

Solution:

Ignore the message


653062 :Syntax error on line %s in dfstab file.

Description:

The specified share command is incorrect.

Solution:

Correct the share command using the dfstab(4) man pages.


653183 :Unable to create the directory %s: %s. Current directory is /.

Description:

Callback method is failed to create the directory specified. Now the callback methods will be executed in "/", so the core dumps from this callbacks will be located in "/".

Solution:

No user action needed. For detailed error message, check the syslog message.


654520 :INTERNAL ERROR: rgm_run_state: bad state <%d> for resource group <%s>

Description:

The rgmd state machine on this node has discovered that the indicated resource group's state information is corrupted. The state machine will not launch any methods on resources in this resource group. This may indicate an internal logic error in the rgmd.

Solution:

Other syslog messages occurring before or after this one might provide further evidence of the source of the problem. If not, save a copy of the /var/adm/messages files on all nodes, and (if the rgmd crashes) a copy of the rgmd core file, and contact your authorized Sun service provider for assistance.


654546:Probe_timeout is not set.

Description:

The resource properties Probe_timeout is not set. This property controls the Sun Cluster HA for BroadVision One-To-One Enterprise Probe time interval.

Solution:

Ensure that this property is set. Use the scrgadm(1M) command to set this property.


654567:Failed to retrieve SAP binary path.

Description:

Cannot retrieve the path to Sun Cluster HA for SAP binaries. This is an internal error.

Solution:

There might be prior messages in syslog indicating specific problems. Make sure that the system has enough memory and swap space available. Save the /var/adm/messages files from all nodes. Contact your authorized Sun service provider.


655512:checkdaemon failed for $HOSTNAME.

Description:

This message if from the Sun Cluster HA for BroadVision One-To-One Enterprise Probe. The Sun Cluster HA for BroadVision One-To-One Enterprise checkdaemon command on the specified host failed.

Solution:

No user action required. The Sun Cluster HA for BroadVision One-To-One Enterprise Probe should take appropriate action.


656721 :clexecd: %s: sigdelset returned %d. Exiting.

Description:

clexecd program has encountered a failed sigdelset(3C) system call. The error message indicates the error number for the failure.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


656795 :CMM: Unable to bind <%s> to nameserver.

Description:

An instance of the userland CMM encountered an internal initialization error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


657495:Tag %s: error number %d in throttle wait; process will not be requeued.

Description:

An internal error has occurred in the rpc.pmfd server while waiting before restarting the specified tag. rpc.pmfd will delete this tag from its tag list and discontinue retry attempts.

Solution:

If desired, restart the tag under pmf using the `pmfadm -c' command.


657560 :CMM: Reading reservations from quorum device %s failed with error %d.

Description:

The specified error was encountered while trying to read reservations on the specified quorum device.

Solution:

There may be other related messages on this and other nodes connected to this quorum device that may indicate the cause of this problem. Refer to the quorum disk repair section of the administration guide for resolving this problem.


657875 :Could not reset SCSI buses on CMM reconfiguration. User program did not execute cleanly.

Description:

An error occurred when the SC 3.0 software was in the process of resetting SCSI buses with shared nodes that are down.

Solution:

Look in /var/adm/messages for other messages before this that may help to pinpoint the exact cause of the failure. If no such message is available, then contact your authorized Sun service provider to determine whether a workaround or patch is available.


658329 :CMM: Waiting for initial handshake to complete.

Description:

The userland CMM has not been able to complete its initial handshake protocol with its counterparts on the other cluster nodes, and will only be able to join the cluster after this is completed.

Solution:

This is an informational message, no user action is needed.


658555 :Retrying to retrieve the resource information.

Description:

An update to cluster configuration occurred while resource properties were being retrieved

Solution:

Ignore the message.


659665 :kill -KILL: %s

Description:

The rpc.fed server is not able to stop a tag that timed out, and the error message is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


659827 :CCR: Can't access CCR metadata on node %s errno = %d.

Description:

The indicated error occurred when CCR is trying to access the CCR metadata on the indicated node. The errno value indicates the nature of the problem. errno values are defined in the file /usr/include/sys/errno.h. An errno value of 28(ENOSPC) indicates that the root files system on the node is full. Other values of errno can be returned when the root disk has failed(EIO).

Solution:

There may be other related messages on the node where the failure occurred. These may help diagnose the problem. If the root file system is full on the node, then free up some space by removing unnecessary files. If the root disk on the afflicted node has failed, then it needs to be replaced. If the cluster repository is corrupted, boot the indicated node in -x mode to restore it from backup. The cluster repository is located at /etc/cluster/ccr/.


660332 :launch_validate: fe_set_env_vars() failed for resource <%s>, resource group <%s>, method <%s>

Description:

The rgmd was unable to set up environment variables for method execution, causing a VALIDATE method invocation to fail. This in turn will cause the failure of a creation or update operation on a resource or resource group.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified. Retry the creation or update operation. If the problem recurs, save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance.


660368:CCR: CCR service not available, service is %s.

Description:

The CCR service is not available due to the indicated failure.

Solution:

Reboot the cluster. Also contact your authorized Sun service provider to determine whether a workaround or patch is available.


660974 :file specified in USER_ENV %s does not exist

Description:

'User_env' property was set when configuring the resource. File specified in 'User_env' property does not exist or is not readable. File should be specified with fully qualified path.

Solution:

Specify existing file with fully qualified file name when creating resource. If resource is already created, please update resource property 'User_env'.


661560 :All the SUNW.HAStoragePlus resources that this resource depends on are online on the local node. Proceeding with the checks for the existence and permissions of the start/stop/probe commands.

Description:

This is an informational message which means that the SUNW.HAStoragePlus resource(s) that this application resource depends on is online on the local node and therefore the validation checks related to start/stop/probe commands will be carried out on the local node.

Solution:

None.


661614 :Method <%s> failed on resource <%s> in resource group <%s>, exit code <%d>

Description:

A resource method exited with a non-zero exit code; this is considered a method failure. Depending on which method is being invoked and the Failover_mode setting on the resource, this might cause the resource group to fail over or move to an error state.

Solution:

Consult resource type documentation to diagnose the cause of the method failure. Other syslog messages occurring just before this one might indicate the reason for the failure. After correcting the problem that caused the method to fail, the operator may choose to issue an scswitch(1M) command to bring resource groups onto desired primaries.


661782:Could not clear stale entries in the orbixd checkpoint file $ORBIXD_CHECKPOINT_FILE.

Description:

This is an internal error.

Solution:

Contact your authorized Sun service provider for assistance. Provide your authorized Sun service provider a copy of the /var/adm/messages files from all nodes.


663089 :clexecd: %s: sigwait returned %d. Exiting.

Description:

clexecd program has encountered a failed sigwait(3C) system call. The error message indicates the error number for the failure.

Solution:

The clexecd program will exit and the node will be halted or rebooted to prevent data corruption. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


663293 :reservation error(%s) - do_status() error for disk %s

Description:

The device fencing program has encountered errors while trying to access a device. All retry attempts have failed.

Solution:

The action which failed is a scsi-2 ioctl. These can fail if there are scsi-3 keys on the disk. To remove invalid scsi-3 keys from a device, use 'scdidadm -R' to repair the disk (see scdidadm man page for details). If there were no scsi-3 keys present on the device, then this error is indicative of a hardware problem, which should be resolved as soon as possible. Once the problem has been resolved, the following actions may be necessary: If the message specifies the 'node_join' transition, then this node may be unable to access the specified device. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access the device. In either case, access can be reacquired by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group may have failed to start on this node. If the device group was started on another node, it may be moved to this node with the scswitch command. If the device group was not started, it may be started with the scswitch command. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group may have failed. If so, the desired action may be retried.


663835 :in libsecurity create of file %s failed: %s

Description:

The rpc.pmfd, rpc.fed or rgmd server was unable to create a cache file for rpcbind information. The affected component should continue to function by calling rpcbind directly.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


663851 :Failover %s data services must have exactly one value for extension property %s.

Description:

Failover data services must have one and only one value for Confdir_list.

Solution:

Create a failover resource group for each configuration file.


663897 :clcomm: Endpoint %p: %d is not an endpoint state

Description:

The system maintains information about the state of an Endpoint. The Endpoint state is invalid.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


663943 :Quorum: Unable to reset node information on quorum disk.

Description:

This node was unable to reset some information on the quorum device. This will lead the node to believe that its partition has been preempted. This is an internal error. If a cluster gets divided into two or more disjoint subclusters, exactly one of these must survive as the operational cluster. The surviving cluster forces the other subclusters to abort by grabbing enough votes to grant it majority quorum. This is referred to as preemption of the losing subclusters.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


665015 :Scalable service instance [%s,%s,%d] registered on node %s.

Description:

The specified scalable service has been registered on the specified node. Now, the gif node can redirect packets for the specified service to this node.

Solution:

This is an informational message, no user action is needed.


665195 :INTERNAL ERROR: rebalance: invalid node name in Nodelist of resource group <%s>

Description:

An internal error has occurred in the rgmd. This error may prevent the rgmd from bringing the affected resource group online.

Solution:

Look for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


665297:Failed to validate BV configuration.

Description:

The validation of the Sun Cluster HA for BroadVision One-To-One Enterprise extension properties or Sun Cluster HA for BroadVision One-To-One Enterprise configuration has failed.

Solution:

Look for other error messages generated while validating the extension properties or Sun Cluster HA for BroadVision One-To-One Enterprise configuration to identify the exact error. Look for appropriate action for that error message.


665931 :Initialization error. CONNECT_STRING is NULL

Description:

Error occurred in monitor initialization. Monitor is unable to get resource property 'Connect_string'.

Solution:

Check syslog messages for errors logged from other system modules. Check the resource configuration and value of 'Connect_string' property. Check syslog messages for errors logged from other system modules. Stop and start fault monitor. If error persists then disable fault monitor and report the problem.


666391 :clcomm: invalid invocation result status %d

Description:

An invocation completed with an invalid result status.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


666443 :unix DLM already running

Description:

UNIX DLM is already running. Another dlm will not be started.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


666603 :clexecd: Error %d in fcntl(F_GETFD). Exiting.

Description:

clexecd program has encountered a failed fcntl(2) system call. The error message indicates the error number for the failure.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


667020 :Invalid shared path

Description:

HA-NFS fault monitor detected that one or more shared paths in dftab are invalid paths.

Solution:

Make sure all paths in dfstab are correct. Look at the prior syslog messages for any specific problems and correct them.


677759 :Unknown status code %d.

Description:

This message indicates that an unknown status code was rerturned by one of the underlying subsystems and an internal error has occured.

Solution:

Report this problem.


669026 :fcntl(F_SETFD) failed in close_on_exec

Description:

A fcntl operation failed. The "fcntl" man page describes possible error codes.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


670753:reservation fatal error(%s) - unable to determine node id for nodes %s.

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the `node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the `release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing `/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the `make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the `primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.


670799 :CMM: Registering reservation key on quorum device %s failed with error %d.

Description:

The specified error was encountered while trying to place the local node's reservation key on the specified quorum device. This node will ignore this quorum device.

Solution:

There may be other related messages on this and other nodes connected to this quorum device that may indicate the cause of this problem. Refer to the quorum disk repair section of the administration guide for resolving this problem.


671954 :waitpid: %s

Description:

The rpc.pmfd server was not able to wait for a process. The message contains the system error. The server does not perform the action requested by the client, and an error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


672013:Waiting for orbixd to start.

Description:

The method should wait until the orbixd daemon starts.

Solution:

No user action required.


672019 :Stop method failed. Error: %d.

Description:

Stop method failed, while attempting to restart the data service.

Solution:

Check the Stop_timeout and adjust it if it is not appropriate. For the detailed explanation of failure, check the syslog messages that occurred just before this message.


672372 :dl_attach: bad ACK header %u

Description:

Could not attach to the physical device. We are trying to open a fast path to the private transport adapters.

Solution:

Reboot of the node might fix the problem.


672511 :Failed to start Text server.

Description:

Sun Cluster HA for Sybase failed to start the text server. Other syslog messages and the log file will provide additional information on possible reasons for the failure.

Solution:

Determine whether the server can be started manually. Examine the HA-Sybase log files, text server log files and setup.


674359 :load balancer deleted

Description:

This message indicates that the service group has been deleted.

Solution:

This is an informational message, no user action is needed.


674415 :svc_restore_priority: %s

Description:

The rpc.pmfd or rpc.fed server was not able to run the application in the correct scheduling mode, and the system error is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


674848 :fatal: Failed to read CCR

Description:

The rgmd is unable to read the cluster configuration repository. This is a fatal error. The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


675776 :Stopped the fault monitor.

Description:

The fault monitor for this data service was stopped successfully.

Solution:

No action needed.


676141 :in libsecurity could not copy host name

Description:

A client was not able to make an rpc connection to a server (rpc.pmfd, rpc.fed or rgmd) because the host name could not be saved, probably due to low memory. An error message is output to syslog.

Solution:

Investigate if the host is low on memory. If not, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


676379 :NAFO group %s has status %s. Assuming this node cannot respond to client requests.

Description:

The state of the NAFO group named is degraded.

Solution:

Make sure all adapters and cables are working. Look in the /var/adm/messages file for message from the network monitoring daemon (pnmd).


676478:kill -TERM: %s

Description:

The rpc.fed server is not able to kill a tag that timed out, and the error message is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


676558 :WARNING: Global_resources_used property of resource group <%s> is set to non-null string, assuming wildcard

Description:

The Global_resources_used property of the resource group was set to a specific non-null string. The only supported settings of this property in the current release are null ("") or wildcard ("*").

Solution:

No user action is required; the rgmd will interpret this value as wildcard. This means that method timeouts for this resource group will be suspended while any device group temporarily goes offline during a switchover or failover. This is usually the desired setting, except when a resource group has no dependency on any global device service or cluster file system.


677278 :No network address resource in resource group.

Description:

A resource has no associated network address.

Solution:

For a failover data service, add a network address resource to the resource group. For a scalable data service, add a network resource to the resource group referenced by the RG_dependencies property.


677785 :PNM: nafo%d: could not failover static routes in %s

Description:

A failover has happened for the named NAFO group, but the static routes commands contained in the named file cannot be successfully executed. Some static routes therefore could not be restored after an adapter failover.

Solution:

Check that the named file has the right permission to be executed by the PNM daemon (pnmd), and that the file consists of valid route(1M) commands. Execute the file manually to restore the static routes. If no static routes need to be failed over explicitly, remove the named file.


678041 :lkcm_sync: cm_reconfigure failed: %s

Description:

ucmm reconfiguration failed.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


678319 :(%s) getenv of "%s" failed.

Description:

Failed to get the value of an environmental variable. udlm will fail to go through a transition.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


678755 :dl_bind: DL_BIND_ACK protocol error

Description:

Could not bind to the physical device. We are trying to open a fast path to the private transport adapters.

Solution:

Reboot of the node might fix the problem.


679912 :uaddr2taddr: %s

Description:

Call to uaddr2taddr() failed. The "uaddr2taddr" man page describes possible error codes. udlm will exit and the node will abort and panic.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.



680437 :Start method failed. Error: %d.

Description:

Restart of the data service failed.

Solution:

Check the sylog messages that are occurred just before this message to check whether there is any internal error. In case of internal error, contact your Sun service provider. Otherwise, any of the following situations may have happened. 1) Check the Start_timeout value and adjust it if it is not appropriate. 2) Check whether the application's configuration is correct. 3) This might be the result of lack of the system resources. Check whether the system is low in memory or the process table is full and take appropriate action.


680960 :Unable to write data: %s.

Description:

Failed to write the data to the socket. The reason might be expiration of timeout, hung application, or heavy load.

Solution:

Check if the application is hung. If this is the case, restart the application.


681547 :fatal: Method <%s> on resource <%s>: Received unexpected result <%d> from rpc.fed, aborting node

Description:

A serious error has occurred in the communication between rgmd and rpc.fed. The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


683763 :TCPTR: Attempt to join from remote node %u that has incompatible cluster software. \"%s\" on node %u not compatible with \"%s\" on node %u.

Description:

Transport at the local node received an initial handshake message from the remote node that is not running a compatible version of the cluster software.

Solution:

Make sure all nodes in the cluster are running compatible versions of sun cluster software.


683997 :Failed to retrieve the resource group property %s: %s

Description:

Unable to retrieve the resource group property.

Solution:

For the property name and the reason for failure, check the syslog message. For more details about the api failure, check the syslog messages from the RGM .


684383:Development system shut down successfully.

Description:

The development system shut down successfully.

Solution:

No user action required.


684753 :store_binding: <%s> bad bind type <%d>

Description:

During a name server binding store an unknown binding type was encountered.

Solution:

No action required. This is informational message.


684895 :Failed to validate scalable service configuration: Error %d.

Description:

An error was detected in the Load_balancing_weights property for the data service.

Solution:

Use the scrgadm command to change the Load_balancing_weights property to a valid value.


685886 :Failed to communicate: %s.

Description:

While determining the health of the data service, fault monitor is failed to communicate with the process monitor facility.

Solution:

This is internal error. Save /var/adm/messages file and contact your authorized Sun service provider. For more details about error, check the syslog messages.


686220 :Node attempted to join with invalid version message

Description:

Initial handshake message from a cluster node did not have a valid format.

Solution:

Check if all cluster nodes are running the same version of the clustering software.


687543 :shutdown abort did not succeed.

Description:

HA-Oracle failed to shutdown Oracle server using 'shutdown abort'.

Solution:

Examine log files and syslog messages to determine the cause of failure.


687929 :daemon %s did not respond to null rpc call: %s.

Description:

HA-NFS fault monitor failed to ping an nfs daemon.

Solution:

No action required. The fault monitor will restart the daemon if necessary.


688163 :clexecd: pipe returned %d. Exiting.

Description:

clexecd program has encountered a failed pipe(2) system call. The error message indicates the error number for the failure.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


688525:$daemon_status daemons are not running on $HOSTNAME.

Description:

This message is from the Sun Cluster HA for BroadVision One-To-One Enterprise Probe. The Sun Cluster HA for BroadVision One-To-One Enterprise Probe detected that the specified number of daemons are not running.

Solution:

No user action required. The Sun Cluster HA for BroadVision One-To-One Enterprise Probe should take appropriate action.


689538 :Listener %s did not stop.(%s)

Description:

Failed to start Oracle listener using 'lsnrctl' command. HA-Oracle will attempt to kill listener process.

Solution:

None


689989 :Invalid device group name <%s> supplied

Description:

The diskgroup name defined in SUNW.HAStorage type resource is invalid.

Solution:

Check and set the correct diskgroup name in extension property "ServicePaths" of SUNW.HAStorage type resource.


690417 :Protocol is missing in system defined property %s.

Description:

The specified system property does not have a valid format. The value of the property must include a protocol.

Solution:

Use scrgadm(1M) to specify the property value with protocol. For example, TCP.


690463:Cannot bring server online on this node.

Description:

The Sybase Adaptive Server cannot be brought online on this node.

Solution:

Manually start the Sybase Adaptive Server. Examine the log files and setup.


691493 :One or more of the SUNW.HAStoragePlus resources that this resource depends on is in a different resource group.

Description:

The application resource and the SUNW.HAStoragePlus resource(s) that it depends on are in a different resource groups.

Solution:

Change the resource or resource group configuration so that the application resource and the SUNW.HAStoragePlus resource(s) are in the same resource group.


691736 :CMM: Quorum device %ld (%s) with votecount = %d removed.

Description:

The specified quorum device with the specified votecount has been removed from the cluster. A quorum device being placed in maintenance state is equivalent to it being removed from the quorum subsystem's perspective, so this message will be logged when a quorum device is put in maintenance state as well as when it is actually removed.

Solution:

This is an informational message, no user action is needed.


692203:Failed to stop development system.

Description:

The development system did not stop.

Solution:

Informational message. Check previous messages in the system log for more details regarding why it failed.


695728:Skipping checks dependant on HAStoragePlus resources on this node.

Description:

This resource will not perform some of the filesystem specific checks (during VALIDATE or MONITOR_CHECK) on this node because at least one SUNW.HAStoragePlus resource that it depends on is online on a different node

Solution:

None.


696186 :This list element in System property %s has an invalid port number: %s.

Description:

The system property that was named does not have a valid port number.

Solution:

Change the value of the property to use a valid port number.


697026 :did instance %d created.

Description:

Informational message from scdidadm.

Solution:

No user action required.


697108 :t_sndudata in send_reply failed.

Description:

Call to t_sndudata() failed. The "t_sndudata" man page describes possible error codes. ucmmd will exit and the node will abort.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


697588 :Nodeid must be less than %d. Nodeid passed: '%s'

Description:

Incorrect nodeid passed to Oracle unix dlm. Oracle unix dlm will not start.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


697663:INTERNAL ERROR:BV extension property structure is NULL.

Description:

An internal error occurred.

Solution:

Contact your authorized Sun service provider for assistance. Provide your authorized Sun service provider a copy of the /var/adm/messages files from all nodes.


698239:Monitor server stopped.

Description:

Sun Cluster HA for Sybase stopped the Monitor Server.

Solution:

No user action required.


698512 :Directory %s is not readable: %s.

Description:

The specified path doesn't exist or is not readable

Solution:

Consult the HA-NFS configuration guide on how to configure the dfstab._name> file for HA-NFS resources.




698526 :scvxvmlg error - service %s has service_class %s, not %s, ignoring it

Description:

The program responsible for maintaining the VxVM namespace has suffered an internal error. If configuration changes were recently made to VxVM diskgroups or volumes, this node may be unaware of those changes. Recently created volumes may be inaccessible from this node.

Solution:

If no configuration changes have been recently made to VxVM diskgroups or volumes and all volumes continue to be accessible from this node, then no action is required. If changes have been made, the device namespace on this node can be updated to reflect those changes by executing '/usr/cluster/lib/dcs/scvxvmlg'. If the problem persists, contact your authorized Sun service provider to determine whether a workaround or patch is available.


698744 :scvxvmlg error - lstat(%s) failed with errno %d

Description:

The program responsible for maintaining the VxVM namespace was unable to access the global device namespace. If configuration changes were recently made to VxVM diskgroups or volumes, this node may be unaware of those changes. Recently created volumes may be inaccessible from this node.

Solution:

Verify that the /global/.devices/node@N (N = this node's node number) is mounted globally and is accessible. If no configuration changes have been recently made to VxVM diskgroups or volumes and all volumes continue to be accessible from this node, then no further action is required. If changes have been made, the device namespace on this node can be updated to reflect those changes by executing '/usr/cluster/lib/dcs/scvxvmlg'. If the problem persists, contact your authorized Sun service provider to determine whether a workaround or patch is available.


699689 :(%s) poll failed: %s (UNIX errno %d)

Description:

Call to poll() failed. The "poll" man page describes possible error codes. udlmctl will exit.

Solution:

Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.