Sun Cluster Error Messages Guide for Solaris OS

Previous: Message IDs 800000–899999

Message IDs 900000–999999

This section contains message IDs 900000–999999.

900102 Failed to retrieve the resource type property %s: %s.

Description:

An API operation has failed while retrieving the resource type property. Low memory or API call failure might be the reasons.

Solution:

In case of low memory, the problem will probably cured by rebooting. If the problem reoccurs, you might need to increase swap space by configuring additional swap devices. Otherwise, if it is API call failure, check the syslog messages from other components.

900198 validate: Port is not set but it is required

Description:

The parameter Port is not set in the parameter file.

Solution:

Set the variable Port in the parameter file mentioned in option -N to a of the start, stop and probe command to valid contents.

900371 pnmd has requested an immediate failover of all HA IP addresses hosted on IPMP group %s

Description:

All network interfaces in the IPMP group have failed. All of these failures were detected by the hardware drivers for the network interfaces, and not in.mpathd. In such a situation pnmd requests all highly available IP addresses (LogicalHostname and SharedAddress) to fail over to another node without any delay.

Solution:

No user action is needed. This is an informational message that indicates that the IP addresses will be failed over to another node immediately.

900499 Error: low memory

Description:

The rpc.fed or cl_apid server was not able to allocate memory. If the message if from the rpc.fed, the server may not be able to capture the output from methods it runs.

Solution:

Investigate if the host is running out of memory. If not save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

900501 pthread_sigmask: %s

Description:

The cl_apid was unable to configure its signal handler, so it is unable to run.

Solution:

Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.

900508 Internal error; SAP instance id set to NULL.

Description:

Extension property could not be retrieved and is set to NULL. Internal error.

Solution:

No user action needed.

900954 fatal: Unable to open CCR

Description:

The rgmd was unable to open the cluster configuration repository (CCR). The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.

901030 clconf: Data length is more than max supported length in clconf_file_io

Description:

In reading configuration data through CCR FILE interface, found the data length is more than max supported length.

Solution:

Check the CCR configuration information.

901066 Monitor_retry_count is not set.

Description:

The resource property Monitor_retry_count is not set. This property controls the number of restart attempts of the fault monitor.

Solution:

Check whether this property is set. Otherwise, set it using scrgadm(1M).

902017 clexecd: can allocate execit_msg Could not allocate memory. Node is too low on memory.

Solution:

clexecd program will exit and node will be halted or rebooted to prevent data corruption. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

902721 Switching over resource group using scha_control GIVEOVER

Description:

Fault monitor has detected problems in RDBMS server. Fault monitor has determined that RDBMS server cannot be restarted on this node. Attempt will be made to switchover the resource to any other node, if a healthy node is available.

Solution:

Check the cause of RDBMS failure.

903317 Entry at position %d in property %s was invalid.

Description:

An invalid entry was found in the named property. The position index, which starts at 0 for the first element in the list, indicates which element in the property list was invalid.

Solution:

Make sure the property has a valid value.

903370 Command %s failed to run: %s.

Description:

HA-NFS attempted to run the specified command to perform some action which failed. The specific reason for failure is also provided.

Solution:

HA-NFS will take action to recover from this failure, if possible. If the failure persists and service does not recover, contact your service provider. If an immediate repair is desired, reboot the cluster node where this failure is occurring repeatedly.

903666 Some other object bound to <%s>.

Description:

The cl_eventd found an unexpected object in the nameserver. It may be unable to forward events to one or more nodes in the cluster.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.

903734 Failed to create lock directory %s: %s.

Description:

This network resource failed to create a directory in which to store lock files. These lock files are needed to serialize the running of the same callback method on the same adapter for multiple resources.

Solution:

This might be the result of a lack of system resources. Check whether the system is low in memory and take appropriate action. For specific error information, check the syslog message.

904401 INITPNM Error: Can't run pmfadm. Exiting without starting in.mpathd

Description:

The pnm startup script was not able to run in.mpathd.

Solution:

See if in.mpathd is running at all. See if it is already under pmf. If it is already running under pmf then do nothing. If it is running outside pmf then kill in.mpathd. If it is not running at all then check the other error messages related to pmf and in.mpathd

904778 Number of entries exceeds %ld: Remaining entries will be ignored.

Description:

The file being processed exceeded the maximum number of entries permitted as indicated in this message.

Solution:

Remove any redundant or duplicate entries to avoid exceeding the maximum limit. Too many entries will cause the excess number of entries to be ignored.

905020 %s is already running on this node outside of Sun Cluster. The start of %s from Sun Cluster will be aborted.

Description:

The specified application is already running on the node outside of Sun Cluster software. The attempt to start it up under Sun Cluster software will be aborted.

Solution:

No user action is needed.

905023 clexecd: dup2 of stderr returned with errno %d while exec'ing (%s). Exiting.

Description:

clexecd program has encountered a failed dup2(2) system call. The error message indicates the error number for the failure.

Solution:

The clexecd program will exit and the node will be halted or rebooted to prevent data corruption. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

905132 CL_EVENTLOG Error: ${LOGGER} is already running.

Description:

The cl_eventlog init script found the cl_eventlogd already running. It will not start it again.

Solution:

No action required.

905591 Could not determine volume configuration daemon mode

Description:

Could not get information about the volume manager daemon mode.

Solution:

Check if the volume manager has been started up right.

906435 Encountered an error while validating configuration.

Description:

An error occurred during the validations of specified HAStoragePlus global device paths and/or FilesystemMountPoints.

Solution:

Investigate possible RGM, DSDL, DCS errors. Contact your authorized Sun service provider for assistance in diagnosing the problem.

906589 Error retrieving network address resource in resource group.

Description:

An error occurred reading the indicated extension property.

Solution:

Check syslog messages for errors logged from other system modules. If error persists, please report the problem.

906838 reservation warning(%s) - do_scsi3_registerandignorekey() err or for disk %s, attempting do_scsi3_register()

Description:

The device fencing program has encountered errors while trying to access a device. Now trying to run do_scsi3_register() This is an informational message, no user action is needed

Solution:

This is an informational message, no user action is needed.

906922 Started NFS daemon %s.

Description:

The specified NFS daemon has been started by the HA-NFS implementation.

Solution:

This is an informational message. No action is needed.

907341 CL_EVENT Error: Can't start ${SERVER}.

Description:

An attempt to start the cl_eventd server failed. This error will prevent the event subsystem from functioning correctly.

Solution:

907960 scvxvmlg error - stat(%s) failed with errno %d

Description:

The program responsible for maintaining the VxVM namespace was unable to access the global device namespace. If configuration changes were recently made to VxVM diskgroups or volumes, this node may be unaware of those changes. Recently created volumes may be inaccessible from this node.

Solution:

Verify that the /global/.devices/node@N (N = this node's node number) is mounted globally and is accessible. If no configuration changes have been recently made to VxVM diskgroups or volumes and all volumes continue to be accessible from this node, then no further action is required. If changes have been made, the device namespace on this node can be updated to reflect those changes by executing '/usr/cluster/lib/dcs/scvxvmlg'. If the problem persists, contact your authorized Sun service provider to determine whether a workaround or patch is available.

908240 in libsecurity realloc failed

Description:

A server (rpc.pmfd, rpc.fed or rgmd) was not able to start, or a client was not able to make an rpc connection to the server, probably due to low memory. An error message is output to syslog.

Solution:

Investigate if the host is low on memory. If not, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

908387 ucm_callback for step %d generated exception %d

Description:

ucmm callback for a step failed.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.

908591 Failed to stop fault monitor.

Description:

An attempt was made to stop the fault monitor and it failed. There may be prior messages in syslog indicating specific problems.

Solution:

If there are prior messages in syslog indicating specific problems, these should be corrected. If that doesn't resolve the issue, the user can try the following. Use process monitor facility (pmfadm (1M)) with -L option to retrieve all the tags that are running on the server. Identify the tag name for the fault monitor of this resource. This can be easily identified as the tag ends in string ".mon" and contains the resource group name and the resource name. Then use pmfadm (1M) with -s option to stop the fault monitor. This problem may occur when the cluster is under load and Sun Cluster cannot stop the fault monitor within the timeout period specified. You may consider increasing the Monitor_Stop_timeout property. If the error still persists, then reboot the node.

908716 fatal: Aborting this node because method <%s> failed on resource <%s> and Failover_mode is set to HARD

Description:

A STOP method has failed or timed out on a resource, and the Failover_mode property of that resource is set to HARD.

Solution:

No action is required. This is normal behavior of the RGM. With Failover_mode set to HARD, the rgmd reboots the node to force the resource group offline so that it can be switched onto another node. Other syslog messages that occurred just before this one might indicate the cause of the STOP method failure.

909656 Unable to open /dev/kmem:%s

Description:

HA-NFS fault monitor attempt to open the device but failed. The specific cause of the failure is logged with the message. The /dev/kmem interface is used to read NFS activity counters from kernel.

Solution:

No action. HA-NFS fault monitor would ignore this error and try to open the device again later. Since it is unable to read NFS activity counters from kernel, HA-NFS would attempt to contact nfsd by means of a NULL RPC. A likely cause of this error is lack of resources. Attempt to free memory by terminating any programs which are using large amounts of memory and swap. If this error persists, reboot the node.

909728 Start command %s timed out.

Description:

Start-up of the data service timed out.

Solution:

No user action needed.

909737 Error loading dtd for %s

Description:

The cl_apid was unable to load the specified dtd. No validation will be performed on CRNP xml messages.

Solution:

No action is required. If you want to diagnose the problem, examine other syslog messages occurring at about the same time to see if the problem can be identified. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.

910546 Although there are no other potential masters, RGM is failing resource group <%s> off of node <%d> because there are other current masters.

Description:

A scha_control(1HA,3HA) GIVEOVER attempt succeeded, even though no candidate node was available to host the resource group, because the resource group was currently mastered by another node.

Solution:

No action required.

911176 Successfully started BV on %s

Description:

Just an Informational Message that the BV servers and daemons on the specified host have started.

Solution:

No action needed.

912352 Could not unplumb some IP addresses.

Description:

Some of the ip addresses managed by the LogicalHostname resource were not successfully brought offline on this node.

Solution:

Use the ifconfig command to make sure that the ip addresses are indeed absent. Check for any error message before this error message for a more precise reason for this error. Use scswitch command to move the resource group to a different node. If problem persists, reboot.

912393 This node is not in the replica node list of global service %s associated with path %s. This is acceptable because the existing affinity settings do not require resource and device colocation.

Description:

Self explanatory.

Solution:

This is an informational message, no user action is needed.

912866 Could not validate CCR tables; halting node

Description:

The rgmd was unable to check the validity of the CCR tables representing Resource Types and Resource Groups. The node will be halted to prevent further errors.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.

913654 %s not specified on command line.

Description:

Required property not included in a scrgadm command.

Solution:

Reissue the scrgadm command with the required property and value.

913825 reservation error() - Failure fencing did not complete in the alotted time: %d seconds

Description:

Device fencing appears to be taking an exceptional amount of time to complete. The system may be hung or experiencing other problems. This node will be unable to act as the primary node for device groups until the device fencing completes.

Solution:

A reboot of the node may be required if no forward progress is made.

914248 Listener status probe timed out after %s seconds.

Description:

An attempt to query the status of the Oracle listener using command 'lsnrctl status <listener_name>' did not complete in the time indicated, and was abandoned. HA-Oracle will attempt to kill the listener and then restart it.

Solution:

None, HA-Oracle will attempt to restart the listener. However, the cause of the listener hang should be investigated further. Examine the log file and syslog messages for additional information.

914519 Error when sending message to child %m

Description:

Error occurred when communicating with fault monitor child process. Child process will be stopped and restarted.

Solution:

If error persists, then disable the fault monitor and report the problem.

914529 ERROR: probe_mysql Option -L not set

Description:

The -L option is missing for probe_mysql command.

Solution:

Add the -L option for probe_mysql command.

914655 Restarting the resource %s.

Description:

The process monitoring facility tried to send a message to the fault monitor noting that the data service application died. It was unable to do so.

Solution:

Since some part (daemon) of the application has failed, it would be restarted. If fault monitor is not yet started, wait for it to be started by Sun Cluster framework. If fault monitor has been disabled, enable it using scswitch.

914866 Unable to complete some unshare commands.

Description:

HA-NFS postnet_stop method was unable to complete the unshare(1M) command for some of the paths specified in the dfstab file.

Solution:

The exact pathnames which failed to be unshared would have been logged in earlier messages. Run those unshare commands by hand. If problem persists, reboot the node.

916067 validate: Directory $Directoryname does not exist but it is required

Description:

The directory with the name $Directoryname does not exist.

Solution:

Set the variable Basepath in the parameter file mentioned in option -N to a of the start, stop and probe command to valid contents.

916361 in libsecurity for program %s (%lu); could not find any tcp transport

Description:

A client was not able to make an rpc connection to the specified server because it could not find a tcp transport. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

916786 last probe for N1 Grid Service Provisioning Systems Tomcat failed, N1 Grid Service Provisioning System considered as unavailable

Description:

The probe on the Tomcat port of the master server failed.

Solution:

None, the cluster will restart the master server.

917103 %s has the keyword group already.

Description:

This means that auto-create was called even though the /etc/hostname.adp file has the keyword "group". Someone might have hand edited that file. This could also happen if someone deletes an IPMP group - A notification should have been provided for this.

Solution:

Please change the file back to its original contents. Try the scrgadm command again. We do not allow IPMP groups to be deleted once they are configured. If the problem persists contact your authorized Sun service provider for suggestions.

917145 Only one Sun Cluster node will be offline, stopping node.

Description:

The resource will be offline on only one node so the database can continue to run.

Solution:

This is an informational message, no user action is needed.

917389 priocntl returned %d.

Description:

Internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

917591 fatal: Resource type <%s> update failed with error <%d>; aborting node

Description:

Rgmd failed to read updated resource type from the CCR on this node.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.

918018 load balancer for group '%s' released

Description:

This message indicates that the service group has been deleted.

Solution:

This is an informational message, no user action is needed.

918488 Validation failed. Invalid command line parameter %s %s.

Description:

Unable to process parameters passed to the call back methodspecified. This is a Sun Cluster HA for Sybase internal error.

Solution:

Report this problem to your authorized Sun service provider

918563 stop_sap_j2ee - Failed to stop J2EE instance %s returned %s

Description:

The agent failed to stop the specified J2EE instance.

Solution:

Check the logfile produced by the stopsap script.

919740 WARNING: error in translating address (%s) for nodeid %d

Description:

Could not get address for a node.

Solution:

Make sure the node is booted as part of a cluster.

919860 scvxvmlg warning - %s does not link to %s, changing it

Description:

The program responsible for maintaining the VxVM device namespace has discovered inconsistencies between the VxVM device namespace on this node and the VxVM configuration information stored in the cluster device configuration system. If configuration changes were made recently, then this message should reflect one of the configuration changes. If no changes were made recently or if this message does not correctly reflect a change that has been made, the VxVM device namespace on this node may be in an inconsistent state. VxVM volumes may be inaccessible from this node.

Solution:

If this message correctly reflects a configuration change to VxVM diskgroups then no action is required. If the change this message reflects is not correct, then the information stored in the device configuration system for each VxVM diskgroup should be examined for correctness. If the information in the device configuration system is accurate, then executing '/usr/cluster/lib/dcs/scvxvmlg' on this node should restore the device namespace. If the information stored in the device configuration system is not accurate, it must be updated by executing '/usr/cluster/bin/scconf -c -D name=diskgroup_name' for each VxVM diskgroup with inconsistent information.

920103 created %d threads to handle resource group switchback; desired number = %d

Description:

The rgmd was unable to create the desired number of threads upon starting up. This is not a fatal error, but it may cause RGM reconfigurations to take longer because it will limit the number of tasks that the rgmd can perform concurrently.

Solution:

Make sure that the hardware configuration meets documented minimum requirements. Examine other syslog messages on the same node to see if the cause of the problem can be determined.

920286 Peer node %d attempted to contact us with an invalid version message, source IP %s.

Description:

Sun Cluster software at the local node received an initial handshake message from the remote node that is not running a compatible version of the Sun Cluster software.

Solution:

Make sure all nodes in the cluster are running compatible versions of Sun Cluster software.

922085 INTERNAL ERROR CMM: Memory allocation error.

Description:

The CMM failed to allocate memory during initialization.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

922276 J2EE probe determined wrong port number, %d.

Description:

The data service detected an invalid port number for the J2EE engine probe.

Solution:

Informational message. No user action is needed.

922330 getservbyname() failed : %s.

Description:

Entries for required services are missing.

Solution:

Verify the sources for the services specified in the /etc/nsswitch.conf file. These should have the entry for the required service.

922363 resource %s status msg on node %s change to <%s>

Description:

This is a notification from the rgmd that a resource's fault monitor status message has changed.

Solution:

This is an informational message, no user action is needed.

922726 The status of device: %s is set to MONITORED

Description:

A device is monitored.

Solution:

No action required.

922870 tag %s: unable to kill process with SIGKILL

Description:

The rpc.fed server is not able to kill the process with a SIGKILL. This means the process is stuck in the kernel.

Solution:

Save the syslog messages file. Examine other syslog messages occurring around the same time on the same node, to see if the cause of the problem can be identified.

923106 sysevent_bind_handle(): %s

Description:

The cl_apid or cl_eventd was unable to create the channel by which it receives sysevent messages. It will exit.

Solution:

Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.

923184 CMM: Scrub failed for quorum device %s.

Description:

The scrub operation for the specified quorum device failed, due to which this quorum device will not be added to the cluster.

Solution:

There may be other related messages on this node that may indicate the cause of this problem. Refer to the disk repair section of the administration guide for resolving this problem. After the problem has been resolved, retry adding the quorum device.

923412 The path name %s associated with the FilesystemCheckCommand extension property is detected to be a relative pathname. Only absolute path names are allowed.

Description:

Self explanatory. For security reasons, this is not allowed.

Solution:

Correct the FilesystemCheckCommand extension property by specifying an absolute path name.

923618 Prog <%s>: unknown command.

Description:

An internal error in ucmmd has prevented it from successfully executing a program.

Solution:

Save a copy of /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.

923648 Archive log destination error condition is now clear.Fault monitor test transactions are now enabled again

Description:

All error conditions detected earlier with archive log destinations are now clear. The fault monitor will commence test transactions again.

Solution:

This is an informational message, no user action is needed.

923712 CCR: Table %s on joining node %s has the same version but different checksum as the copy in the current membership. The table on the joining node will be replaced by the one in the current membership.

Description:

The indicated table on the joining node has the same version but different contents as the one in the current membership. It will be replaced by the one in the current membership.

Solution:

This is an informational message, no user action is needed.

924002 check_samba - Samba server <%s> not working, failed to connect to samba-resource <%s>

Description:

The Samba resource's fault monitor checks that the Samba server is working by using the smbclient program. However this test failed to connect to the Samba server.

Solution:

No user action is needed. The Samba server will be restarted. However, examine the other syslog messages occurring at the same time on the same node, to see if the cause of the problem can be identified.

924059 Validate - MySQL database directory %s does not exist

Description:

The defined database directory (-D option) doesn't exist.

Solution:

Make sure that defined database directory exists.

924489 Error was detected in previous reconfiguration: %s

Description:

Error was detected during previous reconfiguration of the RAC framework component. Error is indicated in the message. As a result of error, the ucmmd daemon was stopped and node was rebooted. On node reboot, the ucmmd daemon was not started on the node to allow investigation of the problem. RAC framework is not running on this node. Oracle parallel server/ Real Application Clusters database instances will not be able to start on this node.

Solution:

Review logs and messages in /var/adm/messages and /var/cluster/ucmm/ucmm_reconf.log. Resolve the problem that resulted in reconfiguration error. Reboot the node to start RAC framework on the node. Refer to the documentation of Sun Cluster support for Oracle Parallel Server/ Real Application Clusters. If problem persists, contact your Sun service representative.

924950 No reference to any remote nodes: Not queueing event %lld.

Description:

The cl_eventd does not have references to any remote nodes.

Solution:

This message is informational only, and does not require user action.

925337 Stop of HADB database did not complete: %s.

Description:

The resource was unable to successfully run the hadbm stop command either because it was unable to execute the program, or the hadbm command received a signal.

Solution:

This might be the result of a lack of system resources. Check whether the system is low in memory and take appropriate action.

925379 resource group <%s> in illegal state <%s>, will not run %s on resource <%s>

Description:

While creating or deleting a resource, the rgmd discovered the containing resource group to be in an unexpected state on the local node. As a result, the rgmd did not run the INIT or FINI method (as indicated in the message) on that resource on the local node. This should not occur, and may indicate an internal logic error in the rgmd.

Solution:

The error is non-fatal, but it may prevent the indicated resource from functioning correctly on the local node. Try deleting the resource, and if appropriate, recreating it. If those actions succeed, then the problem was probably transitory. Since this problem may indicate an internal logic error in the rgmd, please save a copy of the /var/adm/messages files on all nodes, the output of an scstat -g command, and the output of a scrgadm -pvv command. Report the problem to your authorized Sun service provider.

925953 reservation error(%s) - do_scsi3_register() error for disk %s

Description:

The device fencing program has encountered errors while trying to access a device. All retry attempts have failed.

Solution:

This may be indicative of a hardware problem, which should be resolved as soon as possible. Once the problem has been resolved, the following actions may be necessary: If the message specifies the 'node_join' transition, then this node may be unable to access the specified device. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access the device. In either case, access can be reacquired by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group may have failed to start on this node. If the device group was started on another node, it may be moved to this node with the scswitch command. If the device group was not started, it may be started with the scswitch command. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group may have failed. If so, the desired action may be retried.

926201 RGM aborting

Description:

A fatal error has occurred in the rgmd. The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.

926550 scha_cluster_get failed for (%s) with %d

Description:

A call to obtain cluster information failed. The second part of the message gives the error code.

Solution:

The calling program should handle this error. If the error is not recoverable, the calling program exits. No action required.

926749 Resource group nodelist is empty.

Description:

Empty value was specified for the nodelist property of the resource group.

Solution:

Any of the following situations might have occurred. Different user action is required for these different scenarios. 1) If a new resource is created or updated, check the value of the nodelist property. If it is empty or not valid, then provide valid value using scrgadm(1M) command. 2) For all other cases, treat it as an Internal error. Contact your authorized Sun service provider.

927042 Validation failed. SYBASE backup server startup file RUN_%s not found SYBASE=%s.

Description:

Backup server was specified in the extension property Backup_Server_Name. However, Backup server startup file was not found. Backup server startup file is expected to be: $SYBASE/$SYBASE_ASE/install/RUN_<Backup_Server_Name>

Solution:

Check the Backup server name specified in the Backup_Server_Name property. Verify that SYBASE and SYBASE_ASE environment variables are set property in the Environment_file. Verify that RUN_<Backup_Server_Name> file exists.

927753 Fault monitor does not exist or is not executable

Description:

Fault monitor program specified in support file is not executable or does not exist. Recheck your installation.

Solution:

Please report this problem.

927846 fatal: Received unexpected result <%d> from rpc.fed, aborting node

Description:

A serious error has occurred in the communication between rgmd and rpc.fed while attempting to execute a VALIDATE method. The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.

928230 "Validate - sge_qmaster file does not exist or is not executable at ${bin_dir}/sge_qmaster"

Description:

The file 'sge_qmaster' cannot be found, or is not executable.

Solution:

Confirm the file '$SGE_ROOT/bin/<arch>/sge_qmaster' both exists at that location, and is executable.

928235 Validation failed. Adaptive_Server_Log_File %s not found.

Description:

File specified in the Adaptive_Server_Log_File extension property was not found. The Adaptive_Server_Log_File is used by the fault monitor for monitoring the server.

Solution:

Please check that file specified in the Adaptive_Server_Log_File extension property is accessible on all the nodes.

928339 Syntax error in file %s

Description:

Specified file contains syntax errors.

Solution:

Please ensure that all entries in the specified file are valid and follow the correct syntax. After the file is corrected, repeat the operation that was being performed.

928382 CCR: Failed to read table %s on node %s.

Description:

The CCR failed to read the indicated table on the indicated node. The CCR will attempt to recover this table from other nodes in the cluster.

Solution:

This is an informational message, no user action is needed.

928455 clcomm: Couldn't write to routing socket: %d

Description:

The system prepares IP communications across the private interconnect. A write operation to the routing socket failed.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

929100 No permission for others to execute %s.

Description:

The specified path does not have the correct permissions as expected by a program.

Solution:

Set the permissions for the file so that it is readable and executable by others (world).

929252 Failed to start HA-NFS system fault monitor.

Description:

Process monitor facility has failed to start the HA-NFS system fault monitor.

Solution:

Check whether the system is low in memory or the process table is full and correct these problems. If the error persists, use scswitch to switch the resource group to another node.

929425 Validate - %s could not be found

Description:

The program bin/java couldn't be found in the defined JAVA_HOME.

Solution:

Correct the defined JAVA_HOME in /opt/SUNWscswa/util/ha_sap_j2ee_config and reregister the agent.

929712 Share path %s: file system %s is not global.

Description:

The specified share path exists on a file system which is neither mounted via /etc/vfstab nor is a global file system.

Solution:

Share paths in HA-NFS must satisfy this requirement.

930059 %s: %s.

Description:

HA-SAP failed to access to a file. The file in question is specified with the first '%s'. The reason it failed is provided with the second '%s'.

Solution:

Check and make sure the file is accessible via the path list.

930322 reservation fatal error(%s) - malloc() error

Description:

The device fencing program has been unable to allocate required memory.

Solution:

Memory usage should be monitored on this node and steps taken to provide more available memory if problems persist.

930339 cl_event_bind_channel(): %s

Description:

The cl_eventd was unable to create the channel by which it receives sysevent messages. It will exit.

Solution:

Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.

930535 Ignoring string with misplaced quotes in the entry for %s

Description:

HA-Oracle reads the file specified in USER_ENV property and exports the variables declared in the file. Syntax for declaring the variables is : VARIABLE=VALUE VALUE may be a single-quoted or double-quoted string. The string itself may not contain any quotes.

Solution:

Please check the environment file and correct the syntax errors.

930636 Sun Cluster boot: reset vote returns $res

Description:

Internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

930851 ERROR: process_resource: resource <%s> is pending_fini but no FINI method is registered

Description:

A non-fatal internal error has occurred in the rgmd state machine.

Solution:

Since this problem might indicate an internal logic error in the rgmd, please save a copy of the /var/adm/messages files on all nodes, the output of an scstat -g command, and the output of a scrgadm -pvv command. Report the problem to your authorized Sun service provider.

931677 Could not reset SCSI buses on CMM reconfiguration. Could not find clexecd in nameserver.

Description:

An error occurred when the SC 3.0 software was in the process of resetting SCSI buses with shared nodes that are down.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

932026 WebSphere MQ Broker RDBMS has been restarted

Description:

The WebSphere MQ Broker fault monitor has detected that the WebSphere MQ Broker RDBMS has been restarted.

Solution:

No user action is needed. The resource group will be restarted.

932126 CCR: Quorum regained.

Description:

The cluster lost quorum at sometime after the cluster came up, and the quorum is regained now.

Solution:

This is an informational message, no user action is needed.

932584 Service failed and the fault monitor is not running on this node. Not restarting service because Failover_mode is set to LOG_ONLY or RESTART_ONLY.

Description:

The action script for the process is trying to contact the probe, and is unable to do so. Due to the setting of the Failover_mode system property, the action script is not restarting the application.

Solution:

This is an informational message, no user action is needed.

934249 Invalid value for property %s: %d.

Description:

An invalid value may have been specified for the property.

Solution:

Please set correct value for the property and retry the operation.

935470 validate: EnvScript not set but it is required

Description:

The parameter EnvScript is not set in the parameter file.

Solution:

Set the variable EnvScript in the parameter file mentioned in option -N to a of the start, stop and probe command to valid contents.

936306 svc_setschedprio: Could not setup RT (real time) scheduling parameters: %s

Description:

The server was not able to set the scheduling mode parameters, and the system error is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

937669 CCR: Failed to update table %s.

Description:

The CCR data server failed to update the indicated table.

Solution:

There may be other related messages on this node, which may help diagnose the problem. If the root file system is full on the node, then free up some space by removing unnecessary files. If the root disk on the afflicted node has failed, then it needs to be replaced.

938012 INTERNAL ERROR: usage: $0 <Independent_prog_path> <DB_Name> <Project_name>

Description:

An internal error has occurred.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.

938189 rpc.fed: program not registered or server not started

Description:

The rpc.fed daemon was not correctly initialized, or has died. This has caused a step invocation failure, and may cause the node to reboot.

Solution:

Check that the Solaris Clustering software has been installed correctly. Use ps(1) to check if rpc.fed is running. If the installation is correct, the reboot should restart all the required daemons including rpc.fed.

938318 Method <%s> failed on resource <%s> in resource group <%s> [exit code <%d>, time used: %d%% of timeout <%d seconds>] %s

Description:

A resource method exited with a non-zero exit code; this is considered a method failure. Depending on which method is being invoked and the Failover_mode setting on the resource, this might cause the resource group to fail over or move to an error state. Note that failures of FINI, INIT, BOOT, and UPDATE methods do not cause the associated administrative actions (if any) to fail.

Solution:

Consult resource type documentation to diagnose the cause of the method failure. Other syslog messages occurring just before this one might indicate the reason for the failure. If the failed method was one of START, PRENET_START, MONITOR_START, STOP, POSTNET_STOP, or MONITOR_STOP, you may issue an scswitch(1M) command to bring resource groups onto desired primaries after fixing the problem that caused the method to fail. If the failed method was UPDATE, note that the resource might not take note of the update until it is restarted. If the failed method was FINI, INIT, or BOOT, you may need to initialize or de-configure the resource manually as appropriate on the affected node.

938604 File system associated with %s is already mounted.

Description:

HA Storage Plus detected an already mounted file system while mounting the file systems, hence it will not try to remount it.

Solution:

This is an informational message, no user action is needed.

938618 Couldn't create deleted subdir %s: error (%d)

Description:

While mounting this file system, PXFS was unable to create some directories that it reserves for internal use.

Solution:

If the error is 28(ENOSPC), then mount this FS non-globally, make some space, and then mount it globally. If there is some other error, and you are unable to correct it, contact your authorized Sun service provider to determine whether a workaround or patch is available.

938664 INITRGM Error: Sun Cluster does not support transitioning from run-level 3 to levels S (single-user), 1, or 2, halting

Description:

A transition which is not allowed by Sun Cluster has been requested. Sun Cluster does not accept to transition from run-level 3 to states 1, 2 and S

Solution:

938773 Pid_Dir_Path %s is not readable: %s.

Description:

The path which is listed in the message is not readable. The reason for the failure is also listed in the message.

Solution:

Make sure the path which is listed exists. Use the HAStoragePlus resource in the same resource group of the HA-SAPDB resource. So that the HA-SAPDB method have access to the file system at the time when they are launched.

939374 CCR: Failed to access cluster repository during synchronization. ABORT node.

Description:

This node failed to access its cluster repository when it first came up in cluster mode and tried to synchronize its repository with other nodes in the cluster.

Solution:

This is usually caused by an unrecoverable failure such as disk failure. There may be other related messages on this node, which may help diagnose the problem. If the root disk on the afflicted node has failed, then it needs to be replaced. If the root disk is full on this node, boot the node into non-cluster mode and free up some space by removing unnecessary files.

939576 WebSphere MQ bipservice UserNameServer failed

Description:

The WebSphere MQ UserNameServer fault monitor has detected that the WebSphere MQ UserNameServer bipservice process has failed.

Solution:

No user action is needed. The WebSphere MQ UserNameServer will be restarted.

939614 Incorrect resource group properties for RAC framework resource group %s. Value of RG_mode must be Scalable.

Description:

The RAC framework resource group must be a scalable resource group. The RG_mode property of this resource group must be set to 'scalable'. RAC framework will not function correctly without this value.

Solution:

Refer to the documentation of Sun Cluster support for Oracle Parallel Server/ Real Application Clusters for installation procedure.

940685 Configuration file %s missing for NetBackup.

Description:

The configuration file for NetBackup is missing or does not have correct permissions.

Solution:

Check whether the NetBackup configuration file bp.conf, or a link to it exists under /usr/openv/netbackup, and that the file has correct permissions.

941071 Cannot retrieve service <%s>.

Description:

The data service cannot retrieve the required service.

Solution:

Check the service entries on all relevant cluster nodes.

941267 Cannot determine command passed in: <%s>.

Description:

An invalid pathname, displayed within the angle brackets, was passed to a libdsdev routine such as scds_timerun or scds_pmf_start. This could be the result of mis-configuring the name of a START or MONITOR_START method or other property, or a programming error made by the resource type developer.

Solution:

Supply a valid pathname to a regular, executable file.

941318 ERROR: start_sap_j2ee Option -G not set

Description:

The -G option is missing for the start_command.

Solution:

Add -G option to the start-command.

941367 open failed: %s

Description:

Failed to open /dev/console. The "open" man page describes possible error codes.

Solution:

None. ucmmd will exit.

941416 One or more of the SUNW.HAStoragePlus resources that this resource depends on is not online anywhere.

Description:

It is an invalid configuration to create an application resource that depends on one or more SUNW.HAStoragePlus resource(s) that are not online on any node.

Solution:

Bring the SUNW.HAStoragePlus resource(s) online before creating the application resource that depend on them and then try the command again.

941693 "%s" Failed to stay up.

Description:

The tag shown, being run by the rpc.pmfd server, has exited. Either the user has decided to stop monitoring this process, or the process exceeded the number of retries. An error message is output to syslog.

Solution:

This message is informational; no user action is needed.

942465 Validate - The sap user %s does not exist

Description:

The user <SAP SYSTEMNAME>adm does not exist.

Solution:

Add the user to /etc/passwd.

942855 check_mysql - Couldn't do show tables for defined database %s (%s)

Description:

The fault monitor can't issue show tables for the specified database.

Solution:

Either was MySQL already down or the fault monitor user doesn't have the right permission. The defined fault monitor should have Process-,Select-, Reload- and Shutdown-privileges and for MySQL 4.0.x also Super-privileges. Check also the MySQL logfiles for any other errors.

943168 pmf_monitor_suspend: pmf_add_triggers: %s

Description:

The rpc.pmfd server was not able to resume the monitoring of a process, and the monitoring of this process has been aborted. An error message is output to syslog.

Solution:

Save the syslog messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

943520 Invalid node ID in infrastructure table

Description:

The internal cluster configuration is erroneous and a node carries an invalid ID. The error is not fatal and the process keeps running, but more problems can be expected.

Solution:

Report the problem to your authorized Sun service provider.

944068 clcomm: validate_policy: invalid relationship moderate %d high %d pool %d

Description:

The system checks the proposed flow control policy parameters at system startup and when processing a change request. The moderate server thread level cannot be higher than the high server thread level.

Solution:

No user action required.

944096 Validate - This version of MySQL <%s> is not supported with this data service

Description:

An unsupported MySQL version is being used.

Solution:

Make sure that supported MySQL version is being used.

944121 Incorrect permissions set for %s

Description:

This file does not have the expected default execute permissions.

Solution:

Reset the permissions to allow execute permissions using the chmod command.

944181 HA: exception %s (major=%d) from flush_to().

Description:

An unexpected return value was encountered when performing an internal operation.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

944451 Cluster undergoing reconfiguration.

Description:

The cluster is undergoing a reconfiguration.

Solution:

None. This is an informational message only.

946660 Failed to create sap state file %s:%s Might put sap resource in stop-failed state.

Description:

If SAP is brought up outside the control of Sun Cluster, HA-SAP will create the state file to signal the stop method not to try to stop sap via Sun Cluster. Now if SAP was brought up outside of Sun Cluster, and the state file creation failed, then the SAP resource might end in the stop-failed state when Sun Cluster tries to stop SAP.

Solution:

This is an internal error. No user action needed. Save the /var/adm/messages from all nodes. Contact your authorized Sun service provider.

947007 Error initializing the cluster version manager (error %d).

Description:

This message can occur when the system is booting if incompatible versions of cluster software are installed.

Solution:

Verify that any recent software installations completed without errors and that the installed packages or patches are compatible with the rest of the installed software. Also contact your authorized Sun service provider to determine whether a workaround or patch is available.

947401 reservation error(%s) - Unable to open device %s, errno %d

Description:

The device fencing program has encountered errors while trying to access a device. All retry attempts have failed.

Solution:

947882 WebSphere MQ Queue Manager not available - will try later

Description:

Some WebSphere MQ processes are start dependent on the WebSphere MQ Queue Manger, however the WebSphere MQ Queue Manager is not available.

Solution:

No user action is needed. The resource will try again and gets restarted.

948424 Stopping NFS daemon %s.

Description:

The specified NFS daemon is being stopped by the HA-NFS implementation.

Solution:

This is an informational message. No action is needed.

948847 ucm_callback for start_trans generated exception %d

Description:

ucmm callback for start transition failed. Step may have timedout.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.

948865 Unblock parent, write errno %d

Description:

Internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

948872 Errors detected in %s. Entries with error and subsequent entries may get ignored.

Description:

The custom action file that is specified contains errors which need to be corrected before it can be processed completely. This message indicates that entries after encountering an error may or may not be accepted.

Solution:

Please ensure that all entries in the custom monitor action file are valid and follow the correct syntax. After the file is corrected, validate it again to verify the syntax. If all errors are not corrected, the acceptance of entries from this file may not be guaranteed.

949148 "%s" requeued

Description:

The tag shown has exited and was restarted by the rpc.pmfd server. An error message is output to syslog.

Solution:

This message is informational; no user action is needed.

949565 reservation error(%s) - do_scsi2_tkown() error for disk %s

Description:

The device fencing program has encountered errors while trying to access a device. All retry attempts have failed.

Solution:

The action which failed is a scsi-2 ioctl. These can fail if there are scsi-3 keys on the disk. To remove invalid scsi-3 keys from a device, use 'scdidadm -R' to repair the disk (see scdidadm man page for details). If there were no scsi-3 keys present on the device, then this error is indicative of a hardware problem, which should be resolved as soon as possible. Once the problem has been resolved, the following actions may be necessary: If the message specifies the 'node_join' transition, then this node may be unable to access the specified device. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access the device. In either case, access can be reacquired by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group may have failed to start on this node. If the device group was started on another node, it may be moved to this node with the scswitch command. If the device group was not started, it may be started with the scswitch command. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group may have failed. If so, the desired action may be retried.

949937 Out of memory.

Description:

A process has failed to allocate new memory, most likely because the system has run out of swap space.

Solution:

The problem will probably be cured by rebooting. If the problem reoccurs, you might need to increase swap space by configuring additional swap devices. See swap(1M) for more information.

950747 resource %s monitor disabled.

Description:

This is a notification from the rgmd that the operator has disabled monitoring on a resource. This may be used by system monitoring tools.

Solution:

This is an informational message, no user action is needed.

951501 CCR: Could not initialize CCR data server.

Description:

The CCR data server could not initialize on this node. This usually happens when the CCR is unable to read its metadata entries on this node. There is a CCR data server per cluster node.

Solution:

There may be other related messages on this node, which may help diagnose this problem. If the root disk failed, it needs to be replaced. If there was cluster repository corruption, then the cluster repository needs to be restored from backup or other nodes in the cluster. Boot the offending node in -x mode to restore the repository. The cluster repository is located at /etc/cluster/ccr/.

951520 Validation failed. SYBASE ASE runserver file RUN_%s not foundSYBASE=%s.

Description:

Sybase Adaptive Server is started by specifying the AdaptiveServer "runserver" file named RUN_<Server Name> located under$SYBASE/$SYBASE_ASE/install. This file is missing.

Solution:

Verify that the Sybase installation includes the "runserver" file and that permissions are set correctly on the file. The file should reside in the $SYBASE/$SYBASE_ASE/install directory.

951634 INTERNAL ERROR CMM: clconf_get_quorum_table() returned error %d.

Description:

The node encountered an internal error during initialization of the quorum subsystem object.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

951733 Incorrect usage: %s.

Description:

The usage of the program was incorrect for the reason given.

Solution:

Use the correct syntax for the program.

951742 Validate - The user username does not belongs to project projectname

Description:

The user does not belong to the project name specified in the Resource_project_name or in the Resourcegroup_project_name.

Solution:

Correct the user project information or fix either the Resource_project_name or the Resourcegroup_project_name.

951914 Error: unable to initialize CCR.

Description:

The cl_apid was unable to initialize the CCR during start-up. This error will prevent the cl_apid from starting.

Solution:

952006 received signal %d: exiting

Description:

The cl_apid daemon is terminating abnormally due to receipt of a signal.

Solution:

No action required.

952226 Error with allow_hosts or deny_hosts

Description:

The allow_hosts or deny_hosts for the CRNP service contains an error. This error may prevent the cl_apid from starting up.

Solution:

952237 Method <%s>: unknown command.

Description:

An internal error has occurred in the interface between the rgmd and fed daemons. This in turn will cause a method invocation to fail. This should not occur and may indicate an internal logic error in the rgmd.

Solution:

Look for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.

952465 HA: exception adding secondary

Description:

A failure occurred while attempting to add a secondary provider for an HA service.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

952811 Validation failed. CONNECT_STRING not in specified format

Description:

CONNECT_STRING property for the resource is specified in the correct format. The format could be either 'username/password' or '/' (if operating system authentication is used).

Solution:

Specify CONNECT_STRING in the specified format.

952942 Retrying retrieve of resource information: %s.

Description:

An update to resource configuration occurred while resource properties were being retrieved.

Solution:

This is an informational message, no user action is needed.

952974 %s: cannot open directory %s

Description:

The ucmmd was unable to open the directory identified. Contact your authorized Sun service provider for assistance in diagnosing the problem.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.

953818 There are no disks to monitor

Description:

An attempt to start the scdpmd failed.

Solution:

The node should have a least a disk. Check the local disk list with the scdidadm -l command.

954255 Unable to retrieve version from version manager: %s

Description:

The rgmd was unable to retrieve the current version at which it is supposed to run from the version manager. The node will be halted to prevent further errors.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.

954497 clcomm: Unable to find %s in name server

Description:

The specified entity is unknown to the name server.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

954831 Start of HADB node %d did not complete: %s.

Description:

The resource was unable to successfully run the hadbm start command either because it was unable to execute the program, or the hadbm command received a signal.

Solution:

This might be the result of a lack of system resources. Check whether the system is low in memory and take appropriate action.

955930 Attempt to connect from addr %s port %d

Description:

There was a connection from the named IP address and port number (> 1024) which means that a non-priviledged process is trying to talk to the PNM daemon.

Solution:

This message is informational; no user action is needed. However, it would be a good idea to see which non-priviledged process is trying to talk to the PNM daemon and why.

956110 Restart operation failed: %s.

Description:

This message indicated that the rgm didn't process a restart request, most likely due to the configuration settings.

Solution:

This is an informational message.

956501 Issuing a failover request.

Description:

This message indicates that the function is about to make a failover request to the RGM. If the request fails, refer to the syslog messages that appear after this message.

Solution:

This is an informational message, no user action is required.

956796 TM: Failed to retrieve RSM address for %s adapter at %s.

Description:

The Sun Cluster Topology Manager (TM) has failed to obtain the RSM address of Sun Fire Link adapter. The Sun Fire Link adapter may be configured with a hostname different from the cluster node. This needs to be fixed otherwise Sun Fire Link adapter will fail to operate in RSM mode.

Solution:

Make sure that Sun Fire Link adapter is configured with the correct hostname. Please refer to Sun Fire Link configuration guide or contact your authorized Sun service provider for assistance.

957086 Prog <%s> failed to execute step <%s> - error=<%d>

Description:

ucmmd failed to execute a step.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified and if it recurs. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance.

957505 Validate - WebSphere MQ Broker /var/mqsi/locks is not a symbolic link

Description:

The WebSphere MQ Broker failed to validate that /var/mqsi/locks is a symbolic link.

Solution:

Ensure that /var/mqsi/locks is a symbolic link. Please refer to the data service documentation to determine how to do this.

957535 Smooth_shutdown flag is not set to TRUE. The WebLogic Server will be shutdown using sigkill.

Description:

This is a information message. The Smooth_shutdown flag is not set to true and hence the WLS will be stopped using SIGKILL.

Solution:

None. If the smooth shutdown has to be enabled then set the Smooth_shutdown extension property to TRUE. To enable smooth shutdown the username and password that have to be passed to the "java weblogic.Admin .." has to be set in the start script. Refer to your Admin guide for details.

958832 INTERNAL ERROR: monitoring is enabled, but MONITOR_STOP method is not registered for resource <%s>

Description:

A non-fatal internal error has occurred in the rgmd state machine.

Solution:

958888 clcomm: Failed to allocate simple xdoor client %d

Description:

The system could not allocate a simple xdoor client. This can happen when the xdoor number is already in use. This message is only possible on debug systems.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

959261 Error parsing node status line (%s).

Description:

The resource had an error while parsing the specified line of output from the hadbm status --nodes command.

Solution:

Examine other syslog messages occurring around the same time on the same node, to see if the source of the problem can be identified. Otherwise contact your authorized Sun service provider to determine whether a workaround or patch is available.

959384 Possible syntax error in hosts entry in %s.

Description:

Validation callback method has failed to validate the hostname list. There may be syntax error in the nsswitch.conf file.

Solution:

Check for the following syntax rules in the nsswitch.conf file. 1) Check if the lookup order for "hosts" has "files". 2) "cluster" is the only entry that can come before "files". 3) Everything in between '[' and ']' is ignored. 4) It is illegal to have any leading whitespace character at the beginning of the line; these lines are skipped. Correct the syntax in the nsswitch.conf file and try again.

959610 Property %s should have only one value.

Description:

A multi-valued (comma-separated) list was provided to the scrgadm command for the property, while the implementation supports only one value for this property.

Solution:

Specify a single value for the property on the scrgadm command.

959930 Desired_primaries must equal the number of nodes in Nodelist.

Description:

The resource group properties desired_primaries and maximum_primaries must be equal and they must equal the number of Sun Cluster nodes in the nodelist of the resource group.

Solution:

Set the desired and maximum primaries to be equal and to the number of Sun Cluster nodes in the nodelist.

959983 Unmounted global file system (%s) detected. Incorrect AffinityOn specification.

Description:

HA Storage Plus detected that the specified global file system is unmounted and that AffinityOn is set to False. This is a misconfiguration.

Solution:

Correct either the AffinityOn extension property -or- mount manually the global file system.

960228 HADB node %d for host %s does not match any Sun Cluster interconnect hostname.

Description:

The HADB database must be created using the hostnames of the Sun Cluster private interconnect, the hostname for the specified HADB node is not a private cluster hostname.

Solution:

Recreate the HADB and specify cluster private interconnect hostnames.

960308 clcomm: Pathend %p: remove_path called twice

Description:

The system maintains state information about a path. The remove_path operation is not allowed in this state.

Solution:

No user action is required.

960344 ERROR: process_resource: resource <%s> is pending_init but no INIT method is registered

Description:

A non-fatal internal error has occurred in the rgmd state machine.

Solution:

960862 (%s) sigaction failed: %s (UNIX errno %d)

Description:

The udlm has failed to initialize signal handlers by a call to sigaction(2). The error message indicates the reason for the failure.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.

960932 Switchover (%s) error: failed to fsck disk

Description:

The file system specified in the message could not be hosted on the node the message came from because an fsck on the file system revealed errors.

Solution:

Unmount the PXFS file system (if mounted), fsck the device, and then mount the PXFS file system again.

961551 Signal %d terminated the scalable service configuration process.

Description:

An unexpected signal caused the termination of the program that configures the networking components for a scalable resource. This premature termination will cause the scalable service configuration to be aborted for this resource.

Solution:

Save a copy of the /var/adm/messages files on all nodes. If a core file was generated, submit the core to your service provider. Contact your authorized Sun service provider for assistance in diagnosing the problem.

962746 Usage: %s [-c|-u] -R <resource-name> -T <type-name> -G <group-name> [-r sys_def_prop=values ...] [-x ext_prop=values ...].

Description:

Incorrect arguments are passed to the callback methods.

Solution:

This is an internal error. Contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.

962775 Node %u attempting to join cluster has incompatible cluster software.

Description:

A node is attempting to join the cluster but it is either using an incompatible software version or is booted in a different mode (32-bit vs. 64-bit).

Solution:

Ensure that all nodes have the same clustering software installed and are booted in the same mode.

963465 fatal: rpc_control() failed to set automatic MT mode; aborting node

Description:

The rgmd failed in a call to rpc_control(3N). This error should never occur. If it did, it would cause the failure of subsequent invocations of scha_cmds(1HA) and scha_calls(3HA). This would most likely lead to resource method failures and prevent RGM reconfigurations from occurring. The rgmd will produce a core file and will force the node to halt or reboot.

Solution:

Examine other syslog messages occurring at about the same time to see if the source of the problem can be identified. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing the problem. Reboot the node to restart the clustering daemons.

963755 lkcm_cfg: caller is not registered

Description:

udlm is not registered with ucmm.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.

964072 Unable to resolve %s.

Description:

The data service has failed to resolve the host information.

Solution:

If the logical host and shared address entries are specified in the /etc/inet/hosts file, check that these entries are correct. If this is not the reason, then check the health of the name server. For more error information, check the syslog messages.

964083 t_open (open_cmd_port) failed

Description:

Call to t_open() failed. The "t_open" man page describes possible error codes. ucmmd will exit and the node will abort.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

964399 udlm seq no (%d) does not match library's (%d).

Description:

Mismatch in sequence numbers between udlm and the library code is causing an abort.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.

964521 Failed to retrieve the resource handle: %s.

Description:

An API operation on the resource has failed.

Solution:

For the resource name, check the syslog tag. For more details, check the syslog messages from other components. If the error persists, reboot the node.

964829 Failed to add events to client

Description:

The cl_apid experienced an internal error that prevented proper updates to a CRNP client.

Solution:

965261 HTTP GET probe used entire timeout of %d seconds during connect operation and exceeded the timeout by %d seconds. Attempting disconnect with timeout %d

Description:

The probe used it's entire timeout time to connect to the HTTP port.

Solution:

Check that the web server is functioning correctly and if the resources probe timeout is set too low.

965722 Failed to retrieve the resource group Failback property: %s.

Description:

HA Storage Plus was not able to retrieve the resource group Failback property from the CCR.

Solution:

Check the cluster configuration. If the problem persists, contact your authorized Sun service provider.

965841 "validate_options - Fatal: SGE_ROOT not a directory"

Description:

The SGE_ROOT variable contains a value whose location does not exist.

Solution:

Determine where the Sun Grid Engine software is installed (the directory containing 'inst_sge'); initialize SGE_ROOT with this value.

965873 CMM: Node %s (nodeid = %d) with votecount = %d added.

Description:

The specified node with the specified votecount has been added to the cluster.

Solution:

This is an informational message, no user action is needed.

966112 UNRECOVERABLE ERROR: Sun Cluster boot: /usr/cluster/lib/sc/failfastd not found

Description:

Internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

966245 %d entries found in property %s. For a secure Netscape Directory Server instance %s should have one or two entries.

Description:

Since a secure Netscape Directory Server instance can listen on only one or two ports, the list property should have either one or two entries. A different number of entries was found.

Solution:

Change the number of entries to be either one or two.

966335 SAP enqueue server is not running. See %s/ensmon%s.out.%s for output.

Description:

The SAP enqueue server is down. See the output file listed in the error message for details. The data service will fail over the SAP enqueue server.

Solution:

No user action is needed.

966416 This list element in System property %s has an invalid protocol: %s.

Description:

The system property that was named does not have a valid protocol.

Solution:

Change the value of the property to use a valid protocol.

966670 did discovered faulty path, ignoring: %s

Description:

scdidadm has discovered a suspect logical path under /dev/rdsk. It will not add it to subpaths for a given instance.

Solution:

Check to see that the symbolic links under /dev/rdsk are correct.

966682 WARNING: Share path %s may be on a root file system or any file system that does not have an /etc/vfstab entry

Description:

The path indicated is not on a non-root mounted file system. This could be damaging as upon failover NFS clients will start seeing ESTALE err's. However, there is a possibility that this share path is legitimate. The best example in this case is a share path on a root file systems but that is a symbolic link to a mounted file system path.

Solution:

This is a warning. However, administrator must make sure that share paths on all the primary nodes access the same file system. This validation is useful when there are no HAStorage/HAStoragePlus resources in the HA-NFS RG.

966842 in libsecurity unknown security flag %d

Description:

This is an internal error which shouldn't happen. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

967050 Validation failed. Listener binaries not found ORACLE_HOME=%s

Description:

Oracle listener binaries not found under ORACLE_HOME. ORACLE_HOME specified for the resource is indicated in the message. HA-Oracle will not be able to manage Oracle listener if ORACLE_HOME is incorrect.

Solution:

Specify correct ORACLE_HOME when creating resource. If resource is already created, please update resource property 'ORACLE_HOME'.

967080 This resource depends on a HAStoragePlus resource that is not online on this node. Ignoring validation errors.

Description:

The resource depends on a HAStoragePlus resource. Some of the files required for validation checks are not accessible from this node, because HAStoragePlus resource in not online on this node. Validations will be performed on the node that has HAStoragePlus resource online. Validation errors are being ignored on this node by this callback method.

Solution:

Check the validation errors logged in the syslog messages. Please verify that these errors are not configuration errors.

967624 priocntl to set rt returned %d.

Description:

Internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

967970 Modification of resource <%s> failed because none of the nodes on which VALIDATE would have run are currently up

Description:

In order to change the properties of a resource whose type has a registered VALIDATE method, the rgmd must be able to run VALIDATE on at least one node. However, all of the candidate nodes are down. "Candidate nodes" are either members of the resource group's Nodelist or members of the resource type's Installed_nodes list, depending on the setting of the resource's Init_nodes property.

Solution:

Boot one of the resource group's potential masters and retry the resource change operation.

968299 NFS daemon %s died. Will restart in 2 seconds.

Description:

While attempting to start the specified NFS daemon, the daemon started up, however it exited before it could complete its network configuration.

Solution:

This is an informational message. No action is needed. HA-NFS would attempt to correct the problem by restarting the daemon again. To avoid spinning, HA-NFS imposes a delay of 2 seconds between restart attempts.

968557 Could not unplumb any ip addresses.

Description:

Failed to unplumb any ip addresses. The resource cannot be brought offline. Node will be rebooted by Sun cluster.

Solution:

Check the syslog messages from other components for possible root cause. Save a copy of /var/adm/messages and contact Sun service provider for assistance in diagnosing and correcting the problem.

968853 scha_resource_get error (%d) when reading system property %s

Description:

Error occurred in API call scha_resource_get.

Solution:

Check syslog messages for errors logged from other system modules. Stop and start fault monitor. If error persists then disable fault monitor and report the problem.

969008 t_alloc (open_cmd_port-T_ADDR) %d

Description:

Call to t_alloc() failed. The "t_alloc" man page describes possible error codes. ucmmd will exit and the node will abort.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

969264 Unable to get CMM control.

Description:

The cl_eventd was unable to obtain a list of cluster nodes from the CMM. It will exit.

Solution:

969360 Error: Can't start CMASS Agent.

Description:

Internal error. Management Agent unable to start. This agent is used for remote management and is used by SunPlex Manager.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

969827 Failover attempt has failed.

Description:

The failover attempt of the resource is rejected or encountered an error.

Solution:

For more detailed error message, check the syslog messages. Check whether the Pingpong_interval has appropriate value. If not, adjust it using scrgadm(1M). Otherwise, use scswitch to switch the resource group to a healthy node.

Description:

LogicalHostname resource was unable to register with IPMP for status updates.

Solution:

Most likely it is result of lack of system resources. Check for memory availability on the node. Reboot the node if problem persists.

970229 Failed to remove sci%d adapter

Description:

The Sun Cluster Topology Manager (TM) has failed to remove the SCI adapter.

Solution:

Make sure that the SCI adapter is installed correctly on the system or contact your authorized Sun service provider for assistance.

971233 Property %s is not set.

Description:

The property has not been set by the user and must be.

Solution:

Reissue the scrgadm command with the required property and value.

971412 Error in getting global service name for path <%s>

Description:

The path can not be mapped to a valid service name.

Solution:

Check the path passed into extension property "ServicePaths" of SUNW.HAStorage type resource.

972580 CCR: Highest epoch is < 0, highest_epoch = %d.

Description:

The epoch indicates the number of times a cluster has come up. It should not be less than 0. It could happen due to corruption in the cluster repository.

Solution:

Boot the cluster in -x mode to restore the cluster repository on all the members of the cluster from backup. The cluster repository is located at /etc/cluster/ccr/.

972610 fork: %s

Description:

The rgmd, rpc.pmfd or rpc.fed daemon was not able to fork a process, possibly due to low swap space. The message contains the system error. This can happen while the daemon is starting up (during the node boot process), or when executing a client call. If it happens when starting up, the daemon does not come up. If it happens during a client call, the server does not perform the action requested by the client.

Solution:

Investigate if the machine is running out of swap space. If this is not the case, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

972716 Failed to stop the application with SIGKILL. Returning with failure from stop method.

Description:

The stop method failed to stop the application with SIGKILL.

Solution:

Use pmfadm(1M) with the -L option to retrieve all the tags that are running on the server. Identify the tag name for the application in this resource. This can be easily identified as the tag ends in the string ".svc" and contains the resource group name and the resource name. Then use pmfadm(1M) with the -s option to stop the application. If the error still persists, then reboot the node.

972908 Unable to get the name of the local cluster node: %s.

Description:

An internal error occurred while attempting to obtain the local cluster nodename.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.

973243 Validation failed. USERNAME missing in CONNECT_STRING

Description:

USERNAME is missing in the specified CONNECT_STRING. The format could be either 'username/password' or '/' (if operating system authentication is used).

Solution:

Specify CONNECT_STRING in the specified format.

973615 Node %s: weight %d

Description:

The load balancer set the specified weight for the specified node.

Solution:

This is an informational message, no user action is needed.

973933 resource %s added.

Description:

This is a notification from the rgmd that the operator has created a new resource. This may be used by system monitoring tools.

Solution:

This is an informational message, no user action is needed.

974080 RGM isn't failing resource group <%s> off of node <%d>, because resource %s has one or more of its strong or restart dependencies not satisfied anywhere.

Description:

A scha_control(1HA,3HA) GIVEOVER attempt failed on all potential masters, because the resource requesting the GIVEOVER has unsatisfied dependencies. A properly-written resource monitor, upon getting the SCHA_ERR_CHECKS error code from a scha_control call, should sleep for awhile and restart its probes.

Solution:

Usually no user action is required, because the dependee resource is switching or failing over and will come back online automatically. At that point, either the probes will start to succeed again, or the next GIVEOVER attempt will succeed. If that does not appear to be happening, you can use scrgadm(1M) and scstat(1M) to determine the resources on which the specified resource depends that are not online, and bring them online.

974106 lkcm_parm: caller is not registered

Description:

udlm is not registered with ucmm.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.

974129 Cannot stat %s: %s.

Description:

The stat(2) system call failed on the specified pathname, which was passed to a libdsdev routine such as scds_timerun or scds_pmf_start. The reason for the failure is stated in the message. The error could be the result of 1) mis-configuring the name of a START or MONITOR_START method or other property, 2) a programming error made by the resource type developer, or 3) a problem with the specified pathname in the file system itself.

Solution:

Ensure that the pathname refers to a regular, executable file.

974664 HA: no valid secondary provider in rmm - aborting

Description:

This node joined an existing cluster. Then all of the other nodes in the cluster died before the HA framework components on this node could be properly initialized.

Solution:

This node must be rebooted.

975488 INITUCMM Error: rgmd is not running, not starting ucmmd.

Description:

The rgmd process is started before starting the ucmmd process. This message indicates that there is some problem in starting the rgmd daemon on this node.

Solution:

Examine other syslog messages occurring at about the same time to determine why the rgmd is not running. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.

975775 Publish event error: %d

Description:

Internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

976495 fork failed: %s

Description:

Failed to run the "fork" command. The "fork" man page describes possible error codes.

Solution:

Some system resource has been exceeded. Install more memory, increase swap space or reduce peak memory consumption.

976914 fctl: %s

Description:

The cl_apid received the specified error while attempting to deliver an event to a CNRP client.

Solution:

977371 Backup server terminated.

Description:

Graceful shutdown did not succeed. Backup server processes were killed in STOP method. It is likely that adaptive server terminated prior to shutdown of backup server.

Solution:

Please check the permissions of file specified in the STOP_FILE extension property. File should be executable by the Sybase owner and root user.

977412 The state of the path to device: %s has changed to FAILED

Description:

A device is seen as FAILED.

Solution:

Check the device.

978081 rebalance: resource group <%s> is quiescing.

Description:

The indicated resource was quiesced as a result of scswitch -Q being executed on a node.

Solution:

Use scstat(1M) -g to determine the state of the resource group. If the resource group is in ERROR_STOP_FAILED state on a node, you must manually kill the resource and its monitor, and clear the error condition, before the resource group can be started. Refer to the procedure for clearing the ERROR_STOP_FAILED condition on a resource group in the Sun Cluster Administration Guide. If the resource group is in ONLINE_FAULTED or ONLINE state then you may switch it offline or restart it. If the resource group is OFFLINE on all nodes then you may switch it online. If the resource group is in a non-quiescent state such as PENDING_OFFLINE or PENDING_ONLINE, this indicates that another event, such as the death of a node, occurred during or immediately after execution of the scswitch -Q (quiesce) command. In this case, you can re-execute the scswitch -Q command to quiesce the resource group.

978125 in libsecurity setnetconfig failed when initializing the server: %s - %s

Description:

A server (rpc.pmfd, rpc.fed or rgmd) was not able to start because it could not establish a rpc connection for the network specified. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

978534 %s: lookup failed.

Description:

Could not get the hostname for a node. This could also be because the node is not booted as part of a cluster.

Solution:

Make sure the node is booted as part of a cluster.

978736 ERROR: stop_mysql Option -B not set

Description:

The -B option is missing for stop_mysql command.

Solution:

Add the -B option for stop_mysql command.

978812 Validation failed. CUSTOM_ACTION_FILE: %s does not exist

Description:

The file specified in property 'Custom_action_file' does not exist.

Solution:

Please make sure that 'Custom_action_file' property is set to an existing action file. Reissue command to create/update.

978829 t_bind, did not bind to desired addr

Description:

Call to t_bind() failed. The "t_bind" man page describes possible error codes. udlm will exit and the node will abort.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.

979343 Error: duplicate prog <%s> launched step <%s>

Description:

Due to an internal error, uccmd has attempted to launch the same step by duplicate programs. ucmmd will reject the second program and treat it as a step failure.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.

979803 CMM: Node being shut down.

Description:

This node is being shut down.

Solution:

This is an informational message, no user action is needed.

980162 ERROR: start_mysql Option -F not set

Description:

The -F option is missing for start_mysql command.

Solution:

Add the -F option for start_mysql command.

980307 reservation fatal error(%s) - Illegal command

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it may be possible to switch the device group to this node with the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.

980425 Aborting startup: could not determine whether failover of NFS resource groups is in progress.

Description:

Startup of an NFS resource was aborted because it was not possible to determine if failover of any NFS resource groups is in progress.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.

980477 LogicalHostname online.

Description:

The status of the logicalhost resource is online.

Solution:

This is informational message. No user action required.

980681 clconf: CSR removal failed

Description:

While executing task in clconf and modifying the state of proxy, received failure from trying to remove CSR.

Solution:

This is informational message. No user action required.

980942 CMM: Cluster doesn't have operational quorum yet; waiting for quorum.

Description:

Not enough nodes are operational to obtain a majority quorum; the cluster is waiting for more nodes before starting.

Solution:

If nodes are booting, wait for them to finish booting and join the cluster. Boot nodes that are down.

981211 'ensmon 2' timed out. Enqueue server is running. But status for replica server is unknown. See %s/ensmon%s.out.%s for output.

Description:

SAP utility 'ensmon' with option 2 cannot be completed. However, 'ensmon' with option 1 finished successfully. This can happen if the network of the cluster node where the SAP replica server was running becomes unavailable.

Solution:

No user action is needed.

981739 CCR: Updating invalid table %s.

Description:

This joining node carries a valid copy of the indicated table with override flag set while the current cluster membership doesn't have a valid copy of this table. This node will update its copy of the indicated table to other nodes in the cluster.

Solution:

This is an informational message, no user action is needed.

981931 INTERNAL ERROR: postpone_start_r: meth type <%d>

Description:

A non-fatal internal error has occurred in the rgmd state machine.

Solution:

983530 can not find alliance_init in ~ALL_ADM for instance $INST_NAME

Description:

Configuration error in configuration file.

Solution:

Verify and correct SAA configuration file.

984704 reset_rg_state: unable to change state of resource group <%s> on node <%d>; assuming that node died

Description:

The rgmd was unable to reset the state of the specified resource group to offline on the specified node, presumably because the node died.

Solution:

Examine syslog output on the specified node to determine the cause of node death. The syslog output might indicate further remedial actions.

985111 lkcm_reg: illegal %s value

Description:

Cluster information that is being used during udlm registration with ucmm is incorrect.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.

985242 "Stop - can't determine Qmaster pid file - $qmaster_spool_dir/qmaster.pid does not exist."

Description:

The file '$qmaster_spool_dir/qmaster.pid' was not found; '$qmaster_spool_dir/qmaster.pid' is the second argument to the 'Shutdown' function. 'Shutdown' is called within the stop script of the sge_qmaster resource.

Solution:

Confirm the file 'qmaster.pid' exists. Said file should be found in the respective spool directory of the daemon in question.

985352 Unhandled return code from scds_timerun() for %s

Description:

The data service detected an error from scds_timerun().

Solution:

Informational message. No user action is needed.

985417 %s: Invalid arguments, restarting service.

Description:

The PMF action script supplied by the DSDL while launching the process tree was called with invalid arguments.

Solution:

This is an internal error. Contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.

985746 Failed to retrieve the resource group Nodelist property: %s.

Description:

HA Storage Plus was not able to retrieve the resource group Nodelist property from the CCR.

Solution:

Check the cluster configuration. If the problem persists, contact your authorized Sun service provider.

986150 check_mysql - Sql-command %s returned error (%s)

Description:

The fault monitor can't execute the specified SQL command.

Solution:

986157 "Validate - can't determine path to Grid Engine binaries"

Description:

The SGE binary 'qstat' wasn't found in 'binary_path' nor 'binary_path/<arch>'. 'qstat' is used only representatively; if it isn't found the other SGE binaries are presumed misplaced also. 'binary_path' is reported by 'qconf -sconf'.

Solution:

Find 'qstat' in the SGE installation; then update 'binary_path' using 'qconf -mconf'.

986190 Entry at position %d in property %s with value %s is not a valid node identifier or node name.

Description:

The value given for the named property has an invalid node specified for it. The position index, which starts at 0 for the first element in the list, indicates which element in the property list was invalid.

Solution:

Specify a valid node for the property.

986197 reservation fatal error(%s) - malloc() error, errno %d

Description:

The device fencing program has been unable to allocate required memory.

Solution:

Memory usage should be monitored on this node and steps taken to provide more available memory if problems persist. Once memory has been made available, the following steps may need to taken: If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, access to shared devices can be reacquired by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. The device group can be switched back to this node if desired by using the scswitch command. If no other node was available, then the device group will not have been started. The scswitch command may be used to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.

986466 clexecd: stat of '%s' failed

Description:

clexecd problem failed to stat the directory indicated in the error message.

Solution:

Make sure the directory exists. If the problem persists, contact your authorized Sun service provider to determine whether a workaround or patch is available.

987455 in libsecurity weak Unix authorization failed

Description:

A server (rgmd) refused an rpc connection from a client because it failed the Unix authentication. This happens if a caller program using scha public api, either in its C form or its CLI form, is not running as root. An error message is output to syslog.

Solution:

Check that the calling program using the scha public api is running as root. If the program is running as root, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

987601 scvxvmlg error - opendir(%s) failed

Description:

Solution:

988719 Warning: Unexpected result returned while checking for the existence of scalable service group %s: %d.

Description:

A call to the underlying scalable networking code failed.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.

988762 Invalid connection attempted from %s: %s

Description:

The cl_apid received a CRNP client connection attempt from an IP address that it could not accept because of its allow_hosts or deny_hosts.

Solution:

If you wish to allow access to the CRNP service for the client at the specified IP address, modify the allow_hosts or deny_hosts as required. Otherwise, no action is required.

988885 libpnm error: %s

Description:

This means that there is an error either in libpnm being able to send the command to the PNM daemon or in libpnm receiving a response from the PNM daemon.

Solution:

The user of libpnm should handle these errors. However, if the message is: network is too slow - it means that libpnm was not able to read data from the network - either the network is congested or the resources on the node are dangerously low. scha_cluster_open failed - it means that the call to initialize a handle to get cluster information failed. This means that the command will not be sent to the PNM daemon. scha_cluster_get failed - it means that the call to get cluster information failed. This means that the command will not be sent to the PNM daemon. can't connect to PNMd on %s - it means that libpnm was not able to connect to the PNM daemon through the private interconnect on the given host. It could be that the given host is down or there could be other related error messages. wrong version of PNMd - it means that we connected to a PNM daemon which did not give us the correct version number. no LOGICAL PERNODE IP for %s - it means that the private interconnect LOGICAL PERNODE IP address was not found. IPMP group %s not found - either an IPMP group name has been changed or all the adapters in the IPMP group have been unplumbed. There would have been an earlier NOTICE which said that a particular IPMP group has been removed. The pnmd has to be restarted. Send a KILL (9) signal to the PNM daemon. Because pnmd is under PMF control, it will be restarted automatically. If the problem persists, restart the node with scswitch -S and shutdown(1M). no adapters in IPMP group %s - this means that the given IPMP group does not have any adapters. Please look at the error messages from LogicalHostname/SharedAddress. no public adapters on this node - this means that this node does not have any public adapters. Please look at the error messages from LogicalHostname/SharedAddress.

989602 INITPMF Waiting for ${SERVER} to register with rpcbind.

Description:

The initpmf init script is attempting to verify that the rpc.pmfd registered with rpcbind.

Solution:

No user action required.

989693 thr_create failed

Description:

Could not create a new thread. The "thr_create" man page describes possible error codes.

Solution:

Some system resource has been exceeded. Install more memory, increase swap space or reduce peak memory consumption.

989846 ERROR: unpack_rg_seq(): rgname_to_rg failed <%s>

Description:

Due to an internal error, the rgmd was unable to find the specified resource group data in memory.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.

989958 wrong command length received %d

Description:

This means that the PNM daemon received a command from libpnm, but all the bytes were not received.

Solution:

This is not a serious error. It could be happening due to some network problems. If the error persists send KILL (9) signal to pnmd. PMF will restart pnmd automatically. If the problem persists, restart the node with scswitch -S and shutdown(1M).

990215 HA: repl_mgr: exception while invoking RMA reconf object

Description:

An unrecoverable failure occurred in the HA framework.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

990418 received signal %d

Description:

The daemon indicated in the message tag (rgmd or ucmmd) has received a signal, possibly caused by an operator-initiated kill(1) command. The signal is ignored.

Solution:

The operator must use scswitch(1M) and shutdown(1M) to take down a node, rather than directly killing the daemon.

990711 %s: Could not call Disk Path Monitoring daemon to add path(s)

Description:

scdidadm -r was run and some disk paths may have been added, but DPM daemon on the local node may not have them in its list of paths to be monitored.

Solution:

Kill and restart the daemon on the local node. If the status of new local disks are not shown by local DPM daemon check if paths are present in the persistent state maintained by the daemon in the CCR. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

991103 Cannot open /proc directory

Description:

The rpc.pmfd server was unable to open the /proc directory to find a list of the current processes.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

991108 uaddr2taddr (open_cmd_port) failed

Description:

Call to uaddr2taddr() failed. The "uaddr2taddr" man page describes possible error codes. ucmmd will exit and the node will abort.

Solution:

Save the files /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

991130 pthread_create: %s

Description:

The rpc.pmfd server was not able to allocate a new thread, probably due to low memory, and the system error is shown. This can happen when a new tag is started, or when monitoring for a process is set up. If the error occurs when a new tag is started, the tag is not started and pmfadm returns error. If the error occurs when monitoring for a process is set up, the process is not monitored. An error message is output to syslog.

Solution:

Investigate if the machine is running out of memory. If this is not the case, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

991627 Failed to retrieve property %s: %s. Ignored.

Description:

The PMF action script supplied by the DSDL could not retrieve the given property of the resource. The script ignored the error and followed its default behavior.

Solution:

Check the syslog messages around the time of the error for messages indicating the cause of the failure. If this error persists, contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.

991800 in libsecurity transport %s is not a loopback transport

Description:

A server (rpc.pmfd, rpc.fed or rgmd) refused an rpc connection from a client because the named transport is not a loopback. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

991856 in libsecurity for program %s (%lu); could not register on any transport in NETPATH

Description:

The specified server was not able to start because it could not establish a rpc connection for the network specified, because it couldn't find any transport. An error message is output to syslog. This happened because either there are no available transports at all, or there are but none is a loopback.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

991864 putenv: %s

Description:

The rpc.pmfd server was not able to change environment variables. The message contains the system error. The server does not perform the action requested by the client, and an error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

991954 clexecd: wait_for_ready daemon. clexecd program has encountered a problem with the wait_for_ready thread at initialization time.

Solution:

clexecd program will exit and node will be halted or rebooted to prevent data corruption. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

992415 Validate - Couldn't retrieve MySQL-user <%s> from the nameservice

Description:

Couldn't retrieve the defined user from name service.

Solution:

Make sure that the right user is defined or the user exists. Use getent passwd 'username' to verify that defined user exist.

992912 clexecd: thr_sigsetmask returned %d. Exiting.

Description:

clexecd program has encountered a failed thr_sigsetmask(3THR) system call. The error message indicates the error number for the failure.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.

992998 clconf: CSR registration failed

Description:

While executing task in clconf and modifying the state of proxy, received failure from registering CSR.

Solution:

This is informational message. No user action required.

994589 ERROR: start_sap_j2ee Option -R not set

Description:

The -R option is missing for the start_command.

Solution:

Add -R option to the start-command.

994926 Mount of %s failed: (%d) %s.

Description:

HA Storage Plus was not able to mount the specified file system.

Solution:

Check the system configuration. Also check if the FilesystemCheckCommand is not empty (it is not advisable to have it empty since file system inconsistency may occur).

994988 in libsecurity for program %s (%lu); svc_tp_create failed for transport %s

Description:

The specified server was not able to start because it could not create a rpc handle for the network specified. The rpc error message is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

995026 lkcm_cfg: invalid handle was passed %s %d

Description:

Handle for communication with udlmctl during a call to return the current DLM configuration is invalid.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.

995331 start_broker - Waiting for WebSphere MQ Broker Queue Manager

Description:

The WebSphere MQ Broker is dependent on the WebSphere MQ Broker Queue Manager, which is not available. So the WebSphere MQ Broker will wait until it is available before it is started, or until Start_timeout for the resource occurs.

Solution:

No user action is needed.

995339 Restarting using scha_control RESTART

Description:

Fault monitor has detected problems in RDBMS server. Attempt will be made to restart RDBMS server on the same node.

Solution:

Check the cause of RDBMS failure.

995859 scha_cluster_get failed

Description:

Call to get cluster information failed. This means that the incoming connection to the PNM daemon will not be accepted.

Solution:

There could be other related error messages which might be helpful. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

996075 fatal: Unable to resolve %s from nameserver

Description:

The low-level cluster machinery has encountered a fatal error. The rgmd will produce a core file and will cause the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.

996303 Validate - winbindd %s non-existent executable

Description:

The Winbind resource failed to validate that the winbindd program exists and is executable.

Solution:

Check that the correct bin directory for the winbindd program was entered when registering the winbind resource. Please refer to the data service documentation to determine how to do this.

996887 reservation message(%s) - attempted removal of scsi-3 keys from non-scsi-3 device %s

Description:

The device fencing program has detected scsi-3 registration keys on a device which is not configured for scsi-3 PGR use. The keys have been removed.

Solution:

This is an informational message, no user action is needed.

996897 Method <%s> on resource <%s>: stat of program file failed.

Description:

The rgmd was unable to access the indicated resource method file. This may be caused by incorrect installation of the resource type.

Solution:

Consult resource type documentation; [re-]install the resource type, if necessary.

996902 Stopped the HA-NFS system fault monitor.

Description:

The HA-NFS system fault monitor was stopped successfully.

Solution:

No action required.

996942 Stop of HADB database completed successfully.

Description:

The resource was able to successfully stop the HADB database.

Solution:

This is an informational message, no user action is needed.

997689 IP address %s is an IP address in resource %s and in resource %s.

Description:

The same IP address is being used in two resources. This is not a correct configuration.

Solution:

Delete one of the resources that is using the duplicated IP address.

998022 Failed to restart the service: %s.

Description:

Restart attempt of the data service has failed.

Solution:

Check the sylog messages that are occurred just before this message to check whether there is any internal error. In case of internal error, contact your Sun service provider. Otherwise, any of the following situations may have happened. 1) Check the Start_timeout and Stop_timeout values and adjust them if they are not appropriate. 2) This might be the result of lack of the system resources. Check whether the system is low in memory or the process table is full and take appropriate action.

998478 "start_sge_schedd failed"

Description:

The process 'sge_schedd' failed to start for reasons other than it was already running.

Solution:

Check '/var/adm/messages' for any relevant cluster messages. Respond accordingly, then retry bringing the resource online.

998759 Database is ready for auto recovery but the Auto_recovery property is false.

Description:

All the Sun Cluster nodes able to run the HADB resource are running the resource, but the database is unable to be started. If the auto_recovery extension property was set to true the resource would attempt to start the database by running hadbm clear and the command, if any, specified in the auto_recovery_command extension property.

Solution:

The database must be manually recovered, or if autorecovery is desired the auto_recovery extension property can be set to true and auto_recovery_command can optionally also be set.

999882 clnt_tp_create failed for program %s (%lu): %s

Description:

A client was not able to make an rpc connection to the specified server because it could not create the rpc handle. The rpc error is shown. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.

999960 NFS daemon %s has registered with TCP transport but not with UDP transport. Will restart the daemon.

Description:

While attempting to start the specified NFS daemon, the daemon started up. However it registered with TCP transport before it registered with UDP transport. This indicates that the daemon was unable to register with UDP transport.

Solution:

This is an informational message, no user action is needed. Make sure that the order of entries in /etc/netconfig is not changed on cluster nodes where HA-NFS is running.

Previous: Message IDs 800000–899999