Sun Cluster Error Messages Guide for Solaris OS

Message IDs 900000–999999

This section contains message IDs 900000–999999.


900058 Failed to write to test-file %s : %s

Description:

A write operation on the test file for I/O operations failed. One possible cause of this error is that the test file has been deleted. The deletion of this file might be caused by malicious activity in your system.

Solution:

Restart the ScalMountPoint resource.


900102 Failed to retrieve the resource type property %s: %s.

Description:

An API operation has failed while retrieving the resource type property. Low memory or API call failure might be the reasons.

Solution:

In case of low memory, the problem will probably cured by rebooting. If the problem recurs, you might need to increase swap space by configuring additional swap devices. Otherwise, if it is API call failure, check the syslog messages from other components.


900184 Error in scha_resource_get: %s

Description:

Error in reading the resource type of the resource

Solution:

Check if the RGM functionalities are working fine and if the resources are present. Contact SUN vendor for more help.


900198 validate: Port is not set but it is required

Description:

The parameter Port is not set in the parameter file

Solution:

set the variable Port in the paramter file mentioned in option -N to a of the start, stop and probe command to valid contents.


900371 pnmd has requested an immediate failover of all HA IP addresses hosted on IPMP group %s

Description:

All network interfaces in the IPMP group have failed. All of these failures were detected by the hardware drivers for the network interfaces, and not in.mpathd. In such a situation pnmd requests all highly available IP addresses (LogicalHostname and SharedAddress) to fail over to another node without any delay.

Solution:

No user action is needed. This is an informational message that indicates that the IP addresses will be failed over to another node immediately.


900499 Error: low memory

Description:

The rpc.fed or cl_apid server was not able to allocate memory. If the message if from the rpc.fed, the server might not be able to capture the output from methods it runs.

Solution:

Determine if the host is running out of memory. If not save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


900501 pthread_sigmask: %s

Description:

The daemon was unable to configure its signal handler, so it is unable to run.

Solution:

Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


900508 Internal error; SAP instance id set to NULL.

Description:

Extension property could not be retrieved and is set to NULL. Internal error.

Solution:

No user action needed.


900626 Volume (%s) not valid!

Description:

The volume is not a valid volume.

Solution:

Ensure that the volumes that are specified for monitoring are valid.


900675 cluster volume manager shared access mode enabled

Description:

Message indicating shared access availability of the volume manager.

Solution:

None.


900709 pid %u <%s> project <%s> user <%s> setproject() ret %d errno %d

Description:

Should never occur.

Solution:

Verify project database. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


900843 Retrying to retrieve the cluster information.

Description:

An update to cluster configuration occured while cluster properties were being retrieved

Solution:

Ignore the message.


900954 fatal: Unable to open CCR

Description:

The rgmd daemon was unable to open the cluster configuration repository (CCR). The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


901546 SCTP services should depend on SharedAddress resources belonging to the same resource group

Description:

The shared addresses used by the SCTP service belong to different resource groups.

Solution:

Make sure the shared addresses belong to the same resource group.


902017 clexecd: can allocate execit_msg

Description:

Could not allocate memory. Node is too low on memory.

Solution:

clexecd program will exit and node will be halted or rebooted to prevent data corruption. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


902721 Switching over resource group using scha_control GIVEOVER

Description:

Fault monitor has detected problems in RDBMS server. Fault monitor has determined that RDBMS server cannot be restarted on this node. Attempt will be made to switchover the resource to any other node, if a healthy node is available.

Solution:

Check the cause of RDBMS failure.


903007 txmit_common: udp is null!

Description:

Can not transmit a message and communicate with udlmctl because the address to send to is null.

Solution:

None.


903317 Entry at position %d in property %s was invalid.

Description:

An invalid entry was found in the named property. The position index, which starts at 0 for the first element in the list, indicates which element in the property list was invalid.

Solution:

Make sure the property has a valid value.


903370 Command %s failed to run: %s.

Description:

HA-NFS attempted to run the specified command to perform some action which failed. The specific reason for failure is also provided.

Solution:

HA-NFS will take action to recover from this failure, if possible. If the failure persists and service does not recover, contact your service provider. If an immediate repair is desired, reboot the cluster node where this failure is occuring repeatedly.


903626 Internal Error: Could not validate the SCTP bind address list due to array overflow.

Description:

The number of scalable services multiplied by the number of shared addrsses exceeds the maximum allowed value of 1024.

Solution:

Either reduce the number of scalable applications in the cluster or reduce the number of shared addresses that are used.


903666 Some other object bound to <%s>.

Description:

The cl_eventd found an unexpected object in the nameserver. It may be unable to forward events to one or more nodes in the cluster.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


903734 Failed to create lock directory %s: %s.

Description:

This network resource failed to create a directory in which to store lock files. These lock files are needed to serialize the running of the same callback method on the same adapter for multiple resources.

Solution:

This might be the result of a lack of system resources. Check whether the system is low in memory and take appropriate action. For specific error information, check the syslog message.


904778 Number of entries exceeds %ld: Remaining entries will be ignored.

Description:

The file being processed exceeded the maximum number of entries permitted as indicated in this message.

Solution:

Remove any redundant or duplicate entries to avoid exceeding the maximum limit. Too many entries will cause the excess number of entries to be ignored.


904914 fatal: Aborting this node because method <%s> on resource <%s> for node <%s> is unkillable

Description:

The specified callback method for the specified resource became stuck in the kernel, and could not be killed with a SIGKILL. The RGM reboots the node to force the data service to fail over to a different node, and to avoid data corruption.

Solution:

No action is required. This is normal behavior of the RGM. Other syslog messages that occurred just before this one might indicate the cause of the method failure.


905020 %s is already running on this node outside of Sun Cluster. The start of %s from Sun Cluster will be aborted.

Description:

The specified application is already running on the node outside of Sun Cluster software. The attempt to start it up under Sun Cluster software will be aborted.

Solution:

No user action is needed.


905023 clexecd: dup2 of stderr returned with errno %d while exec'ing (%s). Exiting.

Description:

clexecd program has encountered a failed dup2(2) system call. The error message indicates the error number for the failure.

Solution:

The clexecd program will exit and the node will be halted or rebooted to prevent data corruption. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


905056 Validate - Only SUNWfiles or SUNWbinfiles are supported

Description:

The DHCP resource requires that that the /etc/inet/dhcpsvc.conf file has RESOURCE=SUNWfiles or SUNWbinfiles.

Solution:

Ensure that /etc/inet/dhcpsvc.conf has RESOURCE=SUNWfiles or SUNWbinfiles by configuring DHCP appropriately, i.e. as defined within the Sun Cluster 3.0 Data Service for DHCP.


905151 Unable to lock device %s. Error (%s).

Description:

Sun Cluster was unable to lock a device.

Solution:

Check the error returned for why this happened. In cases like an interrupted system call, no user action is required.


905591 Could not determine volume configuration daemon mode

Description:

Could not get information about the volume manager deamon mode.

Solution:

Check if the volume manager has been started up right.


906044 get_cluster_name() call failed.

Description:

An error prevented the cluster name from being retrieved.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


906589 Error retrieving network address resource in resource group.

Description:

An error occured reading the indicated extension property.

Solution:

Check syslog messages for errors logged from other system modules. If error persists, please report the problem.


906658 CCR: Updating invalid table %s.

Description:

This joining node carries a valid copy of the indicated table with override flag set while the current cluster memebership doesn't have a valid copy of this table. This node will update its copy of the indicated table to other nodes in the cluster.

Solution:

This is an informational message, no user action is needed.


906838 reservation warning(%s) - do_scsi3_registerandignorekey() error for disk %s, attempting do_scsi3_register()

Description:

The device fencing program has encountered errors while trying to access a device. Now trying to run do_scsi3_register() This is an informational message, no user action is needed

Solution:

This is an informational message, no user action is needed.


906922 Started NFS daemon %s.

Description:

The specified NFS daemon has been started by the HA-NFS implementation.

Solution:

This is an informational message. No action is needed.


907356 Method <%s> on resource <%s>, node <%s>: authorization error: %s.

Description:

An attempted method execution failed, apparently due to a security violation; this error should not occur. The last portion of the message describes the error. This failure is considered a method failure. Depending on which method was being invoked and the Failover_mode setting on the resource, this might cause the resource group to fail over or move to an error state.

Solution:

Correct the problem identified in the error message. If necessary, examine other syslog messages occurring at about the same time to see if the problem can be diagnosed. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing the problem.


907626 Could not initialize the contract info for svc :%s

Description:

Problem in initializing the contract for the service specified

Solution:

Depending on the error, check the SMF man page.Contact SUN vendor for help.


907960 Attempting to restart listener %s.

Description:

Listener monitor has either timed out or detected failure of listener. Monitor will attempt to restart the listener.

Solution:

None


908240 in libsecurity realloc failed

Description:

A server (rpc.pmfd, rpc.fed or rgmd) was not able to start, or a client was not able to make an rpc connection to the server. This problem can occur if the machine has low memory. An error message is output to syslog.

Solution:

Determine if the host is low on memory. If not, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


908387 ucm_callback for step %d generated exception %d

Description:

ucmm callback for a step failed.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


908591 Failed to stop fault monitor.

Description:

An attempt was made to stop the fault monitor and it failed. There may be prior messages in syslog indicating specific problems.

Solution:

If there are prior messages in syslog indicating specific problems, these should be corrected. If that doesn't resolve the issue, the user can try the following. Use process monitor facility (pmfadm (1M)) with -L option to retrieve all the tags that are running on the server. Identify the tag name for the fault monitor of this resource. This can be easily identified as the tag ends in string ".mon" and contains the resource group name and the resource name. Then use pmfadm (1M) with -s option to stop the fault monitor. This problem may occur when the cluster is under load and Sun Cluster cannot stop the fault monitor within the timeout period specified. Consider increasing the Monitor_Stop_timeout property. If the error still persists, then reboot the node.


909069 tcpmodopen: Could not allocate private data

Description:

Machine is out of memory.

Solution:

Need a user action for this message.


909656 Unable to open /dev/kmem:%s

Description:

HA-NFS fault monitor attempt to open the device but failed. The specific cause of the failure is logged with the message. The /dev/kmem interface is used to read NFS activity counters from kernel.

Solution:

No action. HA-NFS fault monitor would ignore this error and try to open the device again later. Since it is unable to read NFS activity counters from kernel, HA-NFS would attempt to contact nfsd by means of a NULL RPC. A likely cause of this error is lack of resources. Attempt to free memory by terminating any programs which are using large amounts of memory and swap. If this error persists, reboot the node.


909728 Start command %s timed out.

Description:

Start-up of the data service timed out.

Solution:

No user action needed.


909731 Failed to get service state %s:%s

Description:

The Solaris service management facility failed to get the name of the service instance's current state that the fault management resource identifier (FMRI) in the /var/adm/messages specifies.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


909737 Error loading dtd for %s

Description:

The cl_apid was unable to load the specified dtd. No validation will be performed on CRNP xml messages.

Solution:

No action is required. If you want to diagnose the problem, examine other syslog messages occurring at about the same time to see if the problem can be identified. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


910155 WebSphere MQ Trigger Monitor %s restarted

Description:

The specified WebSphere MQ Trigger Monitor has been restarted.

Solution:

None required. Informational message.


910284 libscha XDR Buffer Shortfall while encoding arguments API num : %d, will retry

Description:

A non-fatal error occurred in a libscha.so function while marshalling arguments for a remote procedure call. The operation will be re-tried with a larger buffer.

Solution:

No user action is required. If the message recurs frequently, contact your authorized Sun service provider to determine whether a workaround or patch is available.


910339 scvxvmlg error - remove(%s) failed

Description:

The program responsible for maintaining the VxVM namespace was unable to access the global device namespace. If configuration changes were recently made to VxVM diskgroups or volumes, this node may be unaware of those changes. Recently created volumes may be unaccessible from this node.

Solution:

Verify that the /global/.devices/node@N (N = this node's node number) is mounted globally and is accessible. If no configuration changes have been recently made to VxVM diskgroups or volumes and all volumes continue to be accessible from this node, then no further action is required. If changes have been made, the device namespace on this node can be updated to reflect those changes by executing '/usr/cluster/lib/dcs/scvxvmlg'. If the problem persists, contact your authorized Sun service provider to determine whether a workaround or patch is available.


911450 Probe failed in the prenet start method. Diskgroup (%s) not available.

Description:

The disk set or disk group failed its availability check.

Solution:

Verify the status of the disk set or disk group and perform maintenance if required.


911550 Method <%s> failed to execute on resource <%s> in resource group <%s>, on node <%s>, error: <%s>

Description:

A resource method failed to execute, due to a system error described in the message. For an explanation of the error message, consult intro(2). This is considered a method failure. Depending on which method was being invoked and the Failover_mode setting on the resource, this might cause the resource group to fail over or move to an error state, or it might cause an attempted edit of a resource group or its resources to fail.

Solution:

If the error message is not self-explanatory, other syslog messages occurring at about the same time might provide evidence of the source of the problem. If not, save a copy of the /var/adm/messages files on all nodes, and (if the rgmd did crash) a copy of the rgmd core file, and contact your authorized Sun service provider for assistance.


912393 This node is not in the replica node list of global service %s associated with path %s. This is acceptable because the existing affinity settings do not require resource and device colocation.

Description:

Self explanatory.

Solution:

This is an informational message, no user action is needed.


912747 Failed to un-mount file-system %s from mountpoint %s : %s

Description:

The specified file system could not be unmounted from its mount point during a postnet-stop operation.

Solution:

Manually unmount the file system.


912866 Could not validate CCR tables; halting node

Description:

The rgmd daemon was unable to check the validity of the CCR tables representing Resource Types and Resource Groups. The node will be halted to prevent further errors.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


913776 Error: unable to register for zone state change events: zones functionality will not work correctly on this node

Description:

The sc_zonesd is unable to register for zone state change events: zones functionality will not work correctly on this node.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


914059 Volume (%s) has gone bad

Description:

The volume is not accessible.

Solution:

Check if the volume is in an errored state. Contact your authorized Sun service provider for further assistance in diagnosing the problem.


914248 Listener status probe timed out after %s seconds.

Description:

An attempt to query the status of the Oracle listener using command 'lsnrctl status <listener_name>' did not complete in the time indicated, and was abandoned. HA-Oracle will attempt to kill the listener and then restart it.

Solution:

None, HA-Oracle will attempt to restart the listener. However, the cause of the listener hang should be investigated further. Examine the log file and syslog messages for additional information.


914260 Failed to retrieve global fencing status from the global name server

Description:

An error was encountered when run_reserve was retrieving the global fencing status from the global name server. It will then try to read it from the CCR directly.

Solution:

This is an informational message, no user action is needed.


914440 %s has been deleted.\nIf %s was hosting any HA IP addresses then these should be restarted.

Description:

We do not allow deleting of an IPMP group which is hosting Logical IP addresses registered by RGM. Therefore we notify the user of the possible error.

Solution:

This message is informational; no user action is needed.


914519 Error when sending message to child %m

Description:

Error occurred when communicating with fault monitor child process. Child process will be stopped and restarted.

Solution:

If error persists, then disable the fault monitor and resport the problem.


914866 Unable to complete some unshare commands.

Description:

HA-NFS postnet_stop method was unable to complete the unshare(1M) command for some of the paths specified in the dfstab file.

Solution:

The exact pathnames which failed to be unshared would have been logged in earlier messages. Run those unshare commands by hand. If problem persists, reboot the node.


915051 Failed to register zone state change notification callbacks

Description:

The scprivipd failed to register with sc_zonesd for zone state change notification.

Solution:

The private IP communication for local zones depends on scprivipd and the registration for zone state change notification. So, this feature will not work as expected as a result of this error. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


915091 Probe - Fatal: %s/checkprog not found or not executable

Description:

The binary file ${SGE_ROOT}/utilbin/<arch>/checkprog does not exist, or is not executable.

Solution:

Confirm the binary file ${SGE_ROOT}/utilbin/<arch>/checkprog both exists in that location, and is executable.


915389 Failed to create socket: %s.

Description:

Failure in communication between fault monitor and process monitor facility.

Solution:

This is internal error. Save /var/adm/messages file and contact the Sun service provider.


915652 %s group %s can host %s addresses only. Host %s has no mapping of this type.

Description:

The IPMP group can not host IP addresses of the type that the hostname maps to. eg if the hostname maps to IPv6 address(es) only and the IPMP group is setup to host IPv4 addresses only.

Solution:

Use ifconfig -a to determine if the network adapter in the resource's IPMP group can host IPv4, IPv6 or both kinds of addresses. Make sure that the hostname specified has atleast one IP address that can be hosted by the underlying IPMP group.


915691 DATAGUARD_ROLE has been set to IN_TRANSITION. Switchover operation may be in progress on the database. Database shutdown will not be attempted.

Description:

The DATAGUARD_ROLE extension property of the HA Oracle resource has been set to the value IN_TRANSITION, indicating that a role switchover may be in progress on the database. The HA Oracle data service will not proceed with a database shutdown during this time.

Solution:

If you are sure that the database is not undergoing a switchover operation, set the DATAGUARD_ROLE and STANDBY_MODE extension properties to reflect the current dataguard mode of the Oracle database. Since the database is not shutdown by the data service, clear the STOP_FAILED flag on the Sun Cluster resource and bring the resource online again. Then bring the Sun Cluster resource offline. This operation will shut down the database.


916067 validate: Directory %s does not exist but it is required

Description:

The directory with the name $Directoryname does not exist

Solution:

set the variable Basepath in the paramter file mentioned in option -N to a of the start, stop and probe command to valid contents.


916361 in libsecurity for program %s (%lu); could not find any tcp transport

Description:

A client was not able to make an rpc connection to the specified server because it could not find a tcp transport. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


917103 %s has the keyword group already.

Description:

This means that auto-create was called even though the /etc/hostname.adp file has the keyword "group". Someone might have hand edited that file. This could also happen if someone deletes an IPMP group - A notification should have been provided for this.

Solution:

Change the file back to its original contents. Try the clreslogicalhostname or clressharedaddress command again. We do not allow IPMP groups to be deleted once they are configured. If the problem persists contact your authorized Sun service provider for suggestions.


917145 Only one Sun Cluster node will be offline, stopping node.

Description:

The resource will be offline on only one node so the database can continue to run.

Solution:

This is an informational message, no user action is needed.


917173 fatal: Invalid node ID in infrastructure table

Description:

The internal cluster configuration is erroneous and a node carries an invalid ID. RGMD will exit causing the node to reboot.

Solution:

Report the problem to your authorized Sun service provider.


917467 Got cluster_stop from cluster node %d

Description:

This can be seen only on DEBUG binaries. When a CL_PANIC or ASSERT failure happens on any cluster node, the other nodes are also forced to go down to help in analysing the root cause.

Solution:

Save the cores from all the nodes to check for a potential bug.


917591 fatal: Resource type <%s> update failed with error <%d>; aborting node

Description:

Rgmd failed to read updated resource type from the CCR on this node.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


917682 Invalid name size for CVM

Description:

The length of the name of a VxVM shared-disk group is less than or equal to zero.

Solution:

Contact your authorized Sun service provider for assistance in diagnosing the problem.


917991 scf_pg_update failed: %s

Description:

An API call failed.

Solution:

Examine log files and syslog messages to determine the cause of the failure. Take corrective action based on any related messages. If the problem persists, report it to your Sun support representative for further assistance.


918018 load balancer for group '%s' released

Description:

This message indicates that the service group has been deleted.

Solution:

This is an informational message, no user action is needed.


918488 Validation failed. Invalid command line parameter %s %s.

Description:

Unable to process parameters passed to the call back method specified. This is a Sun Cluster HA for Sybase internal error.

Solution:

Report this problem to your authorized Sun service provider


919740 WARNING: error in translating address (%s) for nodeid %d

Description:

Could not get address for a node.

Solution:

Make sure the node is booted as part of a cluster.


920103 created %d threads to handle resource group switchback; desired number = %d

Description:

The rgmd daemon was unable to create the desired number of threads upon starting up. This is not a fatal error, but it might cause RGM reconfigurations to take longer because it will limit the number of tasks that the rgmd can perform concurrently.

Solution:

Make sure that the hardware configuration meets documented minimum requirements. Examine other syslog messages on the same node to see if the cause of the problem can be determined.


920286 Peer node %d attempted to contact us with an invalid version message, source IP %s.

Description:

Sun Cluster software at the local node received an initial handshake message from the remote node that is not running a compatible version of the Sun Cluster software.

Solution:

Make sure all nodes in the cluster are running compatible versions of Sun Cluster software.


920429 Deferring tcp_rio_client %p with generation %x

Description:

The replyio subsystem detected an out-of-sequence arrival of replyio segments.

Solution:

This message is for diagnostic purposes only; no action required.


920736 Unknown transport type: %s

Description:

The transport type used is not known to Solaris Clustering

Solution:

Need a user action for this message.


920939 About to lofs mount of %s on %s in local zone '%s'. Underlying files/directories will be inaccessible.

Description:

HAStoragePlus detected that the directory of local zone on which loopback mount is about to happen is not empty, hence once mounted, the underlying files will be inaccessible.

Solution:

This is an informational message, no user action is needed.


921054 door_callback: invalid argument size

Description:

The zone state change callback from the sc_zonesd daemon was improperly formatted. The callback will be ignored.

Solution:

Save a copy of the /var/adm/messages files on all nodes. If the problem persists, contact your authorized Sun service provider for assistance in diagnosing the problem.


921508 Failed to activate contract template: %s

Description:

Not able to activate the contract template.

Solution:

Check the contract manpage to know more about the error.Also make sure the basic contract functionalities are working fine.Contact SUN vendor for more help.


922085 INTERNAL ERROR CMM: Memory allocation error.

Description:

The CMM failed to allocate memory during initialization.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


922150 Regular expression compile error %s: %s

Description:

An internal error has occurred.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


922276 J2EE probe determined wrong port number, %d.

Description:

The data service detected an invalid port number for the J2EE engine probe.

Solution:

Informational message. No user action is needed.


922363 resource %s status msg on node %s change to <%s>

Description:

This is a notification from the rgmd that a resource's fault monitor status message has changed.

Solution:

This is an informational message; no user action is needed.


922442 Failed to open contract ctl file to ack%s: %s

Description:

Not able to open the contract control file to acknowledge the event

Solution:

Check the contract manpage to know more about the error.Also make sure the basic contract functionalities are working fine.Contact SUN vendor for more help.This should not seriously effect the working of delegated restarter.


922726 The status of device: %s is set to MONITORED

Description:

A device is monitored.

Solution:

No action required.


922870 tag %s: unable to kill process with SIGKILL

Description:

The rpc.fed server is not able to kill the process with a SIGKILL. This means the process is stuck in the kernel.

Solution:

Save the syslog messages file. Examine other syslog messages occurring around the same time on the same node, to see if the cause of the problem can be identified.


923046 Could not initialize dcs library. Error %d

Description:

An error prevented the initialization of the DCS library.

Solution:

Contact your authorized Sun service provider for assistance in diagnosing the problem.


923106 sysevent_bind_handle(): %s

Description:

The cl_apid or cl_eventd was unable to create the channel by which it receives sysevent messages. It will exit.

Solution:

Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


923412 The path name %s associated with the FilesystemCheckCommand extension property is detected to be a relative pathname. Only absolute path names are allowed.

Description:

Self explanatory. For security reasons, this is not allowed.

Solution:

Correct the FilesystemCheckCommand extension property by specifying an absolute path name.


923618 Prog <%s>: unknown command.

Description:

An internal error in ucmmd has prevented it from successfully executing a program.

Solution:

Save a copy of /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


923675 Resource "%s" could not be disabled: %s

Description:

The scha_control() call to disable this ScalMountPoint resource failed.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


923712 CCR: Table %s on the joining node %s has the same version but different checksum as the copy in the current membership. The table on the joining node will be replaced by the one in the current membership.

Description:

The indicated table on the joining node has the same version but different contents as the one in the current membership. It will be replaced by the one in the current membership.

Solution:

This is an informational message, no user action is needed.


924059 Validate - MySQL database directory %s does not exist

Description:

The defined database directory (-D option) don't exist.

Solution:

Make sure that defined database directory exists.


924303 Failed to update the upgrade state in the CCR

Description:

The upgrade process on the root node has failed to update the CCR.

Solution:

Cluster upgrade has failed. Reboot all the nodes out of cluster mode and recover from upgrade. Finish the cluster upgrade by using the standard upgrade method.


924950 No reference to any remote nodes: Not queueing event %lld.

Description:

The cl_eventd does not have references to any remote nodes.

Solution:

This message is informational only, and does not require user action.


925337 Stop of HADB database did not complete: %s.

Description:

The resource was unable to successfully run the hadbm stop command either because it was unable to execute the program, or the hadbm command received a signal.

Solution:

This might be the result of a lack of system resources. Check whether the system is low in memory and take appropriate action.


925379 resource group <%s> in illegal state <%s>, will not run %s on resource <%s>

Description:

While creating or deleting a resource, the rgmd discovered the containing resource group to be in an unexpected state on the local node. As a result, the rgmd did not run the INIT or FINI method (as indicated in the message) on that resource on the local node. This should not occur, and may indicate an internal logic error in the rgmd.

Solution:

The error is non-fatal, but it may prevent the indicated resource from functioning correctly on the local node. Try deleting the resource, and if appropriate, re-creating it. If those actions succeed, then the problem was probably transitory. Since this problem may indicate an internal logic error in the rgmd, save a copy of the /var/adm/messages files on all nodes, and the output of clresourcegroup status +, clresourcetype show -v and clresourcegroup show -v +. Report the problem to your authorized Sun service provider.


925692 scha_control failed. %s: %s

Description:

The scha_control (SCHA_RESOURCE_IS_RESTARTED) in the PMF action script supplied by DSDL just failed.

Solution:

Check syslog to see if there is no message from RGM which could explain why this failed.


925780 Creation of temporary file failed.

Description:

Was not able to create a temporary keytab file used in thorough probing of the HA-KDC service.

Solution:

Check the state of /etc/krb5.


925953 reservation error(%s) - do_scsi3_register() error for disk %s

Description:

The device fencing program has encountered errors while trying to access a device. All retry attempts have failed.

Solution:

This may be indicative of a hardware problem, which should be resolved as soon as possible. Once the problem has been resolved, the following actions may be necessary: If the message specifies the 'node_join' transition, then this node may be unable to access the specified device. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access the device. In either case, access can be reacquired by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group might have failed to start on this node. If the device group was started on another node, move it to this node by using the cldevicegroup command. If the device group was not started, you can start it by using the cldevicegroup command. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group might have failed. If so, the desired action may be retried.


926201 RGM aborting

Description:

A fatal error has occurred in the rgmd. The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


926749 Resource group nodelist is empty.

Description:

Empty value was specified for the nodelist property of the resource group.

Solution:

Any of the following situations might have occured. Different user action is required for these different scenarios. 1) If a resourcegroup is created or updated, check the value of the nodelist property. If it is empty or not valid, then provide valid value by using clresourcegroup command. 2) For all other cases, treat it as an Internal error. Contact your authorized Sun service provider.


927042 Validation failed. SYBASE backup server startup file RUN_%s not found SYBASE=%s.

Description:

Backup server was specified in the extension property Backup_Server_Name. However, Backup server startup file was not found. Backup server startup file is expected to be: $SYBASE/$SYBASE_ASE/install/RUN_<Backup_Server_Name>

Solution:

Check the Backup server name specified in the Backup_Server_Name property. Verify that SYBASE and SYBASE_ASE environment variables are set property in the Environment_file. Verify that RUN_<Backup_Server_Name> file exists.


927753 Fault monitor does not exist or is not executable : %s

Description:

Fault monitor program specified in support file is not executable or does not exist. Recheck your installation.

Solution:

Please report this problem.


927836 The file %s which the telemetry data service needs to execute is not executable

Description:

A program or script of the telemetry data service could not execute because a file is not executable. This should never occur.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


927846 fatal: Received unexpected result <%d> from rpc.fed, aborting node

Description:

A serious error has occurred in the communication between rgmd and rpc.fed while attempting to execute a VALIDATE method. The rgmd will produce a core file and will force the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


928235 Validation failed. Adaptive_Server_Log_File %s not found.

Description:

File specified in the Adaptive_Server_Log_File extension property was not found. The Adaptive_Server_Log_File is used by the fault monitor for monitoring the server.

Solution:

Please check that file specified in the Adaptive_Server_Log_File extension property is accessible on all the nodes.


928339 Syntax error in file %s

Description:

Specified file contains syntax errors.

Solution:

Please ensure that all entries in the specified file are valid and follow the correct syntax. After the file is corrected, repeat the operation that was being performed.


928382 CCR: Failed to read table %s on node %s.

Description:

The CCR failed to read the indicated table on the indicated node. The CCR will attempt to recover this table from other nodes in the cluster.

Solution:

This is an informational message, no user action is needed.


928448 Failed to retrieve property "Resource_name".

Description:

The name of the ScalMountPoint resource that represents the specified file system could not be retrieved.

Solution:

Investigate possible RGM errors or DSDL errors. Contact your authorized Sun service provider for assistance in diagnosing the problem.


928455 clcomm: Couldn't write to routing socket: %d

Description:

The system prepares IP communications across the private interconnect. A write operation to the routing socket failed.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


928943 scha_cluster_get() failed: %s

Description:

A call to scha_cluster_get() failed.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


929100 No permission for others to execute %s.

Description:

The specified path does not have the correct permissions as expected by a program.

Solution:

Set the permissions for the file so that it is readable and executable by others (world).


929252 Failed to start HA-NFS system fault monitor.

Description:

Process monitor facility has failed to start the HA-NFS system fault monitor.

Solution:

Check whether the system is low in memory or the process table is full and correct these probelms. If the error persists, use scswitch to switch the resource group to another node.


929697 fatal: Got error <%d> trying to read CCR when <%s> <%s> resource <%s>; aborting node

Description:

Rgmd failed to read updated resource from the CCR on this node.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


929846 Error getting service names of class (%s) from DCS.

Description:

An error prevented DCS service names from being retrieved.

Solution:

Determine whether the node is in cluster mode. Contact your authorized Sun service provider for assistance in diagnosing the problem.


930059 %s: %s.

Description:

HA-SAP failed to access to a file. The file in question is specified with the first '%s'. The reason it failed is provided with the second '%s'.

Solution:

Check and make sure the file is accessable via the path list.


930073 getzonenamebyid failed: %s

Description:

getzonenamebyid failed for the specified reason.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


930284 SCSLM fopen <%s> error <%s>

Description:

Should never occur.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


930322 reservation fatal error(%s) - malloc() error

Description:

The device fencing program has been unable to allocate required memory.

Solution:

Memory usage should be monitored on this node and steps taken to provide more available memory if problems persist.


930339 cl_event_bind_channel(): %s

Description:

The cl_eventd was unable to create the channel by which it receives sysevent messages. It will exit.

Solution:

Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


930535 Ignoring string with misplaced quotes in the entry for %s

Description:

HA-Oracle reads the file specified in USER_ENV property and exports the variables declared in the file. Syntax for declaring the variables is : VARIABLE=VALUE VALUE may be a single-quoted or double-quoted string. The string itself may not contain any quotes.

Solution:

Please check the environment file and correct the syntax errors.


930636 Sun Cluster boot: reset vote returns $res

Description:

Internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


930851 ERROR: process_resource: resource <%s> is pending_fini but no FINI method is registered

Description:

A non-fatal internal error has occurred in the rgmd state machine.

Solution:

Since this problem might indicate an internal logic error in the rgmd, save a copy of the /var/adm/messages files on all nodes, and the output of clresourcetype show -v, clresourcegroup show -v +, and clresourcegroup status +. Report the problem to your authorized Sun service provider.


931362 RGMD XDR Buffer Shortfall while encoding return arguments API num = %d. Will retry

Description:

A non-fatal error occurred while rgmd daemon was marshalling arguments for a remote procedure call. The operation will be re-tried with a larger buffer.

Solution:

No user action is required. If the message recurs frequently, contact your authorized Sun service provider to determine whether a workaround or patch is available.


931652 scha_cluster_open failed with %d

Description:

Call to initialize the cluster information handle failed. The second part of the message gives the error code.

Solution:

There could be other related error messages which might be helpful. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


931921 Unable to allocate memory at door server

Description:

The rgmd process has failed to allocate new memory, most likely because the system has run out of swap space.

Solution:

Investigate the cause of swap space depletion and correct the problem, if possible. Reboot your system. If the problem recurs, you might need to increase swap space by configuring additional swap devices. See swap(1M) for more information.


932126 CCR: Quorum regained.

Description:

The cluster lost quorum at sometime after the cluster came up, and the quorum is regained now.

Solution:

This is an informational message, no user action is needed.


934271 The directory %s needed by the telemetry data service does not exist

Description:

A program or script of the telemetry data service could not execute because a directory does not exist. This should never occur.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


934736 !!!! WARNING !!!! Dependency between resource %s and its MDS resource %s has not been defined !

Description:

The specifed ScalMountPoint resource requires an explicit dependency on the metadata server resource that represents the mount point. Validation of the ScalMountPoint resource succeeded, but the dependency must be defined before any attempt to start the ScalMountPoint resource.

Solution:

Before you attempt to start the ScalMountPoint resource, define the required dependency between the ScalMountPoint resource and the metadata server resource.


934965 Error: Can't stop ${SERVER} outside of run control environment.

Description:

The initrgm init script was run manually (not automatically by init(1M)). This is not supported by Sun Cluster software. The "initrgm stop" command was not successful. No action was performed.

Solution:

This is an informational message; no user action is needed.


935088 Error %d starting quota for %s

Description:

The /usr/sbin/quotaon command failed for the UFS mount point specified by %s.

Solution:

Contact your Sun service representative.


935470 validate: EnvScript not set but it is required

Description:

The parameter EnvScript is not set in the parameter file

Solution:

set the variable EnvScript in the paramter file mentioned in option -N to a of the start, stop and probe command to valid contents.


935729 <%s>:<%s>: Core location couldn't be configured for method <%s>.

Description:

The rpc.fed was unable to modify coredump settings for the indicated method.

Solution:

Low memory might be a transient error. If it persists, rebooting might help. If that fails to help, swap space might need to be increased. If it is a popen() failure, refer to coreadm(1M)/popen(3C), or contact your authorized Sun service proivider.


937669 CCR: Failed to update table %s.

Description:

The CCR data server failed to update the indicated table.

Solution:

There may be other related messages on this node, which may help diagnose the problem. If the root file system is full on the node, then free up some space by removing unnecessary files. If the root disk on the afflicted node has failed, then it needs to be replaced.


937795 Nodelist of RG "%s" is empty.

Description:

The node list of the resource group that contains the specified metadata server resource is empty.

Solution:

Ensure that the node list of the resource group that contains the metadata server resource is correct. For information about how to configure the shared QFS file system with Sun Cluster, see your Sun Cluster documentation and your Sun StorEdge QFS documentation.


938083 match_online_key failed strdup for (%s)

Description:

Call to strdup failed. The "strdup" man page describes possible reasons.

Solution:

Install more memory, increase swap space or reduce peak memory consumption.


938163 %s %s server startup encountered errors, errno = %d.

Description:

TCP server to accept connections on the private interconnect could not be started.

Solution:

Need a user action for this message.


938189 rpc.fed: program not registered or server not started

Description:

The rpc.fed daemon was not correctly initialized, or has died. This has caused a step invocation failure, and may cause the node to reboot.

Solution:

Check that the Solaris Clustering software has been installed correctly. Use ps(1) to check if rpc.fed is running. If the installation is correct, the reboot should restart all the required daemons including rpc.fed.


938318 Method <%s> failed on resource <%s> in resource group <%s> [exit code <%d>, time used: %d%% of timeout <%d seconds>] %s

Description:

A resource method exited with a non-zero exit code; this is considered a method failure. Depending on which method is being invoked and the Failover_mode setting on the resource, this might cause the resource group to fail over or move to an error state. Note that failures of FINI, INIT, BOOT, and UPDATE methods do not cause the associated administrative actions (if any) to fail.

Solution:

Consult resource type documentation to diagnose the cause of the method failure. Other syslog messages occurring just before this one might indicate the reason for the failure. If the failed method was one of START, PRENET_START, MONITOR_START, STOP, POSTNET_STOP, or MONITOR_STOP, you can issue a clresourcegroup command to bring resource groups onto desired primaries after fixing the problem that caused the method to fail. If the failed method was UPDATE, note that the resource might not take note of the update until it is restarted. If the failed method was FINI, INIT, or BOOT, you may need to initialize or de-configure the resource manually as approprite on the affected node.


938626 All PostgreSQL non-attached IPC shared memory segments removed

Description:

Remaining IPC shared memory segments have been removed. These segements are a leftover of the previous PostgreSQL instance.

Solution:

None


938701 Could not kill the telemetry data service monitor

Description:

The monitor program of the telemetry data service could not be stopped. This should never occur.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


938773 Pid_Dir_Path %s is not readable: %s.

Description:

The path which is listed in the message is not readable. The reason for the failure is also listed in the message.

Solution:

Make sure the path which is listed exists. Use the HAStoragePlus resource in the same resource group of the HA-SAPDB resource. So that the HA-SAPDB method have access to the file system at the time when they are launched.


939122 startebs - %s was manually started

Description:

The specified Oracle E-Business Suite component has been started manually.

Solution:

None required. Informational message.


939198 Validation failed. The ucmmd daemon will not be started on this node.

Description:

Atleast one of the modules off Sun Cluster support for Oracle OPS/RAC returned error during validation. The ucmmd daemon will not be started on this node and this node will not be able to run Oracle OPS/RAC.

Solution:

This message can be ignored if this node is not configured to run Oracle OPS/RAC. Examine other syslog messages logged at about the same time to determine the configuration errors. Examine the ucmm reconfiguration log file /var/cluster/ucmm/ucmm_reconf.log. Correct the problem and reboot the node. If problem persists, save a copy of the of the log files on this nodes and contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


939374 CCR: Failed to access cluster repository during synchronization. ABORTING node.

Description:

This node failed to access its cluster repository when it first came up in cluster mode and tried to synchronize its repository with other nodes in the cluster.

Solution:

This is usually caused by an unrecoverable failure such as disk failure. There may be other related messages on this node, which may help diagnose the problem. If the root disk on the afflicted node has failed, then it needs to be replaced. If the root disk is full on this node, boot the node into non-cluster mode and free up some space by removing unnecessary files.


939575 Method <%s> on resource <%s>, node <%s>: RPC connection error.

Description:

An attempted method execution failed, due to an RPC connection problem. This failure is considered a method failure. Depending on which method was being invoked and the Failover_mode setting on the resource, this might cause the resource group to fail over or move to an error state; or it might cause an attempted edit of a resource group or its resources to fail.

Solution:

Examine other syslog messages occurring around the same time on the same node, to see if the cause of the problem can be identified. If the same error recurs, you might have to reboot the affected node. After the problem is corrected, the operator can choose to issue a clresourcegroup command to bring resource groups onto desired primaries, or re-try the resource group update operation.


939614 Incorrect resource group properties for RAC framework resource group %s. Value of RG_mode must be Scalable.

Description:

The RAC framework resource group must be a scalable resource group. The RG_mode property of this resource group must be set to 'scalable'. RAC framework will not function correctly without this value.

Solution:

Refer to the documentation of Sun Cluster support for Oracle Parallel Server/ Real Application Clusters for installation procedure.


939851 Failed to open contract ctlfile %s: %s

Description:

Not able to open the contract control file.

Solution:

Check the contract manpage to know more about the error.Also make sure the basic contract functionalities are working fine.Contact SUN vendor for more help.


940545 scnetapp fatal error - Unable to open output file

Description:

The program responsible for retrieving NetApp NAS configuration information from the CCR has suffered an internal error. Continued errors of this type may lead to a compromise in data integrity.

Solution:

Contact your authorized Sun service provider as soon as possible to determine whether a workaround or patch is available.


940969 Failed to remove the private IP address for a zone

Description:

The private IP address assigned to a zone was not unplumbed correctly.

Solution:

The private IP address assigned to the zone is used for private IP communication for the zone. So, this feature may not work as expected as a result of this error. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


941071 Cannot retrieve service <%s>.

Description:

The data service cannot retrieve the required service.

Solution:

Check the service entries on all relevant cluster nodes.


941267 Cannot determine command passed in: <%s>.

Description:

An invalid pathname, displayed within the angle brackets, was passed to a libdsdev routine such as scds_timerun or scds_pmf_start. This could be the result of mis-configuring the name of a START or MONITOR_START method or other property, or a programming error made by the resource type developer.

Solution:

Supply a valid pathname to a regular, executable file.


941367 open failed: %s

Description:

Failed to open /dev/console. The "open" man page describes possible error codes.

Solution:

None. ucmmd will exit.


941416 One or more of the SUNW.HAStoragePlus resources that this resource depends on is not online anywhere. Failing validate method.

Description:

The HAStoragePlus resource this resource needs is not online anywhere in the cluster.

Solution:

Bring the HAStoragePlus resource online.


941997 Validate - Userid %s is not a valid userid

Description:

The specified userid does not exist within /etc/passwd.

Solution:

Ensure that the userid has been added.


942233 Resource %s has failed on node %s. The resource group is being switched offline by the RGM on this node; the resource group remains online on other nodes.

Description:

The resource group was brought OFFLINE on the node specified, probably because of a public network failure on that node. Although there is no spare node on which to relocate this instance of the resource group, the resource group is currently mastered by at least one other healthy node.

Solution:

No action required. If desired, examine other syslog messages on the node in question to determine the cause of the network failure.


942844 Resource <%s> of Resource Group <%s> failed sanity check on node <%s>

Description:

Message logged for failed scha_control sanity check methods on specific node.

Solution:

No user action required.


942855 check_mysql - Couldn't do show tables for defined database %s (%s)

Description:

The faultmonitor can't issue show tables for the specified database.

Solution:

Either was MySQL already down or the faultmonitor user don't have the right permission. The defined faultmonitor should have Process-,Select-, Reload- and Shutdown-privileges and for MySQL 4.0.x also Super-privileges. Check also the MySQL logfiles for any other errors.


943168 pmf_monitor_suspend: pmf_add_triggers: %s

Description:

The rpc.pmfd server was not able to resume the monitoring of a process, and the monitoring of this process has been aborted. An error message is output to syslog.

Solution:

Save the syslog messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


944068 clcomm: validate_policy: invalid relationship moderate %d high %d pool %d

Description:

The system checks the proposed flow control policy parameters at system startup and when processing a change request. The moderate server thread level cannot be higher than the high server thread level.

Solution:

No user action required.


944096 Validate - This version of MySQL <%s> is not supported with this dataservice

Description:

An unsupported MySQL version is being used.

Solution:

Make sure that supported MySQL version is being used.


944121 Incorrect permissions set for %s.

Description:

This file does not have the expected default execute permissions.

Solution:

Reset the permissions to allow execute permissions using the chmod command.


944181 HA: exception %s (major=%d) from flush_to().

Description:

An unexpected return value was encountered when performing an internal operation.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


944973 Unrecoverable error: unable to read contract events.

Description:

The rpc.pmfd is permanently unable to read contract events. It will exit, causing a node panic.

Solution:

Search for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


945739 %s has been removed from %s.\nMake sure that all HA IP addresses hosted on %s are moved.

Description:

We do not allow removig of an adapter from an IPMP group. The correct way to DR an adapter is to use if_mpadm(1M). Therefore we notify the user of the potential error.

Solution:

This message is informational; no user action is needed if the DR was done properly (using if_mpadm).


946660 Failed to create sap state file %s:%s Might put sap resource in stop-failed state.

Description:

If SAP is brought up outside the control of Sun Cluster, HA-SAP will create the state file to signal the stop method not to try to stop sap via Sun Cluster. Now if SAP was brought up outside of Sun Cluster, and the state file creation failed, then the SAP resource might end in the stop-failed state when Sun Cluster tries to stop SAP.

Solution:

This is an internal error. No user action needed. Save the /var/adm/messages from all nodes. Contact your authorized Sun service provider.


946807 Only a single path to the WLS Home directory has to be set in Confdir_list

Description:

Each WLS resource can handle only a single installation of WLS. You cannot create a resource to handle multiple installations.

Solution:

Create a separate resource for making each installation of WLS Highly Available.


946839 Failed to open the pbundle event listener FD: %s

Description:

Cannot open the contract file for event listening

Solution:

Check the contract manpage to know more about the error.Also make sure the basic contract functionalities are working fine.Contact SUN vendor for more help.


947007 Error initializing the cluster version manager (error %d).

Description:

This message can occur when the system is booting if incompatible versions of cluster software are installed.

Solution:

Verify that any recent software installations completed without errors and that the installed packages or patches are compatible with the rest of the installed software. Also contact your authorized Sun service provider to determine whether a workaround or patch is available.


947217 Reading root node of second partition from CCR failed

Description:

The upgrade process was unable to find needed information in the CCR. The CCR might be missing this information.

Solution:

Cluster upgrade has failed. Reboot all the nodes out of cluster mode and recover from upgrade. Finish the cluster upgrade by using the standard upgrade method.


947401 reservation error(%s) - Unable to open device %s, errno %d

Description:

The device fencing program has encountered errors while trying to access a device. All retry attempts have failed.

Solution:

This may be indicative of a hardware problem, which should be resolved as soon as possible. Once the problem has been resolved, the following actions may be necessary: If the message specifies the 'node_join' transition, then this node may be unable to access the specified device. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access the device. In either case, access can be reacquired by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group might have failed to start on this node. If the device group was started on another node, move it to this node by using the cldevicegroup command. If the device group was not started, you can start it by using the cldevicegroup command. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group might have failed. If so, the desired action may be retried.


948105 Failed to find the removed mount points: (%d) %s.

Description:

The online update of the HAStoragePlus resource is not successful because of failure in evaluating the removed file systems during the update operation.

Solution:

Check the syslog messages and try to resolve the problem. Try again to update the resource. If the problem persists, contact your authorized Sun service provider.


948424 Stopping NFS daemon %s.

Description:

The specified NFS daemon is being stopped by the HA-NFS implementation.

Solution:

This is an informational message. No action is needed.


948847 ucm_callback for start_trans generated exception %d

Description:

ucmm callback for start transition failed. Step may have timedout.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


948865 Unblock parent, write errno %d

Description:

Internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


948872 Errors detected in %s. Entries with error and subsequent entries may get ignored.

Description:

The custom action file that is specified contains errors which need to be corrected before it can be processed completely. This message indicates that entries after encountering an error may or may not be accepted.

Solution:

Please ensure that all entries in the custom monitor action file are valid and follow the correct syntax. After the file is corrected, validate it again to verify the syntax. If all errors are not corrected, the acceptance of entries from this file may not be guaranteed.


949565 reservation error(%s) - do_scsi2_tkown() error for disk %s

Description:

The device fencing program has encountered errors while trying to access a device. All retry attempts have failed.

Solution:

The action which failed is a scsi-2 ioctl. These can fail if there are scsi-3 keys on the disk. To remove invalid scsi-3 keys from a device, use 'scdidadm -R' to repair the disk (see scdidadm man page for details). If there were no scsi-3 keys present on the device, then this error is indicative of a hardware problem, which should be resolved as soon as possible. Once the problem has been resolved, the following actions may be necessary: If the message specifies the 'node_join' transition, then this node may be unable to access the specified device. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access the device. In either case, access can be reacquired by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group might have failed to start on this node. If the device group was started on another node, move it to this node by using the cldevicegroup command. If the device group was not started, Start it by using the cldevicegroup command. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group might have failed. If so, you can retry the desired action.


949621 Derby data service requested a failover

Description:

The Derby data service requested a failover.

Solution:

This message is informational; no user action needed.


949924 Out of memory!

Description:

A user program ran out of memory.

Solution:

Determine why the user program ran out of memory. Contact your authorized Sun service provider for assistance in diagnosing the problem.


949937 Out of memory.

Description:

A process has failed to allocate new memory, most likely because the system has run out of swap space.

Solution:

Reboot your system. If the problem recurs, you might need to increase swap space by configuring additional swap devices. See swap(1M) for more information.


949979 Validation failed. Oracle_config_file %s not found

Description:

Unable to locate the file specified in Oracle_config_file property. This message is given by the validate method, when creating the resource for UNIX Distributed Lock Manager or modifying the property value. This file created when ORCLudlm package is installed on the node.

Solution:

Verify the file name specified in Oracle_config_file property. Verify installation of ORCLudlm package on this node.


950747 resource %s monitor disabled.

Description:

This is a notification from the rgmd that the operator has disabled monitoring on a resource. This message can be used by system monitoring tools.

Solution:

This is an informational message; no user action is needed.


951501 CCR: Could not initialize CCR data server.

Description:

The CCR data server could not initialize on this node. This usually happens when the CCR is unable to read its metadata entries on this node. There is a CCR data server per cluster node.

Solution:

There may be other related messages on this node, which may help diagnose this problem. If the root disk failed, it needs to be replaced. If there was cluster repository corruption, then the cluster repository needs to be restored from backup or other nodes in the cluster. Boot the offending node in -x mode to restore the repository. The cluster repository is located at /etc/cluster/ccr/.


951520 Validation failed. SYBASE ASE runserver file RUN_%s not found SYBASE=%s.

Description:

Sybase Adaptive Server is started by specifying the Adaptive Server "runserver" file named RUN_<Server Name> located under $SYBASE/$SYBASE_ASE/install. This file is missing.

Solution:

Verify that the Sybase installation includes the "runserver" file and that permissions are set correctly on the file. The file should reside in the $SYBASE/$SYBASE_ASE/install directory.


951634 INTERNAL ERROR CMM: clconf_get_quorum_table() returned error %d.

Description:

The node encountered an internal error during initialization of the quorum subsystem object.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


951733 Incorrect usage: %s.

Description:

The usage of the program was incorrect for the reason given.

Solution:

Use the correct syntax for the program.


951914 Error: unable to initialize CCR.

Description:

The cl_apid was unable to initialize the CCR during start-up. This error will prevent the cl_apid from starting.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


952006 received signal %d: exiting

Description:

The cl_apid daemon is terminating abnormally due to receipt of a signal.

Solution:

No action required.


952226 Error with allow_hosts or deny_hosts

Description:

The allow_hosts or deny_hosts for the CRNP service contains an error. This error may prevent the cl_apid from starting up.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


952237 Method <%s>: unknown command.

Description:

An internal error has occurred in the interface between the rgmd and fed daemons. This in turn will cause a method invocation to fail. This should not occur and may indicate an internal logic error in the rgmd.

Solution:

Look for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


952465 HA: exception adding secondary

Description:

A failure occurred while attempting to add a secondary provider for an HA service.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


952811 Validation failed. CONNECT_STRING not in specified format

Description:

CONNECT_STRING property for the resource is specified in the correct format. The format could be either 'username/password' or '/' (if operating system authentication is used).

Solution:

Specify CONNECT_STRING in the specified format.


952827 Starting Oracle instance %s

Description:

Oracle instance startup is in progress.

Solution:

None required. Informational message.


952942 Retrying retrieve of resource information: %s.

Description:

An update to resource configuration occured while resource properties were being retrieved.

Solution:

This is an informational message, no user action is needed.


952974 %s: cannot open directory %s

Description:

The ucmmd was unable to open the directory identified. Contact your authorized Sun service provider for assistance in diagnosing the problem.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


953277 monitor_check: the failover requested by scha_control for resource <%s>, resource group <%s>, node <%s> was not completed because of error: <%s>

Description:

A scha_control(1HA,3HA) GIVEOVER attempt failed, due to the error listed.

Solution:

Examine other syslog messages on all cluster members that occurred about the same time as this message, to see if the problem that caused the error can be identified and repaired.


953470 SCSLM file <%s> duplicate project name

Description:

Should never occur.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


953614 %s: Could not signal all contract members: %s

Description:

During the stopping of the resource, could not clear all the contracts that belong to the services under that resource.

Solution:

This should not cause much problem.Contact SUN vendor for more help.


953642 Server is not running. Calling shutdown abort to clear shared memory (if any)

Description:

Informational message. Oracle server is not running. However if Oracle processes are aborted without clearing shared memory, it can cause problems when starting Oracle server. Clearing leftover shared memory if any.

Solution:

None


953818 There are no disks to monitor

Description:

An attempt to start the scdpmd failed.

Solution:

The node should have a least a disk. Check the local disk list with the scdidadm -l command.


954831 Start of HADB node %d did not complete: %s.

Description:

The resource was unable to successfully run the hadbm start command either because it was unable to execute the program, or the hadbm command received a signal.

Solution:

This might be the result of a lack of system resources. Check whether the system is low in memory and take appropriate action.


955269 Mountpoint %s in zone %s falls outside its zone root.

Description:

HAStoragePlus detected that the local zone mount point falls outside of zone root. HAStoragePlus avoid such mount points for security reasons.

Solution:

Change the specified local zone mount point so that it falls inside the zone root.


955366 scvxvmlg error - stat(%s) failed with errno %d

Description:

The program responsible for maintaining the VxVM namespace was unable to access the global device namespace. If configuration changes were recently made to VxVM diskgroups or volumes, this node may be unaware of those changes. Recently created volumes may be unaccessible from this node.

Solution:

Verify that the /global/.devices/node@N (N = this node's node number) is mounted globally and is accessible. If no configuration changes have been recently made to VxVM diskgroups or volumes and all volumes continue to be accessible from this node, then no further action is required. If changes have been made, the device namespace on this node can be updated to reflect those changes by executing '/usr/cluster/lib/dcs/scvxvmlg'. If the problem persists, contact your authorized Sun service provider to determine whether a workaround or patch is available.


955441 The file %s which the derby data service needs to execute is not executable

Description:

A program or script of the derby data service could not execute because a file is not executable. This should never occur.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


955895 Function: validate - ServiceStartCommand (%s) not a fully qualified path.

Description:

The command specified for variable ServiceStartCommand within the /opt/SUNWsczone/sczsh/util/sczsh_config configuration file is not containing the full qualified path to it.

Solution:

Make sure the full qualified path is specified for the ServiceStartCommand, e.g. "/full/path/to/mycommand" rather than just "mycommand". This full qualified path must be accessible within the zone that command is being called.


955930 Attempt to connect from addr %s port %d.

Description:

There was a connection from the named IP address and port number (> 1024) which means that a non-priviledged process is trying to talk to the PNM daemon.

Solution:

This message is informational; no user action is needed. However, it would be a good idea to see which non-priviledged process is trying to talk to the PNM daemon and why?


956110 Restart operation failed: %s.

Description:

This message indicated that the rgm didn't process a restart request, most likely due to the configuration settings.

Solution:

This is an informational message.


956324 Failed to do dual-partition upgrade tasks on second partition

Description:

During a dual-partition upgrade in a live upgrade scenario, some dual-partition upgrade related tasks have failed. Upgrade process has stopped and cannot proceed.

Solution:

Cluster upgrade has failed. Reboot all the nodes out of cluster mode and recover from upgrade. Finish the cluster upgrade by using the standard upgrade method.


956501 Issuing a failover request.

Description:

This message indicates that the fault monitor is about to make a failover request to the RGM. If the request fails, refer to the syslog messages that appear after this message.

Solution:

This is an informational message, no user action is required.


957086 Prog <%s> failed to execute step <%s> - error=<%d>

Description:

ucmmd failed to execute a step.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified and if it recurs. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance.


957314 Cannot import '%s' : pool may be in use from other system.

Description:

Self explanatory.

Solution:

Check the cluster node that has imported the specified pool. Try to export the zpool again by issuing "zpool export -f <poolname>". If the problem persists, contact your authorized Sun service provider.


957535 Smooth_shutdown flag is not set to TRUE. The WebLogic Server will be shutdown using sigkill.

Description:

This is a information message. The Smooth_shutdown flag is not set to true and hence the WLS will be stopped using SIGKILL.

Solution:

None. If the smooth shutdown has to be enabled then set the Smooth_shutdown extension property to TRUE. To enable smooth shutdown the username and pasword that have to be passed to the "java weblogic.Admin .." has to be set in the start script. Refer to your Admin guide for details.


958395 Validate - qconf file does not exist or is not executable at %s/qconf

Description:

The binary file qconf can not be found, or is not executable.

Solution:

Confirm the binary file ${SGE_ROOT}/bin/<arch>/qconf both exists in that location, and is executable.


958832 INTERNAL ERROR: monitoring is enabled, but MONITOR_STOP method is not registered for resource <%s>

Description:

A non-fatal internal error has occurred in the rgmd state machine.

Solution:

Since this problem might indicate an internal logic error in the rgmd, save a copy of the /var/adm/messages files on all nodes, and the output of clresourcetype show -v, clresourcegroup show -v +, and clresourcegroup status +. Report the problem to your authorized Sun service provider.


958888 clcomm: Failed to allocate simple xdoor client %d

Description:

The system could not allocate a simple xdoor client. This can happen when the xdoor number is already in use. This message is only possible on debug systems.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


959261 Error parsing node status line (%s).

Description:

The resource had an error while parsing the specified line of output from the hadbm status --nodes command.

Solution:

Examine other syslog messages occurring around the same time on the same node, to see if the source of the problem can be identified. Otherwise contact your authorized Sun service provider to determine whether a workaround or patch is available.


959384 Possible syntax error in hosts entry in %s.

Description:

Validation callback method has failed to validate the hostname list. There may be syntax error in the nsswitch.conf file.

Solution:

Check for the following syntax rules in the nsswitch.conf file. 1) Check if the lookup order for "hosts" has "files". 2) "cluster" is the only entry that can come before "files". 3) Everything in between '[' and ']' is ignored. 4) It is illegal to have any leading whitespace character at the beginning of the line; these lines are skipped. Correct the syntax in the nsswitch.conf file and try again.


959610 Property %s should have only one value.

Description:

A multi-valued (comma-separated) list was provided to the scrgadm command for the property, while the implementation supports only one value for this property.

Solution:

Specify a single value for the property on the scrgadm command.


959930 Desired_primaries must equal the number of nodes in Nodelist.

Description:

The resource group properties desired_primaries and maximum_primaries must be equal and they must equal the number of Sun Cluster nodes in the nodelist of the resource group.

Solution:

Set the desired and maximum primaries to be equal and to the number of Sun Cluster nodes in the nodelist.


960228 HADB node %d for host %s does not match any Sun Cluster interconnect hostname.

Description:

The HADB database must be created using the hostnames of the Sun Cluster private interconnect, the hostname for the specified HADB node is not a private cluster hostname.

Solution:

Recreate the HADB and specify cluster private interconnect hostnames.


960344 ERROR: process_resource: resource <%s> is pending_init but no INIT method is registered

Description:

A non-fatal internal error has occurred in the rgmd state machine.

Solution:

Since this problem might indicate an internal logic error in the rgmd, save a copy of the /var/adm/messages files on all nodes, and the output of clresourcetype show -v, clresourcegroup show -v +, and clresourcegroup status +. Report the problem to your authorized Sun service provider.


960359 Diskgroup (%s) is available

Description:

The disk group is available. This message is an informational message.

Solution:

No user action is required.


960862 (%s) sigaction failed: %s (UNIX errno %d)

Description:

The udlm has failed to initialize signal handlers by a call to sigaction(2). The error message indicates the reason for the failure.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


960932 Switchover (%s) error: failed to fsck disk

Description:

The file system specified in the message could not be hosted on the node the message came from because an fsck on the file system revealed errors.

Solution:

Unmount the PXFS file system (if mounted), fsck the device, and then mount the PXFS file system again.


961070 SCSLM pool_set_binding <%s> zone <%d> error PO_FAIL

Description:

Should never occur.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


961669 Error in processing and creating fmri

Description:

Cannot process the smf fmri under the resource by sc delegated restarter

Solution:

The previous messages that gets displayed should help.Contact the SUN vendor for more info


962775 Node %u attempting to join cluster has incompatible cluster software.

Description:

A node is attempting to join the cluster but it is either using an incompatible software version or is booted in a different mode (32-bit vs. 64-bit).

Solution:

Ensure that all nodes have the same clustering software installed and are booted in the same mode.


963000 Devfsadm successfully configured did

Description:

Devfsadm command successfully configured did devices.

Solution:

This is an informational message, no user action is needed.


963224 Unable to determine whether the node can safely join the cluster. Giving up after %s retries. The ucmmd will not be started on this node.

Description:

After retrying the operation, the RAC framework cannot determine whether the node can safely join the cluster. The node is not allowed to join the cluster now.

Solution:

Wait for all reconfiguration activity to stop on existing cluster members. Then repeat the attempt to enable the node to join the cluster. If the node is still unable to join, contact your Sun service representative for assistance in diagnosing and correcting the problem.


963309 Monitor for sckrb5 successfully stopped. PMF will restart it.

Description:

The HA-KDC's update method was able to stop the monitor.

Solution:

PMF will restart the monitor, so no intervention is required.


963375 gethostbyname failed for (%s)

Description:

Failed to get information about a host. The "gethostbyname" man page describes possible reasons.

Solution:

Make sure entries in /etc/hosts, /etc/nsswitch.conf and /etc/netconfig are correct to get information about this host.


963465 fatal: rpc_control() failed to set automatic MT mode; aborting node

Description:

The rgmd failed in a call to rpc_control(3N). This error should never occur. If it did, it would cause the failure of subsequent invocations of scha_cmds(1HA) and scha_calls(3HA). This would most likely lead to resource method failures and prevent RGM reconfigurations from occurring. The rgmd will produce a core file and will force the node to halt or reboot.

Solution:

Examine other syslog messages occurring at about the same time to see if the source of the problem can be identified. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing the problem. Reboot the node to restart the clustering daemons.


963755 lkcm_cfg: caller is not registered

Description:

udlm is not registered with ucmm.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


964072 Unable to resolve %s.

Description:

The data service has failed to resolve the host information.

Solution:

If the logical host and shared address entries are specified in the /etc/inet/hosts file, check that these entries are correct. If this is not the reason, then check the health of the name server. For more error information, check the syslog messages.


964083 t_open (open_cmd_port) failed

Description:

Call to t_open() failed. The "t_open" man page describes possible error codes. ucmmd will exit and the node will abort.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


964399 udlm seq no (%d) does not match library's (%d).

Description:

Mismatch in sequence numbers between udlm and the library code is causing an abort.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


964521 Failed to retrieve the resource handle: %s.

Description:

An API operation on the resource has failed.

Solution:

For the resource name, check the syslog tag. For more details, check the syslog messages from other components. If the error persists, reboot the node.


964829 Failed to add events to client

Description:

The cl_apid experienced an internal error that prevented proper updates to a CRNP client.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


965174 Function: validate - %s configured with autoboot true, it needs to be false

Description:

The referenced zone is configured with autoboot = true. This option needs to be set to false.

Solution:

Configure the autoboot variable of the configured zone to false. You need to run the zoncfg command to complete this task.


965261 HTTP GET probe used entire timeout of %d seconds during connect operation and exceeded the timeout by %d seconds. Attempting disconnect with timeout %d

Description:

The probe used it's entire timeout time to connect to the HTTP port.

Solution:

Check that the web server is functioning correctly and if the resources probe timeout is set too low.


965278 The probe has requested an immediate failover. Attempting to failover this resource group subject to other property settings.

Description:

An immediate failover will be performed.

Solution:

This is an informational message, no user action is needed.


965554 scnetapp fatal error - CCR references are not NULL

Description:

The program responsible for retrieving NetApp NAS configuration information from the CCR has suffered an internal error. Continued errors of this type may lead to a compromise in data integrity.

Solution:

Contact your authorized Sun service provider as soon as possible to determine whether a workaround or patch is available.


965722 Failed to retrieve the resource group Failback property: %s.

Description:

HA Storage Plus was not able to retrieve the resource group Failback property from the CCR.

Solution:

Check the cluster configuration. If the problem persists, contact your authorized Sun service provider.


965873 CMM: Node %s (nodeid = %d) with votecount = %d added.

Description:

The specified node with the specified votecount has been added to the cluster.

Solution:

This is an informational message, no user action is needed.


966112 UNRECOVERABLE ERROR: Sun Cluster boot: /usr/cluster/lib/sc/failfastd not found

Description:

Internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


966335 SAP enqueue server is not running. See %s/ensmon%s.out.%s for output.

Description:

The SAP enqueue server is down. See the output file listed in the error message for details. The data service will fail over the SAP enqueue server.

Solution:

No user action is needed.


966670 did discovered faulty path, ignoring: %s

Description:

scdidadm has discovered a suspect logical path under /dev/rdsk. It will not add it to subpaths for a given instance.

Solution:

Check to see that the symbolic links under /dev/rdsk are correct.


966682 WARNING: Share path %s may be on a root file system or any file system that does not have an /etc/vfstab entry.

Description:

The path indicated is not on a non-root mounted file system. This could be damaging as upon failover NFS clients will start seeing ESTALE err's. However, there is a possibility that this share path is legitimate. The best example in this case is a share path on a root file systems but that is a symbolic link to a mounted file system path.

Solution:

This is a warning. However, administrator must make sure that share paths on all the primary nodes access the same file system. This validation is useful when there are no HAStorage/HAStoragePlus resources in the HA-NFS RG.


966842 in libsecurity unknown security flag %d

Description:

This is an internal error which shouldn't happen. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


967050 Validation failed. Listener binaries not found ORACLE_HOME=%s

Description:

Oracle listener binaries not found under ORACLE_HOME. ORACLE_HOME specified for the resource is indicated in the message. HA-Oracle will not be able to manage Oracle listener if ORACLE_HOME is incorrect.

Solution:

Specify correct ORACLE_HOME when creating resource. If resource is already created, please update resource property 'ORACLE_HOME'.


967080 This resource depends on a HAStoragePlus resouce that is not online on this node. Ignoring validation errors.

Description:

The resource depends on a HAStoragePlus resource. Some of the files required for validation checks are not accessible from this node, because HAStoragePlus resource in not online on this node. Validations will be performed on the node that has HAStoragePlus resource online. Validation errors are being ignored on this node by this callback method.

Solution:

Check the validation errors logged in the syslog messages. Please verify that these errors are not configuration errors.


967139 IO to file-system %s, through mountpoint %s failed due to UCMM being not in good state.

Description:

The UCMM is hanging and is not responding.

Solution:

Determine whether another problem with the cluster caused the UCMM to hang. Bring online the UCMM again and restart the ScalMountPoint resource.


967417 validate_options - Fatal: SGE_CELL %s/%s not a directory

Description:

The ${SGE_ROOT}/${SGE_CELL} combination variable produces a value whose location does not exist.

Solution:

The SGE_CELL value is decided when the Sun Grid Engine data service is installed. The default value is 'default' and can be configured in /opt/SUNWscsge/util/sge_config. Run sge_remove and then sge_register after verifying the value in /opt/SUNWscsge/util/sge_config is correct.


967645 Starting the JSAS Domian Admin Server under pmf

Description:

This is just an info message indicating that the Sun Cluster PMF is now starting the Domain Admin Server.

Solution:

None


967970 Modification of resource <%s> failed because none of the nodes on which VALIDATE would have run are currently up

Description:

In order to change the properties of a resource whose type has a registered VALIDATE method, the rgmd must be able to run VALIDATE on at least one node. However, all of the candidate nodes are down. "Candidate nodes" are either members of the resource group's Nodelist or members of the resource type's Installed_nodes list, depending on the setting of the resource's Init_nodes property.

Solution:

Boot one of the resource group's potential masters and retry the resource change operation.


968427 Applied dual-partition upgrade changes successfully

Description:

Changes required for Dual Partition Software Swap were done successfully on this node.

Solution:

This is an informational message, no user action is needed.


968557 Could not unplumb any ipaddresses.

Description:

Failed to unplumb any ip addresses. The resource cannot be brought offline. Node will be rebooted by Sun cluster.

Solution:

Check the syslog messages from other components for possible root cause. Save a copy of /var/adm/messages and contact Sun service provider for assistance in diagnosing and correcting the problem.


968853 scha_resource_get error (%d) when reading system property %s

Description:

Error occurred in API call scha_resource_get.

Solution:

Check syslog messages for errors logged from other system modules. Stop and start fault monitor. If error persists then disable fault monitor and report the problem.


968858 SCSLM <%s> pool_conf_commit static error <%s>

Description:

Should never occur.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


968941 Failed to bind delegated restarter handle: %s

Description:

The sc delegated restarter is not able to bind the handle with the SMF.

Solution:

Check if the SMF default restarter "svc:/system/svc/restarter:default" is online. Check if other SMF services are functioning properly. Contact SUN vendor for more help.


969008 t_alloc (open_cmd_port-T_ADDR) %d

Description:

Call to t_alloc() failed. The "t_alloc" man page describes possible error codes. ucmmd will exit and the node will abort.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


969264 Unable to get CMM control.

Description:

The cl_eventd was unable to obtain a list of cluster nodes from the CMM. It will exit.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


969578 Internal error: trying to start nametag %s that is already running

Description:

The rpc.pmfd encountered an internal logic error. There should be no impact on the monitored processes.

Solution:

Search for other syslog error messages on the same node. Save a copy of the /var/adm/messages files on all nodes, and report the problem to your authorized Sun service provider.


969827 Failover attempt has failed.

Description:

LogicalHostname resource was unable to register with IPMP for status updates.

Solution:

Most likely it is result of lack of system resources. Check for memory availability on the node. Reboot the node if problem persists.


970018 Probe for %s returned error.

Description:

Probe for the specified service returned error.

Solution:

Need a user action for this message.


970364 Command %s returned with non-zero exit status %d, HA-SAP will continue to start SAP.

Description:

Command exited normally but the exit status was was set to a non-zero value.

Solution:

Contact your authorized SAP service provider.


970567 CMM: Quorum server %s cannot be reached. Check if qshost %s is specified in /etc/hosts file, /etc/inet/ipnodes file, or both.

Description:

The quorum server hostname could not be resolved to an address.

Solution:

Add the quorum server hostname to the /etc/hosts file, /etc/inet/ipnodes file, or both. Verify that the settings in the /etc/nsswitch.conf file include "files" for host lookup.


970705 scnetapp fatal error - Cannot bind to nameserver

Description:

The program responsible for retrieving NetApp NAS configuration information from the CCR has suffered an internal error. Continued errors of this type may lead to a compromise in data integrity.

Solution:

Contact your authorized Sun service provider as soon as possible to determine whether a workaround or patch is available.


971233 Property %s is not set.

Description:

The property has not been set by the user and must be.

Solution:

Reissue the clresource command with the required property and value.


972479 ucmm is undergoing reconfiguration.

Description:

The UCMM is being reconfigured.

Solution:

No user action is required.


972580 CCR: Highest epoch is < 0, highest_epoch = %d.

Description:

The epoch indicates the number of times a cluster has come up. It should not be less than 0. It could happen due to corruption in the cluster repository.

Solution:

Boot the cluster in -x mode to restore the cluster repository on all the members of the cluster from backup. The cluster repository is located at /etc/cluster/ccr/.


972610 fork: %s

Description:

The rgmd, or rpc.pmfd daemon was not able to fork a process, possibly due to low swap space. The message contains the system error. This can happen while the daemon is starting up (during the node boot process), or when executing a client call. If it happens when starting up, the daemon does not come up. If it happens during a client call, the server does not perform the action requested by the client.

Solution:

Determine if the machine is running out of swap space. If this is not the case, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


972716 Failed to stop the application with SIGKILL. Returning with failure from stop method.

Description:

The stop method failed to stop the application with SIGKILL.

Solution:

Use pmfadm(1M) with the -L option to retrieve all the tags that are running on the server. Identify the tag name for the application in this resource. This can be easily identified as the tag ends in the string ".svc" and contains the resource group name and the resource name. Then use pmfadm(1M) with the -s option to stop the application. If the error still persists, then reboot the node.


973102 ucmm in stable state.

Description:

The UCMM is in stable state. This message is an informational message.

Solution:

No user action is required.


973243 Validation failed. USERNAME missing in CONNECT_STRING

Description:

USERNAME is missing in the specified CONNECT_STRING. The format could be either 'username/password' or '/' (if operating system authentication is used).

Solution:

Specify CONNECT_STRING in the specified format.


973299 Error in scha_res_get of retry intvl prop,res:%s, Err: %s

Description:

Cannot read the resource retry interval property in sc delegated restarter

Solution:

Take action depending on the error and contact sun vendor for more help.


973615 Node %s: weight %d

Description:

The load balancer set the specfied weight for the specified node.

Solution:

This is an informational message, no user action is needed.


973933 resource %s added.

Description:

This is a notification from the rgmd that the operator has created a new resource. This message can be used by system monitoring tools.

Solution:

This is an informational message; no user action is needed.


974106 lkcm_parm: caller is not registered

Description:

udlm is not registered with ucmm.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


974129 Cannot stat %s: %s.

Description:

The stat(2) system call failed on the specified pathname, which was passed to a libdsdev routine such as scds_timerun or scds_pmf_start. The reason for the failure is stated in the message. The error could be the result of 1) mis-configuring the name of a START or MONITOR_START method or other property, 2) a programming error made by the resource type developer, or 3) a problem with the specified pathname in the file system itself.

Solution:

Ensure that the pathname refers to a regular, executable file.


974664 HA: no valid secondary provider in rmm - aborting

Description:

This node joined an existing cluster. Then all of the other nodes in the cluster died before the HA framework components on this node could be properly initialized.

Solution:

This node must be rebooted.


974898 %s->In scf_simple_prop_get:%s, method: %s

Description:

There is no property in the service or there was an error in reading the property.

Solution:

Check the SMF manpage to know more about the error.Also make sure the basic SMF functionalities are working fine.Contact SUN vendor for more help.


975654 Failover %s data service must have exactly one value for extension property %s.

Description:

Failover data service must have one and only one value for Confdir_list.

Solution:

Create a failover resource group for each configuration file.


975775 Publish event error: %d

Description:

Internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


976495 fork failed: %s

Description:

Failed to run the "fork" command. The "fork" man page describes possible error codes.

Solution:

Some system resource has been exceeded. Install more memory, increase swap space or reduce peak memory consumption.


976914 fctl: %s

Description:

The cl_apid received the specified error while attempting to deliver an event to a CNRP client.

Solution:

Examine other syslog messages occurring at about the same time to see if the problem can be identified. Save a copy of the /var/adm/messages files on all nodes and contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


977371 Backup server terminated.

Description:

Graceful shutdown did not succeed. Backup server processes were killed in STOP method. It is likely that adaptive server terminated prior to shutdown of backup server.

Solution:

Please check the permissions of file specified in the STOP_FILE extension property. File should be executable by the Sybase owner and root user.


977379 Failed to reboot the nodes of the partition

Description:

The upgrade process will reboot the nodes of second partition in the alternate boot environment. Rebooting the nodes failed.

Solution:

Cluster upgrade has failed. Reboot all the nodes out of cluster mode and recover from upgrade. Finish the cluster upgrade by using the standard upgrade method.


977412 The state of the path to device: %s has changed to FAILED

Description:

A device is seen as FAILED.

Solution:

Check the device.


978081 rebalance: resource group <%s> is quiescing.

Description:

The indicated resource was quiesced as a result of clresourcegroup quiesce or scswitch -Q being executed on a node.

Solution:

Use clresourcegroup status to determine the state of the resource group. If the resource group is in ERROR_STOP_FAILED state on a node, you must manually kill the resource and its monitor, and clear the error condition, before the resource group can be started. Refer to the procedure for clearing the ERROR_STOP_FAILED condition on a resource group in the Sun Cluster Administration Guide. If the resource group is in ONLINE_FAULTED or ONLINE state then you can switch it offline or restart it. If the resource group is OFFLINE on all nodes then you can switch it online. If the resource group is in a non-quiescent state such as PENDING_OFFLINE or PENDING_ONLINE, this indicates that another event, such as the death of a node, occurred during or immediately after execution of the clresourcegroup quiesce command. In this case, you can re-execute the clresourcegroup quiesce command to quiesce the resource group.


978125 in libsecurity setnetconfig failed when initializing the server: %s - %s

Description:

A server (rpc.pmfd, rpc.fed or rgmd) was not able to start because it could not establish a rpc connection for the network specified. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


978812 Validation failed. CUSTOM_ACTION_FILE: %s does not exist

Description:

The file specified in property 'Custom_action_file' does not exist.

Solution:

Please make sure that 'Custom_action_file' property is set to an existing action file. Reissue command to create/update.


978829 t_bind, did not bind to desired addr

Description:

Call to t_bind() failed. The "t_bind" man page describes possible error codes. udlm will exit and the node will abort.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


979343 Error: duplicate prog <%s> launched step <%s>

Description:

Due to an internal error, uccmd has attempted to launch the same step by duplicate programs. ucmmd will reject the second program and treat it as a step failure.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


979803 CMM: Node being shut down.

Description:

This node is being shut down.

Solution:

This is an informational message, no user action is needed.


980226 File %s is missing or not executable.

Description:

The specified file does not exist or does not have it's executable bit set.

Solution:

Ensure that the file exists and is executable.


980307 reservation fatal error(%s) - Illegal command

Description:

The device fencing program has suffered an internal error.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available. Copies of /var/adm/messages from all nodes should be provided for diagnosis. It may be possible to retry the failed operation, depending on the nature of the error. If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, it may be possible to reacquire access to shared devices by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. If desired, it might be possible to switch the device group to this node by using the cldevicegroup command. If no other node was available, then the device group will not have been started. You can use the cldevicegroup command to retry the attempt to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. The desired action may be retried.


980425 Aborting startup: could not determine whether failover of NFS resource groups is in progress.

Description:

Startup of an NFS resource was aborted because it was not possible to determine if failover of any NFS resource groups is in progress.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


980477 LogicalHostname online.

Description:

The status of the logicalhost resource is online.

Solution:

This is informational message. No user action required.


980942 CMM: Cluster doesn't have operational quorum yet; waiting for quorum.

Description:

Not enough nodes are operational to obtain a majority quorum; the cluster is waiting for more nodes before starting.

Solution:

If nodes are booting, wait for them to finish booting and join the cluster. Boot nodes that are down.


981211 'ensmon 2' timed out. Enqueue server is running. But status for replica server is unknown. See %s/ensmon%s.out.%s for output.

Description:

SAP utility 'ensmon' with option 2 cannot be completed. However, 'ensmon' with option 1 finished successfully. This can happen if the network of the cluster node where the SAP replica server was running becomes unavailable.

Solution:

No user action is needed.


981931 INTERNAL ERROR: postpone_start_r: meth type <%d>

Description:

A non-fatal internal error has occurred in the rgmd state machine.

Solution:

Since this problem might indicate an internal logic error in the rgmd, save a copy of the /var/adm/messages files on all nodes, and the output of clresourcetype show -v, clresourcegroup show -v +, and clresourcegroup status +. Report the problem to your authorized Sun service provider.


983613 INTERNAL ERROR: error occurred while launching the probe command <%s>

Description:

An internal error has occurred.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


983924 Method timeouts are being suspended for Resource group <%s> until device group switchovers have completed.

Description:

Timeout monitoring for resource methods in the resource group is being suspended while disk device groups are reconfiguring. This prevents unnecessary failovers that might be caused by the temporary unavailability of a device group.

Solution:

This is just an informational message.


984438 File-system has got un-mounted! Please check if it needs maintenance and then re-mount it.

Description:

The I/O probe detected that that file system is not mounted. One possible cause of this error is that the mount-point directory was deleted.

Solution:

Check the condition of the file system and determine whether the mount-point directory exists. If the mount-point directory exists, restart the ScalMountPoint resource.


984634 reservation notice(%s) - MHIOCGRP_INRESV success during retry attempt: %d

Description:

Informational message from reserve on ioctl success during retry.

Solution:

No user action required.


984776 Instance %s already running.

Description:

The agent has detected that the Oracle instance is already running, when attempting to startup the instance.

Solution:

None required. Informational message.


985103 scvxvmlg error - readlink(%s) failed

Description:

The program responsible for maintaining the VxVM namespace was unable to access the global device namespace. If configuration changes were recently made to VxVM diskgroups or volumes, this node may be unaware of those changes. Recently created volumes may be unaccessible from this node.

Solution:

Verify that the /global/.devices/node@N (N = this node's node number) is mounted globally and is accessible. If no configuration changes have been recently made to VxVM diskgroups or volumes and all volumes continue to be accessible from this node, then no further action is required. If changes have been made, the device namespace on this node can be updated to reflect those changes by executing '/usr/cluster/lib/dcs/scvxvmlg'. If the problem persists, contact your authorized Sun service provider to determine whether a workaround or patch is available.


985111 lkcm_reg: illegal %s value

Description:

Cluster information that is being used during udlm registration with ucmm is incorrect.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


985158 SCSLM <%s> pool_conf_close() error <%s>

Description:

Should never occur.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


985291 Monitor for the telemetry data service stopped successfully. PMF will restart it

Description:

When the data service properties are updated, the data service restarts the monitor.

Solution:

This message is informational; no user action needed.


985352 Unhandled return code from scds_timerun() for %s

Description:

The data service detected an error from scds_timerun().

Solution:

Informational message. No user action is needed.


985417 %s: Invalid arguments, restarting service.

Description:

The PMF action script supplied by the DSDL while launching the process tree was called with invalid arguments.

Solution:

This is an internal error. Contact your authorized Sun service provider for assistance in diagnosing and correcting the problem.


985746 Failed to retrieve the resource group Nodelist property: %s.

Description:

HA Storage Plus was not able to retrieve the resource group Nodelist property from the CCR.

Solution:

Check the cluster configuration. If the problem persists, contact your authorized Sun service provider.


986150 check_mysql - Sql-command %s returned error (%s)

Description:

The faultmonitor can't execute the specified SQL command.

Solution:

Either was MySQL already down or the faultmonitor user don't have the right permission. The defined faultmonitor should have Process-,Select-, Reload- and Shutdown-privileges and for MySQL 4.0.x also Super-privileges. Check also the MySQL logfiles for any other errors.


986157 Validate - can't determine path to Grid Engine binaries

Description:

The Sun Grid Engine binary qstat was not found in ${binary_path} nor ${binary_path}/<arch>. qstat is used only representatively. If it is not found, the other Sun Grid Engine binaries are presumed misplaced also. ${binary_path} is reported by the command 'qconf -sconf'.

Solution:

Find 'qstat' in the Sun Grid Engine installation. Then update ${binary_path} using the command 'qconf -mconf' if appropriate.


986190 Entry at position %d in property %s with value %s is not a valid node identifier or node name.

Description:

The value given for the named property has an invalid node specified for it. The position index, which starts at 0 for the first element in the list, indicates which element in the property list was invalid.

Solution:

Specify a valid node for the property.


986197 reservation fatal error(%s) - malloc() error, errno %d

Description:

The device fencing program has been unable to allocate required memory.

Solution:

Memory usage should be monitored on this node and steps taken to provide more available memory if problems persist. Once memory has been made available, the following steps may need to taken: If the message specifies the 'node_join' transition, then this node may be unable to access shared devices. If the failure occurred during the 'release_shared_scsi2' transition, then a node which was joining the cluster may be unable to access shared devices. In either case, access to shared devices can be reacquired by executing '/usr/cluster/lib/sc/run_reserve -c node_join' on all cluster nodes. If the failure occurred during the 'make_primary' transition, then a device group has failed to start on this node. If another node was available to host the device group, then it should have been started on that node. The device group can be switched back to this node if desired by using the cldevicegroup command. If no other node was available, then the device group will not have been started. Use the cldevicegroup command to start the device group. If the failure occurred during the 'primary_to_secondary' transition, then the shutdown or switchover of a device group has failed. You can retry the desired action.


987357 check_dhcp - tftp transfer test timed out exeeded %s seconds

Description:

The tftp transfer has exceeded it's timeout allowance.

Solution:

None required. Informational message, an immediate failover is being requested.


987412 Error in creating svc_fmri %s

Description:

Could not create the named smf service fmri to mage under SC delegated restarter

Solution:

Check the previous messsages for the reason


987543 functions - Fatal: %s/util/arch not found or not executable

Description:

The file '${SGE_ROOT}/util/arch' was not found or is not executable.

Solution:

Make certain the shell script file '${SGE_ROOT}/util/arch' is both in that location, and executable.


987592 Telemetry data service validate method completed successfully

Description:

The telemetry data service could be validated.

Solution:

This message is informational; no user action is needed.


987637 Failed to save old mount points.

Description:

The online update of the HAStoragePlus resource is not successful because of failure in saving the old mount points of file systems that were present before updating the resource.

Solution:

Check the syslog messages and try to resolve the problem. Try again to update the resource. If the problem persists, contact your authorized Sun service provider.


988416 t_sndudata (2) in send_reply: %s

Description:

Call to t_sndudata() failed. The "t_sndudata" man page describes possible error codes.

Solution:

None.


988719 Warning: Unexpected result returned while checking for the existence of scalable service group %s: %d.

Description:

A call to the underlying scalable networking code failed.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


988762 Invalid connection attempted from %s: %s

Description:

The cl_apid received a CRNP client connection attempt from an IP address that it could not accept because of its allow_hosts or deny_hosts.

Solution:

If you wish to allow access to the CRNP service for the client at the specified IP address, modify the allow_hosts or deny_hosts as required. Otherwise, no action is required.


988779 File-system %s mounted on %s is healthy

Description:

An I/O probe of the file system succeeded. The file system is available. This message is an informational message.

Solution:

No user action is required.


988802 UNRECOVERABLE ERROR: Cluster Configuration Repository transformation failed with error code %s

Description:

Cluster Configuration Repository transformation failed with error code ${retval}.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


988885 libpnm error: %s

Description:

This means that there is an error either in libpnm being able to send the command to the PNM daemon or in libpnm receiving a response from the PNM daemon.

Solution:

The user of libpnm should handle these errors. However, if the message is: network is too slow - it means that libpnm was not able to read data from the network - either the network is congested or the resources on the node are dangerously low. scha_cluster_open failed - it means that the call to initialize a handle to get cluster information failed. This means that the command will not be sent to the PNM daemon. scha_cluster_get failed - it means that the call to get cluster information failed. This means that the command will not be sent to the PNM daemon. can't connect to PNMd on %s - it means that libpnm was not able to connect to the PNM daemon through the private interconnect on the given host. It could be that the given host is down or there could be other related error messages. wrong version of PNMd - it means that we connected to a PNM daemon which did not give us the correct version number. no LOGICAL PERNODE IP for %s - it means that the private interconnect LOGICAL PERNODE IP address was not found. IPMP group %s not found - either an IPMP group name has been changed or all the adapters in the IPMP group have been unplumbed. There would have been an earlier NOTICE which said that a particular IPMP group has been removed. The pnmd has to be restarted. Send a KILL (9) signal to the PNM daemon. Because pnmd is under PMF control, it will be restarted automatically. If the problem persists, restart the node with clnode evacuate and shutdown. no adapters in IPMP group %s - this means that the given IPMP group does not have any adapters. Please look at the error messages from LogicalHostname/SharedAddress. no public adapters on this node - this means that this node does not have any public adapters. Please look at the error messages from LogicalHostname/SharedAddress.


989577 (%s) scan of seqnum failed on "%s", ret = %d

Description:

Could not get the sequence number from the udlm message received.

Solution:

This is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


989693 thr_create failed

Description:

Could not create a new thread. The "thr_create" man page describes possible error codes.

Solution:

Some system resource has been exceeded. Install more memory, increase swap space or reduce peak memory consumption.


989846 ERROR: unpack_rg_seq(): rgname_to_rg failed <%s>

Description:

Due to an internal error, the rgmd daemon was unable to find the specified resource group data in memory.

Solution:

Save a copy of the /var/adm/messages files on all nodes. Contact your authorized Sun service provider for assistance in diagnosing the problem.


989958 wrong command length received %d.

Description:

This means that the PNM daemon received a command from libpnm, but all the bytes were not received.

Solution:

This is not a serious error. It could be happening due to some network problems. If the error persists send KILL (9) signal to pnmd. PMF will restart pnmd automatically. If the problem persists, restart the node with clnode evacuate and shutdown.


990215 HA: repl_mgr: exception while invoking RMA reconf object

Description:

An unrecoverable failure occurred in the HA framework.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


990226 WebSphere MQ Listener for port %s stopped

Description:

The specified WebSphere MQ Listener has been stopped.

Solution:

None required. Informational message.


990418 received signal %d

Description:

The daemon indicated in the message tag (rgmd or ucmmd) has received a signal, possibly caused by an operator-initiated kill(1) command. The signal is ignored.

Solution:

The operator must use clnode and shutdown to take down a node, rather than directly killing the daemon.


990711 %s: Could not call Disk Path Monitoring daemon to add path(s)

Description:

scdidadm -r was run and some disk paths may have been added, but DPM daemon on the local node may not have them in its list of paths to be monitored.

Solution:

Kill and restart the daemon on the local node. If the status of new local disks are not shown by local DPM daemon check if paths are present in the persistent state maintained by the daemon in the CCR. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


991103 Cannot open /proc directory

Description:

The rpc.pmfd server was unable to open the /proc directory to find a list of the current processes.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


991108 uaddr2taddr (open_cmd_port) failed

Description:

Call to uaddr2taddr() failed. The "uaddr2taddr" man page describes possible error codes. ucmmd will exit and the node will abort.

Solution:

Save the files /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


991130 pthread_create: %s

Description:

The sc_zonesd daemon was not able to allocate a new thread. This problem can occur if the machine has low memory.

Solution:

Determine if the machine is running out of memory. If this is not the case, save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


991800 in libsecurity transport %s is not a loopback transport

Description:

A server (rpc.pmfd, rpc.fed or rgmd) refused an rpc connection from a client because the named transport is not a loopback. An error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


991864 putenv: %s

Description:

The rpc.pmfd server was not able to change environment variables. The message contains the system error. The server does not perform the action requested by the client, and an error message is output to syslog.

Solution:

Save the /var/adm/messages file. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


991954 clexecd: wait_for_ready daemon

Description:

clexecd program has encountered a problem with the wait_for_ready thread at initialization time.

Solution:

clexecd program will exit and node will be halted or rebooted to prevent data corruption. Contact your authorized Sun service provider to determine whether a workaround or patch is available.


992415 Validate - Couldn't retrieve MySQL-user <%s> from the nameservice

Description:

Couldn't retrieve the defined user from nameservice.

Solution:

Make sure that the right user is defined or the user exists. Use getent passwd 'username' to verify that defined user exist.


992912 clexecd: thr_sigsetmask returned %d. Exiting.

Description:

clexecd program has encountered a failed thr_sigsetmask(3THR) system call. The error message indicates the error number for the failure.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


994776 check_broker - sc3inq %s CURDEPTH(%s)

Description:

The WebSphere Broker fault monitor checks to see if the message flow was successful, by inquiring on the current queue depth for the output queue within the simple message flow.

Solution:

No user action is needed. The fault monitor displays the current queue depth until it successfully checks that the simple message flow has worked.


994915 %s: Cannot get transport information.

Description:

The daemon is unable to get needed information about transport over which it provides RPC service.

Solution:

Need a user action for this message.


994926 Mount of %s failed: (%d) %s.

Description:

HA Storage Plus was not able to mount the specified file system.

Solution:

Check the system configuration. Also check if the FilesystemCheckCommand is not empty (it is not advisable to have it empty since file system inconsistency may occur).


995026 lkcm_cfg: invalid handle was passed %s %d

Description:

Handle for communcation with udlmctl during a call to return the current DLM configuration is invalid.

Solution:

This error is an internal error. Save the contents of /var/adm/messages, /var/cluster/ucmm/ucmm_reconf.log and /var/cluster/ucmm/dlm*/*logs/* from all the nodes and contact your Sun service representative.


995249 You cannot specify zpools in a scalable service resource.

Description:

HAStoragePlus detected an inconsistency in its configuration: you defined the resource group as a scalable service. HAStoragePlus supports zpools that are defined only as a failover service. You cannot specify in this case.

Solution:

Remove the zpools from the service or change the scalable service for the zpool to a failover service.


995339 Restarting using scha_control RESTART

Description:

Fault monitor has detected problems in RDBMS server. Attempt will be made to restart RDBMS server on the same node.

Solution:

Check the cause of RDBMS failure.


995839 validate_options: HA N1 Grid Service Provisioning System Local Distributor %s Option %s is not set

Description:

A requred optin is not set in the start, stop, validate or probe command.

Solution:

Fix the appropriate command in example by regegistering.


996075 fatal: Unable to resolve %s from nameserver

Description:

The low-level cluster machinery has encountered a fatal error. The rgmd will produce a core file and will cause the node to halt or reboot to avoid the possibility of data corruption.

Solution:

Save a copy of the /var/adm/messages files on all nodes, and of the rgmd core file. Contact your authorized Sun service provider for assistance in diagnosing the problem.


996230 scvxvmlg error - mknod(%s) failed

Description:

The program responsible for maintaining the VxVM namespace was unable to access the global device namespace. If configuration changes were recently made to VxVM diskgroups or volumes, this node may be unaware of those changes. Recently created volumes may be unaccessible from this node.

Solution:

Verify that the /global/.devices/node@N (N = this node's node number) is mounted globally and is accessible. If no configuration changes have been recently made to VxVM diskgroups or volumes and all volumes continue to be accessible from this node, then no further action is required. If changes have been made, the device namespace on this node can be updated to reflect those changes by executing '/usr/cluster/lib/dcs/scvxvmlg'. If the problem persists, contact your authorized Sun service provider to determine whether a workaround or patch is available.


996389 Fault-monitor successfully re-started.

Description:

The fault monitor of this ScalMountPoint Resource was restarted successfully. This message is an informational message.

Solution:

No user action is required.


996522 scha_resourcetype_get() failed: %s

Description:

A call to scha_resourcetype_open() failed.

Solution:

Contact your authorized Sun service provider to determine whether a workaround or patch is available.


996887 reservation message(%s) - attempted removal of scsi-3 keys from non-scsi-3 device %s

Description:

The device fencing program has detected scsi-3 registration keys on a a device which is not configured for scsi-3 PGR use. The keys have been removed.

Solution:

This is an informational message, no user action is needed.


996902 Stopped the HA-NFS system fault monitor.

Description:

The HA-NFS system fault monitor was stopped successfully.

Solution:

No action required.


996942 Stop of HADB database completed successfully.

Description:

The resource was able to successfully stop the HADB database.

Solution:

This is an informational message, no user action is needed.


997568 modinstall of tcpmod failed

Description:

Streams module that intercepts private interconnect communication could not be installed.

Solution:

Need a user action for this message.


997689 IP address %s is an IP address in resource %s and in resource %s.

Description:

The same IP address is being used in two resources. This is not a correct configuration.

Solution:

Delete one of the resources that is using the duplicated IP address.


998022 Failed to restart the service: %s.

Description:

Restart attempt of the dataservice has failed.

Solution:

Check the sylog messages that are occurred just before this message to check whether there is any internal error. In case of internal error, contact your Sun service provider. Otherwise, any of the following situations may have happened. 1) Check the Start_timeout and Stop_timeout values and adjust them if they are not appropriate. 2) This might be the result of lack of the system resources. Check whether the system is low in memory or the process table is full and take appropriate action.


998351 <Monitor_Uri_List> extension property is not set. Fault Monitor will not do HTTP probing. asadmin list-domains command will be used for probing the health of the server

Description:

This is just an info message. If the Monitor_uri_list extension property is set, the JSAS Domain Admin Server probe will do thorough probing by sending an HTTP request to DAS and then reading the response. If this extension property is not set, the asadmin list-domains command will be used to get the status of DAS.

Solution:

If HTTP probing is desired, create a HTTP listener for the DAS and then set the Monitor_uri_list extension property.


998374 Validate - winbindd %s non-existent executable

Description:

The Samba executable winbindd either doesn't exist or is not executable.

Solution:

Check the correct pathname for the Samba (s)bin directory was entered when registering the resource and that the program exists and is executable.


998478 start_sge_schedd failed

Description:

The process sge_schedd failed to start for reasons other than it was already running.

Solution:

Check /var/adm/messages for any relevant cluster messages. Respond accordingly, then retry bringing the resource online.


998759 Database is ready for auto recovery but the Auto_recovery property is false.

Description:

All the Sun Cluster nodes able to run the HADB resource are running the resource, but the database is unable to be started. If the auto_recovery extension property was set to true the resource would attempt to start the database by running hadbm clear and the command, if any, specified in the auto_recovery_command extension property.

Solution:

The database must be manually recovered, or if autorecovery is desired the auto_recovery extension property can be set to true and auto_recovery_command can optionally also be set.


998781 start_dhcp - %s %s failed

Description:

The DHCP resource has tried to start the DHCP server using in.dhcpd, however this has failed.

Solution:

The DHCP server will be restarted. Examine the other syslog messages occurring at the same time on the same node, to see if the cause of the problem can be identified.


999536 scf_transaction_property_new failed error %d: %s

Description:

An API call failed.

Solution:

Examine log files and syslog messages to determine the cause of the failure. Take corrective action based on any related messages. If the problem persists, report it to your Sun support representative for further assistance.


999960 NFS daemon %s has registered with TCP transport but not with UDP transport. Will restart the daemon.

Description:

While attempting to start the specified NFS daemon, the daemon started up. However it registered with TCP transport before it registered with UDP transport. This indicates that the daemon was unable to register with UDP transport.

Solution:

This is an informational message, no user action is needed. Make sure that the order of entries in /etc/netconfig is not changed on cluster nodes where HA-NFS is running.